Wednesday, June 27, 2007

My 64-bit porting experiences - VI

6. Excessive code-reuse

We have always been instructed to reuse code, as it makes the code more maintainable. But it can sometimes be taken too far (too much of a good thing ?!).

6.1 Objects of different sizes in a union

Class A{
int getVal {return bVal;}
exp *getExpr {return bExpr;}
int hasExpr {return isExpr;}
union {
int bVal;
exp *bExpr;
}
int isExpr;
};

The object of this class ‘A’ can have a data-member that is either a constant value (bVal), or an expression (bExpr) if the value is not constant. When the value is not constant, ‘isExpr’ is set to ‘1’, and the user is expected to use the expression returned by ‘getExpr’ call.

In the following piece of code , ‘obj_a’ is an object of class ‘A’:

int tmp = 0;
exp *tmpExpr = NULL;
if (cond)
tmp = 1;
else
tmp = obj_a->getVal();

if(obj_a->hasExpr())
{
if(tmp == 1)
tmpExpr = createExpr(1);
else
tmpExpr = obj_a->getExpr();
}
else
tmpExpr = createExpr(tmp);

Incorrect values in a testcase, in 64-bit mode, were traced to this piece of code. Even though ‘obj_a’ contained expression (which represented a value other than ‘1’), and ‘cond’ was ‘0’, this code inferred the value represented by ‘obj_a’ to be ‘1’.

Reason: ‘obj_a’ represented an expression, so obj_getVal() should not have been used. But it was called, and it returned ‘1’. In 64-bit mode, the members of the union – int and pointer – have sizes 32 and 64 bits respectively. When the integer value was queried, the higher 32 bits were returned, which constituted ‘1’ in integer value [the pointer was 0x1_hhhh_hhhh]. This code worked without problems in 32-bit, because int and pointer have same size, and a pointer cannot be 0x1.


6.2 Overriding functions through implicit casting

This problem was encountered at many places in the code, in different forms.

A hash-table to hash pointers had been implemented, and widely used. The interface functions to insert new records, or find existing ones, accepted char** arguments (and dereferenced them to obtain the pointer). Later on, there was a requirement to hash integer values as well. Instead of implementing a new hash-table (or even writing new interface functions), the previous table and its interface was reused – by passing int* on the actual. When the interface functions dereferenced the pointer passed at the actual, the resulting value was int, and therefore not a valid address.

Sunday, June 17, 2007

My 64-bit porting experiences -V

5. Use of compatible, but incorrect datatypes

An API function, ‘func1’ returns int*, which actually points to an array of two integers. The part of code that uses this function is as follows:

long *lp = func1(…);
int lval = lp[0];
int rval = lp[1];

The results were correct in 32-bit, because int and long are same size. On 64-bit, the return values were incorrect, because of the difference in size of int and long. ‘lp[0]’ represents a 64-bit value, which contains both the int values that are returned by ‘func1’. When it is assigned to ‘lval’, the lower 32 bits are assigned to ‘lval’, which was the value that needed to be assigned to ‘rval’. ‘lp[1]’ reads next 64 bits from memory and assigns the lower 32 bits of it to ‘rval’. So, ‘lval’ gets incorrect value, while ‘rval’ gets junk.

Sunday, June 10, 2007

My 64-bit porting experiences - IV

4. Masking operations
The following may be considered as example of oversight, or assumption on datatype size. I have made this an independent section, because masking operations are frequently used in C code for fast storage and access of data. Masks are usually implemented through macros, and problems in these macros are difficult to debug.

4.1 Storing additional information in a pointer
It may be considered that manipulation of pointer addresses will be very prone to errors. But given the fact that addresses are always word-aligned, the last bit in a pointer value is always 0. To take advantage of this fact, applications sometimes store a boolean property in the last bit of the address, and save runtime memory. As long as it takes care to reset the last bit of such pointers before dereferencing them, the applications are safe. The set and reset operations are usually coded as macros in ‘C’ code, in the interest of runtime.

In the following code snippet, ‘x’ is a pointer under manipulation. M_IS_LIST queries the property on the pointer, M_SET_LIST sets the property on the pointer, and M_GET_LIST retrieves the pointer when before it is de-referenced.
#define M_IS_LIST(x) (((unsigned long)(x)) & 0x1U)
#define M_SET_LIST(x) ((x) |= 0x1U)
#define M_GET_LIST(x) (((unsigned long)(x)) & ~0x1U)

This works for 32-bit mode. In 64-bit, I was getting corrupt pointers at some point in code, which was traced to these macros after considerable effort.

Reason: In its own context, a constant is treated as integer value. So, 0x1U is considered as a 32 bit value. Bit-wise AND with a pointer results in a 32-bit quantity (the smaller of the values being AND’ed).

Solution: To make the mask the same size as the variable:
#define M_IS_LIST(x) (((unsigned long)(x)) & 0x1UL)
#define M_SET_LIST(x) ((x) |= 0x1UL)
#define M_GET_LIST(x) (((unsigned long)(x)) & ~0x1UL)


4.2 Manipulating addresses

Yet another memory corruption was caused by the following code snippet, which purported to provide the next pointer to be processed:

if (((unsigned long)pstr) & 0x3U)
pstr = (char *)(((unsigned long)(pstr + 4)) & ~0x3U);

It was very difficult to understand the function of this simple line of code, due to use of constants, and absence of comments. I cannot stress enough the necessity of documentation in code, especially with tricky calculations like this.

Well, it re-aligns a pointer, based on the understanding that the addresses are aligned at 4 bytes (last two bits are 0). However, in 64-bit mode, the addresses are aligned at 8 bytes (last 3 bits are 0). Therefore, calculations needed to be modified [though I wish it could be more generic], as follows:

#ifdef 64_BIT_BUILD
#define addrMask 0x7UL
#else
#define addrMask 0x3UL
#endif

if (((unsigned long)pstr) & addrMask)
pstr = (char *)(((unsigned long)(pstr + sizeof(long))) & ~addrMask);