Sunday, June 10, 2007

My 64-bit porting experiences - IV

4. Masking operations
The following may be considered as example of oversight, or assumption on datatype size. I have made this an independent section, because masking operations are frequently used in C code for fast storage and access of data. Masks are usually implemented through macros, and problems in these macros are difficult to debug.

4.1 Storing additional information in a pointer
It may be considered that manipulation of pointer addresses will be very prone to errors. But given the fact that addresses are always word-aligned, the last bit in a pointer value is always 0. To take advantage of this fact, applications sometimes store a boolean property in the last bit of the address, and save runtime memory. As long as it takes care to reset the last bit of such pointers before dereferencing them, the applications are safe. The set and reset operations are usually coded as macros in ‘C’ code, in the interest of runtime.

In the following code snippet, ‘x’ is a pointer under manipulation. M_IS_LIST queries the property on the pointer, M_SET_LIST sets the property on the pointer, and M_GET_LIST retrieves the pointer when before it is de-referenced.
#define M_IS_LIST(x) (((unsigned long)(x)) & 0x1U)
#define M_SET_LIST(x) ((x) |= 0x1U)
#define M_GET_LIST(x) (((unsigned long)(x)) & ~0x1U)

This works for 32-bit mode. In 64-bit, I was getting corrupt pointers at some point in code, which was traced to these macros after considerable effort.

Reason: In its own context, a constant is treated as integer value. So, 0x1U is considered as a 32 bit value. Bit-wise AND with a pointer results in a 32-bit quantity (the smaller of the values being AND’ed).

Solution: To make the mask the same size as the variable:
#define M_IS_LIST(x) (((unsigned long)(x)) & 0x1UL)
#define M_SET_LIST(x) ((x) |= 0x1UL)
#define M_GET_LIST(x) (((unsigned long)(x)) & ~0x1UL)


4.2 Manipulating addresses

Yet another memory corruption was caused by the following code snippet, which purported to provide the next pointer to be processed:

if (((unsigned long)pstr) & 0x3U)
pstr = (char *)(((unsigned long)(pstr + 4)) & ~0x3U);

It was very difficult to understand the function of this simple line of code, due to use of constants, and absence of comments. I cannot stress enough the necessity of documentation in code, especially with tricky calculations like this.

Well, it re-aligns a pointer, based on the understanding that the addresses are aligned at 4 bytes (last two bits are 0). However, in 64-bit mode, the addresses are aligned at 8 bytes (last 3 bits are 0). Therefore, calculations needed to be modified [though I wish it could be more generic], as follows:

#ifdef 64_BIT_BUILD
#define addrMask 0x7UL
#else
#define addrMask 0x3UL
#endif

if (((unsigned long)pstr) & addrMask)
pstr = (char *)(((unsigned long)(pstr + sizeof(long))) & ~addrMask);

1 comment:

Anonymous said...

Who knows where to download XRumer 5.0 Palladium?
Help, please. All recommend this program to effectively advertise on the Internet, this is the best program!