Tuesday, April 17, 2007

My 64-bit porting experiences - I

1. Oversight

1.1 Implicit functions

Following code examples functioned harmlessly while the code was compiled and run on 32-bit, but crashed on 64-bit. Many of them could have been avoided, if compiler and/or lint warnings were attended to.

char *str = (char *)malloc(n * sizeof(char));

This is a perfectly valid way to allocate a string of ‘n’ characters dynamically, and assign it to ‘str’, which is a pointer of the appropriate type. Strangely, on 64-bit, the pointer address returned was 32-bit, and crashed with SIGBUS. Several hits and trials yielded that the ‘C’ file that contained this code did not include stdlib.h, the standard library that provides the definition of ‘malloc’.

ptr = func_ret_ptr(…);

The function, ‘func_ret_ptr’ returns a pointer of a specific type, which is assigned to a pointer variable of the same type. The function comes from another library file, which also has a corresponding header file. The header file is included in the file that contains the above line of code. In this case also, on 64-bit, the value assigned to ‘ptr’ was 32-bit. Using another similar function correctly returned a 64-bit value. It took hours of debugging to notice that the declaration of this particular function was missing from the header file.

In yet another case, a function that returned a pointer was defined in one file, and used in another. An extern declaration provided the declaration to the code file that used the function. However, the extern declaration incorrectly declared the return type of the function to be int.

The explanation is rather obvious. When the declaration of a function (or variable) is not found, the ‘C’ compiler implicitly defines them as int. In 32-bit mode, int and pointer are of same size, therefore an integer value assigned to a pointer represents a valid address. But in 64-bit mode, a 32-bit value stored by an int is not a valid address and therefore a pointer assigned this value cannot be successfully de-referenced. [It is only now, when I am writing this document, it has become obvious to me that all these cases are similar. Earlier, when I worked on these problems, I had thought that the malloc issue was due to difference in compiler behavior with the standard libraries.]

1.2 Needless casting

Type * ptr_var = (Type *)(int)(func_ret_ptr(…));

There is no need to cast the value returned by the function to int. It functioned in 32-bit mode, but in 64-bit, caused 32 MSBs to be lost from the pointer returned by the function.

2 comments:

Maverick said...

Type mismatch is very easy to debug if it comes up in an error, but if it comes in a warning, then usually i ignore it the first time. n dynamic memory allocation is something tht im very bad at, im kinda scared abt the whole malloc-calloc-pointer thing. n whenevr im in doubt whether to typecast some variable or not, i usually go ahead n cast it, cos i feel unnecessary type casting wont harm nething :)

Sigma said...

Ha ha ha ... now you know better :-))
No error, just crash :-) [Well, there would have been a warning, but the warning would have been for implicit declaration, rather than size/type mismatch.] And as I said, this code was working perfectly fine on 32-bit, for years.
Type-casting indiscriminately is not such a good idea, specially if you are working with C++ and have a lot of inheritence.