This article applies to the Oracle Developer Studio (previously known as Oracle Solaris Studio) compilers. The principal cause of problems when converting 32-bit applications to 64-bit applications is the change in size of the int
type with respect to the long
and pointer types. When converting 32-bit programs to 64-bit programs, only long
types and pointer types change in size from 32 bits to 64 bits; integers of type int
stay at 32 bits in size. This can cause trouble with data truncation when assigning pointer or long
types to int
types. Also, problems with sign extension can occur when assigning expressions using types shorter than the size of an int
to an unsigned long
or a pointer. This article discusses how to avoid or eliminate these problems.
The biggest difference between the 32-bit and the 64-bit compilation environments is the change in data-type models. The C data-type model for 32-bit applications is the ILP32 model, so named because the int
and long
types, and pointers, are 32-bit data types. The data-type model for 64-bit applications is the LP64 data model, so named because long
and pointer types grow to 64 bits. The remaining C integer types and the floating-point types are the same in both data-type models.
It is not unusual for current 32-bit applications to assume that the int
type, long
type, and pointers are the same size. Because the size of long
and pointer
change in the LP64 data model, this change alone is the principal cause of ILP32-to-LP64 conversion problems.
lint
Utility to Detect Problems with 64-bit long
and Pointer TypesUse lint
to check code that is written for both the 32-bit and the 64-bit compilation environment. Specify the -errchk=longptr64
option to generate LP64 warnings. Also use the -errchk=longptr64
flag which checks portability to an environment for which the size of long integers and pointers is 64 bits and the size of plain integers is 32 bits. The -errchk=longptr64
flag checks assignments of pointer expressions and long integer expressions to plain integers, even when explicit casts are used.
Use the -errchk=longptr64,signext
option to find code where the normal ISO C value-preserving rules allow the extension of the sign of a signed-integral value in an expression of unsigned-integral type. Use the -m64
option of lint
when you want to check code that you intend to run in the Solaris 64-bit SPARC or x86 64-bit environment.
When lint generates warnings, it prints the line number of the offending code, a message that describes the problem, and whether or not a pointer is involved. The warning message also indicates the sizes of the involved data types. When you know a pointer is involved and you know the size of the data types, you can find specific 64-bit problems and avoid the pre-existing problems between 32-bit and smaller types.
You can suppress the warning for a given line of code by placing a comment of the form "NOTE(LINTED(<optional message>))"
on the previous line. This is useful when you want lint to ignore certain lines of code such as casts and assignments. Exercise extreme care when you use the "NOTE(LINTED(<optional message>))"
comment because it can mask real problems. When you use NOTE
, also include #include<note.h>
. Refer to the lint man
page for more information.
Since plain integers and pointers are the same size in the ILP32 compilation environment, 32-bit code commonly relies on this assumption. Pointers are often cast to int
or unsigned int
for address arithmetic. You can cast your pointers to unsigned long
because long
and pointer types are the same size in both ILP32 and LP64 data-type models. However, rather than explicitly using unsigned long
, use uintptr_t
instead because it expresses your intent more closely and makes the code more portable, insulating it against future changes. To use the uintptr_t
and intptr_t
you need to #include <inttypes.h>
.
Consider the following example:
char *p;
p = (char *) ((int)p & PAGEOFFSET);
% cc ..
warning: conversion of pointer loses bits
The following version will function correctly when compiled to both 32-bit and 64-bit targets:
char *p;
p = (char *) ((uintptr_t)p & PAGEOFFSET);
Because integers and longs are never really distinguished in the ILP32 data-type model, your existing code probably uses them indiscriminately. Modify any code that uses integers and longs interchangeably so it conforms to the requirements of both the ILP32 and LP64 data-type models. While an integer and a long are both 32-bits in the ILP32 data-type model, a long is 64 bits in the LP64 data-type model.
Consider the following example:
int waiting;
long w_io;
long w_swap;
...
waiting = w_io + w_swap;
% cc
warning: assignment of 64-bit integer to 32-bit integer
Sign extension is a common problem when you convert to the 64-bit compilation environment because the type conversion and promotion rules are somewhat obscure. To prevent sign-extension problems, use explicit casting to achieve the intended results.
To understand why sign extension occurs, it helps to understand the conversion rules for ISO C. The conversion rules that seem to cause the most sign extension problems between the 32-bit and the 64-bit compilation environment come into effect during the following operations:
You can use a char, short, enumerated type, or bit-field, whether signed or unsigned, in any expression that calls for an integer. If an integer can hold all possible values of the original type, the value is converted to an integer; otherwise, the value is converted to an unsigned integer.
When an integer with a negative sign is promoted to an unsigned integer of the same or larger type, it is first promoted to the signed equivalent of the larger type, then converted to the unsigned value.
When the following example is compiled as a 64-bit program, the addr variable becomes sign-extended, even though both addr and a.base are unsigned types.
%cat test.c
struct foo {
unsigned int base:19, rehash:13;
};
main(int argc, char *argv[])
{
struct foo a;
unsigned long addr;
a.base = 0x40000;
addr = a.base << 13; /* Sign extension here! */
printf("addr 0x%lx\n", addr);
addr = (unsigned int)(a.base << 13); /* No sign extension here! */
printf("addr 0x%lx\n", addr);
}
This sign extension occurs because the conversion rules are applied as follows:
The structure member a.base
is converted from an unsigned int
bit field to an int
because of the integral promotion rule. In other words, because the unsigned 19-bit field fits within a 32-bit integer, the bit field is promoted to an integer rather than an unsigned integer. Thus, the expression a.base << 13
is of type int
. If the result were assigned to an unsigned int
, this would not matter because no sign extension has yet occurred.a.base << 13
is of type int
, but it is converted to a long
and then to an unsigned long
before being assigned to addr
, because of signed and unsigned integer promotion rules. The sign extension occurs when performing the int
to long
conversion.Thus, when compiled as a 64-bit program, the result is as follows:
% cc -o test64 -m64 test.c
% ./test64
addr 0xffffffff80000000
addr 0x80000000
%
unsigned long
is the same as the size of an int
, so there is no sign extension.
% cc -o test test.c
% ./test
addr 0x80000000
addr 0x80000000
%
Check the internal data structures in an applications for holes; that is, extra padding appearing between fields in the structure to meet alignment requirements. This extra padding is allocated when long
or pointer fields grow to 64 bits for the LP64 data-type model, and appear after an int
that remains at 32 bits in size. Since long
and pointer types are 64-bit aligned in the LP64 data-type model, padding appears between the int
and long
or pointer type. In the following example, member p
is 64-bit aligned, and so padding appears between the member k
and member p
.
struct bar {
int i;
long j;
int k;
char *p;
}; /* sizeof (struct bar) = 32 bytes */
Also, structures are aligned to the size of the largest member within them. Thus, in the above structure, padding appears between member i
and member j
.
When you repack a structure, follow the simple rule of moving the long and pointer fields to the beginning of the structure. Consider the following structure definition:
struct bar {
char *p;
long j;
int i;
int k;
}; /* sizeof (struct bar) = 24 bytes */
Be sure to check the members of unions because their fields can change size between the ILP32 and the LP64 data-type models, making the size of the members different. In the following union, member _d
and member array _l
are the same size in the ILP32 model, but different in the LP64 model because long
types grow to 64 bits in the LP64 model, but double
types do not.
typedef union {
double _d;
long _l[2];
} llx_
The size of the members can be rebalanced by changing the type of the _l
array member from type long
to type int
.
A lack of precision can cause the loss of data in some constant expressions. Be explicit when you specify the data types in your constant expression. Specify the type of each integer constant by adding some combination of { u,U,l,L
}. You can also use casts to specify the type of a constant expression. Consider the following example:
int i = 32;
long j = 1 << i; /* j will get 0 because RHS is integer expression */
The above code can be made to work as intended, by appending the type to the constant, 1
, as follows:
int i = 32;
long j = 1L << i; /* now j will get 0x100000000, as intended */
Make sure the format strings for printf
(3S), sprintf
(3S), scanf
(3S), and sscanf
(3S) can accommodate long or pointer arguments. For pointer arguments, the conversion operation given in the format string should be %p
to work in both the 32-bit and 64-bit compilation environments. For long
arguments, the long size specification, l
, should be prepended to the conversion operation character in the format string.
Also, check to be sure that buffers passed to the first argument in sprintf
contain enough storage to accommodate the expanded number of digits used to convey long and pointer values. For example, a pointer is expressed by 8 hex digits in the ILP32 data model but expands to 16 in the LP64 data model.
sizeof()
Operator is an unsigned long
In the LP64 data-type model, sizeof()
has the effective type of an unsigned long
. If sizeof()
is passed to a function expecting an argument of type int
, or assigned or cast to an int
, the truncation could cause a loss of data. This is only likely to be problematic in large database programs containing extremely long arrays.
For data structures that are shared between 32-bit and 64-bit versions of an application, stick with data types that have a common size between ILP32 and LP64 programs. Avoid using long
data types and pointers. Also, avoid using derived data types that change in size between 32-bit and 64-bit applications. For example, the following types defined in <sys/types.h>
change in size between the ILP32 and LP64 data models:
clock_t
, which represents the system time in clock ticksdev_t
, which is used for device numbersoff_t
, which is used for file sizes and offsetsptrdiff_t
, which is the signed integral type for the result of subtracting two pointerssize_t
, which reflects the size, in bytes, of objects in memoryssize_t
, which is used by functions that return a count of bytes or an error indicationtime_t
, which counts time in seconds
Using the derived data types in <sys/types.h>
is a good idea for internal data, because it helps to insulate the code from data-model changes. However, preccisely because the size of these types are prone to change with the data model, using them is not recommended in data that is shared between 32-bit and 64-bit applications, or in other situations where the data size must remain fixed. Nevertheless, as with the sizeof() operator discussed above, before making any changes to the code, consider whether the loss of precision will actually have any practical impact on the program.
For binary interface data, consider using the fixed-width integer types in <inttypes.h>. These types are good for explicit binary representations of the following:
Be aware that a type change in one area can result in an unexpected 64-bit conversion in another area. For example, check all the callers of a function that previously returned an int
and now returns an ssize_t
.
long
Arrays on PerformanceLarge arrays of long
or unsigned long
types, can cause serious performance degradation in the LP64 data-type model as compared to arrays of int
or unsigned int
types. Large arrays of long
types cause significantly more cache misses and consume more memory. Therefore, if int
works just as well as long
for the application purposes, it's better to use int
rather than long
. This is also an argument for using arrays of int
types instead of arrays of pointers. Some C applications suffer from serious performance degradation after conversion to the LP64 data-type model because they rely on many, large, arrays of pointers.