by Giri Mandalika, May, 2005 (updated March 2006 and June 2016)
Hiding non-interface symbols of a library within the library makes the library more robust and less vulnerable to symbol collisions from outside the library. This symbol scope reduction also improves the performance of an application by reducing the runtime relocation costs of the dynamic libraries. To indicate the appropriate linker scoping in a source program, you can now use language extensions built into the Oracle Developer Studio C/C++ compilers as described here.
With older Studio compilers, linker mapfiles were the only way to change the default symbol processing by the linker. With the help of mapfiles, all non-interface 1 symbols of an object can be hidden within a load module 2, thereby making the object more robust and less vulnerable to symbol collisions. This symbol scope reduction helps improving the performance of an application by reducing the runtime relocation costs of the dynamic objects. The other reason for symbol scoping is to ensure that clients only use the intended interface to the library, and not the functions that are internal to the library.
The mapfile mechanism is useful with languages such as C, but difficult to exploit with languages such as C++. There are two major hurdles:
The link-editor 3 only processes symbols in their mangled form. For example, even a simple interface such as
void printstring(char *str)
has a C++ symbolic representation something like __1cLprintstring6Fpc_v_
. As no tool exists that can determine a symbol's mangled name other than the compilers themselves, trying to establish definitions of this sort within a mapfile, is not a simple task.
Also, changes to a function's signature, or to a typedef
that a function signature uses can invalidate the mapfile which was produced because the mangled name of the symbols could have changed. For versioned libraries, this invalidation is a good thing because the function signature has changed. The fact that the mapfiles survive changes in parameter types in C, is a problem.
We can avoid the specification problems in mapfiles by specifying the linker symbol scope within the source program. Oracle Developer Studio introduced new syntax for specifying this scope, and a new option for controlling default scope behavior. With these new features, programmers do not need mapfiles for linker scoping.
There are reasons programmers might still need mapfiles. The primary reason is library versioning. For other reasons, see the Linker and Libraries Guide for the full list of mapfile capabilities. Compiler-assisted linker scoping also helps with construction of mapfiles, because the set of globally visible symbols in a candidate library becomes the actual set of globally visible symbols, and the only remaining task is to assign symbols to versions.
This article introduces linker scoping with simple examples, and outlines the benefits of this feature for developing end-user applications. All the content of this article is equally applicable to C and C++, unless otherwise specified. Note that the terms shared library, dynamic library, load module and dynamic module are used interchangeably throughout the article.
Oracle added linker scoping as a language extension with the release of the Oracle Developer Studio 8 C/C++ compilers. Using this feature, the programmer can indicate the appropriate symbol scoping within the source program. The following paragraphs briefly explain the need for such a feature.
Default Behavior of the Oracle Solaris Linker
With Oracle Solaris (and UNIX in general), external names (symbols) will have global scope by default, in a dynamic object. This is due to the fact that the static linker makes all symbols global in scope without linker scoping mechanism. That is, it puts all the symbols into the dynamic symbol table of the resulting binary object, so other binary modules can access those symbols. Such symbols are called external or exported symbols.
At program startup, the dynamic linker 4 (also referred to as the runtime linker) loads up all dynamic libraries specified at link time before starting execution of the application. Because shared libraries are not available to the executable until runtime, shared library calls get special treatement in executable objects. To do this, the dynamic linker maintains a linked list of the link maps in the address space of the executing process, one for each dynamically linked object. The symbol search mechanism traverses this list to bind the objects of an application. The Procedure Linkage Table (PLT) facilitates this binding.
Relocation Processing
The PLT can be used to redirect function calls between the executable and a shared object, or between different shared objects and is purely an optimization strategy designed to permit lazy symbol resolution at run time.
Once all the dependencies for the executable were loaded, the runtime linker updates the memory image of the executable and its dependencies to reflect the real addresses for data and function references. This is also known as relocation processing.
The dynamic relocations that the dynamic linker performs are only necessary for the global (sometimes referred to as external or exported) symbols. The static linker resolves references to local symbols (for example, names of static functions) statically when it links the binary. So, when an application is made out of dynamic libraries with the default global scoping, it will pay some penalty during application startup time and the performance may suffer during runtime due to the overhead of PLT processing.
A considerable amount of startup time is spent performing symbolic relocations 5. Generally a lot more time is spent relocating symbols from dependency objects than relocating symbols from the executable itself. To gain noticeable reduction in startup time, we have to somehow decrease the amount of relocation processing.
As stated earlier, the dynamic linker maintains a linked list of the link maps in the memory of the executing process, one for each dynamically linked object. So, the symbol search mechanism requires the runtime linker to traverse the whole link-map list, looking in each object's symbol table to find the required symbol definition. This is known as a symbolic relocation. Because there can be many link maps containing many symbols, symbolic relocations are time consuming and expensive. The process of looking up symbol values needs to be done only for symbolic relocations that reference data. Symbolic entries from the .plt
section are not relocated at startup because they are relocated on demand. However, non-symbolic relocations do not require a lookup and thus are not expensive and do not affect the application startup time. Because relocation processing can be the most expensive operation during application startup, it is desirable to have fewer symbols that can be relocated. See Appendix for instructions to estimate the number of relocations on a library.
It can be summarized as follows:
Each global symbol has a run-time overhead for binding the symbol. This overhead may occur for all symbols at program startup, or it may occur only for referenced symbols upon first reference. In addition, each use of a symbol will have a run-time overhead for the indirection of the binding tables.
A symbol that needs binding is visible in the library as a relocation. Reducing the number of relocations will reduce both forms of overhead, and yield faster libraries.
Reducing the Number of Relocations
One way of reducing the relocations is to have fewer symbols visible outside the application or library. This can be done by declaring locally used functions and global data private to the application/library. Using static
keyword as a function type in C/C++ programs, makes the function local to the module and the symbol will not appear in the dynamic symbol table ( .dynsym
). elfdump
6 or nm
7 utilities can be used to examine the symbol table of an object file.
Another way is to use the mapfile option to control the scope of functions and symbols. But due to the overhead of maintaining map files with the changes in source code and compiler versions explained earlier, it is not a preferable scheme to be used with C++ applications.
Yet another way is to indicate the appropriate linker scoping within the source program with the help of language extensions in the Oracle Developer Studio C/C++ compilers. The following paragraphs explain the linker scoping in detail with some examples.
With the release of Oracle Developer Studio 8 compilers, C and C++ are now capable of describing symbol visibility. Although the symbol visibility is specified in the source file, it actually defines how a symbol can be accessed once it has become part of an executable or shared object. The default visibility of symbol is specified by the symbol's binding type.
By using a combination of linker scope specifier directives and command line options, the programmer can define the runtime interface of a C/C++ object. These definitions are then encoded in the symbol table, and used by link-editor in a similar manner as reading definitions from a mapfile. With this interface definition technique, the compilation method can greatly reduce the number of symbols that would normally be employed in runtime relocations. In addition, as the compiler knows what implementation symbols must remain global within the final object, these symbols are given the appropriate visibility attribute to insure their correct usage.
Language/Compiler Extensions
There is a new compiler flag: -xldscope=
{ global
| symbolic
| hidden
}
-xldscope
accepts one of the values: global
, symbolic
, or hidden
. This command line option sets the default linker scope for user-defined external symbols. The compiler issues an error if you specify -xldscope
without an argument. Multiple instances of this option on the command line override each other until the rightmost instance is reached. Symbols with explicit linker scope qualifiers, declarations of external symbols, static symbols, and local symbols are not affected by the -xldscope option.
There is a new C/C++ source language interface:
__global
, __symbolic
, and __hidden
declaration specifiers were introduced to specify symbol visibility at declarations and definitions of external symbols and class types. These specifiers are applicable to external functions, variables and classes; and these specifiers takes precedence over the command line ( -xldscope
) option.
With no specifier, the symbol linker scoping remains unchanged from any prior declarations. If the symbol has no prior declaration, the symbol will have the default linker scoping.
Global Scoping
The __global
specifier can be used to make the symbol definition global in linker scope. This is the default scoping for extern
symbols. With global scope, all references to the symbol bind to the definition in the first dynamic load module (shared library) that defines the symbol. To make all symbols global in scope, the programmer need not use any special flags, as it is the default. Note that -xldscope=global
is the default assumed by the compiler; so, specifying -xldscope=global
explicitly on the command line has no additional effect beyond overriding a previous -xldscope
on the same command line.
Symbolic Scoping
Symbolic scoping (also known as protected) is more restrictive than global linker scoping; all references within a library that match definitions within the library will bind to those definitions. Outside of the library, the symbol appears as though it was global. That is, at first the link-editor tries to find the definition of the symbol within the library. If found, the symbol will be bound to the definition during link time; otherwise the search continues outside the library as the case with global symbols. For variables, there is an extra complication of copy relocations 8.
Symbolic scoping ensures that the library uses its own versions of specific functions, no matter what might appear elsewhere in the program. There are times when symbolic scoping of a set of symbols is exactly what we want. For instance, symbolic scoping fits well in a scenario, where there is an encryption function, with the requirement that it must not be overridden by any other function from any other library irrespective of the link order of those libraries during link time.
On the downside, we lose the flexibility of library interposition, as the resulting symbols are non-interposable. Library interposition is a useful technique for tuning performance, collecting runtime statistics, or debugging applications.For example, if libc
was built with symbolic scoping, then we cannot take advantage of faster memory allocator libraries like libmtmalloc.so
for multi-threaded applications, by simply preloading libmtmalloc
and interpose malloc
. To do so, the symbol malloc
must be interposable with global binding.
With the __symbolic
specifier, symbol definitions will have symbolic linker scope. With -xldscope=symbolic
on command line and without any linker scoping specifiers in the source code, all the symbols of the library get symbolic scoping. This linker scoping corresponds to the linker option, -Bsymbolic
.
Be aware that with symbolic scoping, you can wind up with multiple copies of an object or function in a program when only one should be present. For example, suppose a symbol X is defined in library L scoped symbolic. If X is also defined in the main program or another library that is linked ahead of L, library L will use its own copy of X, but everthing else in the program will use a different copy of X. When using -Bsymbolic
linker op tionoblem extends to every symbol defined in the library, not just the ones you intend to be symbolic.
-Bsymbolic
Some interfaces created by languages such as C++, provide implementation details of the language itself. These implementation interfaces often must remain global within a group of similar dynamic objects, as one interface must interpose on all the others for the correct execution of the application. As users generally are not aware of what implementation symbols are created, they can blindly demote these symbols to local with options like -Bsymbolic
. For this reason, -Bsymbolic
has never been supported with C++, and its use was discouraged with C++.
-xldscope=symbolic
. -xldscope=symbolic
is considerably safer than -Bsymbolic
at link time. -Bsymbolic
is a big hammer that affects every non-local symbol. With the compiler option, certain compiler-generated symbols that need to be global remain global. Also the compiler options do not break exception handling, where as the linker -Bsymbolic
option can break exception handling. Linker map file is an alternative solution. Check the introductory paragraphs for the problems associated with linker map files.
Hidden Scoping
Symbols with __hidden
specifier will have hidden linker scoping. Hidden linker scoping is the most restrictive scope of all. All references within a dynamic load module bind to a definition within that module and the symbol will not be visible outside of the module. That is, the symbol will be local to the library in which it was defined and other libraries may not know the existence of such symbol.
Using -xldscope=hidden
requires using at least __global
or __symbolic
declaration specifier. Otherwise the instructions result in a library that is completely unusable. The mixed use of -xldscope=hidden
and __symbolic
will yield the same effect as __declspec(dllexport)
in DLLs on Windows (explained in the later part of the article).
Summary of Linker Scoping
Declaration Specifier | -xldscope Value | Reference Binding | Visibility of Definitions |
---|---|---|---|
__global | global | First Module | All Modules |
__symbolic | symbolic | Same Module | All Modules |
__hidden | hidden | Same Module | Same Module only |
The linker will choose the most restrictive scoping specified for all definitions.
Linker scoping specifiers are applicable to struct, class
, and union
declarations and definitions. Consider the following example:
__hidden struct __symbolic BinaryTree node;
The declaration specifier before the struct
keyword applies to variable node
. The class key modifier after the struct
keyword applies to the type BinaryTree
.
Rules for Using These Specifiers
Additional Notes
-xldscope
flag, only the definitions are affected.-xldscope.
struct
linker scoping.__hidden
or __symbolic
specifiers can be generated inline when building the library. They are not supposed to be overridden by clients. If you intend to allow a client to override a function in a library, you must ensure that the function is not generated inline in the library. The compiler inlines a function if you:
* specify the function name with -xinline
* compile at -xO4
or higher in which case inlining can happen automatically
* use the inline
specifier, or
* use the #pragma inline
__global
specifier, should not be declared inline, and should be protected from inlining by use of the -xinline
compiler option.-xldscope
option does not apply to tentative 10 definitions; tentative definitions continue to have global scope.-xldscope=symbolic
, and if the same object file is used in building more than one library, dynamically loading/unloading, referencing the common symbols from those libraries may lead to a crash during run-time, due to the possible symbol conflict. This is due to the globalization of static symbols to support "fix and continue" debugging. These global names must be interposable for "fix and continue" to work. If the same object file say x.o, has to be linked in creating more than one library, use object file (x.o) with a different timestamp each time you build a new library ie., compile the original source again just before building a new library. Or compile the original source to create object files with different names say x_1.o, x_2.o, etc., and use those unique object file names in building new libraries.
% cat employee.c
__global const float lversion = 1.2;
__symbolic int taxrate;
__hidden struct employee {
int empid;
char *name;
} Employee;
__global void createemployee(int id, char *name) { }
__symbolic void deleteemployee(int id) { }
__hidden void modifyemployee(int id) { }
% cc -c employee.c
% elfdump -s employee.o | egrep -i "lver|tax|empl" | grep -v "employee.c"
[5] 0x00000004 0x00000004 OBJT GLOB P 0 COMMON taxrate
[6] 0x00000004 0x00000008 OBJT GLOB H 0 COMMON Employee
[7] 0x00000068 0x00000018 FUNC GLOB H 0 .text modifyemployee
[8] 0x00000040 0x00000018 FUNC GLOB P 0 .text deleteemployee
[9] 0x00000010 0x0000001c FUNC GLOB D 0 .text createemployee
[10] 0x00000000 0x00000004 OBJT GLOB D 0 .rodata lversion
In this example, though different visibility was specified for all the symbols, scoping restraints were not in affect in the ELF relocatable object. Due to this, all symbols have global ( GLOB
) binding. However the object file is holding the corresponding ELF symbol visibility attributes for all the symbols according to their binding type.
Variable lversion
and function createemployee
have attribute D
, which stands for DEFAULT
visibility (that is, __global
). So those two symbols are visible outside of the defining component, the executable file or shared object.
taxrate
& deleteemployee
have attribute P
, which stands for PROTECTED
visibility ( __symbolic
). A symbol that is defined in the current component is protected, if the symbol is visible in other components, but cannot be preempted. Any reference to such a symbol from within the defining component must be resolved to the definition in that component. This resolution must occur, even if a symbol definition exists in another component that would interpose by the default rules.
Function modifyemployee
and structure Employee
were HIDDEN
with attribute H
( __hidden
). A symbol that is defined in the current component is hidden if its name is not visible to other components. Such a symbol is necessarily protected. This attribute is used to control the external interface of a component. An object named by such a symbol can still be referenced from another component if its address is passed outside.
A hidden symbol contained in a relocatable object is either removed or converted to local ( LOCL
) binding when the object is included in an executable file or shared object. It can be seen in the following example:
% cc -G -o libempl.so employee.o
% elfdump -sN.dynsym libempl.so | egrep -i "lver|tax|empl"
[5] 0x00000298 0x00000018 FUNC GLOB P 0 .text deleteemployee
[6] 0x00010360 0x00000004 OBJT GLOB P 0 .bss taxrate
[9] 0x000002f4 0x00000004 OBJT GLOB D 0 .rodata lversion
[11] 0x00000268 0x0000001c FUNC GLOB D 0 .text createemployee
% elfdump -sN.symtab libempl.so | egrep -i "lver|tax|empl" \
| grep -v "libempl.so" | grep -v "employee.c"
[19] 0x000002c0 0x00000018 FUNC LOCL H 0 .text modifyemployee
[20] 0x00010358 0x00000008 OBJT LOCL H 0 .bss Employee
[36] 0x00000298 0x00000018 FUNC GLOB P 0 .text deleteemployee
[37] 0x00010360 0x00000004 OBJT GLOB P 0 .bss taxrate
[40] 0x000002f4 0x00000004 OBJT GLOB D 0 .rodata lversion
[42] 0x00000268 0x0000001c FUNC GLOB D 0 .text createemployee
Because of the __hidden
specifier, Employee
and modifyemployee
were locally bound ( LOCL
) with hidden ( H
) visibility and didn't show up in dynamic symbol table; hence Employee
and modifyemployee
can not go into the procedure linkage table (PLT), and the run-time linker need only deal with four out of six symbols.
Default Scope
At this point, it is worth looking at the default scope of symbols without linker scoping mechanism in force, to practically observe the things we learned so far:
% cat employee.c
const float lversion = 1.2;
int taxrate;
struct employee {
int empid;
char *name;
} Employee;
void createemployee(int id, char *name) { }
void deleteemployee(int id) { }
void modifyemployee(int id) { }
% cc -c employee.c
% elfdump -s employee.o | egrep -i "lver|tax|empl" | grep -v "employee.c"
[5] 0x00000004 0x00000004 OBJT GLOB D 0 COMMON taxrate
[6] 0x00000004 0x00000008 OBJT GLOB D 0 COMMON Employee
[7] 0x00000068 0x00000018 FUNC GLOB D 0 .text modifyemployee
[8] 0x00000040 0x00000018 FUNC GLOB D 0 .text deleteemployee
[9] 0x00000010 0x0000001c FUNC GLOB D 0 .text createemployee
[10] 0x00000000 0x00000004 OBJT GLOB D 0 .rodata lversion
% cc -G -o libempl.so employee.o
% elfdump -sN.dynsym libempl.so | egrep -i "lver|tax|empl"
[1] 0x00000344 0x00000004 OBJT GLOB D 0 .rodata lversion
[4] 0x000103a8 0x00000008 OBJT GLOB D 0 .bss Employee
[6] 0x000002e8 0x00000018 FUNC GLOB D 0 .text deleteemployee
[9] 0x00000310 0x00000018 FUNC GLOB D 0 .text modifyemployee
[11] 0x000103b0 0x00000004 OBJT GLOB D 0 .bss taxrate
[13] 0x000002b8 0x0000001c FUNC GLOB D 0 .text createemployee
% elfdump -sN.symtab libempl.so | egrep -i "lver|tax|empl" \
| grep -v "libempl.so" | grep -v "employee.c"
[30] 0x00000344 0x00000004 OBJT GLOB D 0 .rodata lversion
[33] 0x000103a8 0x00000008 OBJT GLOB D 0 .bss Employee
[35] 0x000002e8 0x00000018 FUNC GLOB D 0 .text deleteemployee
[38] 0x00000310 0x00000018 FUNC GLOB D 0 .text modifyemployee
[40] 0x000103b0 0x00000004 OBJT GLOB D 0 .bss taxrate
[42] 0x000002b8 0x0000001c FUNC GLOB D 0 .text createemployee
From the above elfdump
output, all the six symbols were having global binding. So, PLT will be holding atleast six symbols.
Suggestions for Establishing an Object Interface
__global
directive, and reduce all other symbols to local using the -xldscope=hidden
compiler option. This model provides the most flexibility. All global symbols are interposable, and allow for any copy relocations to be processed correctly. Or,__symbolic
directive, data objects using the __global
directive and reduce all other symbols to local using the -xldscope=hidden
compiler option. Symbolic symbols are globally visible, but have been internally bound to. This means that these symbols do not require symbolic runtime relocation, but can not be interposed upon, or have copy relocations against them. Note that the problem of copy relocations only applies to data, but not to functions. This mixed model in which functions are symbolic and data objects are global will yield more optimization opportunities in the compiler.In short: if we do not want a user to interpose upon our interfaces, and don't export data items, the second model ie., mixed model with __symbolic
& __global
, is the best. If in doubt, better stick to the more flexible use of __global
(first model).
Exporting of symbols in dynamic libraries can be controlled with the help of __global
, __symbolic
, and __hidden
declaration specifiers. Look at the following header file:
% cat tax.h
int taxrate = 33;
float calculatetax(float);
If the taxrate
is not needed by any code outside of the module, we can hide it with __hidden
specifier and compile with -xldscope=global
option. Or leave taxrate
to the default scope, make calculatetax()
visible outside of the module by adding __global
or __symbolic
specifiers and compile the code with -xldscope=hidden
option. Let's have a look at both approaches.
First approach:
% more tax.h
__hidden int taxrate = 33;
float calculatetax(float);
% more tax.c
#include "tax.h"
float calculatetax(float amount) {
return ((float) ((amount * taxrate)/100));
}
% cc -c -KPIC tax.c
% cc -G -o libtax.so tax.o
% elfdump -s tax.o | egrep "tax"
[8] 0x00000010 0x00000068 FUNC GLOB D 0 .text calculatetax
[9] 0x00000000 0x00000004 OBJT GLOB H 0 .data taxrate
% elfdump -s libtax.so | egrep "tax"
[6] 0x00000240 0x00000068 FUNC GLOB D 0 .text calculatetax
[23] 0x00010350 0x00000004 OBJT LOCL H 0 .data taxrate
[42] 0x00000240 0x00000068 FUNC GLOB D 0 .text calculatetax
Second approach:
% more tax.h
int taxrate = 33;
__global float calculatetax(float);
% more tax.c
#include "tax.h"
float calculatetax(float amount) {
return ((float) ((amount * taxrate)/100));
}
% cc -c -xldscope=hidden -Kpic tax.c
% cc -G -o libtax.so tax.o
% elfdump -s tax.o | egrep "tax"
[8] 0x00000010 0x00000068 FUNC GLOB D 0 .text calculatetax
[9] 0x00000000 0x00000004 OBJT GLOB H 0 .data taxrate
% elfdump -s libtax.so | egrep "tax"
[6] 0x00000240 0x00000068 FUNC GLOB D 0 .text calculatetax
[23] 0x00010350 0x00000004 OBJT LOCL H 0 .data taxrate
[42] 0x00000240 0x00000068 FUNC GLOB D 0 .text calculatetax
Now it is clear that, the same effect of symbol visibility can be achieved by changing either the specifier and/or the command line interface through -xldscope
flag. (Appendix (1) shows the binding types with all possible source interfaces (specifiers) and command line option, -xldscope
)
Let's try to build a driver that invokes calculatetax()
function. But at first, let's modify tax.h
slightly to make calculatetax()
non-interposable (refer to bullet 2, in the "Suggestions for Establishing an Object Interface" section) and build libtax.so
.
% cat tax.h
int taxrate = 33;
__symbolic float calculatetax(float);
% cc -c -xldscope=hidden -Kpic tax.c
% cc -G -o libtax.so tax.o
% cat driver.c
#include <stdio.h>
#include "tax.h"
int main() {
printf("** Tax on $2525 = %0.2f **\n", calculatetax(2525));
return (0);
}
% cc -R. -L. -o driver driver.c -ltax
Undefined first referenced
symbol in file
calculatetax driver.o (symbol scope specifies local binding)
ld: fatal: Symbol referencing errors. No output written to driver
% elfdump -s driver.o | egrep "calc"
[7] 0x00000000 0x00000000 FUNC GLOB P 0 UNDEF calculatetax
Building the driver program failed, because even the client program ( driver.c
) is trying to export ( __symbolic
) the definition of calculatetax()
, instead of importing ( __global
) it. The declarations within header files shared between the library and the clients must ensure that clients and implementation have different values for the linker scoping of public symbols. So, the simple fix is to export the symbol while the library being built and import it when the client program needs it. This can be done by either copying the tax.h
to another file and changing the specifier to __global
, or by using preprocessor conditionals (with -D
option) to alter the declaration depending on whether the header file is used in building the library or by a client.
Using separate header files for clients and library, leads to code maintenance problems. Even though using a compiler directive eases the pain of writing and maintaining two header files, unfortunately it places lots of implementation details in the public header file. The following example illustrates this by introducing the compiler directive BUILDLIB
for building the library.
% more tax.h
int taxrate = 33;
#ifdef BUILDLIB
__symbolic float calculatetax(float);
#else
__global float calculatetax(float);
#endif
When the library was built, the private compiler directive defines BUILDLIB
to be non-zero; so, the symbol calculatetax
will be exported. While building a client program with the same header file, the BUILDLIB
variable is set to zero, and calculatetax
will be made available to the client ie., the symbol will be imported.
You may want to emulate this system by defining macros for your own libraries. This implies that you have to define a compiler switch (analogous to BUILDLIB
) yourself. This can be done with -D flag of Oracle Developer Studio C/C++ compilers. Using -D option at command line is equivalent to including a #define
directive at the beginning of the source. Set the switch to non-zero when you're building your library, and then set it to zero when you publish your headers for use by library clients.
Let's continue with the example by adding the directive BUILDLIB
to the compile line that builds libtax.so
.
% make
Compiling tax.c ..
cc -c -xldscope=hidden -Kpic tax.c
Building libtax library ..
cc -G -DBUILDLIB -o libtax.so tax.o
Building driver program ..
cc -ltax -o driver driver.c
Executing driver ..
./driver
** Tax on $2525 = 833.25 **
The following is an alternative implementation for the above example, with simple interface in public header file. The idea behind this approach is to use a second header file that redeclares symbols with a more restrictive linker scope for use within the library.
% cat tax_public.h
float calculatetax(float);
% cat tax_private.h
#include "tax_public.h"
int taxrate = 33;
% cat tax_private.c
#include "tax_private.h"
__symbolic float calculatetax(float amount) {
return ((float) ((amount * taxrate)/100));
}
% cat tax_private.h
#include "tax_public.h"
int taxrate = 33;
__symbolic float calculatetax(float);
To export the symbol calculatetax
, private header should be used while building the library.
% cat tax_private.c
#include "tax_private.h"
float calculatetax(float amount) {
return ((float) ((amount * taxrate)/100));
}
% cc -c -xldscope=hidden -KPIC tax_private.c
% cc -G -o libtax_private.so tax_private.o
Public header should be used while building the client program, so the client can access calculatetax
, since it will have global visibility.
% cat driver.c
#include <stdio.h>
#include "tax_public.h"
int main() {
printf("** Tax on $2525 = %0.2f **\n", calculatetax(2525));
return (0);
}
% cc -ltax_private -o driver driver.c
% ./driver
** Tax on $2525 = 833.25 **
The trade-off with this alternate approach is that we need two sets of header files, one for exporting the symbols and the other for importing them.
__declspec
A new keyword called __declspec
and supports dllexport
and dllimport
storage-class attributes (or specifiers) to facilitate the porting of applications developed using Microsoft Windows compilers to Oracle Solaris.
Syntax:
storage... __declspec( dllimport ) type declarator...
storage... __declspec( dllexport ) type declarator...
On Windows, these attributes define the symbols exported (the library as a provider) and imported (the library as a client).
On Oracle Solaris, __declspec(dllimport)
maps to __global
and the __declspec(dllexport)
maps to the __symbolic
specifier. Note that the semantics of these keywords are somewhat different on Microsoft and Oracle platforms. So, the applications being developed natively on Oracle platform are strongly encouraged to stick to Oracle specified syntax, instead of using Microsoft specific extensions to C/C++.
__declspec(dllexport)
While building a shared library, all the global symbols of the library should be explicitly exported using the __declspec
keyword. To export a symbol, the declaration will be like:
__declspec(dllexport) type name
__declspec(dllexport) char *printstring();
struct __declspec(dllexport) MyClass {...}
Data, functions, classes, or class member functions from a shared library can be exported using the __declspec(dllexport)
keyword.
When building a library, we typically create a header file that contains the function prototypes and/or classes we are exporting, and add __declspec(dllexport)
to the declarations in the header file.
Oracle Developer Studio compilers map __declspec(dllexport)
to __symbolic
; hence the following two declarations are equivalent:
__symbolic void printstring();
__declspec(dllexport) void printstring();
__declspec(dllimport)
To import the symbols that were exported with __declspec(dllexport)
, a client, that wants to use the library must reverse the declaration by replacing dllexport
with dllimport
__declspec(dllimport) char *printstring();
class __declspec(dllimport) MyClass{...}
A program that uses public symbols defined by a shared library is said to be importing them. While creating header files for applications that use the libraries to build with, __declspec(dllimport)
should be used on the declarations of the public symbols.
Oracle Developer Studio compilers map __declspec(dllimport)
to __global
; hence the following two declarations are equivalent:
__global void printstring();
__declspec(dllimport) void printstring();
Windows C/C++ compilers may accept code that is declared with __declspec(dllexport)
, but actually imported. Such code will not compile with Oracle Developer Studio compilers. The dllexport/dllimport attributes must be correct. This constraint increases the effort necessary to port existing Windows code to Oracle Solaris esp. for large C++ libraries and applications.
For example, the following code may compile and run on Windows, but doesn't compile on Sun.
% cat util.h
__declspec(dllexport) long multiply (int, int);
% cat util.cpp
#include "util.h"
long multiply (int x, int y) {
return (x * y);
}
% cat test.cpp
#include <stdio.h>
#include "util.h"
int main() {
printf(" 25 * 25 = %ld", multiply(25, 25));
return (0);
}
% CC -G -o libutil.so util.cpp
% CC -o test test.cpp -L. -R. -lutil
Undefined first referenced
symbol in file
multiply test.o (symbol scope specifies local binding)
ld: fatal: Symbol referencing errors. No output written to test
% elfdump -CsN.symtab test.o | grep multiply
[3] 0x00000000 0x00000000 FUNC GLOB P 0 UNDEF long multiply(int,int)
Let's conclude by stating some of the benefits of reduced linker scoping.
The following paragraphs explain some of the benifits of linker scoping feature. We can take advantage of most of the benefits listed, just by reducing the scope of all or most of the symbols in our application from global to local.
With C++, namespaces are the preferred method for avoiding name collisions. But applications that rely heavily on C style programming and doesn't use namespace mechanism, are vulnerable to name collisions.
Name collisions are hard to detect and debug. Third party libraries can create havoc when some of their symbol names coincide with those in the application. For example, if a third-party shared library uses a global symbol with the same name as a global symbol in one of the application's shared libraries, the symbol from the third-party library may interpose on ours and unintentionally change the functionality of the application without any warning. With symbolic scoping, we can make it hard to interpose symbols and ensure the correct symbol being used during run-time.
Reducing the exported interfaces of shared objects greatly reduces the runtime overhead of processing these objects and improves the application startup time & the runtime performance. Due to the reduced symbol visibility, the symbol count is reduced, hence less overhead in runtime symbol lookup, and the relocation count is reduced, hence less overhead in fixing up the objects prior to their use.
Access to thread-local storage can be significantly faster as the compiler knows the inter-object, intra-linker-module relationship between a reference to a symbol and the definition of that symbol. If the backend knows that a symbol will not be exported from a dynamic library or executable it can perform optimizations which it couldn't perform before when it only knew the scope relative to the relocatable object being built.
-Kpic
With most symbols hidden, there are fewer symbols in the library, and the library may be able to use the more efficient -Kpic rather than the less efficient -KPIC.
The PIC-compiled code allows the linker to keep a read-only version of the text (code) segment for a given shared library. The dynamic linker can share this text segment among all running processes, referencing it at a given time. PIC helps reducing the number of relocations.
The strip(1)
utility is not enough to hide the names of the application's routines and data items; stripping eliminates the local symbols but not the global symbols.
Dynamically linked binaries (both executables and shared libraries) use two symbol tables: the static symbol table and the dynamic symbol table. The dynamic symbol table is used by the runtime linker. It has to be there even in stripped executables, or else the dynamic linker can not find the symbols it need. The strip
utility can only remove the static symbol table.
By making most of the symbols of the application local in scope, the symbol information for such local symbols in a stripped binary is really gone and are not available at runtime; so no one can extract it.
Note even though linker scoping is an easier mechanism to use, it is not the only one and the same could be done with mapfiles too.
By reducing scope of symbols, the linker symbols that the client can link to are aligned with the supported interface of the library; and the client cannot link to functions that are not supported and may do damage to the operation of the library.
ELF's exported symbol table format is quite a space hog. Due to the reduced linker scope, there will be a noticeable drop in the sizes of the binaries being built
In the 64-bit mode, the linker on Oracle Solaris 8 or previous versions currently has a limitation: It can only handle up to 32768 PLT entries. This means that we can't link very large shared libraries in the 64-bit mode. Linker throws the following error message if the limit is exceeded:
Assertion failed: pltndx < 0x8000
The linker needs PLT entries only for the global symbols. If we use linker scoping to reduce the scope of most of the symbols to local, this limitation is likely to become irrelevant.
The use of linker mapfiles for linker scoping is difficult with C++ because the mapfiles require linker names, which are not the same names used in the program source (explained in the introductory paragraphs). Linker scoping is a viable alternative to mapfiles for reducing the scope of symbols. With linker scoping, the header files of the library need not change. The source files may be compiled with the -xldscope flag to indicate the default linker scoping, and individual symbols that wish another linker scoping are specified in the source.
Note that linker mapfiles provide many features beyond linker scoping, including assigning addresses to symbols and internal library versioning.
__global
, __symbolic
, and __hidden
) will have priority over the command-line -xldscope
option. The following table shows the resulting binding and visibility when the code was compiled with the combination of specifiers and command line option:
Declaration Specifier | -xldscope=global | -xldscope=symbolic | -xldscope=hidden | no -xldscope |
---|---|---|---|---|
__global | GLOB DEFAULT | GLOB DEFAULT | GLOB DEFAULT | GLOB DEFAULT |
__symbolic | GLOB PROTECTED | GLOB PROTECTED | GLOB PROTECTED | GLOB PROTECTED |
__hidden | LOCL HIDDEN | LOCL HIDDEN | LOCL HIDDEN | LOCL HIDDEN |
no specifier | GLOB DEFAULT | GLOB PROTECTED | LOCL HIDDEN | GLOB DEFAULT |
-xldscope=hidden
and its interface functions defined with __global
or __symbolic
.% cat external.h
extern void non_library_function();
inline void non_library_inline() {
non_library_function();
}
% cat public.h
extern void interposable();
extern void non_interposable();
struct container {
virtual void method();
void non_virtual();
};
% cat private.h
extern void inaccessible();
% cat library.c
#include "external.h"
#include "public.h"
#include "private.h"
__global void interposable() { }
__symbolic void non_interposable() { }
__symbolic void container::method() { }
__hidden void container::non_virtual() { }
void inaccessible() {
non_library_inline();
}
Compiling library.c
results in the following linker scopings in library.o
.
------------------------------------------------
function name linker scoping
------------------------------------------------
non_library_function undefined
non_library_inline hidden
interposable global
non_interposable symbolic
container::method symbolic
container::non_virtual hidden
inaccessible hidden
------------------------------------------------
__symbolic
specifier in the template class definition, all members of all instances of class Stack
, will have symbolic
scope, unless overridden. % cat stack.cpp
template <class Type>
class __symbolic Stack
{
private:
Type items[25];
int top;
public:
Stack();
Bool isempty();
Bool isfull();
Bool push(const Type & item);
Bool pop(Type & item);
};
% cat mylib_public.h
float getlibversion();
int checklibversion();
% cat mylib_private.h
#include "mylib_public.h"
const float libversion = 2.2;
% cat mylib.cpp
#include "mylib_private.h"
float getlibversion() {
return (libversion);
}
int checklibversion() {
return ((getlibversion() < 2.0) ? 1 : 0);
}
% CC -G -o libmylib.so mylib.cpp
% cat thirdpartylib.h
const float libversion = 1.5;
float getlibversion();
% cat thirdpartylib.cpp
#include "thirdpartylib.h"
float getlibversion() {
return (libversion);
}
% CC -G -o libthirdparty.so thirdpartylib.cpp
% cat versioncheck.cpp
#include <stdio.h>
#include "mylib_public.h"
int main() {
if (checklibversion()) {
printf("\n** Obsolete version being used .. Can\'t proceed further! **\n");
} else {
printf("\n** Met the library version requirement .. Good to Go! ** \n");
}
return (0);
}
% CC -o vercheck -lthirdparty -lmylib versioncheck.cpp
% ./vercheck
** Obsolete version being used .. Can't proceed further! **
Since checklibversion()
and getlibversion()
are within the same load module, checklibversion()
of mylib
library is expecting the getlibversion()
to be called from mylib
library. However linker picked up the getlibversion()
from thirdparty
library since it was linked before mylib
, when the executable was built.
To avoid failures like this, it is suggested to bind the symbols to their definition within the module itself with symbolic scoping. Compiling mylib
library's source with -xldscope=symbolic
makes all the symbols of the module to be symbolic in nature. It produces the desired behavior and makes it hard for symbol collisions, by ensuring that the library will use the local definition of the routine rather than a definition that occurs earlier in the link order:
% CC -G -o libmylib.so -xldscope=symbolic mylib.cpp
% CC -o vercheck -lthirdparty -lmylib versioncheck.cpp
% ./vercheck
** Met the library version requirement .. Good to Go! **
To get the number of relocations that the linker may perform, run the following commands:
For the total number of relocations:
% elfdump -r <DynamicObject> | grep -v NONE | grep -c R_
For the number of non-symbolic relocations:
% elfdump -r <DynamicObject> | grep -c RELATIVE
For example
% elfdump -r /usr/lib/libc.so | grep -v NONE | grep -c R_
2562
% elfdump -r /usr/lib/libc.so | grep -c RELATIVE
1868
The number of symbolic relocations is calculated by subtracting the number of non-symbolic relocations from the total number of relocations. This number also includes the relocations in the procedure linkage table.
ld(1)
, also called link-editor, creates load modules from object modules.ld.so.1(1)
performs the runtime linking of dynamic executables and shared libraries. It brings shared libraries into an executing application and handles the symbols in those libraries as well as in the dynamic executable images. ie., the dynamic linker creates a process image from load modules.Giri Mandalika is an engineering consultant at Oracle working with independent software vendors to make sure their products run well on Oracle platform. He holds a Master's degree in Computer Science from The University of Texas at Dallas.