By Max Bruning, June 2006
Many developers are writing applications to run under the Linux operating system. With the many new features of the Solaris 10 OS, and with the new emphasis Sun has placed on supporting the Solaris OS on AMD and Intel processor-based machines, developers are becoming interested in being able to develop their applications on the Solaris platform. This article examines similarities and differences in the development environments of both operating systems. Someone responsible for porting applications from Linux to the Solaris OS, or programmers with prior Linux experience that want to learn development on the Solaris OS, should benefit from this article.
In this article, the term "Solaris" refers to the Solaris 10 OS (and OpenSolaris), and "Linux" refers to Linux 2.6. Many of the details covered will also apply to earlier versions of Solaris and Linux. The Linux distribution is meant to be generic, though examples have been tested on SuSe 9.1. Also, the article concentrates on applications written using the C programming language, though C++ should behave the same. Since Java technology-based applications should not be making function calls specific to Linux or the Solaris OS, they should be portable as is.
This article discusses similarities and differences that will be visible to application programmers and analysts on the Solaris OS and Linux. It is not meant as an exhaustive description of differences, nor is it meant to show that one OS is superior to the other. Rather, the article tries to help developers experienced in one of the OSes to work with the other OS as quickly as possible.
A simple application that is POSIX-compliant and doesn't make any system calls or library functions specific to the Solaris OS or Linux should be portable between the OSes without changes. You should be able to write your app, compile for the Solaris OS or Linux, and simply recompile for the other OS, and it should work. Most of the system calls and library routines on both OSes will fall into this category.
Many system calls in Linux exist as library functions in the Solaris OS, and vice versa. For instance, sched_setscheduler()
is a system call in Linux and a library function that calls the priocntl(2)
system call in the Solaris OS. The priocntl(2)
system call does not exist in Linux, but Linux does not support multiple schedulers beyond time share and real time. The next section of this article groups system calls into functional sections and compares what is available in each OS.
Most of the applications and toolkits from the Linux world will compile and run without changes. These include gcc, emacs, MySQL, perl, and many others. Precompiled binaries for many packages are available at http://www.sunfreeware.com.
Various administrative differences exist between the Solaris OS and Linux, and within Linux, between different distributions. The Solaris 10 OS has introduced the "Service Management Framework" (SMF), which is a big change from previous versions of Solaris. Coverage of system administration differences will not be handled in this paper, except where it affects developers.
Most of the system calls and libraries that exist in Linux also exist in the Solaris OS. This section will cover system calls and library routines that are different between the two systems. The system calls and library routines are categorized as follows:
The Solaris OS keeps a list of system calls in /usr/include/sys/syscall.h
. Linux maintains the same information in /usr/include/asm/unistd.h
. (Note that both Linux and the Solaris OS have unistd.h
and syscall.h
files, and that in some cases, the files agree in content.)
Documentation for system calls is available in the Solaris OS and on Linux at /usr/share/man/man2
. (The Solaris OS has a symbolic link from /usr/man
to the same place.) Library routines are documented in various manual sections. See man intro.3
for an overview of the library sections on Linux and on the Solaris OS. Note that the Solaris OS breaks down the library routines more finely than Linux. For instance, aio_read()
is documented at aio_read(3RT)
on the Solaris OS, while on Linux, it is documented at aio_read(3)
. The result of this is that when compiling a program using aio_read()
on the Solaris OS, one must include the real-time library via -lrt
with the compilation/link command, which is not necessary on Linux.
Both Linux and the Solaris OS come with over 200 different libraries, with more than 50,000 functions defined within the libraries.
The following table lists some libraries on Linux and the Solaris OS. Note that this is not meant to be a complete listing. Also note that some of these libraries must be downloaded and installed separately from normal installation of the system.
Solaris OS< | Linux | Description |
---|---|---|
libc | libc | The standard C library (POSIX, SysV, ANSI, etc.) See man libc on Solaris OS. |
libucb | libc | UCB (University California Berkeley) compatibility library |
libmalloc | libc | There are several different malloc libraries; the default is in libc. |
libsocket | libc | Socket library (sockets are in libc on Linux). |
libxnet | libc | X/Open Networking library |
libresolv | libresolv | DNS routines (and on Solaris OS, inet_* routines) |
libnsl | libnsl/libc | Network services library (linux - nis/nis+ routines) |
librpc | librpc | RPC functions |
libslp | libslp | Service Location Protocol |
libsasl | libsasl | Simple Authentication and Security Layer |
libaio | libaio | Asynchronous I/O library |
libdoor | Door support ( door_create() , door_return() , etc.) | |
librt | librt | POSIX Real Time library |
libcfgadm | Configuration administration library | |
libcontract | Contract management library (see man contract.4 on Solaris OS) | |
libcpc | CPU performance counter library (on Linux, may need to install kernel module?) | |
libdat | Direct Access Transport Library (see http://www.datcollaborative.org) | |
libelf | libelf | ELF support library |
libm | libm | Math library |
The next sections take a closer look at some of the system calls and libraries. We'll concentrate on what's different between the systems.
Most of the socket and networking code should simply need to be recompiled for the OS you are using, and the resulting executable should work. This section compares network-related system calls and library routines that are typically used on the Solaris OS and Linux.
socket()
The socket()
routine, in addition to the AF_UNIX
, AF_INET
, and AF_INET6
domain arguments, has additional values on the Solaris OS and Linux. On the Solaris OS, the AF_NCA
domain is used to specify the Network Cache and Accelerator (see nca(1)
) for use with a socket. Most of the address families (domains) exist on both Linux and the Solaris OS. Note: See /usr/include/sys/socket.h
on the Solaris OS and /usr/include/linux/socket.h
for the possible address families. But you may need to download or write code to support some of the domains.
Linux has several additional domains documented on the socket(2)
man page. The additional documented domains on Linux are:
AF_IPX
- Novell IPX protocols (may be for SuSe only?).AF_NETLINK
- Kernel/user interface device, allows users to access kernel modules. Note: Other ways exist to do this on the Solaris OS (and on Linux for that matter).AF_X25
- X25 protocol. On the Solaris OS, this domain is included with Solstice X.25 product.AF_AX25
- Amateur radio AX.25 protocol.AF_ATMPVC
- Permanent Virtual Circuits over ATM.AF_APPLETALK
- See man ddp
on Linux. Also exists on the Solaris OS but not documented.AF_PACKET
- See man packet.7
on Linux. Raw packet interface. On the Solaris OS, open the NIC device and use getmsg(2)/putmsg(2)
to receive/send raw packets using DLPI. (See Data Link Provider Interface (DLPI), Version 2 for details on DLPI). bind()
The Linux man page ( man bind.2
), includes some information about different address families besides AF_INET
and AF_UNIX
. The Solaris man page is man bind.3socket
.
listen()
On both Linux and the Solaris OS, the backlog
argument (the second argument to listen()
) refers to the queue length for established connections that are waiting to be accepted. The Linux man page says this, while the Solaris man page just refers to the "queue of pending connections".
accept()
Linux supports three connection-based socket types: SOCK_STREAM
, SOCK_SEQPACKET
, and SOCK_RDM
, whereas the Solaris OS only documents SOCK_STREAM
. The Linux implementation does not inherit some socket flags. This may differ from other implementations.
connect()
The Linux man page ( man connect.2
) documents SOCK_SEQPACKET
, while the Solaris OS does not. Linux breaks the association between a connectionless socket and connect()
by connecting to an address with sa_family
in struct sockaddr
set to AF_UNSPEC
. This behavior is not documented in the Solaris OS.
send()/recv()
As in the other socket
library functions, these behave almost identically between the systems. Linux has some additional flags
argument documentation on the man page.
shutdown()
No noticeable difference between the Solaris OS and Linux.
It can be useful to look at an application where some of the differences appear. The tracedump program uses a packet capture library ( libpcap) to read Ethernet packets at the user level. The code to read raw Ethernet is quite different between the Solaris OS and Linux. (libpcap can also be used to examine the differences with other systems, such as FreeBSD, HP-UX, and AIX.) The applicable code in libpcap is at pcap-linux.c
and pcap-dlpi.c
. The DLPI code is used for Solaris, HP-UX, AIX, and other operating systems. Linux provides a mechanism for reading raw socket packets via the standard socket
calls. The Solaris OS uses the getmsg(2)
and putmsg(2)
calls to receive and send DLPI packets.
The following code demonstrates a way to do user-level packet capture on a network interface in the Solaris OS. This is followed by the analogous code in Linux. This code is a (very greatly) simplified extraction from the libpcap library.
#include <sys/types.h>
#include <sys/dlpi.h>
#include <sys/stream.h>
#include <stdio.h>
#include <errno.h>
#include <stropts.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
int
main(int argc, char *argv[])
{
register char *cp;
int fd;
dl_info_ack_t *infop;
union DL_primitives dlp;
dl_info_req_t inforeq;
dl_bind_req_t bindreq;
dl_attach_req_t attachreq;
dl_promiscon_req_t promisconreq;
struct strbuf ctl, data;
int flags;
char buffer[8192];
dl_error_ack_t *edlp;
fd = open(argv[1], O_RDWR); /* for instance, /dev/elxl0 */
/* attach to a specific interface */
attachreq.dl_primitive = DL_ATTACH_REQ;
attachreq.dl_ppa = 0; /* assume we want /dev/xxx0 */
ctl.maxlen = 0;
ctl.len = sizeof(attachreq);
ctl.buf = (char *)&attachreq;
flags = 0;
/* send attach req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get ok ack, may contain error */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
memset((char *)&bindreq, 0, sizeof(bindreq));
/* the following bind might not need to be done */
bindreq.dl_primitive = DL_BIND_REQ;
bindreq.dl_sap = 0;
bindreq.dl_max_conind = 1;
bindreq.dl_service_mode = DL_CLDLS;
bindreq.dl_conn_mgmt = 0;
bindreq.dl_xidtest_flg = 0;
ctl.maxlen = 0;
ctl.len = sizeof(bindreq);
ctl.buf = (char *)&bindreq;
flags = 0;
/* send bind req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get bind ack */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
promisconreq.dl_primitive = DL_PROMISCON_REQ;
promisconreq.dl_level = DL_PROMISC_PHYS;
ctl.maxlen = 0;
ctl.len = sizeof(promisconreq);
ctl.buf = (char *)&promisconreq;
flags = 0;
/* send promiscuous on req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get get ok ack */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
promisconreq.dl_primitive = DL_PROMISCON_REQ;
promisconreq.dl_level = DL_PROMISC_SAP;
ctl.maxlen = 0;
ctl.len = sizeof(promisconreq);
ctl.buf = (char *)&promisconreq;
flags = 0;
/* send promiscuous on req */
putmsg(fd, &ctl, (struct strbuf *)NULL, flags);
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
ctl.buf = (char *)&dlp;
/* get get ok ack */
getmsg(fd, &ctl, (struct strbuf*)NULL, &flags);
/* read and echo to stdout whatever comes to us */
while (1) {
data.buf = buffer;
data.maxlen = sizeof(buffer);
data.len = 0;
ctl.buf = (char *) &dlp;
ctl.maxlen = sizeof(dlp);
ctl.len = 0;
flags = 0;
getmsg(fd, &ctl, &data, &flags);
write(1, "\nCTL:\n", 6);
write(1, ctl.buf, ctl.len);
write(1, "\nDAT:\n", 6);
write(1, data.buf, data.len);
}
}
The Solaris code forms DLPI requests and gets DLPI responses to tell the interface that the application wants a copy of all packets arriving at the interface.
The code in Linux is much simpler, as a socket(2)
call allows one to specify raw packets. Linux does not use DLPI or STREAMS.
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <net/if.h>
#include <netinet/in.h>
#include <linux/if_ether.h>
#include <linux/if_packet.h>
#include <net/if_arp.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
int sock_fd = -1;
struct sockaddr_ll sll, from;
struct packet_mreq mr;
socklen_t fromlen;
int packet_len;
char buffer[8192];
sock_fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
memset(&sll, 0, sizeof(sll));
sll.sll_family = AF_PACKET;
sll.sll_ifindex = 0;
sll.sll_protocol = htons(ETH_P_ALL);
bind(sock_fd, (struct sockaddr *) &sll, sizeof(sll));
while (1) {
fromlen = sizeof(from);
packet_len = recvfrom(
sock_fd, buffer, sizeof(buffer), MSG_TRUNC,
(struct sockaddr *) &from, &fromlen);
write(1, buffer, packet_len);
}
}
A process on both the Solaris OS and Linux is a running instance of a program. In both the Solaris OS and in Linux (2.6), a process is a container for an address space and one or more threads. Every process in the system has a unique process ID (PID), which remains unique for some time after the process dies. Processes are created using fork(2)
and its variants. On Linux, processes (and threads) can also be created using clone(2)
, but pthread_create(3)
is more portable. On the Solaris OS, the undocumented lwp_create()
system call is somewhat analogous to clone(2)
.
vfork()
performs similarly on both systems. The Solaris OS has fork1()
and forkall()
. In the case of fork1()
, this causes the child process to only have the thread that executed the fork()
call; in the case of forkall()
, all the threads that were in the parent are replicated in the child. The default fork is fork1()
. forkall()
must be explicitly used. forkall()
does not exist in Linux, (i.e., Linux only supports fork1()
semantics).
The ps -elfL
command can be used on both the Solaris OS and Linux to see the threads in a process. Both systems report the number of LWPs and the lwpid
for each thread in the process. Note that an lwpid
is unique across processes in Linux. In the Solaris OS, the lwpid
is unique within the process. In Linux, the process ID of a multithreaded process is actually a thread group ID. The thread group ID is equivalent to the process ID of the main thread. Sending a signal (via kill(1)/kill(2)
) to any lwpid
is equivalent to sending the signal to the process. In the Solaris OS, you send the signal to the pid
. In both cases, if the default action is taken, the process typically exits and all threads are terminated. See the man page for ps(1)
for more details.
Both Linux and the Solaris OS support the notion of binding a process or thread to a processor. Linux allows binding to a set of processors for non-exclusive use of those processors. The Solaris OS allows binding to a set of processors for exclusive use, (that is, CPU fencing), but does not allow binding to a group for non-exclusive use (except via Solaris Zones?). Linux does not have a mechanism for CPU fencing, though implementations can be found on the web (see, for example, the CPUSETS for Linux page on the bullopensource.org site). The Linux system calls that are processor affinity based are sched_setaffinity(2)
and sched_getaffinity(2)
. The Solaris OS has the following:
processor_bind(2)
to bind/unbind LWPs or processes to a processorpset_create(2)
to set up a processor setpbind(1)
and psrset(1)
, which are command-line interfacesFor completeness, output of the ps(1)
command, first on Linux, then on the Solaris OS, is shown in the section on Threads.
On Linux and the Solaris OS, all forms of the exec
system call result in calling execve(2)
. The Solaris OS documents all six flavors of exec(2)
on the same manual page. The Linux man page exec(3)
documents execv
, execl
, execle
, execlp
, and execvp
. A separate page covers execve(2)
.
The /proc
file system exists in slightly different variations on Linux and the Solaris OS. On both systems, /proc
is a directory containing files whose names are the process IDs of the current active processes on the system. Each PID-named file is in turn a directory. /proc
on Linux has various other directories besides processes. Most of these deal with processors, devices, and statistics on the system. On Linux, one looks in /proc
to find information about processes, processors, devices, machine architecture, and so on. On the Solaris OS, the same kind of information is typically available by using a command. For instance, prtconf(1)
can be used to learn about machine configuration on the Solaris OS. On Linux, this is done largely by looking at files in /proc
.
The virtual address space used by processes can be examined using pmap(1)
on the Solaris OS, and by catting the /proc/ pid/maps
file on Linux, as shown below. See pmap(1)
on the Solaris OS and proc(5)
on Linux for more details.
<-- on solaris, address space of this instance of bash -->
bash-3.00$ pmap -x $$
1043: /usr/bin/bash -i
Address Kbytes RSS Anon Locked Mode Mapped File
08045000 12 12 4 - rw--- [ stack ]
08050000 528 468 - - r-x-- bash
080E3000 76 72 8 - rwx-- bash
080F6000 124 108 40 - rwx-- [ heap ]
FED8E000 4 4 - - rwxs- [ anon ]
FEDA0000 4 4 - - rwx-- [ anon ]
FEDB0000 760 660 - - r-x-- libc.so.1
FEE7E000 24 24 8 - rw--- libc.so.1
FEE84000 8 8 - - rw--- libc.so.1
FEE90000 24 8 4 - rwx-- [ anon ]
FEEA0000 524 324 - - r-x-- libnsl.so.1
FEF33000 20 20 4 - rw--- libnsl.so.1
FEF38000 32 - - - rw--- libnsl.so.1
FEF50000 44 40 - - r-x-- libsocket.so.1
FEF6B000 4 4 - - rw--- libsocket.so.1
FEF70000 4 4 4 - rwx-- [ anon ]
FEF80000 144 132 - - r-x-- libcurses.so.1
FEFB4000 28 24 - - rw--- libcurses.so.1
FEFBB000 8 - - - rw--- libcurses.so.1
FEFC0000 4 4 - - r-x-- libdl.so.1
FEFC7000 140 140 - - r-x-- ld.so.1
FEFFA000 4 4 4 - rwx-- ld.so.1
FEFFB000 8 8 4 - rwx-- ld.so.1
-------- ------- ------- ------- -------
total Kb 2528 2072 80 -
bash-3.00$
For the equivalent on Linux, see Figure 1. Note that Linux shows the full path name to libraries (the output has been edited to only show the library name). To get the full path names to libraries on the Solaris OS, use pldd(1)
.
Figure 1: Examining Virtual Address Space Used by Processes in Linux
Linux and the Solaris OS support POSIX threads, Linux via The Native POSIX Thread Library for Linux (PDF), and the Solaris OS as part of the standard C library. See Multithreaded Programming Guide (PDF), specifically, Chapter 5 Programming with the Solaris Software, for details of threads on the Solaris OS. Also quite good is the white paper Multithreading in the Solaris Operating Environment.
In addition to POSIX threads, the Solaris OS supports "Solaris threads". The threads(5)
man page describes the similarities and differences between the POSIX thread library and the Solaris thread library. The implementations are interoperable and can be used with care within the same application. The following is straight from the man page.
Most of the functions in the libpthread and libthread libraries have a counterpart in the other corresponding library. POSIX function names, with the exception of the semaphore names, have a "pthread" prefix. Names for similar POSIX and Solaris functions have similar endings. Typically, similar POSIX and Solaris functions have the same number and use of arguments.
fork(2)
calls.The following is a very simple MT program. Very few differences are found in the ways in which multithreaded applications work between the two OSes. Of course, the underlying implementations have several differences.
#include <pthread.h>
#include <stdio.h>
void *fcn(void *);
int
main(int argc, char *argv[])
{
pthread_t tid;
pthread_create(&tid, NULL, fcn, NULL);
(void) printf("main thread id = %x\n", pthread_self());
pthread_join(tid, NULL);
}
void *
fcn(void *arg)
{
printf("new thread id = %x\n", pthread_self());
}
Use the following to compile and run the program on the Solaris platform:
bash-3.00$ cc simplepthread.c -o simplepthread
bash-3.00$ ./simplepthread
main thread id = 1
new thread id = 2
bash-3.00$
Using gcc
on the Solaris platform gives the same results. On Linux it appears thus:
max@linux:~/source> cc simplepthread.c
/tmp/cc8u7kZs.o(.text+0x1e): In function `main':
simplepthread.c: undefined reference to `pthread_create'
/tmp/cc8u7kZs.o(.text+0x4a):simplepthread.c: undefined reference
to `pthread_join'
collect2: ld returned 1 exit status
max@linux:~/source> cc simplepthread.c -lpthread -o simplepthread
max@linux:~/source> ./simplepthread
main thread id = 4015c6c0
new thread id = 4035cbb0
max@linux:~/source>
On Linux, the POSIX thread library needs to be explicitly linked. Note that Solaris 9 and earlier versions also require this. In the Solaris 10 OS, POSIX threads are in the standard C library ( libc.so
). Note also that the Solaris OS assigns thread IDs using a monotonically increasing integer starting at 1. Linux uses the user virtual address of the pthread
structure (structure used internally by the thread library).
Visibility to threads is provided on both systems by the ps(1)
command, and via the /proc
file system. See Figure 2 for the output of the ps(1)
command on the Solaris platform and Figure 3 for the output on Linux. You'll see that, given the same options, the output is very similar between the machines.
Figure 2: Output of ps(1)
Command on Solaris Platform
Figure 3: Output of ps(1)
Command on Linux
The command shows state, user, PID, parent PID, LWP ID, number of LWPs (for user processes, this is the number of threads), scheduling class, scheduling priority, user virtual size, wait channel, start time, tty, time spent running, and command. Linux does not report ADDR
, and the Solaris OS shows the (kernel) virtual address of the proc_t
data structure, which the kernel uses to maintain the process. Linux shows WCHAN
as a symbol, while the Solaris OS shows it as an address. In the Solaris OS, the WCHAN
column is the address of a synchronization variable on which the thread is blocked. On Linux, WCHAN
is the routine in which the thread is sleeping. To get the equivalent information in the Solaris OS, use ::threadlist -v
inside of mdb -k
.
Note that on a machine running a 64-bit kernel (that is, SPARC or AMD64 architecture based), the ADDR
and WCHAN
fields will display a question mark ( ?
). To see the values for these two fields, use ps -e -o addr,wchan,comm
.
More likely, you are interested in what the application threads are doing. For this, use pstack(1)
on the process ID of interest. There is a pstack
on Linux, but it must be downloaded. Search for it on http://rpmfind.net/linux/RPM/. Note that it only gives the stack backtrace of one thread (the thread ID that is passed to it as an argument). If you want a backtrace of all threads within a process, you need to pass the thread IDs as separate arguments.
<-- get user-level stack(s) of a process on Solaris -->
bash-3.00$ pstack `pgrep mozilla-bin`
21528: /usr/sfw/bin/../lib/mozilla/mozilla-bin -UILocale en-US
----------------- lwp# 1 / thread# 1 --------------------
fef68967 pollsys (896dac8, 9, 0, 0)
fef2b2aa poll (896dac8, 9, ffffffff) + 52
fe793242 g_main_context_iterate () + 39d
----------------- lwp# 2 / thread# 2 --------------------
fef68967 pollsys (fbf5bd04, 1, 0, 0)
fef2b2aa poll (fbf5bd04, 1, ffffffff) + 52
fede047d _pr_poll_with_poll (816fa0c, 1, ffffffff, fbf5bf64,
fc0558aa, 816fa0c) + 2d5
fede05f1 PR_Poll (816fa0c, 1, ffffffff) + 11
fc0558aa __1cYnsSocketTransportServiceEPoll6M_i_ (816f6b8) + 58
fc055f7d __1cYnsSocketTransportServiceDRun6M_I_ (816f6b8) + 18f
fc3d1262 __1cInsThreadEMain6Fpv_v_ (816eb60) + 32
fede1693 _pt_root (816fcc0) + 9e
fef67b30 _thr_setup (feec2400) + 51
fef67f40 _lwp_start (feec2400, 0, 0, 0, 0, 0)
----------------- lwp# 4 / thread# 4 --------------------
fef67f7b lwp_park (0, fa87deb8, 0)
fef620bb cond_wait_queue (825cfec, 816b8d0, fa87deb8, 0) + 3e
fef62462 cond_wait_common (825cfec, 816b8d0, fa87deb8) + 1e9
fef62691 _cond_timedwait (825cfec, 816b8d0, fa87df38) + 4a
fef62722 cond_timedwait (825cfec, 816b8d0, fa87df38) + 27
fef62761 pthread_cond_timedwait (825cfec, 816b8d0,
fa87df38) + 21
feddc598 pt_TimedWait (825cfec, 816b8d0, f1c) + b8
feddc767 PR_WaitCondVar (825cfe8, f1c) + 64
fc3d417e __1cLTimerThreadDRun6M_I_ (81e5108) + 16e
fc3d1262 __1cInsThreadEMain6Fpv_v_ (820d690) + 32
fede1693 _pt_root (820e6b0) + 9e
fef67b30 _thr_setup (fb520400) + 51
fef67f40 _lwp_start (fb520400, 0, 0, 0, 0, 0)
bash-3.00$
Here is an equivalent on Linux. It is interesting that programs like Mozilla and xemacs are stripped on Linux and not stripped on the Solaris OS.
max@linux:~> cd /proc/`pgrep mozilla`/task
max@linux:/proc/3991/task> pstack *
3991: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (8803488, 8, ffffffff, 8803488, 9, 400fbea0) + 40
0x404b0a6d: ???? (8129258, 4035236c, 57f, 4011e4e6, 4048de14,
403513c4) + 20
0x404b0d07: ???? (814b898, 814b898, 0, 0, 415a8f64, 814b898) + 30
0x401dc11f: ???? (8106350, bfffee80, bfffede8, 807673e, 8084cf4, 0)
0x415c4006: ???? (8106350, 0)
0x414fbae4: ???? (8105ee8, 0, 8079c2c, bfffee90, 80a67b8,
40ad841c) + 1f0
0x08059b7c: ???? (80e7f08, bffff058, 40017068, 14, 4081ccf8,
1) + 90
0x08055a47: ???? (1, bffff134, bffff13c, 4081ccf8, 406eebd0,
400168c0) + 40
0x405f2500: ???? (8055840, 1, bffff134, 80557b0, 8055740,
4000d330) + 40000ed8
4001: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (413eb7f0, 1, ffffffff, 18, 413eb7f8, 0) + 230
0x400c7439: ???? (818911c, 1, ffffffff, 40c5a0a8, ffffffff,
8188dec)
0x40bc8a52: ???? (8188dc8, 8188df4, 1, 8188dec, 8188f7c, 1) + 10
0x40bc8bcb: ???? (8188dc8, 413ebbb0, 40102ce0, 400d5238,
8189478, 0)
0x40a8da6b: ???? (81893f8, 8189478, 4000ca40, 40102be8, 0, 0)
0x400cb7a6: ???? (8189478, 413ebac4, 0, 0, 0, 0) + 54
0x400fa9dd: ???? (413ebbb0, 0, 0, 0, 0, 0) + bec144d4
4004: /opt/mozilla/lib/mozilla-bin
(No symbols found)
0xffffe410: ???? (40656756, 400d5238, 81ed160, 81ed2d0, 41ffba08,
400c5721) + 170fd55
crawl: Input/output error
Error tracing through process 4004
0x1afcdbf8: ????max@linux:/proc/3991/task>
Solaris threads are given a default user stack size of 1MB. For Linux, the default stack size is 2MB (SuSe 9.1).
Both OSes support POSIX synchronization mechanisms, i.e., mutexes, condition variables, reader/writer locks, semaphores, and barriers. The underlying mechanisms rely on mutexes. In Solaris, user-level mutexes are implemented using "adaptive" spin locks. On Linux, the mechanism is the "futex", or fast user level mutex. Both mechanisms avoid going into the kernel in the non-contention case, and should give comparable performance and behavior.
The Solaris user-level adaptive spin mutexes are described in Multithreading in the the Solaris Operating Environment. Linux futexes are described in Futexes Are Tricky (pdf).
The Solaris OS mechanisms lwp_park()
and lwp_unpark()
, and Linux mechanisms futex_up()
and futex_down()
, can be used by applications. However, I have not found any source code examples. It is probably best to stick with the POSIX APIs. If you want to compare relative speeds of the POSIX locking mechanisms (as well as performance of various other library routines and system calls), I recommend getting a copy of the libmicro micro benchmark and trying it out on both the Solaris OS and Linux.
Without describing differences in the kernels' handling of memory, we can say that at user level several different memory allocation (malloc) libraries exist, most of which are available (or can be built) for either OS. A comparison of some of the user-level memory allocators can be found in the Sun Developer Network article A Comparison of Memory Allocators in Multiprocessors. "A Memory Allocator" at http://gee.cs.oswego.edu/dl/html/malloc.html contains a (dated) description of a memory allocator used on Linux. More comments can be found in the source code.
At application level, the Solaris OS and Linux both offer POSIX timer routines, including timer_create()
, timer_delete()
, and nanosleep()
. The Solaris OS has an additional timer, CLOCK_HIGHRES
, that attempts to use an optimal hardware source, and may give close to nanosecond resolution. A CLOCK_HIGH_RES
timer may give similar resolution on Linux, but needs to be installed as a kernel patch (see home page for the high resolution timers project at http://high-res-timers.sourceforge.net/ for details). The following is example code that uses the CLOCK_HIGHRES
timer to fire on user-specified intervals for a user-specified duration. The interval is specified in nanoseconds, and the duration in seconds. When the program completes, it prints the number of times the timer fired, and the number of times the timer was "overrun". The "overrun" value is a count of the number of timer expirations that occurred between the time a timer fired (causing a signal to be generated), and the time the signal is handled (see timer_getoverrun(3RT)
. Running the program real-time with too short an interval may cause the system to hard hang.
#include <pthread.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <time.h>
#include <errno.h>
#define DURATION 120 /* default time to run in seconds */
/* default .5 seconds in nanosecs */
#define INTERVAL (1000*1000*500)
void* timer_fcn(void* arg);
void* signaler_thd(void* arg);
/* Program globals */
extern int errno;
int duration = DURATION;
int interval = INTERVAL;
int
main(int argc, char *argv[])
{
sigset_t mask;
pthread_t wtid = 0;
pthread_t stid = 0;
int rval;
int n;
if (argc >=2) {
errno = 0;
if (argc == 2)
duration = strtol(argv[1], NULL, 0);
else if (argc == 3) {
interval = strtol(argv[1], NULL, 0);
duration = strtol(argv[2], NULL, 0);
}
if (errno || argc > 3 || interval <= 0
|| duration <= 0) {
fprintf(stderr, "Usage: %s [[interval] duration]\n",
argv[0]);
fprintf(stderr, "interval nsecs, duration seconds\n");
exit(1);
}
}
/* mask SIGALRM signals */
sigemptyset(&mask);
sigaddset(&mask, SIGALRM);
sigaddset(&mask, SIGUSR1);
rval = pthread_sigmask(SIG_BLOCK, &mask, NULL);
if(rval != 0) {
printf("%s: pthread_sigmask failed, errno = %d.\n",
argv[0], rval);
exit(1);
}
rval = pthread_create(&wtid, NULL, timer_fcn, NULL);
if (rval != 0) { /* Waiter create call create failed */
perror ("Waiter create");
printf ("Waiter create call failed: %d.\n", rval);
exit (1);
}
/* Do signaler thread */
rval = pthread_create(&stid, NULL, signaler_thd, &mask);
if (rval != 0) { /* Signaler call create failed */
printf ("Signaler call create failed: %d.\n", rval);
exit (1);
}
/* Wait for waiter and signaler to finish */
rval = pthread_join(stid, NULL);
if (rval != 0) { /* Signaler call join failed */
printf ("Signaler call join failed: %d.\n", rval);
exit (1);
}
rval = pthread_join(wtid, NULL);
if (rval != 0) { /* Waiter call join failed */
printf ("Waiter call join failed: %d.\n", rval);
exit (1);
}
printf("done\n");
exit(0);
}
pthread_mutex_t mp;
pthread_cond_t cv;
int time_expired = 0;
int timerentered;
int timeroverrun;
timer_t itimerid;
void *
timer_fcn(void *arg)
{
struct itimerspec value;
struct sigevent event;
value.it_interval.tv_sec = 0;
value.it_interval.tv_nsec = interval; /* nsec intervals */
value.it_value.tv_sec = 1; /* starting in 1 second */
value.it_value.tv_nsec = 0; /* plus 0 nanosecs */
event.sigev_notify = SIGEV_SIGNAL;
event.sigev_signo = SIGALRM;
event.sigev_value.sival_int = 0;
if (timer_create(CLOCK_HIGHRES, &event,
&itimerid) == -1) {
perror("timer_create failed");
exit(1);
}
/* the second arg can be set to TIMER_ABSTIME */
if (timer_settime(itimerid, 0, &value, NULL) == -1) {
/* else time value is relative to when the call is made */
perror("timer_settime failed");
exit(1);
}
pthread_mutex_lock(&mp);
while (time_expired == 0)
pthread_cond_wait(&cv, &mp);
printf("timerentered = %d\n", timerentered);
printf("timeroverrun = %d\n", timeroverrun);
pthread_mutex_unlock(&mp);
exit(0);
}
int timerset;
void *
signaler_thd(void *arg)
{
int signo;
while (1) {
signo = sigwait(arg);
if (signo == SIGALRM) {
if (!timerset) {
struct itimerspec value;
struct sigevent event;
timer_t endtimerid;
++timerset;
value.it_interval.tv_sec = 0;
value.it_interval.tv_nsec = 0;
value.it_value.tv_sec = duration; /*wait duration secs*/
value.it_value.tv_nsec = 0; /* plus 0 nanosecs */
event.sigev_notify = SIGEV_SIGNAL;
event.sigev_signo = SIGUSR1;
event.sigev_value.sival_int = 0;
if (timer_create(CLOCK_HIGHRES, &event,
&endtimerid) == -1) {
perror("timer_create failed");
exit(1);
}
/* the second arg can be set to TIMER_ABSTIME */
if (timer_settime(endtimerid, 0, &value, NULL)
== -1) {
perror("timer_settime failed");
exit(1);
}
} else { /* if (!timerset) */
++timerentered;
timeroverrun += timer_getoverrun(itimerid);
}
} else { /* SIGUSR1 */
struct itimerspec value;
struct sigevent event;
/* cancel the interval timer */
value.it_interval.tv_sec = 0;
value.it_interval.tv_nsec = 0; /* nanosecond intervals */
/* setting the following to 0 should stop the timer */
value.it_value.tv_sec = 0;
value.it_value.tv_nsec = 0; /* plus 0 nanosecs */
event.sigev_notify = SIGEV_SIGNAL;
event.sigev_signo = SIGALRM;
event.sigev_value.sival_int = 0;
pthread_mutex_lock(&mp);
if (timer_settime(itimerid, 0, &value, NULL) == -1) {
perror("timer_settime failed");
exit(1);
}
++time_expired;
pthread_cond_signal(&cv);
pthread_mutex_unlock(&mp);
}
}
}
And here are some examples of running the compiled code.
<-- realtime library and best optimization -->
bash-3.00$ cc timerex1.c -lrt -o timerex1 -O -fast
bash-3.00$ ./timerex1 <-- only root can use high res timer
timer_create failed: Not owner
bash-3.00$ su
Password:
<-- default interval is .5 seconds, duration is 120 seconds -->
# ./timerex1
timerentered = 240 <-- timer fired every .5 seconds
timeroverrun = 0
# ./timerex1 1000000 10 <-- interval is 1 msec for 10 secs
timerentered = 9912
timeroverrun = 88
# priocntl -e -c RT ./timerex1 1000000 10 <-- run it real time
timerentered = 10000 <-- timer fired once each msec for 10 secs
timeroverrun = 0
# ./timerex1 100000 10 <-- interval is 100 usecs for 10 seconds
timerentered = 99615 <-- we missed a few
timeroverrun = 386
# priocntl -e -c RT ./timerex1 100000 10 <-- try real time
timerentered = 99871 <-- almost 1 every 100 microseconds
timeroverrun = 129
# ./timerex1 10000 10 <-- interval is 10 microseconds
timerentered = 485905 <-- here we miss over half
timeroverrun = 514125 <-- (sig handler takes > 10 usecs?)
<-- using RT 1 usec interval causes hang on my machine -->
# priocntl -e -c RT ./timerex1 1000 10
Both the Solaris OS and Linux support System V IPC (shared memory, message queues, and semaphores). Both systems also support pipes and the real-time shared memory operations ( shm_open()
, shm_unlink()
, and so on). Both systems support the tmpfs file system (using memory and swap space for files). The Solaris OS places /tmp
, /var/run
, and /etc/svc/volatile
in tmpfs. Linux uses /dev/shm
. Both systems allow other mount points to be added.
Here are the steps for using tmpfs on the Solaris OS; steps for Linux are shown below. Note that "swap" on the Solaris OS uses memory as well as disk (if needed). In other words, files created in /tmp
are stored in memory. If memory gets full, the pageout
daemon may write data from /tmp
to swap space on disk.
# mkdir /foo
<-- create a tmpfs file system using swap on /foo
# mount -F tmpfs swap /foo
# df -h /foo
Filesystem size used avail capacity Mounted on
swap 652M 0K 652M 0% /foo
# df -h /tmp
Filesystem size used avail capacity Mounted on
swap 652M 52K 652M 1% /tmp
#
And here are the analogous steps on Linux.
linux:/home/max # mkdir /foo
<-- tmpfs also uses swap space and memory -->
linux:/home/max # mount tmpfs /foo -t tmpfs
linux:/home/max # df -h /foo
Filesystem Size Used Avail Use% Mounted on
tmpfs 248M 0 248M 0% /foo
linux:/home/max # df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 248M 16K 248M 1% /dev/shm
linux:/home/max #
It might be interesting to run the libmicro benchmarks mentioned earlier in the article to get some idea of relative performance between the systems.
The Solaris OS and Linux treat signals similarly. Some signals exist in the Solaris OS and not in Linux, and vice versa. Also, some of the same signals use different signal numbers. Both OSes recommend using sigaction(2)
over signal()
to catch signals, and the use of sigwait()
to handle asynchronous signals in multithreaded applications. The sigwait(3)
manual page on Linux has a BUGS
section. The Linux signal handling differs from the POSIX standard. POSIX states that an asynchronously delivered signal (a signal sent externally to the process), is handled by any thread that does not have the signal currently blocked. In Linux, asynchronous signals may be sent to specific threads (signals can be sent to the thread ID via kill(1)
). The Solaris OS implements the POSIX standard for this. There is no way to send a signal to a specific thread externally to the process. One can send a signal via kill(1)
to the process, not to a specific thread within the process.
Some of the differences are described in "Building Applications with the Linux Standard Base" at http://lsbbook.gforge.freestandards.org/sig-handling.html. Note that this page may not be entirely accurate. For instance, the page says that Linux sets SIGBUS
to SIGUNUSED
because there is no "bus error" in Linux. However, the Linux man page for mmap(2)
documents receiving SIGBUS
when accessing a memory range that does not correspond to a valid location in the file that mmap
was used with. (The Solaris OS does the same).
On both the Solaris OS and Linux, signals are handled when a non-held, non-ignored signal is found pending for a thread returning from kernel to user mode. On both systems, SIGKILL
and SIGSTOP
take priority over other signals. Otherwise, on Solaris signals are handled in an undocumented order (lowest signal number first). On Linux, signals are handled in the order they are delivered (again, excepting SIGKILL
and SIGSTOP
).
On the Solaris OS, to see the signal settings for a running process, use psig
.
bash-3.00$ psig $$ <-- signal disp for current shell
954: /usr/bin/bash -i
HUP caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
INT caught sigint_sighandler 0
QUIT ignored
ILL caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
TRAP caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
ABRT caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
EMT caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
FPE caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
KILL default
BUS caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
SEGV caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
SYS caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
PIPE caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
ALRM caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
TERM ignored
USR1 caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
USR2 caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
CLD blocked,caught 0x807d4d7 0
PWR default
WINCH caught 0x807e182 0 <-- not all syms are present
URG default
POLL default
STOP default
TSTP ignored
CONT default
TTIN ignored
TTOU ignored
VTALRM caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
PROF default
XCPU caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
XFSZ caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
WAITING default
LWP default
FREEZE default
THAW default
CANCEL default
LOST caught termination_unwind_protect 0 HUP,INT,
ILL,TRAP,ABRT,EMT,FPE,BUS,SEGV,SYS,PIPE,ALRM,TERM,USR1,USR2,
VTALRM,XCPU,XFSZ,LOST
XRES default
JVM1 default
JVM2 default
RTMIN default
RTMIN+1 default
RTMIN+2 default
RTMIN+3 default
RTMAX-3 default
RTMAX-2 default
RTMAX-1 default
RTMAX default
bash-3.00$
As far as I can tell, there is no easy way to do this in Linux, but someone has probably implemented a kernel patch/module to give you the information. Certainly it should be do-able with User Mode Linux.
Generally, if you are developing a POSIX-compliant application on Linux or the Solaris OS, the application should port to the other OS simply by recompilation. Of course, many applications will have parts that are not addressed by POSIX. For instance, device ioctl(2)
handling tends to be OS (and, of course, device) specific.
Getting documentation for the Solaris OS is reasonably straightforward, since most of the documentation is at https://docs.oracle.com. Getting documentation for Linux is sometimes simple (search on the web), and sometimes not so simple. You'll find that Linux typically offers multiple ways to do the same thing (different implementations of threads, for example). My impression is that much of the Linux documentation is in the source code itself. This is fine if you have access to all the source code. You do have access to all of the source code, but it is not all in one place. In fact, it seems scattered all over the place.