Introducing the Basics of Service Management Facility (SMF) on Oracle Solaris 11

by Glynn Foster
Published August 2012

Simple examples of administering services on Oracle Solaris 11 with the Service Management Facility.

The Service Management Facility (SMF), first introduced in Oracle Solaris 10, is a feature of the operating system for managing system and application services, replacing the legacy init scripting start-up mechanism common to prior releases of Oracle Solaris and other UNIX operating systems. SMF improves the availability of a system by ensuring that essential system and application services run continuously even in the event of hardware or software failures. SMF is one of the components of the wider Oracle Solaris Predictive Self Healing capability.

This article gives an introduction to SMF and demonstrates some simple examples of administering services on Oracle Solaris 11. More advanced administration topics will be covered in another article.

An Overview of SMF

Before we look at some command line examples, let's quickly explore some of the features of SMF and the benefits it can bring in terms of improving application resiliency in a typical data center environment. SMF is the software framework that is responsible for managing services on a system—whether they are critical system services essential to the working operation of the system or application services, such as a database or Web server.

Each service has a well-defined state (enabled, disabled, offline, maintenance) and usually a relationship to other dependent services that are required to be running on the system first. This provides a key benefit in that services can be started in parallel during system start up, resulting in a much faster boot when compared to the legacy init framework, which is only able to start processes in sequence and must wait until they complete. Each service is usually started by the SMF master restarter daemon, svc.startd, though this task can be delegated to an alternative restarter, as is the case for internet services delegated to inetd.

Behind the scenes of each service is a service manifest that describes some basic information about the service, what service dependencies are required, any required service configuration, and how SMF should start and stop the service. A service, once started, can start several different processes that are tied together as part of a service contract. This means that an administrator needs to manage only the higher-level service, rather than worrying about a series of individual processes and what start order might be required by those processes. If a service fails for any reason, whether during a hardware or software fault, SMF will automatically detect the failure and restart the service and any dependent services.

SMF also includes the ability to run multiple instances of a given service and share common configuration across those instances. This is especially useful when you want to run multiple Apache Web server instances, for example, that might differ only by a given port number and document root. SMF stores service configuration data in a configuration repository, including the current state of each service instance on the system as well as the configuration data related to that service and service instance. The configuration repository is managed by the SMF configuration repository daemon, svc.configd.

Each service on the system can be described using a Fault Management Resource Indicator (FMRI) that shows the service name, the service instance, and an associating category. For example, the SSH server has the following FMRI:

svc:/network/ssh:default

In this case, the service name is ssh, the service instance is default, and the category is network. All SMF-related FMRIs are prefixed with the svc:/ scheme, except for "legacy services," which are prefixed with the lrc:/ scheme, as we will see below. Administrators use FMRIs as the main way to manipulate services on an Oracle Solaris system. In some cases, we can use abbreviated forms to refer to the same service, which we will see a little later in this article.

The SMF framework is always active on an Oracle Solaris 11 system, and it is started (and restarted) through the default init process, as shown in Figure 1.

Figure 1

Figure 1. SMF Framework

The SMF Command Line

There are several commands administrators can use from the command line to administer services and make configuration changes to the system. Table 1 provides a quick summary of the different command line options that are available.

Table 1. Summary of SMF Commands

Command Description
svcadm Manage the state of service instances
svcs Provide information about services, including their status
svcprop Get information about service configuration properties
svccfg Import, export, and modify service configuration

In this article, we will use the root account to execute our commands for simplicity. Some SMF command lines require privilege and other commands do not. Users can gain this privilege by adopting the root role or by gaining the solaris.smf.manage and/or solaris.smf.modify authorities.

Getting Information About Services Running on a System

Let's first take a look at the svcs command to generate a list of service instances that are being run on the system. Without any additional options, svcs provides a quick one-line status of each enabled service instance, as shown in Listing 1.





# svcs
STATE          STIME    FMRI
legacy_run     Jun_14   lrc:/etc/rcS_d/S99openconnect-clean
legacy_run     Jun_14   lrc:/etc/rc2_d/S47pppd
legacy_run     Jun_14   lrc:/etc/rc2_d/S81dodatadm_udaplt
legacy_run     Jun_14   lrc:/etc/rc2_d/S89PRESERVE
disabled       Jun_14   svc:/platform/i86pc/acpihpd:default
disabled       Jun_14   svc:/network/ipsec/policy:default
disabled       Jun_14   svc:/network/nis/domain:default
online         Jun_14   svc:/system/early-manifest-import:default
online         Jun_14   svc:/system/svc/restarter:default
online         Jun_14   svc:/network/tcp/congestion-control:vegas
online         Jun_14   svc:/network/tcp/congestion-control:highspeed
online         Jun_14   svc:/network/sctp/congestion-control:highspeed
online         Jun_14   svc:/network/sctp/congestion-control:vegas
online         Jun_14   svc:/network/tcp/congestion-control:newreno
online         Jun_14   svc:/network/sctp/congestion-control:cubic
online         Jun_14   svc:/network/tcp/congestion-control:cubic
...
online         Jun_14   svc:/system/zones:default
online         Jun_14   svc:/system/power:default
online         Jun_14   svc:/system/hal:default
online         Jun_14   svc:/application/texinfo-update:default
online         Jun_14   svc:/application/pkg/update:default

Listing 1. Example svcs Output

There are a couple of things to note about the output shown in Listing 1. First, the svcs command also lists some legacy services that are being started through the rc*.d script-initiated mechanism. Also, the command lists some service instances that are temporarily disabled until the next system reboot. We can get a list of all service instances, including disabled or incomplete ones, by using the -a option to svcs, as shown in Listing 2.




# svcs -a
STATE          STIME    FMRI
legacy_run     Jun_14   lrc:/etc/rcS_d/S99openconnect-clean
legacy_run     Jun_14   lrc:/etc/rc2_d/S47pppd
legacy_run     Jun_14   lrc:/etc/rc2_d/S81dodatadm_udaplt
legacy_run     Jun_14   lrc:/etc/rc2_d/S89PRESERVE
disabled       Jun_14   svc:/system/device/mpxio-upgrade:default
disabled       Jun_14   svc:/network/install:default
disabled       Jun_14   svc:/network/ipfilter:default
disabled       Jun_14   svc:/network/ipsec/ike:default
disabled       Jun_14   svc:/network/ipsec/manual-key:default
disabled       Jun_14   svc:/system/name-service-cache:default
disabled       Jun_14   svc:/network/ldap/client:default
disabled       Jun_14   svc:/network/nis/client:default
disabled       Jun_14   svc:/network/ibd-post-upgrade:default
disabled       Jun_14   svc:/network/inetd-upgrade:default
disabled       Jun_14   svc:/network/nfs/status:default
disabled       Jun_14   svc:/network/nfs/nlockmgr:default
...
online         Jun_14   svc:/system/zones:default
online         Jun_14   svc:/system/power:default
online         Jun_14   svc:/system/hal:default
online         Jun_14   svc:/application/texinfo-update:default
online         Jun_14   svc:/application/pkg/update:default

Listing 2. List of All Services

As we can see in Listing 2, we get a number of new service instances not listed with the previous command. To get an idea of just how many differences there are, we can get a quick count of the lines of output and get the number of disabled (or incomplete) services. In this case, it amounts to 111 disabled services on this system.




# svcs | wc -l
     147
# svcs -a | wc -l
     258

Now that we've seen a listing of all service instances, let's explore one of the service instances and get some more information about it. In this example, let's choose the svc:/system/zones:default service instance. We can use the -l option and the service name to get more information, as shown in Listing 3.




# svcs -l svc:/system/zones:default
fmri         svc:/system/zones:default
name         Zones autoboot and graceful shutdown
enabled      true
state        online
next_state   none
state_time   June 14, 2012 08:30:31 PM NZST
logfile      /var/svc/log/system-zones:default.log
restarter    svc:/system/svc/restarter:default
manifest     /etc/svc/profile/generic.xml
manifest     /lib/svc/manifest/system/zones.xml
manifest     /lib/svc/manifest/system/zonestat.xml
dependency   require_all/none svc:/milestone/multi-user-server (online)
dependency   optional_all/none svc:/system/pools:default (disabled)
dependency   optional_all/none svc:/system/pools/dynamic:default (disabled)
dependency   optional_all/none svc:/system/zones-monitoring (online)

Listing 3. Getting Information About a Service Instance

This command lists a lot of information about the svc:/system/zones:default service instance, including a description, detail about the state, where on the file system messages about it are being logged, what service is responsible for starting and restarting it, related service manifests, and dependency information.

As we can see from the description, this service instance is responsible for autobooting zones during system startup and shutting them down. From Listing 3, we can see that this service instance has four dependencies, one of which is required and three of which are optional. Another way to view dependency information is to use the -d option to svcs. While this gives us information about the state of the dependent service, it does not tell us what the dependency relationship might be:




# svcs -d svc:/system/zones:default
gman@rampage:~$ svcs -d zones
STATE          STIME    FMRI
disabled       Jun_14   svc:/system/pools:default
disabled       Jun_14   svc:/system/pools/dynamic:default
online         Jun_14   svc:/system/zones-monitoring:default
online         Jun_14   svc:/milestone/multi-user-server:default

Let's now have a look at another related service instance, svc:/system/zones-monitoring:default, and see what services depend on this service using the -D option to svcs:




# svcs -D svc:/system/zones-monitoring:default
STATE          STIME    FMRI
online          Jun_14   svc:/system/zone:default

The result, svc:/system/zones:default, is relatively unsurprising since we had already determined that relationship in the previous example. One of the key features of SMF is that administrators manage services rather than the individual processes themselves. But what if we wanted to know what processes were being started by a given service instance? We can look at this easily by using the -p option to svcs, which ps helps to confirm:




# svcs -p zones-monitoring
STATE          STIME    FMRI
online         Jun_14   svc:/system/zones-monitoring:default
               Jun_14        216 zonestatd

# ps 216
   PID TT       S  TIME COMMAND
   216 ?        S  0:01 /usr/lib/zones/zonestatd

Up until now, we have always used the full FMRI on the command line to specify the service that we are interested in. SMF also supports abbreviated FMRIs. All of the following examples of getting information about the svc:/system/system-log:default service instance are equivalent because they each uniquely identify the service:




# svcs -l svc:/system/system-log:default
# svcs -l system/system-log:default
# svcs -l system-log:default
# svcs -l system-log

Starting and Stopping Services

Now that we've looked at what services are running on the system and retrieved some basic information about those services, let's now look at how we can administer the state of those services with the svcadm command. For the next few examples, we'll take a look at the svc:/application/management/net-snmp:default service instance, which is responsible for managing the /usr/sbin/snmpd SNMP agent that is used to collect information about a system through a set of Management Information Bases (MIBs). We can check the initial state of this service and the types of dependencies it has using the svcs command, as shown in Listing 4.




# svcs net-snmp
STATE          STIME    FMRI
disabled       Jun_14   svc:/application/management/net-snmp:default
# svcs -d net-snmp
STATE          STIME    FMRI
disabled       Jun_14   svc:/network/rpc/rstat:default
online         Jun_14   svc:/system/cryptosvc:default
online         Jun_14   svc:/milestone/network:default
online         Jun_14   svc:/system/filesystem/local:default
online         Jun_14   svc:/milestone/name-services:default
online         Jun_14   svc:/system/system-log:default
online         Jun_14   svc:/milestone/multi-user:default
# svcs -l net-snmp
fmri         svc:/application/management/net-snmp:default
name         net-snmp SNMP daemon
enabled      false
state        disabled
next_state   none
state_time   June 19, 2012 01:50:37 PM NZST
logfile      /var/svc/log/application-management-net-snmp:default.log
restarter    svc:/system/svc/restarter:default
contract_id
manifest     /etc/svc/profile/generic.xml
manifest     /lib/svc/manifest/application/management/net-snmp.xml
dependency   require_all/none svc:/milestone/multi-user (online)
dependency   require_all/none svc:/system/filesystem/local (online)
dependency   optional_all/none svc:/milestone/name-services (online)
dependency   optional_all/none svc:/system/system-log (online)
dependency   optional_all/none svc:/network/rpc/rstat (disabled)
dependency   require_all/restart svc:/system/cryptosvc (online)
dependency   require_all/restart svc:/milestone/network (online)
dependency   require_all/refresh file://localhost/etc/net-snmp/snmp/snmpd.conf (online)

Listing 4. Checking the Initial State and Dependencies of a Service

The net-snmp service instance is initially disabled, but all its required dependencies are online (only one optional dependency, svc:/network/rpc/rstat, is disabled). Let's go ahead and enable this using the svcadm enable command:




# svcadm enable net-snmp
# svcs -p net-snmp
STATE          STIME    FMRI
online          9:33:40 svc:/application/management/net-snmp:default
                9:33:40     6062 snmpd

As we can see above, the /usr/sbin/snmpd daemon agent has now been started, and we can verify that the SNMP agent is working using the snmpwalk command, as shown in Listing 5.




# snmpwalk -v 1 -c public localhost
SNMPv2-MIB::sysDescr.0 = STRING: SunOS rampage 5.11 11.1 i86pc
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.3
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (186832) 0:31:08.32
SNMPv2-MIB::sysContact.0 = STRING: "System administrator"
SNMPv2-MIB::sysName.0 = STRING: rampage
SNMPv2-MIB::sysLocation.0 = STRING: "System administrators office"
SNMPv2-MIB::sysServices.0 = INTEGER: 72
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (31) 0:00:00.31
SNMPv2-MIB::sysORID.1 = OID: SNMP-FRAMEWORK-MIB::snmpFrameworkMIBCompliance
SNMPv2-MIB::sysORID.2 = OID: SNMP-MPD-MIB::snmpMPDCompliance
SNMPv2-MIB::sysORID.3 = OID: SNMP-USER-BASED-SM-MIB::usmMIBCompliance
SNMPv2-MIB::sysORID.4 = OID: SNMPv2-MIB::snmpMIB
SNMPv2-MIB::sysORID.5 = OID: TCP-MIB::tcpMIB
SNMPv2-MIB::sysORID.6 = OID: IP-MIB::ip
SNMPv2-MIB::sysORID.7 = OID: UDP-MIB::udpMIB
SNMPv2-MIB::sysORID.8 = OID: SNMP-VIEW-BASED-ACM-MIB::vacmBasicGroup
SNMPv2-MIB::sysORDescr.1 = STRING: The SNMP Management Architecture MIB.
SNMPv2-MIB::sysORDescr.2 = STRING: The MIB for Message Processing and Dispatching.
...
NOTIFICATION-LOG-MIB::nlmConfigGlobalEntryLimit.0 = Gauge32: 1000
NOTIFICATION-LOG-MIB::nlmConfigGlobalAgeOut.0 = Gauge32: 1440 minutes
NOTIFICATION-LOG-MIB::nlmStatsGlobalNotificationsLogged.0 = Counter32: 0 notifications
NOTIFICATION-LOG-MIB::nlmStatsGlobalNotificationsBumped.0 = Counter32: 0 notifications

Listing 5. Verifying That the SNMP Agent is Working

Before we go any further, let's also take a quick look at SMF's ability to restart any processes in the event of a hardware or software failure. As we saw above, the /usr/sbin/snmpd agent daemon is running with a process ID of 6062. Let's kill that process and see what happens:




# kill -9 6062
# svcs -p net-snmp
STATE          STIME    FMRI
online          9:38:12 svc:/application/management/net-snmp:default
                9:38:12     6065 snmpd

We can see that the /usr/sbin/snmpd process has restarted with a new process ID of 6065 and the service is still online! Permanently disabling the service is also simple by using the svcadm disable command, as follows:




# svcadm disable net-snmp
# svcs net-snmp
STATE          STIME    FMRI
disabled        9:44:40 svc:/application/management/net-snmp:default
# snmpwalk -v 1 -c public localhost
Timeout: No response from localhost


If we had chosen to, we could also have disabled the service temporarily until the next reboot using the -t option. Each service in SMF is always in one of a few different states, as shown in Table 2.

Table 2. SMF Service States
State Description
uninitialized This is the initial state of all services until its restarter (usually svc.startd) moves services to another state.
offline The instance is enabled but not yet running or unable to run.
online The instance is enabled and running.
maintenance The instance is enabled but unable to run for some reason, and administrative action will be required.
disabled The instance is disabled.
legacy-run The service is not directly managed by SMF, but it was started at some point.

If, for any reason, we wanted to restart a service, we could use the svcadm restart command.

SMF Milestones

SMF milestones are services that aggregate multiple service dependencies and describe a specific state of system readiness on which other services can depend. Administrators can see the list of milestones that are defined by using the svcs command, as shown in Listing 6.




# svcs milestone*
STATE          STIME    FMRI
online         Jun_30   svc:/milestone/unconfig:default
online         Jun_30   svc:/milestone/config:default
online         Jun_30   svc:/milestone/devices:default
online         Jun_30   svc:/milestone/network:default
online         Jun_30   svc:/milestone/single-user:default
online         Jun_30   svc:/milestone/name-services:default
online         Jun_30   svc:/milestone/self-assembly-complete:default
online         Jun_30   svc:/milestone/multi-user:default
online         Jun_30   svc:/milestone/multi-user-server:default

Listing 6. Listing Milestones

Some of the above milestones correspond to the traditional system run levels S (svc:/milestone/single-user), 2 (svc:/milestone/multi-user), and 3 (svc:/milestone/multi-user-server). Others correspond to internal implementation of the system configuration framework, sysconfig. While changing milestones is possible with svcadm, it is recommended that administrators continue to use the init command.

Making Some Basic Configuration Changes to Services

From time to time, it might be necessary to modify some of the configuration behind a service instance. One of the significant changes in Oracle Solaris 11 is that some of the system configuration traditionally located in /etc was moved into the SMF configuration repository. One of the primary drivers for this change was to more seamlessly manage a system upgrade and preservation of configuration while being able to easily merge in any vendor-provided configuration for configuration options that haven't been locally modified. The SMF configuration repository, managed by the svc.configd daemon, has been modified to store its configuration in a series of layers, as a series of administrative customizations and configuration provided through site profiles, system profiles, and manifests. We will cover this in more detail in another article.

At the heart of the configuration repository are property groups and properties. Property groups are exactly what they say they are—a set of properties that have been organized into a logical grouping. Within each property group, an arbitrary number of properties can exist storing a variety of different configuration types—simple strings, integers, Booleans, and network addresses, to name a few. Properties and property groups can be specific to a given service instance or global across all instances of a particular service. A property might have different values set on a parent service and a service instance, and the value from the service instance will take precedence.

Before we go into detail about how to modify changes in the SMF repository, let's quickly look at the command svcprop and how we can use it to list property groups and properties of a given service or service instance. Listing 7 shows it being used with the svc:/network/dns/client:default instance.




# svcprop dns/client:default
general/complete astring
general/enabled boolean true
general/action_authorization astring solaris.smf.manage.name-service.dns.client
general/entity_stability astring Unstable
general/single_instance boolean true
general/value_authorization astring solaris.smf.manage.name-service.dns.client
config/value_authorization astring solaris.smf.value.name-service.dns.client
config/nameserver net_address 192.168.0.1
sysconfig/group astring naming_services
milestoneconfig_network_dns_client/entities fmri svc:/milestone/config
milestoneconfig_network_dns_client/external boolean true
milestoneconfig_network_dns_client/grouping astring optional_all
milestoneconfig_network_dns_client/restart_on astring none
milestoneconfig_network_dns_client/type astring service
location_dns-client/entities fmri svc:/network/location:default
...
restarter/state_timestamp time 1339662573.051463000
restarter_actions/auxiliary_tty boolean false
restarter_actions/auxiliary_fmri astring svc:/network/location:default
general_ovr/enabled boolean true

Listing 7. Listing Property Groups and Properties

In Listing 7, we are using svcprop without any other options, and we get a composed view by default—one that includes properties from both the parent service and the service instance. If we just wanted to look at the instance properties, we can use the -C option, as shown in Listing 8.




# svcprop -C dns/client:default
general/complete astring
general/enabled boolean true
restarter/logfile astring /var/svc/log/network-dns-client:default.log
restarter/start_pid count 572
restarter/start_method_timestamp time 1339662573.041262000
restarter/start_method_waitstatus integer 0
restarter/transient_contract count
restarter/auxiliary_state astring dependencies_satisfied
restarter/next_state astring none
restarter/state astring online
restarter/state_timestamp time 1339662573.051463000
restarter_actions/auxiliary_tty boolean false
restarter_actions/auxiliary_fmri astring svc:/network/location:default
general_ovr/enabled boolean true

Listing 8. Listing Only Instance Properties

If we wanted to focus on a particular property, we can use the -p option to specify the property group and property. In this case, we're going to find the config/nameserver property on the service rather than on the service instance. This property is used as a replacement to the legacy /etc/resolv.conf file in previous versions of Oracle Solaris, though the value is mirrored to that file for compatibility with applications that might be parsing it.




# svcprop -p config/nameserver dns/client
192.168.0.1

Now that we've seen how to query properties, let's take a look at another command, svccfg, that we can use to set properties. svccfg provides a number of different ways to set properties: directly on the command line, through an interactive text-based interface, or through a text editor. Let's keep with our svc:/network/dns/client example and see how easy it is to set the name server configuration.




# svccfg -s dns/client setprop config/nameserver = 10.0.0.1
# svccfg -s dns/client listprop config/nameserver
config/nameserver net_address 10.0.0.1

Changes made to an existing service in the respository typically do not take effect until the service instance has been refreshed.




# svcprop -p config/nameserver dns/client
192.168.0.1
# svcadm refresh dns/client:default
# svcprop -p config/nameserver dns/client
10.0.0.1


Equally, we could have used the interactive interface to make these changes. Let's change the value of config/nameserver back to what it was originally, 192.168.0.1, as shown in Listing 9.




# svccfg
svc:> select dns/client
svc:/network/dns/client> listprop config/nameserver
config/nameserver net_address 10.0.0.1
svc:/network/dns/client> describe config/nameserver
config/nameserver net_address 10.0.0.1
    The value used to construct the "nameserver" directive in resolv.conf(4)
svc:/network/dns/client> setprop config/nameserver = 192.168.0.1
svc:/network/dns/client> listprop config/nameserver
config/nameserver net_address 192.168.0.1
svc:/network/dns/client> select default
svc:/network/dns/client:default> refresh
svc:/network/dns/client:default> exit

Listing 9. Using the Interactive Interface

svccfg supports a number of other useful commands, such as listpg to list property groups on a given service, editprop to open up a text editor to more easily allow configuration of multiple properties at the same time, and extract to allow administrators to easily capture service customizations as an XML file that can be applied on other systems. We will cover more of these in another article.

Monitoring the State of Services

One of the new features added to SMF in Oracle Solaris 11 is the ability to monitor the state of services and get notified if they change, either through e-mail or SNMP traps. Notifications can be quickly set to check if any SMF services go into maintenance mode or if a particular service goes online, for example. As a quick example, let's set an e-mail notification to be sent anytime an SMF service goes into maintenance mode, as shown in Listing 10.




# svccfg setnotify -g maintenance mailto:admin@mycompany.com
# svccfg listnotify -g
    Event: to-maintenance (source: svc:/system/svc/global:default)
        Notification Type: smtp
            Active: true
            to: admin@mycompany.com

    Event: from-maintenance (source: svc:/system/svc/global:default)
        Notification Type: smtp
            Active: true
            to: admin@mycompany.com

Listing 10. Example of Setting a Notification

By default, SMF will use an existing simple e-mail template to fill in the values of any SMF service that has gone into or out of the maintenance state; however, this can be modified easily by setting a parameter, msg_template, in the mailto: address, as follows:

# svccfg setnotify -g maintenance "'mailto:admin@mycompany.com?msg_template=/usr/local/share/new-smf-email-template'"

We can also monitor individual services. In this case, let's monitor the svc:/network/http:apache22 Apache Web server default instance for any changes away from its current online state:




# svcs http:apache22
STATE          STIME    FMRI
online         Jun_14   svc:/network/http:apache22
# svccfg -s http:apache22 setnotify from-online mailto:admin@mycompany.com
# svccfg -s http:apache22 listnotify
    Event: from-online (source: svc:/network/http:apache22)
        Notification Type: smtp
            Active: true
            to: admin@mycompany.com


Troubleshooting

Now that we have covered some of the basics of administration with SMF, let's quickly take a look at some of the things we can do to troubleshoot what might be wrong with a service. To quickly get an idea of what services are not running due to errors, we can use the -xv options to svcs, as shown in Listing 11.




# svcs -xv
svc:/system/identity:node (system identity (nodename))
 State: disabled since June 22, 2012 08:11:14 PM NZST
Reason: Disabled by an administrator.
   See: http://sun.com/msg/SMF-8000-05
   See: man -M /usr/share/man -s 4 nodename
   See: /var/svc/log/system-identity:node.log
Impact: 5 dependent services are not running:
        svc:/network/rpc/bind:default
        svc:/network/rpc/gss:default
        svc:/system/filesystem/autofs:default
        svc:/network/rpc/smserver:default
        svc:/network/nfs/mapid:default

Listing 11. Determining Which Services Have Errros

In this case, we have a simple problem: svc:/system/identity:node has been disabled causing five dependent services to not run. Enabling it fixes the problem.

Another reason for failure might be a missing configuration file, as in this example with svc:/application/management/net-snmp:default:




# svcs -xv
svc:/application/management/net-snmp:default (net-snmp SNMP daemon)
 State: offline since June 22, 2012 08:17:28 PM NZST
Reason: Dependency file://localhost/etc/net-snmp/snmp/snmpd.conf is absent.
   See: http://sun.com/msg/SMF-8000-E2
   See: man -M /usr/share/man/ -s 8 snmpd
   See: /var/svc/log/application-management-net-snmp:default.log
Impact: This service is not running.

Once we have fixed the problem (by ensuring that the snmpd.conf file exists), we need to restart the service.

Another failure might be due to an incorrect configuration file or missing executables, as it the case here with svc:/network/http:apache22:




# svcs -xv
svc:/network/http:apache22 (Apache 2.2 HTTP server)
 State: maintenance since June 22, 2012 08:23:35 PM NZST
Reason: Method failed.
   See: http://sun.com/msg/SMF-8000-8Q
   See: man -M /usr/apache2/2.2/man -s 8 httpd
   See: http://httpd.apache.org
   See: /var/svc/log/network-http:apache22.log
Impact: This service is not running.

In this case, it's not clear from a quick summary of the error what the fault is; however, it's clear that the service is now in maintenance state requiring explicit administrative intervention. The next logical step is to look at the service log located at /var/svc/log/network-http:apache22.log, as shown in Listing 12, which soon reveals the problem.




# tail /var/svc/log/network-http\:apache22.log
[ Jun 22 20:22:34 Method "stop" exited with status 0. ]
[ Jun 22 20:22:34 Executing start method ("/lib/svc/method/http-apache22 start"). ]
Apache version is 2.2
[ Jun 22 20:22:35 Method "start" exited with status 0. ]
[ Jun 22 20:23:35 Stopping because service restarting. ]
[ Jun 22 20:23:35 Executing stop method ("/lib/svc/method/http-apache22 stop"). ]
Apache version is 2.2
/usr/apache2/2.2/bin/apachectl[86]: /usr/apache2/2.2/bin/httpd: not found 
[No such file or directory]
Server failed to start. Check the error log (defaults to 
/var/apache2/2.2/logs/error_log) for more information, if any.
[ Jun 22 20:23:35 Method "stop" exited with status 95. ]

Listing 12. Checking the Service Log

We can easily see that our system is missing the /usr/apache2/2.2/bin/httpd executable file. This can be fixed easily by restoring the missing file using the IPS package manager with a pkg fix apache-22 command. Once we have identified and fixed the problem, we need to clear the state of the SMF service:




# svcadm clear http:apache22
# svcs http:apache22
STATE          STIME    FMRI
online         20:34:04 svc:/network/http:apache22


While some of the examples above have relied on checking the output of the status of a service through svcs or the service log located in /var/svc/log, sometimes you will need to check the log of the SMF restarter for that service, either svc.startd or a delegated restarter. In the case of the former, the log can be found at /var/svc/log/svc.startd.log. In another article, we will cover other troubleshooting tips in case the tips above don't work.

Summary

The Service Management Facility (SMF) provides a number of benefits for administrators managing system services and applications on Oracle Solaris 11, including automatic service restart, consolidated service configuration, and integration into the fault management framework. Unlike the legacy init system, administrators manage services—rather than processes—with full service dependency checking and parallel service startup, leading to a more consistent system state and more manageability.

See Also

About the Author

Glynn Foster is a Principal Product Manager for Oracle Solaris and works on technology areas that include the Image Packaging System and Service Management Facility. Glynn joined Oracle in 2010 as part of the Sun Microsystems acquisition.

Revision 1.0, 08/06/2012