The Oracle+PHP Cookbook: Improving Performance Through Persistent Connections

The concept of persistent database connections is a question mark for many PHP developers. When exactly is it a wise decision to use them?
by John Coggeshall
Published February 2006

Over the years, in my work as a technical consultant specializing in PHP, I have been asked about the appropriateness of persistent database connections more times than I can remember. After all, many of the most popular database extensions for PHP have functionality to establish such a connection.

To really understand the answer to this question, you need to understand the lifecycle of PHP script execution and how your Web server affects that lifecycle—and that's exactly where I'll begin this recipe.

Downloads for this article

How Apache and PHP Serve Requests

The most common PHP setup is PHP running as a module within an Apache Web server—generally, Apache 1.3.x (although PHP will run fine in the latest versions of Apache as well)—as long as you're running newer Apache 2.0+ versions in pre-fork mode. The seemingly unrelated fact that PHP must run in pre-fork mode is the framework for understanding the pros and cons of persistent database connections. Not sure what I mean? Let's take a look at the way Apache works.

When a client such as a Web browser connects to a Web site served by Apache, a process is dispatched to handle that request. In Apache 1.3.x, each of these processes (called "child" processes) are mutually independent. That is to say, unlike a thread-based model, each process has a completely independent memory and execution space in the server. This approach has long been considered a benefit to Apache because it makes the entire server incredibly stable—if a particular request causes Apache to crash, it will be far more likely to crash only the particular, individual child that handled the request.

This process is illustrated in Figure 1, where red indicates a segmentation fault caused by a particular HTTP request.

Figure 1 Segmentation fault in action

While this architecture is clearly a benefit for ensuring a stable Web server environment, the side effect of this "sandboxing" of each individual HTTP request is that everything regarding the request is by necessity also isolated from other requests. Because most PHP setups (by recommendation) are done running PHP as a module within Apache (mod_php), PHP scripts running in this pre-fork model themselves are subject to the same mutual isolation of requests. Thus, all variables and resources available within PHP are allocated and destroyed within the context of a single Apache child—and this is where persistent Oracle connections start coming into play.

Because all PHP resources are defined within the context of a single Apache child, when PHP establishes a database connection to an Oracle database the only place that connection exists is within that specific child. Thus, when a executing the same PHP script that makes a database connection concurrently for multiple different clients, each child must establish its own connection. (See Figure 2.)

Figure 1 Concurrent database connections across multiple clients

If each child process is executing the same PHP script, you can imagine the inefficiencies the pre-fork Apache model imposes on PHP as well as Oracle, as theoretically only one database connection was needed for all four children. Furthermore, even with four individual connections to the database using the standard connection methods provided by the Oracle APIs for PHP, this connection is broken and re-established for every single request. Even if each child must have its own connection, re-establishing the connection every request seems quite unnecessary for the vast majority of cases—and thus PHP attempts to address this issue through the notion of persistent database connections. This approach, unfortunately, only solves part of the problem—while persistent connections prevent the connection from being broken on every request, the PHP resource that is persisted across requests still only exists within a single child.

(Hint: If you are curious about how many total connections can occur to your Oracle database, take a look at the MaxClients configuration directive in your Apache Web server. Assuming your PHP scripts make X connections to your database per request, Oracle should be prepared to accept MaxClients * X database connections under peak load.)

Keeping Apache From Killing Off Your Database Connections

With the knowledge of how PHP, Apache, and Oracle interact, you are now equipped to make the most of every single connection to the database, as establishing one can easily be the most expensive operation when working with the database per request.

At first glance, apparently not much can be done to improve performance other than to persist the connection within a single child. However, making this assumption would be a mistake—the trick isn't to optimize PHP, but rather your server architecture itself.

Unless you are running some sort of remote procedure call (RPC) server that handles nothing but RPC requests, your Apache servers are probably doing more then just serving PHP-enabled pages that make use of the database connection. In fact, it's almost guaranteed that for every single "page" request (loading the index.php Web page from the browser), the browser may make 10, 20, or even 30 individual requests to the Web server for images, flash files, or other embedded documents within frames (and so on). Depending on how your servers are configured, this fact can render any benefit of using persistent connections effectively moot. Fine-tuning your Web servers running PHP to make the most of every single database connection request is critical, so let's take a look at the relevant Apache 1.3.x configuration settings: KeepAlive, MaxRequestsPerChild, and MaxSpareServers.

Finding Where and When Apache Kills Your Database Connections

Now that you have an idea about what's going on with the Web server as it relates to PHP and persistent connections, let's take a look at how you can improve things. The basic goal here is to maximize the number of times a persistent connection is used before it is closed implicitly by Apache itself.

To begin, take a look at the MaxRequestsPerChild configuration directive. This is an important directive when working with persistent connections, because it defines, as its name implies, the maximum number of HTTP requests a single Apache child can serve before the process is killed and restarted. While under ideal circumstances this setting is set to zero (meaning each child accepts an unlimited number of connections), if anything running within your Web server is unstable, leaking memory, and so on, this mechanism must be used to keep the server in top shape. Thus, if your children are being killed after X requests, then X-1 is the total number requests by which you can take advantage of a persistent database connection. This is the first directive you should look for, as it has the most direct impact on your persistent connection performance.

Similarly, the MaxSpareServers configuration directive is designed to "throttle back" the number of resources used on your Web server during low load times by killing off unnecessary Apache child processes that are not actually handling requests. Depending on the type of load patterns you experience on your site, Apache—although it does tend to manage this process somewhat intelligently—may very well be killing off children, and hence database connections, prematurely. For instance, for Web servers that experience frequent spikes in traffic, it makes sense to have significantly more spare child processes running, while servers that experience a heavy sustained followed by lulls for extended periods can kill them off to free up resources for other tasks.

Making the Most of Each Connection

Now that we've established the two primary means by which Apache kills off its children, and thus any persistent connections you may have established, let's now take a look at ways to optimize this behavior. While it is not always possible to simply not kill off children (especially if your child processes tend to leak memory over time), properly configuring Apache can make a big difference. Here are a few things you can do to improve performance.

Turn off MaxRequestsPerChild. Unless your child processes are leaking memory or otherwise breaking over time, there is no reason to set a request lifetime on your Apache children—he default value of zero (unlimited) is the best choice here. If you absolutely must set it to something other than zero, be as liberal as possible to make the most of your persistent connection.
Be liberal with spare processes. In almost every case, you will have one or more Apache Web servers running on dedicated machines. As such there is nothing wrong with taking a liberal approach to the number of "spare" servers you have laying around waiting for requests. After all, you aren't paying them by the hour, and keeping them alive allows you to avoid an expensive reconnect process later.
Use KeepAlive. If for whatever reason you must cap the maximum number of requests per child process, you can still make better use of your persistent connections by enabling the KeepAlive directive in your Apache configuration. A "keep alive" request means that a connecting client can open a single connection to a single child process and request multiple documents before closing. While they are both called a "request" (and indeed, multiple document requests are made to the server), Apache only counts this as a single request when the MaxRequestsPerChild configuration directive is involved. Thus, KeepAlive will help you preserve your persistent connection as long as possible before forcing a re-connect when the child dies and another takes its place.
Isolate PHP to its own Web farm. Another trick for improving performance, for your overall site as well as your Oracle database, is to isolate Web servers running PHP scripts from those serving static content. Remember, Apache doesn't distinguish a PHP script request from any other request, so if the same servers which are executing your PHP code are also serving static images, you'll run through even the most liberal MaxRequestsPerChild settings quickly. Rather, even if it's on the same machine, move serving of non-PHP resources to a separate Apache Web server and use mod_proxy to transparently serve them as usual. You'll not only improve performance overall (since now you can really streamline your static servers for optimal serving of static files), but you also will make your persistent connections as close to maximum performance as possible.

Optimizing the OCI Extension

Zend Core for Oracle (which includes a refactoring of the OCI extension for PHP) offers a number of new management tools for making the most of your persistent connections. These tools come in the form of a number of PHP configuration directives that you can use in the php.ini file.

oci8.max_persistent. The oci8.max_persistent configuration directive allows you to control how many persistent connections PHP will allow to the database server per process. In a properly designed application, it is unlikely you would need multiple persistent connections to the same server and such actions should be considered likely to be a bug. Using this setting you can help flush out these excessive connections to your database and thereby improve overall performance.
oci8.persistent_timeout. The oci8.persistent_timeout configuration directive allows you to control how long a persistent connection will be maintained in an idle state prior to being closed automatically. Note that Oracle itself can also control this behavior via user profiles or the Oracle Net layer.
oci8.ping_interval. The oci8.ping_interval configuration directive allows you to control how many seconds must pass before PHP will check the status of an existing persistent connection before attempting to use it to perform queries. The longer this interval, the more efficiently the connection will be established--however, be aware that this efficiency comes at the cost of being more likely to provide a bad database connection to the PHP application if the connection is lost between requests.

Conclusion

You now have everything you need to make the most of each and every connection made to your Oracle database—and when to use (or not use) persistent connections. While the information here is specific to database connections, any persistent resource following the same pattern can be optimized using the techniques described. John Coggeshall is a Senior Technical Consultant for Zend Technologies where he provides professional services to clients around the world. He got started with PHP in 1997 and is the author of three published books and over 100 articles on PHP technologies with some of the biggest names in the industry such as Sams Publishing, Apress, and O'Reilly. John also is a active contributor to the PHP core as the author of the tidy extension, a member of the Zend Education Advisory Board, and frequent speaker at PHP-related conferences worldwide.