The Java NIO.2 File System in JDK 7

By Janice J. Heiss and Sharon Zakhour, May 2009

JSR 203, a major feature of JDK 7 under the leadership of Sun software engineer Alan Bateman as an OpenJDK project, contains three primary elements that offer new input/output (I/O) APIs for the Java platform:

  • An extensive File I/O API system addresses feature requests that developers have sought since the inception of the JDK.
  • A socket channel API addresses multicasting, socket binding associated with channels, and related issues.
  • An asynchronous I/O API enables mapping to I/O facilities, completion ports, and various I/O event port mechanisms to enhance scalability.

This article provides a basic overview of the first element, the File I/O API. The abbreviation NIO generally refers to new I/O APIs that allow for I/O operations in Java technology. The java.nio, java.nio.channels, and java.nio.charset packages have been in existence since the inclusion of JSR 51 in Java version 1.4.* JSR 203 adds NIO.2 in JDK 7.

In NIO.2, the file system API is contained in a new package, java.nio.file, with two subpackages. The java.nio.file.attribute subpackage supports bulk access to file attributes, and the service provider interface (SPI) subpackage java.nio.file.spi, an interface for pluggable file system implementations, is designed for advanced developers who wish to create their own provider implementations.

The Need for Java NIO.2 File Revision

The Java I/O File API, as it was originally created, presented challenges for developers. It was not initially written to be extended. Many of the methods were created without exceptions, so they failed to throw I/O exceptions, which resulted in considerable frustration for developers. Applications often failed during file deletion, leaving developers confused as to why no useful error message had been generated. The rename method behaved inconsistently across volumes and file systems: Some were easily renamed, but others were not. Methods for gaining simultaneous metadata about files were inefficient. And developers wanted greater access to metadata such as file permissions, as well as more efficient file copy support and file change notification.

Developers also requested the ability to develop their own file system implementations by, for example, keeping a pseudofile system in memory, or by formatting files as zip files.

The Java NIO.2 packages address these and other needs.

The Path Class Operations

In java.nio.file, the class that most developers will use is the Path class, java.io.file.Path, which in the new API is the equivalent to the java.io.File that was created in Java version 1.0. A path is a file reference that locates a file using a system-dependent path. In other words, it is a path to a file in the file system. The file itself is not required to exist.

"Most developers will likely use the Path class and little else," says project lead Alan Bateman. "Think of Path as the equivalent of java.io.File in the new API. Interoperability with java.io.File and existing code is achieved using the toPath method, so existing code can be retrofitted to use the new API without many changes."

"We've added this method so that if an application is failing to delete and the developer does not know why, the new API can construct a path to the file object and the developer can invoke its delete method. The new delete method throws an I/O exception and tells you exactly why the file could not be deleted."

Two types of operations are available in java.io.file.Path. First, syntactic types of operations allow developers to, among other things, manipulate paths, get a parent directory, extract path components, and iterate over components of the path. A second type of file operation uses the path to locate a file in order to perform an operation, like create files, open a file for I/O, delete it, create a directory, and so on.

A path is either absolute or relative. An absolute path can be used to locate a file without requiring further information. An absolute path always contains the root element and the complete directory list required to locate the file. For example: /home/sally/statusReport is an absolute path. All the information needed to locate the file is contained in the path string.

A nonabsolute or relative path must be combined with other path information in order to locate a file. For example, joe/foo is a relative path. Without more information, a program cannot reliably locate the joe/foo directory on the file system.

"You can think of a path as similar to a string in its logic, though it's not exactly like a string, because it depends on the operating system you're on how the particular operation will function," explains Bateman. "Suppose we have a path of /home/alanb/foo. If I call its getParent, the parent directory, I get /home/alanb. You can work your way back to the root of the path by using something like the getParent method."

Directories in NIO.2

In the java.io.file API created for Java version 1.0, list and list file methods returned by an array of the names of files and directories. These methods did not scale to large directories, so in listing a large directory over a network, the list method might hang for long periods. If an application was serving multiple clients or getting directory lists, the virtual machine (VM) might run out of memory.

In Java NIO.2, directories function to return an iterator to allow for greater scaling. The directory stream class is an object to iterate over the entries in a directory. It returns a stream of entries that represents each file in the directory. When the action is complete, the developer closes the stream. The stream's close method must be invoked to close the stream.

In addition, the platform representation of the file names in the directory is preserved so that the files can be accessed again -- this is very important where the file names are stored as sequences of bytes, for example. A further advantage to having a handle to an open directory is that it is possible to perform operations relative to the directory, something that is important for security-sensitive applications. When iterating over a directory, the entries can be filtered. The API has built-in support for glob and regex patterns to filter by name or to develop arbitrary filters.

The FileVisitor Class Interface -- Developing Recursive Operations

"Suppose," says Bateman, "that you have a file tree and you'd like to do something on all of the files or a subset of the file tree. We have a utility method in the file class called walkFileTree."

"If you provide a starting point and a file visitor, it will invoke various methods on the file visitor as it walks through the file in the file tree. We expect people to use this if they are developing a recursive copy, a recursive move, a recursive delete, or a recursive operation that sets permissions or performs another operation on each of the files."

When walking a file tree, developers encounter errors. Parts of the file tree may be inaccessible, or a link may exist to a file that does not exist or that is not mounted. "In NIO.2, these issues are all covered," said Bateman. "We have a visit file, failed method, and a pre-visit directory failed method that notify you when a directory is not opened or a file cannot be visited. Various recovery actions become available. Each method returns a file visit result."

Symbolic Links

Although most file system objects are directories or files, some systems support the notion of symbolic links, also referred to as a symlink or soft link. A symbolic link is a special file that serves as a reference to another file. For the most part, symbolic links are transparent to applications, and operations on symbolic links are automatically redirected to the file or directory being pointed to, the target of the link. However, when a symbolic link is deleted, removed, or renamed, it is the link itself that is deleted, removed, or renamed -- not the target of the link.

The java.nio.file API has full support for symbolic links based on the long-standing semantics of UNIX symbolic links -- something that Java developers have long requested. This works on Windows Vista and newer Windows operating systems as well. By default, symbolic links are followed with a couple of exceptions, such as move and delete. In a few cases, the application can specify an option to follow or not follow links. This is important when reading file attributes or walking file trees, for example.

A symbolic link is usually transparent to the user. Reading or writing to a symbolic link works like reading or writing to any other file or directory. Applicable methods in the API are constructed so that they have an option to configure what to do when encountering a symbolic link.

Occasionally, a created symbolic link can cause a circular reference, wherein the target of a link points back to the original link. The circular reference may be indirect -- for example, directory A points to directory B, which points to directory C, which contains a subdirectory pointing back to directory A. Circular references can cause havoc when a program is recursively walking a directory structure. The API protects against such scenarios by reporting loops.

The WatchService API and File Change Notification

The java.nio.file package has a WatchService API to support file change notification. The main goal here is to help with performance issues in applications that are currently forced to poll the file system. This midlevel API is relatively easy to customize and build on. Developers can use it as is or create a high-level API on top of it to suit their needs.

To implement a watch service, do the following:

  • Create a WatchService "watcher" for the file system.
  • For each directory that you want monitored, register the directory with the watcher. When you register a directory, specify the type of events for which you want to receive notification. You receive a WatchKey instance for each directory that you register.
  • Implement an infinite loop to wait for incoming events.
  • When an event occurs, the key is signaled and placed into the watcher's queue.
  • The key is retrieved from the watcher's queue. You can obtain the file name from the key.
  • Retrieve each pending event for the key -- there may be multiple events -- and process as needed.
  • Reset the key and resume waiting for events.
  • The WatchService will watch all registered objects until it is closed by invoking its close method.

These WatchKeys are thread-safe and can be used with the java.nio.concurrent package. Developers can dedicate a thread pool to this effort.

The WatchService API is designed for applications that need to be notified about file change events. It is well suited for any application, like an editor or integrated development environment (IDE) that potentially has many open files and needs to ensure that the files are in sync with the file system. It is also well suited for the application server that watches a directory, perhaps waiting for .jsp or .jar files to drop, in order to deploy them. It is not designed for indexing a hard drive.

Most file system implementations have native support for file change notification -- the WatchService API takes advantage of this where available. But when a file system does not support this mechanism, the watch service will poll the file system, waiting for events.

Two Security Models

The API supports two security models, the traditional POSIX or UNIX file permissions and the Access Control List (ACL) model, which is based on the NFSvs4 ACL model. An implementation may also support additional or alternative security models.

An ACL is essentially a list of entries, with each entry consisting of four components. An ACL component indicates whether the ACL entry grants or denies access. The principal component indicates what is being talked about -- for example, a group or user. The permissions component is a superset of the POSIX permissions. A flags component indicates how entries are inherited and propagated.

File Attributes and the java.nio.file.attribute Package

The java.nio.file.attribute package provides access to file attributes or metadata, an area that is highly file-system specific. The package groups together the related attributes and defines a view of the commonly used ones. An implementation is required to support a basic view that defines attributes that are common to most file systems, such as file type, size, and time stamps. An implementation may also support additional views.

The package defines views for other common groups of attributes, including a subset of the POSIX attributes. For the most part, developers don't need to be concerned with this package but will instead use the static methods defined by the Attributes class for common cases.

Where possible, attributes are read in bulk, which enhances the performance of applications that need several attributes of the same file. Dynamic access is also supported, allowing attributes to be treated as name-value pairs, useful in avoiding compile-time dependencies on implementation-specific classes at the expense of type-safety. The API also allows initial file attributes to be set when creating files.

The java.nio.file.spi Package

Only developers who are defining new file system providers or file type detectors need directly use the service provider interface (SPI) package known as java.nio.file.spi. In addition to customizing providers, it is possible to replace or interpose on the default provider. Developers who wish to extend the default provider can install their own provider that delegates to the default provider, something that might be useful for those who want to develop virtual file systems.

"The file system provider can be extended with a concrete implementation, so that it becomes the factory that creates the file systems, along with all the other objects in the factory, which allows you to access files," observes Bateman. "You can develop your own provider for such things as zip files and memory file systems -- people are doing interesting work with this."

Because the file system provider is stackable, it enables developers to interpose their own provider on top of the default provider and to delegate to the default provider. "That's interesting when you want to, for example, log all system operations or augment the default provider by adding or changing existing functionality," explains Bateman.

The java.nio.file.spi package contains one other class, the FileTypeDetector, which is used to probe a file to guess its file type.

Looking Ahead

So how does Java NIO.2 fit into the overall direction of JDK 7? Alan Bateman remarks, "The file system API will be a significant boon to applications that today are forced to resort to native code to do many basic file system operations. Finally, the platform has support for copying and moving files, symbolic links, and file permissions, and for many other basic features whose previous absence inhibited effective access to the file system."

As used in this article, the term "Java version" refers to the Java Platform, Standard Edition. See Java SE Naming and Versions for additional information.

For More Information