Object Serialization: Frequently Asked Questions

Open all Close all

    Questions About Serialization in the Latest JDK:

  • Why does the javadoc standard doclet generate many warnings about missing @serial and/or @serialData tags?

    In order to provide a developer with a final opportunity to acknowledge that a default serializable field should be serialized, javadoc emits a warning when the field is missing an @serial tag within a javadoc comment. The @serial tag acts as a confirmation that the default serializable field is appropriate to be serialized and that it is worth supporting this field for all future compatible versions of the class. To silence the warning, simply add an @serial tag within the javadoc comment above the field.

    An Externalizable.writeExternal method must have an @serialData tag to document its data layout or a warning is emitted that the tag is missing.

    See the javadoc reference page for a complete description of the three new serialization javadoc tags, @serial, @serialField, and @serialData.  Also, see Section 1.6 " Documenting Serializable Fields and Data for a Class," of the Java Object Serialization Specification for additional information.

  • Why do I see javadoc warnings stating that I am missing @serial tags for private fields if I am not running javadoc with the " -private" switch?

    Default serializable fields are computed independent of accessibility. Therefore, private serializable fields still need an @serial tag and are documented in the serialized form. All serializable fields, regardless of accessibility, must be documented as part of the serialized form since these members are exposed outside of the JVM* and are part of the serialized form for the class.

    Questions About the Serialization Subsystem:

  • Why must classes be marked serializable in order to be written to an ObjectOutputStream?

    The decision to require that classes implement the java.io.Serializable interface was not made lightly. The design called for a balance between the needs of developers and the needs of the system to be able to provide a predictable and safe mechanism. The most difficult design constraint to satisify was the safety and security of Java classes.

    If classes were to be marked as being serializable, the design team worried that a developer, either out of forgetfulness, laziness, or ignorance might not declare a class as being Serializable and then make that class useless for RMI or for purposes of persistence. We worried that the requirement would place on a developer the burden of knowing how a class was to be used by others in the future, an essentially unknown condition. Indeed, our preliminary design, as reflected in the alpha API, concluded that the default case for a class ought to be that the objects in the class be serializable. We later changed our design only after security and correctness considerations convinced us that the default had to be that an object not be serialized.

    Security Restrictions:
    The first consideration that caused us to change the default behavior of objects had to do with security, in particular the privacy of fields declared to be private, package protected, or protected. The Java runtime restricts access to such fields for either read or write to a subsetof the objects within the runtime.

    No such restriction can be made on an object once it has been serialized; the stream of bytes that are the result of object serialization can be read and altered by any object that has access to that stream. This allows any object access to the state of a serialized object, which can violate the privacy guarantees that users of the language expect. Furthermore, the bytes in the stream can be altered in arbitrary ways, allowing reconstruction of an object that was never created within the protections of a Java environment. There are cases in which the recreation of such an object could compromise not only the privacy guarantees expected by users of the Java environment, but also the integrity of the environment itself.

    These violations cannot be guarded against, since the whole idea of serialization is to allow an object to be converted into a form that can be moved outside of the Java environment (and therefore outside of the privacy and integrity guarantees of that environment) and then be brought back into the environment. Requiring objects to be declared Serializable does mean that the class designer must make an active decision to allow the possibility of such a breach in privacy or integrity. A developer who does not know about serialization should not be open to compromise because of this lack of knowledge. In addition, the developer who declares a class to be Serializable must only do so after giving some thought to the possible consequences of that declaration.

    Note that this sort of security problem is not one that can be dealt with by the mechanism of a security manager. Since serialization is intended to allow the transport of an object from one virtual machine to some other (either over space, as it is used in RMI, or over time, as when the stream is saved to a file), the mechanisms used for security need to be independent of the runtime environment of any particular virtual machine. We wanted to avoid as much as possible the problem of being able to serialize an object in one virtual machine and not being able to deserialize that object in some other virtual machine. Since the security manager is part of the runtime environment, using the security manager for serialization would have violated this requirement.

    Forcing a Conscious Decision:
    While security concerns were the first reason for considering the design change, a reason that we feel is at least as convincing is that serialization should only be added to a class after some design consideration. It is far too easy to design a class that falls apart under serialization and reconstruction. By requiring a class designer to declare support for the Serialization interface, we hoped that the designer would also give some thought to the process of serializing that class.

    Examples are easy to cite. Many classes deal with information that only makes sense in the context of the runtime in which the particular object exists; examples of such information include file handles, open socket connections, security information, etc. Such data can easily be dealt with by simply declaring the fields as transient, but such a declaration is only necessary if the object is going to be serialized. A novice (or forgetful, or hurried) programmer may neglect to mark fields as transient in much the same way he or she may neglect to mark the class as implementing the Serializable interface. Such a case should not lead to incorrect behavior; the way to avoid this is to not serialize objects that are not marked as implementing Serializable.

    Another example of this sort is the "simple" object that is the root of a graph which spans a large number of objects. Serializing such an object could result in serializing several others, since serialization works over an entire graph. Doing something like this should be a conscious decision, not one that happens by default.

    The need for this sort of thought was brought home to us when we were going through the base Java class libraries marking the system classes as Serializable (where appropriate). Originally, we thought that this would be a fairly simple process, and that most of the system classes could just be marked as implementing Serializable and then use the default implementation with no other changes. What we found was that this was far less often the case than we had suspected. In a large number of the classes, careful thought had to be given to whether or not a field should be marked as transient or whether it made sense to serialize the class at all.

    Of course, there is no way to guarantee that a programmer or class designer is actually going to think about these issues when marking a class as Serializable. However, by requiring the class to declare itself as implementing the Serializable interface, we do require that some thought be given by the programmer. Having serialization be the default state of an object would mean that lack of thought could cause bad effects in a program, something that the overall design of the Java programming environment has attempted to avoid.

  • When a Serializable object is written with writeObject, then modified and written a second time, why is the modification missing when the stream is deserialized?

    The ObjectOutputStream class keeps track of each object it serializes and sends only the handle if the object is written into the stream a subsequent time. This is the way it deals with graphs of objects. The corresponding ObjectInputStream keeps track of all of the objects it has created and their handles so when the handle is seen again it can return the same object. Both output and input streams keep this state until they are freed.

    Alternatively, the ObjectOutputStream class implements a reset method that discards the memory of having sent an objecct, so sending an object again will make a copy.

  • Why is OutOfMemoryError thrown after writing a large number of objects into an ObjectOutputStream?

    The ObjectOutputStream maintains a table mapping objects written into the stream to a handle. The first time an object is written to a stream, its contents are written into the stream; subsequent writes of the object result in a handle to the object being written into the stream. This table maintains references to objects that might otherwise be unreachable by an application, thus, resulting in an unexpected situation of running out of memory. A call to the ObjectOutputStream.reset() method resets the object/handle table to its initial state, allowing all previously written objects to be elgible for garbage collection.

  • How do I get the serialVersionUID of an array class?

    Run the serialver tool, supplying the name of the class, as shown in the example that follows:

    serialver "[Ljava.lang.String;"
  • Why is UTFDataFormatException thrown by DataOutputStream.writeUTF() when serializing a String?

    DataOutputStream.writeUTF() does not support writing out strings larger than 64K. The first two bytes of a UTF string in the stream are the length of the string. If a java.lang.String can be larger than 64K, it needs to be stored in the stream by an alternative method rather than depending on the default method of storing a String in the stream, writeUTF.

  • How do I serialize a tree of objects?

    Here's a brief example that shows how to serialize a tree of objects.

    import java.io.*;
    
    
    class tree implements java.io.Serializable {
      public tree left;
      public tree right;
      public int id;
      public int level;
      private static int count = 0;
      public tree(int l) {
        id = count++;
        level = l;
        if (l > 0) {
           left = new tree(l-1);
           right = new tree(l-1);
        }
      }
      public void print(int levels) {
      for (int i = 0; i < level; i++)
        System.out.print("  ");
        System.out.println("node " + id);
    
        if (level <= levels && left != null)
           left.print(levels);
    
        if (level <= levels && right != null)
           right.print(levels);
      }
    
      public static void main (String argv[]) {
    
        try {
           /* Create a file to write the serialized tree to. */
           FileOutputStream ostream = new FileOutputStream("tree.tmp");
           /* Create the output stream */
           ObjectOutputStream p = new ObjectOutputStream(ostream);
    
           /* Create a tree with three levels. */
    
           p.writeObject(base); // Write the tree to the stream.
           p.flush();
           ostream.close();    // close the file.
    
           /* Open the file and set to read objects from it. */
           FileInputStream istream = new FileInputStream("tree.tmp");
           ObjectInputStream q = new ObjectInputStream(istream);
    
           /* Read a tree object, and all the subtrees */
           tree new_tree = (tree)q.readObject();
    
           new_tree.print(3);  // Print out the top 3 levels of the tree
        } catch (Exception ex) {
           ex.printStackTrace();
        }
      }
    }
           
    
  • If class A does not implement Serializable but a subclass B implements Serializable, will the fields of class A be serialized when B is serialized?

    Only the fields of Serializable objects are written out and restored. The object may be restored only if it has a no-arg constructor that will initialize the fields of non-serializable supertypes. If the subclass has access to the state of the superclass it can implement writeObject and readObject to save and restore that state.

  • Does object serialization support encryption?

    Object serialization does not contain any encryption/decryption in itself. It writes to and reads from Java Streams, so it can be coupled with any available encryption technology. Object serialization can be used in many different ways from simple persistence, writing and read to/from files, or for RMI to communicate across hosts.

    RMI's use of serialization leaves encryption and decryption to the lower network transport. We expect that when a secure channel is needed, the network connections will be made using SSL or the like.

  • The object serialization classes are stream oriented. How do I write objects to a random access file?

    Currently there is no direct way to write objects to a random access file.

    You can use the ByteArray I/O streams as an intermediate place to write and read bytes to/from the random access file and create Object I/O streams from the byte streams to write/read the objects. You just have to make sure that you have the entire object in the byte stream or reading/writing the object will fail.

    For example, java.io.ByteArrayOutputStream can be used to receive the bytes of ObjectOutputStream. From it you can get a byte[] of the result which, in turn, can be used with ByteArrayInputStream as input to ObjectInput.

  • How can I create an ObjectInputStream from an ObjectOutputStream without a file in between?

    ObjectOutputStream and ObjectInputStream work to/from any stream object. You could use a ByteArrayOutputStream and then get the array and insert it into a ByteArrayInputStream. You could also use the piped stream classes as well. Any java.io class that extends the OutputStream and InputStream classes can be used.

    Alternatively, the ObjectOutputStream> class implements a reset method that discards the memory of having sent an object, so sending an object again will make a copy.

  • Can I compute diff(serial(x),serial(y))?

    The diff will produce the same stream each time the same object is serialized. You will need to create a new ObjectOutputStream to serialize each object.

  • Can I compress the serial representation of my objects using my own zip/unzip methods?

    ObjectOutputStream produces an OutputStream. If your zip object extends the OutputStream class, there is no problem compressing it.

  • Can I execute methods on compressed versions of my objects, for example isempty(zip(serial(x)))?

    This is not really viable for arbitrary objects because of the encoding of objects. For a particular object (such as String) you can compare the resulting bit streams. The encoding is stable, in that every time the same object is encoded it is encoded to the same set of bits.

  • Why can't a file that contains multiple appended ObjectOutputStreams be deserialized by one ObjectInputStream?

    Using the default implementation of serialization, there must be a one-to-one mapping between ObjectOutputStream construction and ObjectInputStream construction. ObjectOutputStream constructor writes a stream header and ObjectInputStream reads this stream header. A workaround is to subclass ObjectOutputStream and override writeStreamHeader(). The overriding writeStreamHeader() should call the super writeStreamHeader method if it is the first write to the file and it should call ObjectOutputStream.reset() if it is appending to a pre-existing ObjectOutputStream within the file.

    Questions About Using Serialization within the JDK:

  • When a local object is serialized and passed as a parameter in an RMI call, are the byte codes for the local object's methods also passed? What about object coherency, if the remote VM application "keeps" the object handle?

    The bytecodes for a local object's methods are not passed directly in the ObjectOutputStream, but the object's class may need to be loaded by the receiver if the class is not already available locally. (The class files themselves are not serialized, just the names of the classes.) All classes must be able to be loaded during deserialization using the normal class loading mechanisms. For applets this means they are loaded by the AppletClassLoader.

    There are no conherency guarantees for local objects passed to a remote VM, since such objects are passed by copying their contents (a true pass-by-value).

  • Which JDK 1.1 system classes will be marked Serializable?

    The following list shows the classes that are marked Serializable. Note that classes that extend these classes are also serializable.

    • java.lang.Character
    • java.lang.Boolean
    • java.lang.String
    • java.lang.StringBuffer
    • java.lang.Throwable - Including all subtypes of Exception
    • java.lang.Number - including Integer, Long,etc.
    • java.util.Hashtable
    • java.util.Random
    • java.util.Vector - includes Stack
    • java.util.Date
    • java.util.BitSet
    • java.io.File
    • java.net.InetAddress
    • java.rmi.server.RemoteObject
    • The AWT classes
    • Arrays of primitives
    • Arrays of objects are Serializable though the objects may not be.

    There are many classes for which Serialization makes no sense, such as those representing the state of something in the current VM (e.g. java.io.FileInputStream) or are exceedingly hard to do correctly (e.g. java.lang.Thread).

  • Are there any plans to support the serialization of threaded objects?

    In JDK 1.1, Threads will not be serializable. In the present implementation, if you attempt to serialize and then deserialize a thread, there is no explicit allocation of a new native thread or stack; all that happens is that the Java object is allocated with none of the native implementation. In short, it just won't work and will fail in unpredictable ways.

    The difficulty with threads is that they have so much state which is intricately tied into the virtual machine that it is difficult or impossible to re-establish the context somewhere else. For example, saving the Java call stack is insufficient because if there were native methods that had called C procedures that in turn called Java, there would be an incredible mix of Java constructs and C pointers to deal with. Also, serializing the stack would imply serializing any object reachable from any stack variable.

    If a thread were resumed in the same VM, it would be sharing a lot of state with the original thread, and would therefore fail in unpredictable ways if both threads were running at once, just like two C threads trying to share a stack. When deserialized in a separate VM, its hard to tell what might happen.

  • I am having problems deserializing AWT components. How can I make this work?

    AWT has not yet been modified to work well with Serialization. When you serialize AWT widgets, also serialized are the Peer objects that map the AWT functions to the local window system. When you deserialize (reconsitute) the AWT widgets, the old Peers are recreated, but they are out of date. Peers are native to the local window system and contain pointers to data structures in the local address space, and therefore cannnot be moved.

    Workaround:
    Remove the top level widget from its container (so the widgets are no longer live). The peers are discarded at this point and you will save only the AWT widget state. Later, when you deserialize and read the widgets back in, add the top level widget to the frame to make the AWT widgets appear. You may need to add a show call.

    Note that JDK 1.1 AWT widgets will be serializable, but they will not interoperate with JDK 1.0.2 widgets.

  • If I try to serialize a font or image object and reconstitute it in a different VM, my application dies. Why?

    AWT does not yet work well with serialization and you will therefore have trouble trying to pass fonts and images. This is because each contains memory pointers that are valid only in the originating VM, which will cause a segmentation violation when passed to a new VM.

    These problems should be corrected by the time JDK 1.1 releases. As a work around for fonts, you will need to pass the information necessary to recreate a new font object that duplicates the characteristics of the font object in the originating VM. There is no current work around to allow images to be passed correctly.

  • *As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual machine for the Java platform.