OKI OSID Serialization

Mark J. Norton

Oct. 8, 2003


Many of the applications based on the OKI OSID interfaces will require some or extensive persistance of data. The OSID definitions provide support to make data persistent by requiring that certain objects be serializable by the Java Foundation Classes (JFC). This is accomplished by implementing the Serializable interface definedin JFC. Persistence of data is intended to be an optional function and is thus not explicitly included in the OSID specifications.

This document reviews some of the potential needs of persistent data and how it might be implemented in the OSIDs.

Potentially Persistent Data

The Open Service Interface Definitions (OSID) is a set of pure interface descriptions. The current binding is in the Java programming language, though other bindings such as C#, WSDL, and PHP are under development. Because the OSIDs do not include any implementation details, the data included in OSIDs classes is left to individual developers. However, much of the data likely to be persisted can be inferred from the methods associated with key OSID classes, in particular, the Manager classes.

The following diagram illustrates some of the data which might be persisted in a selection of OSIDs. This is not intended to be complete, since it is highly implementation dependent. Furthermore, the focus is on the Common Services Layer, thought services included in the Educational Services Layer will also likely have persistent data needs.



In most cases, perisistent data is collected into arrays or hash maps, though there may be certain flags and other state data that needs to be preserved as well.

Persistency Strategies

Partially for the sake of functions like Remote Method Invocation, Java has very good support for object serialization. In older languages like C++, serialization is associated with CORBA and other distributed object systems. Other languages may or may not have commonly available support for object serialization and persistence.

Three common strategies for persistence are generally used. These are: file, database, and XML. Let’s examine each of these approaches in a bit more detail.

File Persistence

Of the three basic approaches to data persistence, file based persistence is perhaps the most simple. Java provides direct support for object serialization in its ObjectOutputStream and ObjectInputStream classes. A very good treatment of this topic is found in “Java I/O” by Elliotte Russel Harold, 1999, O’Reilly & Associates, Inc.

File based persistence consists of serializing one or more objects and writting them out to a file in a specified order. Serialization captures the data aspects of each object (including nested objects) and encodes them as a series of atomic data types (byte, char, int, float, etc.). Restoring objects is the reverse of this process, done by reading the file in the same order written. Objects typically must be cast into their final instantiated form.

Using files is certainly the simplest approach, but it does have some drawbacks. Serializing objects, and more importantly, restoring them can be a lengthy process. There is usually no support for random access of persisted data, forcing all objects to be restored when the application is launched (though this can be managed with appropriate design decisions).

Database Persistence

Another common approach to data persistence is saving state into a database. Two kinds of database systems are used: relational and object oriented. Naturally, an object oriented database is the simpler of the two, having native support for saving and accessing objects. Object oriented databases are not all that common, thought that continues to change as native object support is added to commerically available databases such as thos from Oracle, Sybase, and Microsoft.

More typical is the use of relational databases for object persistence. Although very sophisticated approaches to object persistence is possible, the simplest is to develop database schemas that closely correspond to the data to be saved. Thus an array of Agent objects might be persisted as a table with fields for an identifier, type, and reference to a properties table.

More flexible is to take advantage of support for saving binary objects in the database. Binary chunks of data (blobs) allow objects to be serialized in a manner similar to the file approach as saved as such. Using binary storage of data, access to persisted objects is more randomly accessible.

Databases add random access to persisted objects. This allows information to be restored on an as-needed basis rather than pulling it all in at once. This can improve system performance, especially where large amount of data to be saved are present. Naturally, any databased approach to object persistent will require a database system. Such systems range in cost from free to very expensive.

XML Persistence

Finally, there is XML base object persistence. Since XML is a method to format files using tags, XML persistence is a sub-set of file based persistence, but one with some special properties that cause it to deserve special treatment.

Rather than saving object data to a file in a serial, binary form, XML persistence saves data using a nested tag format described by a DTD, XML schema, or RDF schema. Those familiar with XML will see that it is simple to describe basic data types and the heirarchical organzation needed to represent aribrarily complex objects, including those with array data.

XML serialization and persistence is mentioned in several XML projects. Some of these include Jigsaw, Web Objects (Common Document Architecture), LOTP, the Document Object Model, SOAP, WebBroker, and many others. Searching on www.w3c.org will give you a lot more detail on how object serialization is used by them.

Since XML serialization produces an XML document, it can be handled by the Document Object Model and related technologies, such as XPath and XQuery. This in turn provides some random access to saved object data, though this does require parsing the file to find it. Finally, XML documents are human readable. This is a plus and minus in that serialized objects can be read for analysis and diagnostic purposes. On the other hand, there are security issues that arise as well.

Implementing Persistence in OKI OSIDs

As stated earlier, much of the data to be persisted in the OSIDs are maintained by the various OSID managers. In order to support compatibility across implementations and versions, these managers are loaded on demand by osid.OsidLoader. A such, the managers are not directly instantiated by the OSID implementations or the applications which use them.

Data persistence could be included in the constructors for the OSID managers, but that would force applications using those managers to include persistence. A more flexible approach is to add a few additional methods to each of the OSID managers to provided saving and restoring state information. Three methods are recommended:

public void writeState() throws osid.OsidException;
public void readState() throws osid.OsidException;
public boolean isRestored();


To save the state of a manager requiring persistence of state data, call the writeState() method. Similarly, to restore state data, call readState(). Since it may be unknown whether state has already been restored, the isRestored() method is provided a simple test.

A sample implementation for osid.shared.SharedManager would be:

public class SharedManager extends OsidManager implements osid.shared.SharedManager {
    private boolean restored = false;
    public static final String SERIALIZED_FILE_NAME = "osid_shared_manager.sid";
    private Vector ids = null;
    private Vector agents = null;
    private Vector groups = null;

    public void writeState() throws osid.shared.SharedException {
        try {
            FileOutputStream fout = new FileOutputStream
		(osid_mjn.shared.SharedManager.SERIALIZED_FILE_NAME);
            ObjectOutputStream  oout = new ObjectOutputStream (fout);
            oout.writeObject (agents);
            oout.writeObject (groups);
            // oout.writeObject (ids);
            oout.close();
        }
        catch (java.io.IOException ex) {
            throw new osid.shared.SharedException ("I/O error on writing state data out of SharedManager.");
        }
    }

    public void readState() throws osid.shared.SharedException {
        if (this.isEmpty()) {
            //  Open up the serilization file, if it exists.
            FileInputStream fin = null;
            try {
                fin = new FileInputStream (osid_mjn.shared.SharedManager.SERIALIZED_FILE_NAME);
            }
            catch (java.io.IOException exio) {
                //  This error is most likely to be a file not found, which means
                //  that there is not state to load.  Just return.
                restored = true;
                return;
            }

            try {
                //  Create an object stream from file stream.
                ObjectInputStream oin = new ObjectInputStream (fin);

                //  Read the persistent data and cast it to objects.
                agents = (Vector)oin.readObject();
                groups = (Vector)oin.readObject();
                // ids = (Vector)oin.readObject();

                // Close in input stream.
                oin.close();
                restored = true;
            }
            catch (java.io.IOException ex2) {
                throw new osid.shared.SharedException ("I/O error on reading state data into SharedManager.");
            }
            catch (java.lang.ClassNotFoundException ex4) {
                throw new osid.shared.SharedException ("Class not found error on reading state data into SharedManager.");
            }
        }
        else
            throw new osid.shared.SharedException ("Attempt to read state into an modified SharedManager.");
    }

    public boolean isRestored() {
        return (restored);
    }
}

Note that in this implementation, the vector of Ids maintained by the SharedManager is not persisted (though it could be). This was decided in part because Id objects will tend to be persisted at a lower level, such as part of an Agent, Group, etc.

By including persistence as methods that extend SharedManager, persistence can be implemented and used as needed without impacting the overall interoperability of the implementation, since those who do not require it, will not call the methods. However, there does need to be some agreement on what these extension names will be. Perhaps OKI might consider an OsidPersistence interface definition which would define the three methods described above. Alternatively, some form of discovery may be included.