How can I use serialization in conjuction with Isolated Storage?

Serialization is the ability to read and write an arbitrary object graph (although reading is sometimes called deserialization). When you've got an isolated storage stream, or any other stream for that matter, it's serialization that lets you save anything, from simple name-value pairs to entire object hierarchies. However, before we can talk about serializing objects, we need to talk about where they’re going to serialize to.

Whenever an object is serialized, it has to go somewhere. It may go into memory, a file, a database record or a socket. Where the data is actually written generally does matter to the object itself. It needs to store the same data regardless of where it goes. All the object generally cares about is that bytes can be written and read and something we’d like to skip around amongst the bytes. This need manifests itself in the abstract base class Stream from the System.IO namespace:

.NET provides several classes that derive from Stream, including MemoryStream, FileStream and IsolatedStorageFileStream. The MemoryStream is fun to play with because it has no permanent side effects:

This code creates a specific implementation of the abstract Stream class, making sure to close it, even in the face of exceptions. The code then uses the stream for writing and reading bytes, being careful to seek back to the beginning of the stream in between. We could have written the exact same code for any stream.

However, the manual conversion of the string object back and forth between the bytes is kind of a pain. To avoid writing that code, we’ve got the StreamWriter and StreamReader classes:

This code is considerably simpler since the conversion to bytes is managed by the stream writer and readers as they work on the stream. However, the stream writer and readers as oriented towards text only, which is why we wrote each piece of data on its own line and why we had to parse the integer back out of the string when reading. To avoid the conversion to and from strings, we can write the data in its native binary format and pull it out typed using the BinaryWriter and BinaryReader classes:

Using the BinaryWriter and BinaryReader, there was no string conversion, but our code still had to keep track of the types of the objects we were writing and the order in which they needed to be read. We could group the data into a custom class and read it all at once, but the BinaryWriter and BinaryReader don’t support custom classes, only built-in simple types. To read and write arbitrary objects, we need a formatter.

A formatter is an object that knows how to write arbitrary objects to a stream. The formatter promises to provide this functionality by implementing the IFormatter information from the System.Runtime.Serialization namespace:

A formatter has two jobs. The first is to serialize arbitrary objects, including their fields and their fields, all the way down. It knows which fields to serialize using Reflection, which is the .NET API for finding out type information about a type at run-time. An object is written to a stream with the Serialize method and read from a stream with the Deserialize method.

The second job of a formatter is to translate the data into some format at the byte level. The .NET Framework provides two formatters, the BinaryFormatter and the SoapFormatter.

The BinaryFormatter, from the System.Runtime.Serialization.Formatters.Binary namespace, writes the data in a binary format, just like the BinaryWriter. The SoapFormatter, from the System.Runtime.Serialization.Formatters.Soap namespace, write data in XML according to the Simple Object Access Protocol (SOAP) Specification. While SOAP is the core protocol of web services, using the SOAP formatter for the purposes of serializing settings or document data has nothing to do with web services or even the web. However, it is a handy format for a human to read the data back out again.

There is one stipulation on any type that a formatter is to serialize. It must be marked with the SerializableAttribute or the formatter will throw a run-time exception. Once the type (and the type of any contained field) is marked as serializable, serializing an object is a matter of creating a formatter and asking it to serialize the object:

After creating the formatter, the code makes a call to Serialize, which writes the type information for the MyData object and then recursively writes all of the data for the fields of the object. Reading the object back out is accomplished with a call to Deserialize and a cast to the top-level object, which reads all fields recursively.

Because we chose the SOAP formatter and a FileStream, we can examine the data that the formatter wrote:

Here we can see that an instance of Form1.MyData was written and that it contains two fields, one called s with a value “Wahoo!” and one called n with a value “452”, which was just what the code meant to write.

We have some control over what the formatter writes, although probably not the way that you’d expect. For example, if we decide that we want to serialize the MyData class, but not the n field, we can’t stop the formatter by marking the field as protected or private. To be consistent at deserialization, an object is going to need the protected and private fields just as much as it needs the public ones (in fact, fields shouldn’t be public at all!). However, applying the NonSerializedAttribute to a field will cause it to be skipped by the formatter:

Serializing an instance of this type shows that the formatted is skipping the non-serialized field:

Good candidates for the non-serialized attribute are fields that are calculated, cached or transient, since they don’t need to be stored. However, when an object is deserialized, the non-serialized fields may need to be recalculated to put the object into a valid state. For example, expanding the duties of the n field of the MyData type to be a cache of the s field’s length, there’s no need to persist n, as it can be recalculated at any time.

However, the MyData object needs to be notified with s changes to keep n valid. Using properties keeps the n and s fields controlled until an object is deserialized and only s is set, but not using a property. MyData can implement IDeserializationCallback to get that notification:

If you’ve got any fields marked as non-serialized, chances are you should be handling IDeserializationCallback to set those fields at deserialization-time.

To gain even more control over the serialization process, you can implement the ISerializable interface and a special constructor:

Implementing ISerializable.GetObjectData puts your class on the hook to populate the name/value pairs that the formatter is using to fill the stream during serialization. GetObjectData is provided with two pieces of information: a place to put the fields to serialize (called the serialization information) and where the object is going (called the context state). GetObjectData should add all of the fields to the serialization info that it would like to have serialized, naming each one. It’s this name that’s used by the formatter to write the data:

Deserialization happens with the special constructor that also takes a serialization info and a context state, this time to pull the data out. The SerializationInfo class provides several methods for pulling out typed data. For built-in types, the specific method, like GetString, can be used. For general types, the GetValue method can be used. For example, the following two lines of code are equivalent, with the latter the only choice for custom types:

While the types that hold the data are always subject to the .NET rules of versioning, that doesn’t really help you when it comes to reading and writing old versions of the data using new versions of the object. To support that, all you can do is write a version ID into the stream as part of the implementation of ISerializable:

This implementation writes a Version string on all of the data it writes, then uses that string to decide what data to read back in at run-time. As the data in a class changes, marking it with a version provides a way to migrate old data forward (and even save to old versions, if you’d like).

Notice also that the new hunk of data added was an ArrayList. Just like the simple types, the collection classes along with a large number of other classes in the .NET Framework can be serialized, making this model useful for all kinds of things, including user settings into isolated storage.

Settings & Streams

The combination of an isolated storage stream and formatters makes for a great way to store user settings, like the state of a form when it's closed and opened:

using System.Runtime.Serialization;

using System.Runtime.Serialization.Formatters;

using System.Runtime.Serialization.Formatters.Soap;

using System.IO;

using System.IO.IsolatedStorage;

...

// Custom type to manage serializable form data

[SerializableAttribute]

class FormData {

public Point Location;

public Size ClientSize;

public FormWindowState WindowState;

public FormData(Form form) {

this.Location = form.Location;

this.ClientSize = form.ClientSize;

this.WindowState = form.WindowState;

}

void MainForm_Closing(object sender, CancelEventArgs e) {

// Save the form's position before it closes

IsolatedStorageFile store =

IsolatedStorageFile.GetUserStoreForAssembly();

using( Stream stream =

new IsolatedStorageFileStream("MainForm.txt",

FileMode.Create,

store) ) {

// Restore the window state to save location and

// client size at restored state

FormWindowState state = this.WindowState;

this.WindowState = FormWindowState.Normal;

// Serialize custom FormData object

IFormatter formatter = new SoapFormatter();

formatter.Serialize(stream, new FormData(this));

}

void MainForm_Load(object sender, EventArgs e) {

// Restore the form's position

try {

IsolatedStorageFile store =

IsolatedStorageFile.GetUserStoreForAssembly();

using( Stream stream =

new IsolatedStorageFileStream("MainForm.txt",

FileMode.Open,

store) ) {

// Don't let the form's position be set automatically

this.StartPosition = FormStartPosition.Manual;

// Deserialize custom FormData object

IFormatter formatter = new SoapFormatter();

FormData data = (FormData)formatter.Deserialize(stream);

// Set data from FormData object

this.Location = data.Location;

this.ClientSize = data.ClientSize;

this.WindowState = data.WindowState;

}

// Don't let missing settings scare the user

catch( Exception ex ) {

MessageBox.Show(ex.Message, ex.GetType().Name);

}

To have something to serialize, we’ve got a custom type called FormData which keeps track of the location, client size and window state. When it’s time to save the form data, the code creates an instance of the new type and then hands it to the formatter, along with the file stream opened up in the special folder. Likewise, when loading, we use a formatter to deserialized the form data and use it to restore the form.

How Did I Figure This Out?

Streams have been a common serialization abstraction since C++ and COM. The streams in .NET are an extension and simplification of those. It was the history I already knew along with some research into what was different about streams in .NET from what I was familiar with that made it pretty obvious what the .NET serialization story was. I find that the more you dig into any one technology, e.g. C++ streams, the easier it is to figure out how the next new technology is when it comes along. Building on a base of knowledge, it's just a matter of figuring out what was different about .NET streams. I didn't have to learn anything from scratch.

Feedback