Q. So IsolatedStorage provides a assembly-user-specific read/write wrapper on a free form text file? I was hoping for some simple name=value handling ala SetPrivateProfileString, but I guess I have to invent my own format for these items. Am I missing something obvious here?

Asked by Anne Brumme. Answered by the Wonk on February 7, 2003

A.

Serialization is the ability to read and write an arbitrary object graph (although reading is sometimes called deserialization). When you've got an isolated storage stream, or any other stream for that matter, it's serialization that lets you save anything, from simple name-value pairs to entire object hierarchies. However, before we can talk about serializing objects, we need to talk about where they’re going to serialize to.

Streams

Whenever an object is serialized, it has to go somewhere. It may go into memory, a file, a database record or a socket. Where the data is actually written generally does matter to the object itself. It needs to store the same data regardless of where it goes. All the object generally cares about is that bytes can be written and read and something we’d like to skip around amongst the bytes. This need manifests itself in the abstract base class Stream from the System.IO namespace:

 

abstract class Stream : MarshalByRefObject, IDisposable {

  // Fields

  public static readonly System.IO.Stream Null;

 

  // Properties

  public bool CanRead { virtual get; }

  public bool CanSeek { virtual get; }

  public bool CanWrite { virtual get; }

  public long Length { virtual get; }

  public long Position { virtual get; virtual set; }

 

  // Methods

  public virtual IAsyncResult BeginRead(...);

  public virtual IAsyncResult BeginWrite(...);

  public virtual void Close();

  public virtual int EndRead(IAsyncResult asyncResult);

  public virtual void EndWrite(IAsyncResult asyncResult);

  public virtual void Flush();

  public virtual int Read(byte[] buffer, int offset, int count);

  public virtual int ReadByte();

  public virtual long Seek(long offset, System.IO.SeekOrigin origin);

  public virtual void SetLength(long value);

  public virtual void Write(byte[] buffer, int offset, int count);

  public virtual void WriteByte(byte value);

}

 

.NET provides several classes that derive from Stream, including MemoryStream, FileStream and IsolatedStorageFileStream. The MemoryStream is fun to play with because it has no permanent side effects:

 

using System.IO;

...

string s = "Wahoo!";

int n = 452;

using( Stream stream = new MemoryStream() ) {

  // Write to the stream

  byte[] bytes1 = UnicodeEncoding.Unicode.GetBytes(s);

  byte[] bytes2 = BitConverter.GetBytes(n);

  stream.Write(bytes1, 0, bytes1.Length);

  stream.Write(bytes2, 0, bytes2.Length);

 

  // Reset the stream to the beginning

  stream.Seek(0, SeekOrigin.Begin);

 

  // Read from the stream

  byte[] bytes3 = new byte[stream.Length - 4];

  byte[] bytes4 = new byte[4];

  stream.Read(bytes3, 0, bytes3.Length);

  stream.Read(bytes4, 0, bytes4.Length);

 

  // Do something with the data

  MessageBox.Show(UnicodeEncoding.Unicode.GetString(bytes3) + " " +

    BitConverter.ToInt32(bytes4, 0));

}

 

This code creates a specific implementation of the abstract Stream class, making sure to close it, even in the face of exceptions. The code then uses the stream for writing and reading bytes, being careful to seek back to the beginning of the stream in between. We could have written the exact same code for any stream.

 

However, the manual conversion of the string object back and forth between the bytes is kind of a pain. To avoid writing that code, we’ve got the StreamWriter and StreamReader classes:

 

string s = "Wahoo!";

int n = 452;

using( Stream stream = new MemoryStream() ) {

 

  // Write to the stream

  StreamWriter writer = new StreamWriter(stream);

  writer.WriteLine(s);

  writer.WriteLine(n);

  writer.Flush(); // Flush the buffer

 

  // Reset the stream to the beginning

  stream.Seek(0, SeekOrigin.Begin);

 

  // Read from the stream

  StreamReader reader = new StreamReader(stream);

  string s2 = reader.ReadLine();

  int n2 = int.Parse(reader.ReadLine());

 

  // Do something with the data

  MessageBox.Show(s2 + " " + n2);

}

 

This code is considerably simpler since the conversion to bytes is managed by the stream writer and readers as they work on the stream. However, the stream writer and readers as oriented towards text only, which is why we wrote each piece of data on its own line and why we had to parse the integer back out of the string when reading. To avoid the conversion to and from strings, we can write the data in its native binary format and pull it out typed using the BinaryWriter and BinaryReader classes:

 

string s = "Wahoo!";

int n = 452;

using( Stream stream = new MemoryStream() ) {

 

  // Write to the stream

  BinaryWriter writer = new BinaryWriter(stream);

  writer.Write(s);

  writer.Write(s);

  writer.Flush(); // Flush the buffer

 

  // Reset the stream to the beginning

  stream.Seek(0, SeekOrigin.Begin);

 

  // Read from the stream

  BinaryReader reader = new BinaryReader(stream);

  string s2 = reader.ReadString();

  int n2 = reader.ReadInt32();

 

  // Do something with the data

  MessageBox.Show(s2 + " " + n2);

}

 

Using the BinaryWriter and BinaryReader, there was no string conversion, but our code still had to keep track of the types of the objects we were writing and the order in which they needed to be read. We could group the data into a custom class and read it all at once, but the BinaryWriter and BinaryReader don’t support custom classes, only built-in simple types. To read and write arbitrary objects, we need a formatter.

Formatters

A formatter is an object that knows how to write arbitrary objects to a stream. The formatter promises to provide this functionality by implementing the IFormatter information from the System.Runtime.Serialization namespace:

 

interface IFormatter {

  // Properties

  SerializationBinder Binder { get; set; }

  StreamingContext Context { get; set; }

  ISurrogateSelector SurrogateSelector { get; set; }

 

  // Methods

  object Deserialize(Stream serializationStream);

  void Serialize(Stream serializationStream, object graph);

}

 

A formatter has two jobs. The first is to serialize arbitrary objects, including their fields and their fields, all the way down. It knows which fields to serialize using Reflection, which is the .NET API for finding out type information about a type at run-time. An object is written to a stream with the Serialize method and read from a stream with the Deserialize method.

 

The second job of a formatter is to translate the data into some format at the byte level. The .NET Framework provides two formatters, the BinaryFormatter and the SoapFormatter.

 

The BinaryFormatter, from the System.Runtime.Serialization.Formatters.Binary namespace, writes the data in a binary format, just like the BinaryWriter. The SoapFormatter, from the System.Runtime.Serialization.Formatters.Soap namespace, write data in XML according to the Simple Object Access Protocol (SOAP) Specification. While SOAP is the core protocol of web services, using the SOAP formatter for the purposes of serializing settings or document data has nothing to do with web services or even the web. However, it is a handy format for a human to read the data back out again.

 

There is one stipulation on any type that a formatter is to serialize. It must be marked with the SerializableAttribute or the formatter will throw a run-time exception. Once the type (and the type of any contained field) is marked as serializable, serializing an object is a matter of creating a formatter and asking it to serialize the object:

 

using System.Runtime.Serialization;

using System.Runtime.Serialization.Formatters;

using System.Runtime.Serialization.Formatters.Soap;

...

[SerializableAttribute]

class MyData {

  public string s = "Wahoo!";

  public int n = 452;

}

 

static void DoSerialize() {

  MyData data = new MyData();

  using( Stream stream =

           new FileStream(@"c:\temp\mydata.xml", FileMode.Create) ) {

    // Write to the stream

    IFormatter formatter = new SoapFormatter();

    formatter.Serialize(stream, data);

 

    // Reset the stream to the beginning

    stream.Seek(0, SeekOrigin.Begin);

 

    // Read from the stream

    MyData data2 = (MyData)formatter.Deserialize(stream);

 

    // Do something with the data

    MessageBox.Show(data2.s + " " + data2.n);

  }

}

 

After creating the formatter, the code makes a call to Serialize, which writes the type information for the MyData object and then recursively writes all of the data for the fields of the object. Reading the object back out is accomplished with a call to Deserialize and a cast to the top-level object, which reads all fields recursively.

 

Because we chose the SOAP formatter and a FileStream, we can examine the data that the formatter wrote:

 

<SOAP-ENV:Envelope ...>

<SOAP-ENV:Body>

<a1:Form1_x002B_MyData id="ref-1" ...>

<s id="ref-3">Wahoo!</s>

<n>452</n>

</a1:Form1_x002B_MyData>

</SOAP-ENV:Body>

</SOAP-ENV:Envelope>

 

Here we can see that an instance of Form1.MyData was written and that it contains two fields, one called s with a value “Wahoo!” and one called n with a value “452”, which was just what the code meant to write.

Non-Serialized

We have some control over what the formatter writes, although probably not the way that you’d expect. For example, if we decide that we want to serialize the MyData class, but not the n field, we can’t stop the formatter by marking the field as protected or private. To be consistent at deserialization, an object is going to need the protected and private fields just as much as it needs the public ones (in fact, fields shouldn’t be public at all!). However, applying the NonSerializedAttribute to a field will cause it to be skipped by the formatter:

 

[SerializableAttribute]

class MyData {

  public string s = "Wahoo!";

  [NonSerialized] public int n = 452;

}

 

Serializing an instance of this type shows that the formatted is skipping the non-serialized field:

 

<SOAP-ENV:Envelope ...>

<SOAP-ENV:Body>

<a1:Form1_x002B_MyData id="ref-1" ...>

<s id="ref-3">Wahoo!</s>

</a1:Form1_x002B_MyData>

</SOAP-ENV:Body>

</SOAP-ENV:Envelope>

IDeserializationCallback

Good candidates for the non-serialized attribute are fields that are calculated, cached or transient, since they don’t need to be stored. However, when an object is deserialized, the non-serialized fields may need to be recalculated to put the object into a valid state. For example, expanding the duties of the n field of the MyData type to be a cache of the s field’s length, there’s no need to persist n, as it can be recalculated at any time.

 

However, the MyData object needs to be notified with s changes to keep n valid. Using properties keeps the n and s fields controlled until an object is deserialized and only s is set, but not using a property. MyData can implement IDeserializationCallback to get that notification:

 

[SerializableAttribute]

class MyData : IDeserializationCallback {

  string s = "Wahoo!";

  [NonSerialized] int n = 6;

 

  public string String {

    get { return s; }

    set { value = s; n = s.Length; }

  }

 

  public int Length {

    get { return n; }

  }

 

  #region Implementation of IDeserializationCallback

  public void OnDeserialization(object sender) {

    // Cache the string's length

    n = s.Length;

  }

  #endregion

}

 

If you’ve got any fields marked as non-serialized, chances are you should be handling IDeserializationCallback to set those fields at deserialization-time.

ISerializable

To gain even more control over the serialization process, you can implement the ISerializable interface and a special constructor:

 

[SerializableAttribute]

class MyData : ISerializable {

  string s = "Wahoo!";

  int n = 6;

 

  public string String {

    get { return s; }

    set { value = s; n = s.Length; }

  }

 

  public int Length {

    get { return n; }

  }

 

  public MyData() {}

 

  #region Implementation of ISerializable

  public MyData(

    SerializationInfo info, StreamingContext context) {

    // Get value from name/value pairs

    s = info.GetString("MyString");

 

    // Cache the string's length

    n = s.Length;

  }

 

  public void GetObjectData(

    SerializationInfo info, StreamingContext context) {

    // Add value to name/value pairs

    info.AddValue("MyString", s);

  }

  #endregion

}

 

Implementing ISerializable.GetObjectData puts your class on the hook to populate the name/value pairs that the formatter is using to fill the stream during serialization. GetObjectData is provided with two pieces of information: a place to put the fields to serialize (called the serialization information) and where the object is going (called the context state). GetObjectData should add all of the fields to the serialization info that it would like to have serialized, naming each one. It’s this name that’s used by the formatter to write the data:

 

<SOAP-ENV:Envelope ...>

<SOAP-ENV:Body>

<a1:Form1_x002B_MyData id="ref-1" ...>

<MyString id="ref-3">Wahoo!</s>

</a1:Form1_x002B_MyData>

</SOAP-ENV:Body>

</SOAP-ENV:Envelope>

 

Deserialization happens with the special constructor that also takes a serialization info and a context state, this time to pull the data out. The SerializationInfo class provides several methods for pulling out typed data. For built-in types, the specific method, like GetString, can be used. For general types, the GetValue method can be used. For example, the following two lines of code are equivalent, with the latter the only choice for custom types:

 

s = info.GetString("MyString");

s = (string)info.GetValue("MyString", typeof(string));

Data Versioning

While the types that hold the data are always subject to the .NET rules of versioning, that doesn’t really help you when it comes to reading and writing old versions of the data using new versions of the object. To support that, all you can do is write a version ID into the stream as part of the implementation of ISerializable:

 

[SerializableAttribute]

class MyData : ISerializable {

  string s = "Wahoo!";

  int n = 6;

  ArrayList oldStrings = new ArrayList(); // v2.0 addition

  static string version = "2.0";

  ...

#region Implementation of ISerializable

  public MyData(

    SerializationInfo info, StreamingContext context) {

    // Read the data based on the version

    string version = info.GetString("Version");

    switch( version ) {

      case "1.0":

        s = info.GetString("MyString");

        n = s.Length;

        break;

 

      case "2.0":

        s = info.GetString("MyString");

        n = s.Length;

        oldStrings =

          (ArrayList)info.GetValue("OldStrings", typeof(ArrayList));

        break;

    }

  }

 

  public void GetObjectData(

    SerializationInfo info, StreamingContext context) {

    // Tag the data with a version

    info.AddValue("Version", version);

    info.AddValue("MyString", s);

    info.AddValue("OldStrings", oldStrings);

  }

#endregion

}

 

This implementation writes a Version string on all of the data it writes, then uses that string to decide what data to read back in at run-time. As the data in a class changes, marking it with a version provides a way to migrate old data forward (and even save to old versions, if you’d like).

 

Notice also that the new hunk of data added was an ArrayList. Just like the simple types, the collection classes along with a large number of other classes in the .NET Framework can be serialized, making this model useful for all kinds of things, including user settings into isolated storage.

Settings & Streams

The combination of an isolated storage stream and formatters makes for a great way to store user settings, like the state of a form when it's closed and opened:

 

using System.Runtime.Serialization;

using System.Runtime.Serialization.Formatters;

using System.Runtime.Serialization.Formatters.Soap;

using System.IO;

using System.IO.IsolatedStorage;

...

// Custom type to manage serializable form data

[SerializableAttribute]

class FormData {

  public Point Location;

  public Size ClientSize;

  public FormWindowState WindowState;

 

  public FormData(Form form) {

    this.Location = form.Location;

    this.ClientSize = form.ClientSize;

    this.WindowState = form.WindowState;

  }

}

 

void MainForm_Closing(object sender, CancelEventArgs e) {

  // Save the form's position before it closes

  IsolatedStorageFile store =

    IsolatedStorageFile.GetUserStoreForAssembly();

  using( Stream stream =

            new IsolatedStorageFileStream("MainForm.txt",

            FileMode.Create,

            store) ) {

    // Restore the window state to save location and

    // client size at restored state

    FormWindowState state = this.WindowState;

    this.WindowState = FormWindowState.Normal;

 

    // Serialize custom FormData object

    IFormatter formatter = new SoapFormatter();

    formatter.Serialize(stream, new FormData(this));

  }

}

 

void MainForm_Load(object sender, EventArgs e) {

  // Restore the form's position

  try {

    IsolatedStorageFile store =

      IsolatedStorageFile.GetUserStoreForAssembly();

    using( Stream stream =

              new IsolatedStorageFileStream("MainForm.txt",

              FileMode.Open,

              store) ) {

      // Don't let the form's position be set automatically

      this.StartPosition = FormStartPosition.Manual;

 

      // Deserialize custom FormData object

      IFormatter formatter = new SoapFormatter();

      FormData data = (FormData)formatter.Deserialize(stream);

 

      // Set data from FormData object

      this.Location = data.Location;

      this.ClientSize = data.ClientSize;

      this.WindowState = data.WindowState;

    }

  }

  // Don't let missing settings scare the user

  catch( Exception ex ) {

    MessageBox.Show(ex.Message, ex.GetType().Name);

  }

}

 

To have something to serialize, we’ve got a custom type called FormData which keeps track of the location, client size and window state. When it’s time to save the form data, the code creates an instance of the new type and then hands it to the formatter, along with the file stream opened up in the special folder. Likewise, when loading, we use a formatter to deserialized the form data and use it to restore the form.

How Did I Figure This Out?

Streams have been a common serialization abstraction since C++ and COM. The streams in .NET are an extension and simplification of those. It was the history I already knew along with some research into what was different about streams in .NET from what I was familiar with that made it pretty obvious what the .NET serialization story was. I find that the more you dig into any one technology, e.g. C++ streams, the easier it is to figure out how the next new technology is when it comes along. Building on a base of knowledge, it's just a matter of figuring out what was different about .NET streams. I didn't have to learn anything from scratch.

Feedback

I have feedback on this Ask The Wonk answer