# Chapter 24 Runtime Serialization
# Serialization/Deserialization Quick Start
Let’s start off by looking at some code.
using System; | |
using System.Collections.Generic; | |
using System.IO; | |
using System.Runtime.Serialization.Formatters.Binary; | |
internal static class QuickStart { | |
public static void Main() { | |
// Create a graph of objects to serialize them to the stream | |
var objectGraph = new List<String> { "Jeff", "Kristin", "Aidan", "Grant" }; | |
Stream stream = SerializeToMemory(objectGraph); | |
// Reset everything for this demo | |
stream.Position = 0; | |
objectGraph = null; | |
// Deserialize the objects and prove it worked | |
objectGraph = (List<String>) DeserializeFromMemory(stream); | |
foreach (var s in objectGraph) Console.WriteLine(s); | |
} | |
private static MemoryStream SerializeToMemory(Object objectGraph) { | |
// Construct a stream that is to hold the serialized objects | |
MemoryStream stream = new MemoryStream(); | |
// Construct a serialization formatter that does all the hard work | |
BinaryFormatter formatter = new BinaryFormatter(); | |
// Tell the formatter to serialize the objects into the stream | |
formatter.Serialize(stream, objectGraph); | |
// Return the stream of serialized objects back to the caller | |
return stream; | |
} | |
private static Object DeserializeFromMemory(Stream stream) { | |
// Construct a serialization formatter that does all the hard work | |
BinaryFormatter formatter = new BinaryFormatter(); | |
// Tell the formatter to deserialize the objects from the stream | |
return formatter.Deserialize(stream); | |
} | |
} |
Wow, look how simple this is! The SerializeToMemory method constructs a System.IO. MemoryStream object. This object identifies where the serialized block of bytes is to be placed. Then the method constructs a BinaryFormatter object (which can be found in the System. Runtime.Serialization.Formatters.Binary namespace). A formatter is a type (implementing the System.Runtime.Serialization.IFormatter interface) that knows how to serialize and deserialize an object graph. The Framework Class Library (FCL) ships with two formatters: the BinaryFormatter (used in this code example) and a SoapFormatter (which can be found in the System. Runtime.Serialization.Formatters.Soap namespace and is implemented in the System. Runtime.Serialization.Formatters.Soap.dll assembly).
💡注意:从 .NET Framework 3.5 开始便废了 SoapFormatter
类,不要在生产代码中使用它。但在调试序列化代码时,它仍有一定用处,因为它能生成便于阅读的 XML 文本。要在生产代码中使用 XML 序列化和反序列化,请参见 XmlSerializer
和 DataContractSerializer
类。
To serialize a graph of objects, just call the formatter’s Serialize method and pass it two things: a reference to a stream object and a reference to the object graph that you want to serialize. The stream object identifies where the serialized bytes should be placed and can be an object of any type derived from the System.IO.Stream abstract base class. This means that you can serialize an object graph to a MemoryStream, a FileStream, a NetworkStream, and so on.
The second parameter to Serialize is a reference to an object. This object could be anything: an Int32, a String, a DateTime, an Exception, a List, a Dictionary, and so on. The object referred to by the objectGraph parameter may refer to other objects. For example, objectGraph may refer to a collection that refers to a set of objects. These objects may also refer to other objects. When the formatter’s Serialize method is called, all objects in the graph are serialized to the stream.
Formatters know how to serialize the complete object graph by referring to the metadata that describes each object’s type. The Serialize method uses reflection to see what instance fields are in each object’s type as it is serialized. If any of these fields refer to other objects, then the formatter’s Serialize method knows to serialize these objects, too.
Formatters have very intelligent algorithms. They know to serialize each object in the graph no more than once out to the stream. That is, if two objects in the graph refer to each other, then the formatter detects this, serializes each object just once, and avoids entering into an infinite loop.
In my SerializeToMemory method, when the formatter’s Serialize method returns, the MemoryStream is simply returned to the caller. The application uses the contents of this flat byte array any way it wants. For example, it could save it in a file, copy it to the clipboard, send it over a wire, or whatever.
The DeserializeFromStream method deserializes a stream back into an object graph. This method is even simpler than serializing an object graph. In this code, a BinaryFormatter is constructed and then its Deserialize method is called. This method takes the stream as a parameter and returns a reference to the root object within the deserialized object graph.
Internally, the formatter’s Deserialize method examines the contents of the stream, constructs instances of all the objects that are in the stream, and initializes the fields in all these objects so that they have the same values they had when the object graph was serialized. Typically, you will cast the object reference returned from the Deserialize method into the type that your application is expecting.
💡注意:下面是一个有趣而实用的方法,它利用序列化创建对象的深拷贝 (或者说克隆体):
private static Object DeepClone(Object original) {
// 构造临时内存流
using (MemoryStream stream = new MemoryStream()) {
// 构造序列化格式化器来执行所有实际工作
BinaryFormatter formatter = new BinaryFormatter();
// 值一行在本章 24.6 节“流上下文” 解释
formatter.Context = new StreamingContext(StreamingContextStates.Clone);
// 将对象图序列化到内存流中
formatter.Serialize(stream, original);
// 反序列化前,定位到内存流的起始位置
stream.Position = 0;
// 将对象图反序列化成一组新对象,
// 向调用者返回对象图(深拷贝)的根
return formatter.Deserialize(stream);
}
}
At this point, I’d like to add a few notes to our discussion. First, it is up to you to ensure that your code uses the same formatter for both serialization and deserialization. For example, don’t write code that serializes an object graph by using the SoapFormatter and then deserializes the graph by using the BinaryFormatter. If Deserialize can’t decipher the contents of the stream, then a System. Runtime.Serialization.SerializationException exception will be thrown.
The second thing I’d like to point out is that it is possible and also quite useful to serialize multiple object graphs out to a single stream. For example, let’s say that we have the following two class definitions.
[Serializable] internal sealed class Customer { /* ... */ } | |
[Serializable] internal sealed class Order { /* ... */ } |
And then, in the main class of our application, we define the following static fields.
private static List<Customer> s_customers = new List<Customer>(); | |
private static List<Order> s_pendingOrders = new List<Order>(); | |
private static List<Order> s_processedOrders = new List<Order>(); |
We can now serialize our application’s state to a single stream with a method that looks like this.
private static void SaveApplicationState(Stream stream) { | |
// Construct a serialization formatter that does all the hard work | |
BinaryFormatter formatter = new BinaryFormatter(); | |
// Serialize our application's entire state | |
formatter.Serialize(stream, s_customers); | |
formatter.Serialize(stream, s_pendingOrders); | |
formatter.Serialize(stream, s_processedOrders); | |
} |
To reconstruct our application’s state, we would deserialize the state with a method that looks like this.
private static void RestoreApplicationState(Stream stream) { | |
// Construct a serialization formatter that does all the hard work | |
BinaryFormatter formatter = new BinaryFormatter(); | |
// Deserialize our application's entire state (same order as serialized) | |
s_customers = (List<Customer>) formatter.Deserialize(stream); | |
s_pendingOrders = (List<Order>) formatter.Deserialize(stream); | |
s_processedOrders = (List<Order>) formatter.Deserialize(stream); | |
} |
The third and last thing I’d like to point out has to do with assemblies. When serializing an object, the full name of the type and the name of the type’s defining assembly are written to the stream. By default, BinaryFormatter outputs the assembly’s full identity, which includes the assembly’s file name (without extension), version number, culture, and public key information. When deserializing an object, the formatter first grabs the assembly identity and ensures that the assembly is loaded into the executing AppDomain by calling System.Reflection.Assembly’s Load method (discussed in Chapter 23, “Assembly Loading and Reflection”).
After an assembly has been loaded, the formatter looks in the assembly for a type matching that of the object being deserialized. If the assembly doesn’t contain a matching type, an exception is thrown and no more objects can be deserialized. If a matching type is found, an instance of the type is created and its fields are initialized from the values contained in the stream. If the type’s fields don’t exactly match the names of the fields as read from the stream, then a SerializationException exception is thrown and no more objects can be deserialized. Later in this chapter, I’ll discuss some sophisticated mechanisms that allow you to override some of this behavior.
💡重要提示:有的可扩展应用程序使用 Assembly.LoadFrom
加载程序集,然后根据加载的程序集中定义的类型来构造对象。这些对象序列化到流中是没有问题的。但在反序列化时,格式化器会调用 Assembly
的 Load
方法 (而非 LoadFrom
方法) 来加载程序集。大多数情况下,CLR 都将无法定位程序集文件,从而造成 SerializationException
异常。许多开发人员对这个结果深感不解。序列化都能正确进行,他们当然预期反序列化也是正确的。
如果应用程序使用 Assembly.LoadFrom
加载程序集,再对程序集中定义的类型进行序列化,那么在调用格式化器的 Deserialize
方法之前,我建议你实现一个方法,它的签名要匹配 System.ResolveEventHandler
委托,并向 System.AppDomain
的 AssemblyResolve
事件注册这个方法。( Deserialize
方法返回后,马上向事件注销这个方法。) 现在,每次格式化器加载一个程序集失败,CLR 都会自动调用你的 ResolveEventHandler
方法。加载失败的程序集的标识 (Identity) 会传给这个方法。方法可以从程序集的标识中提取程序集文件名,并用这个名称来构造路径,使应用程序知道去哪里寻找文件。然后,方法可调用 Assembly.LoadFrom
加载程序集,最后返回对结果程序集的引用。
This section covered the basics of how to serialize and deserialize object graphs. In the remaining sections, we’ll look at what you must do in order to define your own serializable types, and we’ll also look at various mechanisms that allow you to have greater control over serialization and deserialization.
💡小结:序列化时将对象或对象图转换成字节流的过程。反序列化是将字节流转换回对象图的过程。在对象和字节流之间转换是很有用的机制。一旦将对象序列化成内存中的字节流,就可方便地以一些更有用的方式处理数据,比如进行加密和压缩。格式化器是实现了 System.Runtime.Serialization.IFormatter
接口的类型,它知道如何序列化和反序列化对象图。FCL 提供了两个格式化器: BinaryFormatter
和 SoapFormatter
(在 System.Runtime.Serialization.Formatters.Soap
命名空间中定义,在 System.Runtime.Serialization.Formatters.Soap.dll
程序集中实现)。序列化对象图只需调用格式化器的 Serialize
方法,并向它传递两样东西:对流对象的引用,以及对想要序列化的对象图的引用。流对象标识了序列化好的字节应放到哪里,它可以是从 System.IO.Stream
抽象基类派生的任何类型的对象。也就是说,对象图可序列化成一个 MemoryStream
, FileStream
或者 NetworkStream
等。 Serialize
的第二个参数是一个对象引用。这个对象可以是任何东西,可以是一个 Int32
, String
, DateTime
, Exception
, List<String>
或者 Dictionary<Int32, DateTime>
等。 objectGraph
参数引用的对象可引用其他对象。例如, objectGraph
可引用一个集合,而这个集合引用了一组对象。这些对象还可继续引用其他对象,调用格式化器的 Serialize
方法时,对象图中的所有对象都被序列化到流中。格式化器参考对每个对象的类型进行描述的元数据,从而了解如何序列化完整的对象图。序列化时, Serialize
方法利用反射来查看每个对象的类型中都有哪些实例字段。在这些字段中,任何一个引用了其他对象,格式化器的 Serialize
方法就知道那些对象也要进行序列化。在内部,格式化器的 Deserialize
方法检查流的内容,构造流中所有对象的实例,并初始化所有这些对象中的字段,使它们具有与当初序列化时相同的值。通常要将 Deserialize
方法返回的对象引用转型为应用程序期待的类型。有几点需要注意。首先,是由你来保证代码为序列化和反序列化使用相同的格式化器。例如,不要写代码用 SoapFormatter
序列化一个对象图,再用 BinaryFormatter
反序列化。 Deserialize
如果解释不了流的内容会抛出 System.Runtime.Serialization.SerializationException
异常。其次,可将多个对象图序列化到一个流中,这是很有用的一个操作。最后一个主意事项与程序集有关。序列化对象时,类型的全名和类型定义程序集的全名会被写入流。 BinaryFormatter
默认输出程序集的完整标识,其中包括程序集的文件名 (无扩展名)、版本号、语言文化以及公钥信息。反序列化对象时,格式化器首先获取程序集标识信息。并通过调用 System.Refleciton.Assembly
的 Load
方法,确保程序集已加载到正在执行的 AppDomain 中。
# Making a Type Serializable
When a type is designed, the developer must make the conscious decision as to whether or not to allow instances of the type to be serializable. By default, types are not serializable. For example, the following code does not perform as expected.
internal struct Point { public Int32 x, y; } | |
private static void OptInSerialization() { | |
Point pt = new Point { x = 1, y = 2 }; | |
using (var stream = new MemoryStream()) { | |
new BinaryFormatter().Serialize(stream, pt); // throws SerializationException | |
} | |
} |
If you were to build and run this code in your program, you’d see that the formatter’s Serialize method throws a System.Runtime.Serialization.SerializationException exception. The problem is that the developer of the Point type has not explicitly indicated that Point objects may be serialized. To solve this problem, the developer must apply the System.SerializableAttribute custom attribute to this type as follows. (Note that this attribute is defined in the System namespace, not the System.Runtime.Serialization namespace.)
[Serializable] | |
internal struct Point { public Int32 x, y; } |
Now, if we rebuild the application and run it, it does perform as expected and the Point objects will be serialized to the stream. When serializing an object graph, the formatter checks that every object’s type is serializable. If any object in the graph is not serializable, the formatter’s Serialize method throws the SerializationException exception.
💡注意:序列化对象图时,也许有的对象的类型能序列化,有的不能。考虑到性能,在序列化之前,格式化器不会验证对象图中的所有对象都能序列化。所以,序列化对象图时,在抛出 SerializationException
异常之前,完全有可能已经有一部分对象序列化到流中。如果发生这种情况,流中就会包含已损坏的数据。序列化对象图时,如果你认为也许有一些对象不可序列化,那么写的代码就应该能得体地从这种情况中恢复。一个方案是先将对象序列化到一个 MemoryStream
中。然后,如果所有对象都成功序列化,就可以将 MemoryStream
中的字节复制到你真正希望的目标流中 (比如文件和网络)。
The SerializableAttribute custom attribute may be applied to reference types (class), value types (struct), enumerated types (enum), and delegate types (delegate) only. (Note that enumerated and delegate types are always serializable, so there is no need to explicitly apply the SerializableAttribute attribute to these types.) In addition, the SerializableAttribute attribute is not inherited by derived types. So, given the following two type definitions, a Person object can be serialized, but an Employee object cannot.
[Serializable] | |
internal class Person { ... } | |
internal class Employee : Person { ... } |
To fix this, you would just apply the SerializableAttribute attribute to the Employee type as well.
[Serializable] | |
internal class Person { ... } | |
[Serializable] | |
internal class Employee : Person { ... } |
Note that this problem was easy to fix. However, the reverse—defining a type derived from a base type that doesn’t have the SerializableAttribute attribute applied to it—is not easy to fix. But, this is by design; if the base type doesn’t allow instances of its type to be serialized, its fields cannot be serialized, because a base object is effectively part of the derived object. This is why System. Object has the SerializableAttribute attribute applied to it.
💡注意:一般建议将你定义的大多数类型都设置成可序列化。毕竟,这样能为类型的用户提供很大的灵活性。但必须注意的是,序列化会读取对象的所有字段,不管这些字段声明为 public
, protected
, internal
还是 private
。如果类型的实例要包含敏感或安全数据 (比如密码),或者数据在转移之后便没有含义或者没有值,就不应使类型变得可序列化。
如果使用的类型不是为序列化而设计的,而且手上没有类型的源代码,无法从源头添加序列化支持,也不必气馁。在本章最后的 24.9 节 “反序列化对象时重写程序集和 / 或类型” 中,我会解释如何使任何不可序列化的类型变得可序列化。
💡小结:设计类型时,设计人员必须珍重地决定是否允许类型的实例序列化。类型默认是不可序列化对的。开发者必须向类型应用定制特性 System.SerializableAttribute
(注意该特性在 System
而不是 System.Runtime.Serialization
命名空间中定义)。序列化对象图时,格式化器会确认每个对象的类型都是可序列化的。任何对象不可序列化,格式化器的 Serialize
方法都会抛出 SerializationException
异常。 SerializableAttribute
这个定制特性只能应用于引用类型 ( class
)、值类型 ( struct
)、枚举类型 ( enum
) 和委托类型 ( delegate
)。注意,枚举和委托类型总是可序列化的,所以不必显式应用 SerializableAttribute
特性。除此之外, SerializableAttribute
特性不会被派生类型继承。如果基类型不允许它的实例序列化,它的字段就不能序列化,因为基对象实际是派生对象的一部分。这正是为什么 System.Object
已经很体贴地应用了 SerializableAttribute
特性的原因。
# Controlling Serialization and Deserialization
When you apply the SerializableAttribute custom attribute to a type, all instance fields (public, private, protected, and so on) are serialized.1 However, a type may define some instance fields that should not be serialized. In general, there are two reasons why you would not want some of a type’s instance fields to be serialized:
The field contains information that would not be valid when deserialized. For example, an object that contains a handle to a Windows kernel object (such as a file, process, thread, mutex, event, semaphore, and so on) would have no meaning when deserialized into another process or machine because Windows’ kernel handles are process-relative values.
The field contains information that is easily calculated. In this case, you select which fields do not need to be serialized, thus improving your application’s performance by reducing the amount of data transferred.
The following code uses the System.NonSerializedAttribute custom attribute to indicate which fields of the type should not be serialized. (Note that this attribute is also defined in the System namespace, not the System.Runtime.Serialization namespace.)
[Serializable] | |
internal class Circle { | |
private Double m_radius; | |
[NonSerialized] | |
private Double m_area; | |
public Circle(Double radius) { | |
m_radius = radius; | |
m_area = Math.PI * m_radius * m_radius; | |
} | |
... | |
} |
In the preceding code, objects of Circle may be serialized. However, the formatter will serialize the values in the object’s m_radius field only. The value in the m_area field will not be serialized because it has the NonSerializedAttribute attribute applied to it. This attribute can be applied only to a type’s fields, and it continues to apply to this field when inherited by another type. Of course, you may apply the NonSerializedAttribute attribute to multiple fields within a type.
So, let’s say that our code constructs a Circle object as follows.
Circle c = new Circle(10); |
Internally, the m_area field is set to a value approximate to 314.159. When this object gets serialized, only the value of the m_radius field (10) gets written to the stream. This is exactly what we want, but now we have a problem when the stream is deserialized back into a Circle object. When deserialized, the Circle object will get its m_radius field set to 10, but its m_area field will be initialized to 0—not 314.159!
The following code demonstrates how to modify the Circle type to fix this problem.
[Serializable] | |
internal class Circle { | |
private Double m_radius; | |
[NonSerialized] | |
private Double m_area; | |
public Circle(Double radius) { | |
m_radius = radius; | |
m_area = Math.PI * m_radius * m_radius; | |
} | |
[OnDeserialized] | |
private void OnDeserialized(StreamingContext context) { | |
m_area = Math.PI * m_radius * m_radius; | |
} | |
} |
I’ve changed Circle so that it now contains a method marked with the System.Runtime. Serialization.OnDeserializedAttribute custom attribute.2 Whenever an instance of a type is deserialized, the formatter checks whether the type defines a method with this attribute on it and then the formatter invokes this method. When this method is called, all the serializable fields will be set correctly, and they may be accessed to perform any additional work that would be necessary to fully deserialize the object.
In the preceding modified version of Circle, I made the OnDeserialized method simply calculate the area of the circle by using the m_radius field and place the result in the m_area field. Now, m_area will have the desired value of 314.159.
In addition to the OnDeserializedAttribute custom attribute, the System.Runtime.Serialization namespace also defines OnSerializingAttribute, OnSerializedAttribute, and OnDeserializingAttribute custom attributes, which you can apply to your type’s methods to have even more control over serialization and deserialization. Here is a sample class that applies each of these attributes to a method.
[Serializable] | |
public class MyType { | |
Int32 x, y; [NonSerialized] Int32 sum; | |
public MyType(Int32 x, Int32 y) { | |
this.x = x; this.y = y; sum = x + y; | |
} | |
[OnDeserializing] | |
private void OnDeserializing(StreamingContext context) { | |
// Example: Set default values for fields in a new version of this type | |
} | |
[OnDeserialized] | |
private void OnDeserialized(StreamingContext context) { | |
// Example: Initialize transient state from fields | |
sum = x + y; | |
} | |
[OnSerializing] | |
private void OnSerializing(StreamingContext context) { | |
// Example: Modify any state before serializing | |
} | |
[OnSerialized] | |
private void OnSerialized(StreamingContext context) { | |
// Example: Restore any state after serializing | |
} | |
} |
Whenever you use any of these four attributes, the method you define must take a single StreamingContext parameter (discussed in the “Streaming Contexts” section later in this chapter) and return void. The name of the method can be anything you want it to be. Also, you should declare the method as private to prevent it from being called by normal code; the formatters run with enough security that they can call private methods.
💡注意 序列化一组对象时,格式化器首先调用对象的标记了 OnSerializing
特性的所有方法。接着,它序列化对象的所有字段。最后,调用对象的标记了 OnSerialized
特性的所有方法。类似地,反序列化一组对象时,格式化器首先调用对象的标记了 OnDeserializing
特性的所有方法。然后,它反序列化对象的所有字段。最后,它调用对象的标记了 OnDeserialized
特性的所有方法。
还要注意,在反序列化期间,当格式化器看到类型提供的一个方法标记了 OnDeserialized
特性时,格式化器会将这个对象的引用添加到一个内部列表中。所有对象都反序列化之后,格式化器反向遍历列表,调用每个对象的 OnDeserialized
方法,调用这个方法后,所有可序列化的字段都会被正确设置,可访问这些字段来执行任何必要的、进一步的工作,从而将对象完整地反序列化。之所以要以相反的顺序调用这些方法,因为这样才能使内层对象先于外层对象完成反序列化。
例如,假定一个集合对象 (比如 Hashtable
或 Dictionary
) 内部用一个哈希表维护它的数据项列表。集合对象类型可实现一个标记了 OnDeserialized
特性的方法。即使集合对象先反序列化 (先于它包含的数据项),它的 OnDeserialized
方法也会最后调用 (在调用完它的数据项的所有 OnDeserialized
方法之后)。这样一来,所有数据项在反序列化后,它们的所有字段都能得到正确的初始化,以便计算出一个好的哈希码值。然后,集合对象创建它的内部哈希桶,并利用数据项的哈希码将数据项放到桶中。本章稍后的 24.5 节” 控制序列化 / 反序列化的数据 “会提供一个例子,它展示了 Dictionary
类如何利用这个技术。
If you serialize an instance of a type, add a new field to the type, and then try to deserialize the object that did not contain the new field, the formatter throws a SerializationException with a message indicating that the data in the stream being deserialized has the wrong number of members. This is very problematic in versioning scenarios where it is common to add new fields to a type in a newer version. Fortunately, you can use the System.Runtime.Serialization.OptionalFieldAttribute attribute to help you.
You apply the OptionalFieldAttribute attribute to each new field you add to a type. Now, when the formatters see this attribute applied to a field, the formatters will not throw the SerializationException exception if the data in the stream does not contain the field.
💡小结:将 SerializableAttribute
定制特性应用于类型,所有实例字段 ( public
, private
和 protected
等) 都会被序列化。但类型可能定义了一些不应序列化的实例字段。一般有两个原因造成我们不想序列化部分实例字段。一是字段含有反序列化后变得无效的信息。例如,假定对象包含 Windows 内核对象 (如文件、进程、线程、互斥体、事件、信号量等) 的句柄,那么在反序列化到另一个进程或另一台机器之后,就会失去意义。因为 Windows 内核对象是跟进程相关的值。二是字段含有很容易计算的信息。这时要选出那些无须序列化的字段,减少需要传输的数据,增强应用程序的性能。可以使用 System.NonSerializedAttribute
定制特性指出类型中不应序列化的字段。注意,该特性也在 System
(而非 System.Runtime.Serialization
) 命名空间中定义。注意,该特性只能应用于类型中的字段,而且会被派生类型继承。当然,可向一个类型中的多个字段应用 NonSerializedAttribute
特性。在标记了 [Serializable]
特性的类型中,不要用 C# 的 “自动实现的属性” 功能来定义属性。这是由于字段名是由编译器自动生成的,而生成的名称每次重新编译代码时都不同。这会阻止类型被反序列化。可以使用 System.Runtime.Serialization.OnDeserializedAttribute
定制特性标记方法。每次反序列化类型的实例,格式化器都会检查类型中是否定义了应用了该特性的方法。如果是,就调用该方法。调用这个方法时,所有可序列化的字段都会被正确设置。在该方法中,可能需要访问这些字段来执行一些额外的工作,从而确保对象的完全反序列化。除了 OnDeserializedAttribute
这个定制特性, System.Runtime.Serialization
命名空间还定义了包括 OnSerializingAttribute
, OnSerializedAttribute
和 OnDeserializingAttribute
在内的其他定制特性。可将它们应用于类型中定义的方法,对序列化和反序列化过程进行更多的控制。使用这 4 个属性中的任何一个时,你定义的方法必须获取一个 StreamingContext
参数并返回 void
。方法名可以是你希望的任何名称。另外,应将方法声明为 private
,以免它被普通的代码调用;格式化器运行时有充足的安全权限,所以能调用私有方法。如果序列化类型的实例,在类型中添加新字段,然后试图反序列化不包含新字段的对象,格式化器会抛出 SerializationException
异常,并显示一条消息告诉你流中要反序列化的数据包含错误的成员数目。这非常不利于版本控制,因为我们经常都要在类型的新版本中添加新字段。幸好,这时可以利用 System.Runtime.Serialization.OptionalFieldAttribute
特性。类型中新增的每个字段都要应用 OptionalFieldAttribute
特性。然后,当格式化器看到该特性应用于一个字段时,就不会因为流中的数据不包含这个字段而抛出 SerializationException
。
# How Formatters Serialize Type Instances
In this section, I give a bit more insight into how a formatter serializes an object’s fields. This knowledge can help you understand the more advanced serialization and deserialization techniques explained in the remainder of this chapter.
To make things easier for a formatter, the FCL offers a FormatterServices type in the System. Runtime.Serialization namespace. This type has only static methods in it, and no instances of the type may be instantiated. The following steps describe how a formatter automatically serializes an object whose type has the SerializableAttribute attribute applied to it.
- The formatter calls FormatterServices’s GetSerializableMembers method. public static MemberInfo[] GetSerializableMembers(Type type, StreamingContext context); This method uses reflection to get the type’s public and private instance fields (excluding any fields marked with the NonSerializedAttribute attribute). The method returns an array of MemberInfo objects, one for each serializable instance field.
- The object being serialized and the array of System.Reflection.MemberInfo objects are then passed to FormatterServices’ static GetObjectData method. public static Object[] GetObjectData(Object obj, MemberInfo[] members); This method returns an array of Objects where each element identifies the value of a field in the object being serialized. This Object array and the MemberInfo array are parallel. That is, element 0 in the Object array is the value of the member identified by element 0 in the MemberInfo array.
- The formatter writes the assembly’s identity and the type’s full name to the stream.
- The formatter then enumerates over the elements in the two arrays, writing each member’s name and value to the stream.
The following steps describe how a formatter automatically deserializes an object whose type has the SerializableAttribute attribute applied to it:
- The formatter reads the assembly’s identity and full type name from the stream. If the assembly is not currently loaded into the AppDomain, it is loaded (as described earlier). If the assembly can’t be loaded, a SerializationException exception is thrown and the object cannot be deserialized. If the assembly is loaded, the formatter passes the assembly identity information and the type’s full name to FormatterServices’ static GetTypeFromAssembly method. public static Type GetTypeFromAssembly(Assembly assem, String name); This method returns a System.Type object indicating the type of object that is being deserialized.
- The formatter calls FormatterServices’s static GetUninitializedObject method. public static Object GetUninitializedObject(Type type); This method allocates memory for a new object but does not call a constructor for the object. However, all the object’s bytes are initialized to null or 0.
- The formatter now constructs and initializes a MemberInfo array as it did before by calling the FormatterServices’s GetSerializableMembers method. This method returns the set of fields that were serialized and that need to be deserialized.
- The formatter creates and initializes an Object array from the data contained in the stream.
- The reference to the newly allocated object, the MemberInfo array, and the parallel Object array of field values is passed to FormatterServices’ static PopulateObjectMembers method.
public static Object PopulateObjectMembers( | |
Object obj, MemberInfo[] members, Object[] data); |
This method enumerates over the arrays, initializing each field to its corresponding value. At this point, the object has been completely deserialized.
💡小结:为了简化格式化器的操作,FCL 在 System.Runtime.Serialization
命名空间提供了一个 FormatterServices
类型。该类型只包含静态方法,而且该类型不能实例化。以下步骤描述了格式化器如何自动序列化类型应用了 SerializableAttribute
特性的对象。1. 格式化器调用 FormatterServices
的 GetSerializableMembers
方法。这个方法利用反射获取类型的 public
和 private
实例字段 (标记了 NonSerializedAttribute
特性的字段除外)。方法返回由 MemberInfo
对象构成的数组,其中每个元素都对应一个可序列化的实例字段。2. 对象被序列化, System.Reflection.MemberInfo
对象数组传给 FormatterServices
的静态方法 GetObjectData
。这个方法返回一个 Object
数组,其中每个元素都标识了被序列化的那个对象中的一个字段的值。这个 Object
数组和 MemberInfo
数组是并行 (parallel) 的;换言之, Object
数组中元素 0 是 MemberInfo
数组中的元素 0 所标识的那个成员的值。3. 格式化器将程序集标识和类型的完整名称写入流中。4. 格式化器然后遍历两个数组中的元素,将每个成员的名称和值写入流中。以下步骤描述了格式化器如何自动反序列化类型应用了 SerializableAttribute
特性的对象。1. 格式化器从流中读取程序集标识和完整类型名称。如果程序集当前没有加载到 AppDomain 中,就加载它 (这一点前面已经讲过了)。如果程序集不能加载,就抛出一个 SerializationException
异常,对象不能反序列化。如果程序集已加载,格式化器将程序集标识信息和类型全名传给 FormatterServices
的静态方法 GetTypeFromAssembly
。 这个方法返回一个 System.Type
对象,它代表要反序列化的那个对象的类型。2. 格式化器调用 FormmatterServices
的静态方法 GetUninitializedObject
。这个方法为一个新对象分配内存,但不为对象调用构造器。然而,对象的所有字节都被初始为 null
或 0
。3. 格式化器现在构造并初始化一个 MemberInfo
数组,具体做法和前面一样,都是调用 FormatterServices
的 GetSerializableMembers
方法。这个方法返回序列化好、现在需要反序列化的一组字段。4. 格式化器根据流中包含的数据创建并初始化一个 Object
数组。5. 将新分配对象、 MemberInfo
数组以及并行 Object
数组 (其中包含字段值) 的引用传给 FormatterServices
的静态方法 PopulateObjectMembers
。这个方法遍历数组,将每个字段初始化成对应的值。到此为止,对象就算是被彻底反序列化了。
# Controlling the Serialized/Deserialized Data
As discussed earlier in this chapter, the best way to get control over the serialization and deserialization process is to use the OnSerializing, OnSerialized, OnDeserializing, OnDeserialized, NonSerialized, and OptionalField attributes. However, there are some very rare scenarios where these attributes do not give you all the control you need. In addition, the formatters use reflection internally and reflection is slow, which increases the time it takes to serialize and deserialize objects. To get complete control over what data is serialized/deserialized or to eliminate the use of reflection, your type can implement the System.Runtime.Serialization.ISerializable interface, which is defined as follows.
public interface ISerializable { | |
void GetObjectData(SerializationInfo info, StreamingContext context); | |
} |
This interface has just one method in it, GetObjectData. But most types that implement this interface will also implement a special constructor that I’ll describe shortly.
💡重要提示: ISerializable
接口最大的问题在于,一旦类型实现了它,所有派生类型也必须实现它,而且派生类型必须保证调用基类的 GetObjectData
方法和特殊构造器。此外,一旦类型实现了该接口,便永远不能删除它,否则会失去与派生类型的兼容性。所以,密封类实现 ISerializable
接口是最让人放心的。使用本章前面描述的各种定制特性, ISerializable
接口的所有问题都可以避免。
重要提示: ISerializable
接口和特殊构造器旨在由格式化器使用。但其他代码可能调用 GetObjectData
来返回敏感数据。另外,其他代码可能构造对象,并传入损坏的数据。因此,建议向 GetObjectData
方法和特殊构造器应用以下特性:[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)]
When a formatter serializes an object graph, it looks at each object. If its type implements the ISerializable interface, then the formatter ignores all custom attributes and instead constructs a new System.Runtime.Serialization.SerializationInfo object. This object contains the actual set of values that should be serialized for the object.
When constructing a SerializationInfo, the formatter passes two parameters: Type and System.Runtime.Serialization.IFormatterConverter. The Type parameter identifies the object that is being serialized. Two pieces of information are required to uniquely identify a type: the string name of the type and its assembly’s identity (which includes the assembly name, version, culture, and public key). When a SerializationInfo object is constructed, it obtains the type’s full name (by internally querying Type’s FullName property) and stores this string in a private field. You can obtain the type’s full name by querying SerializationInfo’s FullTypeName property. Likewise, the constructor obtains the type’s defining assembly (by internally querying Type’s Module property followed by querying Module’s Assembly property followed by querying Assembly’s FullName property) and stores this string in a private field. You can obtain the assembly’s identity by querying SerializationInfo’s AssemblyName property.
💡注意:虽然可以设置一个 SerializationInfo
的 FullTypeName
和 AssemblyName
属性,但不建议这样做。如果想要更改被序列化的类型,建议调用 SerializationInfo
的 SetType
方法,传递对目标 Type
对象的引用。调用 SetType
可确保类型的全名和定义程序集被正确设置。本章后面的 24.7 节 “类型序列化为不同类型以及对象反序列化为不同对象” 将展示调用 SetType
的一个例子。
After the SerializationInfo object is constructed and initialized, the formatter calls the type’s GetObjectData method, passing it the reference to the SerializationInfo object. The GetObjectData method is responsible for determining what information is necessary to serialize the object and adding this information to the SerializationInfo object. GetObjectData indicates what information to serialize by calling one of the many overloaded AddValue methods provided by the SerializationInfo type. AddValue is called once for each piece of data that you want to add.
The following code shows an approximation of how the Dictionary type implements the ISerializable and IDeserializationCallback interfaces to take control over the serialization and deserialization of its objects.
[Serializable] | |
public class Dictionary<TKey, TValue>: ISerializable, IDeserializationCallback { | |
// Private fields go here (not shown) | |
private SerializationInfo m_siInfo; // Only used for deserialization | |
// Special constructor (required by ISerializable) to control deserialization | |
[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)] | |
protected Dictionary(SerializationInfo info, StreamingContext context) { | |
// During deserialization, save the SerializationInfo for OnDeserialization | |
m_siInfo = info; | |
} | |
// Method to control serialization | |
[SecurityCritical] | |
public virtual void GetObjectData(SerializationInfo info, StreamingContext context) { | |
info.AddValue("Version", m_version); | |
info.AddValue("Comparer", m_comparer, typeof(IEqualityComparer<TKey>)); | |
info.AddValue("HashSize", (m_ buckets == null) ? 0 : m_buckets.Length); | |
if (m_buckets != null) { | |
KeyValuePair<TKey, TValue>[] array = new KeyValuePair<TKey, TValue>[Count]; | |
CopyTo(array, 0); | |
info.AddValue("KeyValuePairs", array, typeof(KeyValuePair<TKey, TValue>[])); | |
} | |
} | |
// Method called after all key/value objects have been deserialized | |
public virtual void IDeserializationCallback.OnDeserialization(Object sender) { | |
if (m_siInfo == null) return; // Never set, return | |
Int32 num = m_siInfo.GetInt32("Version"); | |
Int32 num2 = m_siInfo.GetInt32("HashSize"); | |
m_comparer = (IEqualityComparer<TKey>) | |
m_siInfo.GetValue("Comparer", typeof(IEqualityComparer<TKey>)); | |
if (num2 != 0) { | |
m_buckets = new Int32[num2]; | |
for (Int32 i = 0; i < m_buckets.Length; i++) m_buckets[i] = -1; | |
m_entries = new Entry<TKey, TValue>[num2]; | |
m_freeList = -1; | |
KeyValuePair<TKey, TValue>[] pairArray = (KeyValuePair<TKey, TValue>[]) | |
m_siInfo.GetValue("KeyValuePairs", typeof(KeyValuePair<TKey, TValue>[])); | |
if (pairArray == null) | |
ThrowHelper.ThrowSerializationException( | |
ExceptionResource.Serialization_MissingKeys); | |
for (Int32 j = 0; j < pairArray.Length; j++) { | |
if (pairArray[j].Key == null) | |
ThrowHelper.ThrowSerializationException( | |
ExceptionResource.Serialization_NullKey); | |
Insert(pairArray[j].Key, pairArray[j].Value, true); | |
} | |
} else { m_buckets = null; } | |
m_version = num; | |
m_siInfo = null; | |
} |
Each AddValue method takes a String name and some data. Usually, the data is of a simple value type like Boolean, Char, Byte, SByte, Int16, UInt16, Int32, UInt32, Int64, UInt64, Single, Double, Decimal, or DateTime. However, you can also call AddValue, passing it a reference to an Object such as a String. After GetObjectData has added all of the necessary serialization information, it returns to the formatter.
💡注意:务必调用 AddValue
方法的某个重载版本为自己的类型添加序列化信息。如果一个字段的类型实现了 ISerializable
接口,就不要在字段上调用 GetObjectData
。相反,调用 AddValue
来添加字段;格式化器会注意到字段的类型实现了 ISerializable
,会帮你调用 GetObjectData
。如果自己在字段对象上调用 GetObjectData
,格式化器便不知道在对流进行反序列化时创建新对象。
The formatter now takes all of the values added to the SerializationInfo object and serializes each of them out to the stream. You’ll notice that the GetObjectData method is passed another parameter: a reference to a System.Runtime.Serialization.StreamingContext object. Most types’ GetObjectData methods will completely ignore this parameter, so I will not discuss it now. Instead, I’ll discuss it in the “Streaming Contexts” section later in this chapter.
So now you know how to set all of the information used for serialization. At this point, let’s turn our attention to deserialization. As the formatter extracts an object from the stream, it allocates memory for the new object (by calling the System.Runtime.Serialization.FormatterServices type’s static GetUninitializedObject method). Initially, all of this object’s fields are set to 0 or null. Then, the formatter checks if the type implements the ISerializable interface. If this interface exists, the formatter attempts to call a special constructor whose parameters are identical to that of the GetObjectData method.
If your class is sealed, then it is highly recommended that you declare this special constructor to be private. This will prevent any code from accidentally calling increasing security. If not, then you should declare this special constructor as protected so that only derived classes can call it. Note that the formatters are able to call this special constructor no matter how it is declared.
This constructor receives a reference to a SerializationInfo object containing all of the values added to it when the object was serialized. The special constructor can call any of the GetBoolean, GetChar, GetByte, GetSByte, GetInt16, GetUInt16, GetInt32, GetUInt32, GetInt64, GetUInt64, GetSingle, GetDouble, GetDecimal, GetDateTime, GetString, and GetValue methods, passing in a string corresponding to the name used to serialize a value. The value returned from each of these methods is then used to initialize the fields of the new object.
When deserializing an object’s fields, you should call the Get method that matches the type of value that was passed to the AddValue method when the object was serialized. In other words, if the GetObjectData method called AddValue, passing it an Int32 value, then the GetInt32 method should be called for the same value when deserializing the object. If the value’s type in the stream doesn’t match the type you’re trying to get, then the formatter will attempt to use an IFormatterConvert object to “cast” the stream’s value to the desired type.
As I mentioned earlier, when a SerializationInfo object is constructed, it is passed an object whose type implements the IFormatterConverter interface. Because the formatter is responsible for constructing the SerializationInfo object, it chooses whatever IFormatterConverter type it wants. Microsoft’s BinaryFormatter and SoapFormatter types always construct an instance of the System.Runtime.Serialization.FormatterConverter type. Microsoft’s formatters don’t offer any way for you to select a different IFormatterConverter type.
The FormatterConverter type calls the System.Convert class’s static methods to convert values between the core types, such as converting an Int32 to an Int64. However, to convert a value between other arbitrary types, the FormatterConverter calls Convert’s ChangeType method to cast the serialized (or original) type to an IConvertible interface and then calls the appropriate interface method. Therefore, to allow objects of a serializable type to be deserialized as a different type, you may want to consider having your type implement the IConvertible interface. Note that the FormatterConverter object is used only when deserializing objects and when you’re calling a Get method whose type doesn’t match the type of the value in the stream.
Instead of calling the various Get methods previously listed, the special constructor could instead call GetEnumerator, which returns a System.Runtime.Serialization.SerializationInfoEnumerator object that can be used to iterate through all the values contained within the SerializationInfo object. Each value enumerated is a System.Runtime.Serialization. SerializationEntry object.
Of course, you are welcome to define a type of your own that derives from a type that implements ISerializable’s GetObjectData and special constructor. If your type also implements ISerializable, then your implementation of GetObjectData and your implementation of the special constructor must call the same functions in the base class in order for the object to be serialized and deserialized properly. Do not forget to do this or the objects will not serialize or deserialize correctly. The next section explains how to properly define an ISerializable type whose base type doesn’t implement this interface.
If your derived type doesn’t have any additional fields in it and therefore has no special serialization/deserialization needs, then you do not have to implement ISerializable at all. Like all interface members, GetObjectData is virtual and will be called to properly serialize the object. In addition, the formatter treats the special constructor as “virtualized.” That is, during deserialization, the formatter will check the type that it is trying to instantiate. If that type doesn’t offer the special constructor, then the formatter will scan base classes until it finds one that implements the special constructor.
💡重要提示:特殊构造器中的代码一般从传给它的 SerializationInfo
对象中提取字段。提取字段后,不保证对象已完全反序列化,所以特殊构造器中的代码不应该尝试操作它提取的对象。
如果你的类型必须访问提取的对象中的成员 (比如调用方法),建议你的类型提供一个应用了 OnDeserialized
特性的方法,或者让类型实现 IDeserializationCallback
接口的 OnDeserialization
方法 (就像前面的 Dictionary
示例中那样)。调用该方法时,所有对象的字段都已设置好。然而,对于多个对象来说,它们的 OnDeserialized
或 OnDeserialization
方法的调用顺序是没有保障的。所以,虽然字段可能已初始化,但你仍然不知道被引用的对象是否已完全反序列化好 (如果那个被引用的对象也提供了一个 OnDeserialized
方法或者实现了 IDeserializationCallback
)。
# How to Define a Type That Implements ISerializable When the Base Type Doesn’t Implement This Interface
As mentioned earlier, the ISerializable interface is extremely powerful, because it allows a type to take complete control over how instances of the type get serialized and deserialized. However, this power comes at a cost: The type is now responsible for serializing all of its base type’s fields as well. Serializing the base type’s fields is easy if the base type also implements the ISerializable interface; you just call the base type’s GetObjectData method.
However, someday, you may find yourself defining a type that needs to take control of its serialization, but whose base type does not implement the ISerializable interface. In this case, your derived class must manually serialize the base type’s fields by grabbing their values and adding them to the SerializationInfo collection. Then, in your special constructor, you will also have to get the values out of the collection and somehow set the base class’s fields. Doing all of this is easy (albeit tedious) if the base class’s fields are public or protected, but it can be very difficult or impossible to do if the base class’s fields are private.
This following code shows how to properly implement ISerializable’s GetObjectData method and its implied constructor so that the base type’s fields are serialized.
[Serializable] | |
internal class Base { | |
protected String m_name = "Jeff"; | |
public Base() { /* Make the type instantiable */ } | |
} | |
[Serializable] | |
internal sealed class Derived : Base, ISerializable { | |
private DateTime m_date = DateTime.Now; | |
public Derived() { /* Make the type instantiable*/ } | |
// If this constructor didn't exist, we'd get a SerializationException | |
// This constructor should be protected if this class were not sealed | |
[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)] | |
private Derived(SerializationInfo info, StreamingContext context) { | |
// Get the set of serializable members for our class and base classes | |
Type baseType = this.GetType().BaseType; | |
MemberInfo[] mi = FormatterServices.GetSerializableMembers(baseType, context); | |
// Deserialize the base class's fields from the info object | |
for (Int32 i = 0; i < mi.Length; i++) { | |
// Get the field and set it to the deserialized value | |
FieldInfo fi = (FieldInfo)mi[i]; | |
fi.SetValue(this, info.GetValue(baseType.FullName + "+" + fi.Name, fi.FieldType)); | |
} | |
// Deserialize the values that were serialized for this class | |
m_date = info.GetDateTime("Date"); | |
} | |
[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)] | |
public virtual void GetObjectData(SerializationInfo info, StreamingContext context) { | |
// Serialize the desired values for this class | |
info.AddValue("Date", m_date); | |
// Get the set of serializable members for our class and base classes | |
Type baseType = this.GetType().BaseType; | |
MemberInfo[] mi = FormatterServices.GetSerializableMembers(baseType, context); | |
// Serialize the base class's fields to the info object | |
for (Int32 i = 0; i < mi.Length; i++) { | |
// Prefix the field name with the fullname of the base type | |
info.AddValue(baseType.FullName + "+" + mi[i].Name, | |
((FieldInfo)mi[i]).GetValue(this)); | |
} | |
} | |
public override String ToString() { | |
return String.Format("Name={0}, Date={1}", m_name, m_date); | |
} | |
} |
In this code, there is a base class, Base, which is marked only with the SerializableAttribute custom attribute. Derived from Base is Derived, which also is marked with the SerializableAttribute attribute and also implements the ISerializable interface. To make the situation more interesting, you’ll notice that both classes define a String field called m_name. When calling SerializationInfo’s AddValue method, you can’t add multiple values with the same name. The preceding code handles this situation by identifying each field by its class name prepended to the field’s name. For example, when the GetObjectData method calls AddValue to serialize Base’s m_name field, the name of the value is written as “Base+m_name.”
💡小结:前面讨论过,控制序列化和反序列化过程的最佳方式就是使用 OnSerializing
, OnSerialized
, OnDeserializing
, OnDeserialized
, NonSerialized
和 OptionalField
等特性。然而,在一些极少见的情况下,这些特性不能提供你想要的全部控制。此外,格式化器内部使用的是反射,而反射的速度是比较慢的,这会增大序列化和反序列化对象所花的时间,为了对序列化 / 反序列化的数据进行完全的控制,并避免使用反射,你的类型可实现 System.Runtime.Serialization.ISerializable
接口。这个接口只有一个方法,即 GetObjectData
。格式化器序列化对象图时会检查每个对象。如果发现一个对象的类型实现了 ISerializable
接口,就会忽略所有定制特性,改为构造新的 System.Runtime.Serialization.SerializationInfo
对象。该对象包含了要以对象序列化的值的集合。构造 SerializationInfo
对象时,格式化器要传递两个参数: Type
和 System.Runtime.Serialization.IFormatterConverter
。 Type
参数标识要序列化的对象。唯一性地标识一个类型需要两个部分的信息:类型的字符串名称及其程序集标识 (包括程序集名、版本、语言文化和公钥)。构造好的 SerializationInfo
对象包含类型的全名 (通过在内部查询 Type
的 FullName
属性),这个字符串会存储到一个私有字段中,如果你想获取类型的全名,可查询 SerializationInfo
的 FullTypeName
属性。类似地,构造器获取类型的定义程序集 (通过在内部查询 Type
的 Module
属性,再查询 Module
的 Assembly
属性,再查询 Assembly
的 FullName
属性),这样个字符串会存储在一个私有字段中。如果你想获取程序集的标识,可查询 SerializationInfo
的 AssemblyName
属性。构造好并初始化好 SerializationInfo
对象后,格式化器调用类型的 GetObjectData
方法,向它传递对 SerializationInfo
对象的引用。 GetObjectData
方法决定需要哪些信息来序列化对象,并将这些信息添加到 SerializationInfo
对象中。 GetObjectData
调用 SerializationInfo
类型提供的 AddValue
方法的众多重载版本之一指定要序列化的信息。针对要添加的每个数据,都要调用一次 AddValue
。每个 AddValue
方法都获取一个 String
名称好一些数据。数据一般是简单的值类型,比如 Boolean
, Char
, Byte
, SByte
, Int16
, Int32
, UInt32
, Int64
, UInt64
, Single
, Double
, Decimal
或者 DateTime
。然而,还可以在调用 AddValue
时向它传递对一个 Object
(比如一个 String
) 的引用。 GetObjectData
添加好所有必要的序列化信息之后,会返回至格式化器。现在,格式化器获取已经添加到 SerializationInfo
对象的所有值,并把它们都序列化到流中。注意,我们还向 GetObjectData
方法传递了另一个参数,也就是对一个 System.Runtime.Serialization.StreamingContext
对象的引用。大多数类型的 GetObjectData
方法都会完全忽略这个参数,后面的一节会说到这个参数。知道了如何设置序列化所需的全部信息之后,再来看反序列化。格式化器从流中提取一个对象时,会为新对象分配内存 (通过调用 System.Runtime.Serialize.FormatterServices
类型的静态 GetUninitializedObject
方法)。最初,这个对象的所有字段都设为 0
或 null
。然后,格式化器检查类型是否实现了 ISerializable
接口。如果存在这个接口,格式化器就尝试调用一个特殊构造器,它的参数和 GetObjectData
方法的完全一致。如果你的类是密封类,强烈建议将这个特殊构造器声明为 private
。这样可防止任何代码不慎调用它,从而提升安全性。如果不是密封类,应该将这个特殊构造器声明为 protected
,确保只有派生类才能调用。注意,无论这个特殊构造器是如何声明的,格式化器都能调用它。反序列化对象的字段时,应调用和对象序列化时传给 AddValue
方法的值的类型匹配的 Get
方法。换言之,如果 GetObjectData
方法调用 GetInt32
方法。如果值在流中的类型和你试图获取 (Get) 的类型不符,格式化器会尝试用一个 IFormatterConverter
对象将流中的值转型成你指定的类型。前面说过,构造 SerializationInfo
对象时,要向它传递类型实现了 IFormatterConverter
接口的一个对象。由于是格式化器负责构造 SerializationInfo
对象,所以要由它选择它想要的 IFormatterConverter
类型。 Microsoft 的 BinaryFormatter
和 SoapFormatter
类型总是构造 System.Runtime.Serialization.FormatterConverter
类型的实例。Microsoft 的格式化器没有提供任何方式让你选择不同的 IFormatterConverter
类型。 FormatterConverter
类型调用 System.Convert
类的各种静态方法在不同的核心类型之间对值进行转换,比如将一个 Int32
转换成一个 Int64
。然而,为了在其他任意类型之间转换一个值, FormatterConverter
要调用 Convert
的 ChangeType
方法将序列化好的 (或者原始的) 类型转型为一个 IConvertible
接口,再调用恰当的接口方法。所以,要允许一个可序列化类型的对象反序列化成一个不同的类型,可考虑让自己的类型实现 IConvertible
接口。注意,只有在反序列化对象时调用一个 Get
方法,但发现它的类型和流中的值的类型不符时,才会使用 FormatterConverter
对象。当然,完全可以定义自己的类型,让它从实现了 ISerializable
的 GetObjectData
方法和特殊构造器类型派生。如果你的类型也实现了 ISerializable
,那么在你实现的 GetObjectData
方法和特殊构造器中,必须调用基类中的同名方法,确保对象能正确序列化和反序列化。这一点务必牢记,否则对象是不能正确序列化和反序列化的。下一节将解释如何正确地定义基类型未实现 ISerializable
接口一个 ISerializable
类型。总有一天需要定义类型来控制它的序列化,但发现它的基类没有实现 ISerializable
接口。在这种情况下,派生类必须手动序列化基类的字段,具体的做法是获取它们的值,并把这些值添加到 SerializationInfo
集合中。然后,在你的特殊构造器中,还必须从集合中取出值,并以某种方式设置基类的字段。如果基类的字段是 public
或 protected
的,那么一切都很容易实现。如果是 private
字段,就很难或者根本不可能实现。
# Streaming Contexts
As mentioned earlier, there are many destinations for a serialized set of objects: same process, different process on the same machine, different process on a different machine, and so on. In some rare situations, an object might want to know where it is going to be deserialized so that it can emit its state differently. For example, an object that wraps a Windows semaphore object might decide to serialize its kernel handle if the object knows that it will be deserialized into the same process, because kernel handles are valid within a process. However, the object might decide to serialize the semaphore’s string name if it knows that the object will be deserialized on the same machine but into a different process. Finally, the object might decide to throw an exception if it knows that it will be deserialized in a process running on a different machine because a semaphore is valid only within a single machine.
A number of the methods mentioned earlier in this chapter accept a StreamingContext. A StreamingContext structure is a very simple value type offering just two public read-only properties, as shown in Table 24-1.
A method that receives a StreamingContext structure can examine the State property’s bit flags to determine the source or destination of the objects being serialized/deserialized. Table 24-2 shows the possible bit flag values.
Now that you know how to get this information, let’s discuss how you would set this information. The IFormatter interface (which is implemented by both the BinaryFormatter and the SoapFormatter types) defines a read/write StreamingContext property called Context. When you construct a formatter, the formatter initializes its Context property so that StreamingContextStates is set to All and the reference to the additional state object is set to null.
After the formatter is constructed, you can construct a StreamingContext structure using any of the StreamingContextStates bit flags, and you can optionally pass a reference to an object containing any additional context information you need. Now, all you need to do is set the formatter’s Context property with this new StreamingContext object before calling the formatter’s Serialize or Deserialize methods. Code demonstrating how to tell a formatter that you are serializing/deserialzing an object graph for the sole purpose of cloning all the objects in the graph is shown in the DeepClone method presented earlier in this chapter.
💡小结:一组序列化好的对象可以有许多目的地:同一个进程、同一台机器上的不同进程、不同机器上的不同进程等。在一些比较少见的情况下,一个对象可能想知道它要在什么地方反序列化,从而以不同的当时生成它的状态。例如,如果对象中包装了 Windows 信号量 (semaphore) 对象,如果它知道要反序列化到同一个进程中,就可决定对它的内核句柄 (kernel handle) 进行序列化,这是因为内核句柄在一个进程中有效。但如果要反序列化到同一台计算机的不同进程中,就可决定对信号量的字符串名称进行序列化。最后,如果要反序列化到不同计算机上的进程,就可决定抛出异常,因为信号量只在一台机器内有效。本章提到的大量方法都接受一个 StreamingContext
(流上下文)。 StreamingContext
结构是一个非常简单的值类型,它只提供了两个公共只读属性, State
和 Context
。接受一个 StreamingContext
结构的方法能检查 State
属性的位标志,判断要序列化 / 反序列化的对象的来源或目的地。而 Context
属性则是一个对象的引用,对象中包含用户希望的任何上下文信息。 IFormatter
接口 (同时由 BinaryFormatter
和 SoapFormatter
类型实现) 定义了 StreamingContext
类型的可读 / 可写属性 Context
。构造格式化器时,格式化器会初始化它的 Context
属性,将 StreamingContextStates
设为 All
,将对额外状态对象的引用设为 null
。格式化器构造好之后,就可以使用任何 StreamingContextStates
位标志来构造一个 StreamingContext
结构,并可选择传递一个对象引用 (对象中包含你需要的任何额外的上下文信息)。现在,在调用格式化器的 Serialize
或 Deserialize
方法之前,你只需要将格式化器的 Context
属性设为这个新的 StreamingContext
对象。
# Serializing a Type As a Different Type and Deserializing an Object As a Different Object
The .NET Framework’s serialization infrastructure is quite rich, and in this section, we discuss how a developer can design a type that can serialize or deserialize itself into a different type or object. Below are some examples where this is interesting:
Some types (such as System.DBNull and System.Reflection.Missing) are designed to have only one instance per AppDomain. These types are frequently called singletons. If you have a reference to a DBNull object, serializing and deserializing it should not cause a new DBNull object to be created in the AppDomain. After deserializing, the returned reference should refer to the AppDomain’s already-existing DBNull object.
Some types (such as System.Type, System.Reflection.Assembly, and other reflection types like MemberInfo) have one instance per type, assembly, member, and so on. Imagine you have an array where each element references a MemberInfo object. It’s possible that five array elements reference a single MemberInfo object. After serializing and deserializing this array, the five elements that referred to a single MemberInfo object should all refer to a single MemberInfo object. What’s more, these elements should refer to the one MemberInfo object that exists for the specific member in the AppDomain. You could also imagine how this could be useful for polling database connection objects or any other type of object.
For remotely controlled objects, the CLR serializes information about the server object that, when deserialized on the client, causes the CLR to create a proxy object. This type of the proxy object is a different type than the server object, but this is transparent to the client code. When the client calls instance methods on the proxy object, the proxy code internally remotes the call to the server that actually performs the request.
Let’s look at some code that shows how to properly serialize and deserialize a singleton type.
// There should be only one instance of this type per AppDomain | |
[Serializable] | |
public sealed class Singleton : ISerializable { | |
// This is the one instance of this type | |
private static readonly Singleton s_theOneObject = new Singleton(); | |
// Here are the instance fields | |
public String Name = "Jeff"; | |
public DateTime Date = DateTime.Now; | |
// Private constructor allowing this type to construct the singleton | |
private Singleton() { } | |
// Method returning a reference to the singleton | |
public static Singleton GetSingleton() { return s_theOneObject; } | |
// Method called when serializing a Singleton | |
// I recommend using an Explicit Interface Method Impl. Here | |
[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)] | |
void ISerializable.GetObjectData(SerializationInfo info, StreamingContext context) { | |
info.SetType(typeof(SingletonSerializationHelper)); | |
// No other values need to be added | |
} | |
[Serializable] | |
private sealed class SingletonSerializationHelper : IObjectReference { | |
// Method called after this object (which has no fields) is deserialized | |
public Object GetRealObject(StreamingContext context) { | |
return Singleton.GetSingleton(); | |
} | |
} | |
// NOTE: The special constructor is NOT necessary because it's never called | |
} |
The Singleton class represents a type that allows only one instance of itself to exist per AppDomain. The following code tests the Singleton’s serialization and deserialization code to ensure that only one instance of the Singleton type ever exists in the AppDomain.
private static void SingletonSerializationTest() { | |
// Create an array with multiple elements referring to the one Singleton object | |
Singleton[] a1 = { Singleton.GetSingleton(), Singleton.GetSingleton() }; | |
Console.WriteLine("Do both elements refer to the same object? " | |
+ (a1[0] == a1[1])); // "True" | |
using (var stream = new MemoryStream()) { | |
BinaryFormatter formatter = new BinaryFormatter(); | |
// Serialize and then deserialize the array elements | |
formatter.Serialize(stream, a1); | |
stream.Position = 0; | |
Singleton[] a2 = (Singleton[])formatter.Deserialize(stream); | |
// Prove that it worked as expected: | |
Console.WriteLine("Do both elements refer to the same object? " | |
+ (a2[0] == a2[1])); // "True" | |
Console.WriteLine("Do all elements refer to the same object? " | |
+ (a1[0] == a2[0])); // "True" | |
} | |
} |
Now, let’s walk through the code to understand what’s happening. When the Singleton type is loaded into the AppDomain, the CLR calls its static constructor, which constructs a Singleton object and saves a reference to it in a static field, s_theOneObject. The Singleton class doesn’t offer any public constructors, which prevents any other code from constructing any other instances of this class.
In SingletonSerializationTest, an array is created consisting of two elements; each element references the Singleton object. The two elements are initialized by calling Singleton’s static GetSingleton method. This method returns a reference to the one Singleton object. The first call to Console’s WriteLine method displays “True,” verifying that both array elements refer to the same exact object.
Now, SingletonSerializationTest calls the formatter’s Serialize method to serialize the array and its elements. When serializing the first Singleton, the formatter detects that the Singleton type implements the ISerializable interface and calls the GetObjectData method. This method calls SetType, passing in the SingletonSerializationHelper type, which tells the formatter to serialize the Singleton object as a SingletonSerializationHelper object instead. Because AddValue is not called, no additional field information is written to the stream. Because the formatter automatically detected that both array elements refer to a single object, the formatter serializes only one object.
After serializing the array, SingletonSerializationTest calls the formatter’s Deserialize method. When deserializing the stream, the formatter tries to deserialize a SingletonSerializationHelper object because this is what the formatter was “tricked” into serializing. (In fact, this is why the Singleton class doesn’t provide the special constructor that is usually required when implementing the ISerializable interface.) After constructing the SingletonSerializationHelper object, the formatter sees that this type implements the System. Runtime.Serialization.IObjectReference interface. This interface is defined in the FCL as follows.
public interface IObjectReference { | |
Object GetRealObject(StreamingContext context); | |
} |
When a type implements this interface, the formatter calls the GetRealObject method. This method returns a reference to the object that you really want a reference to now that deserialization of the object has completed. In my example, the SingletonSerializationHelper type has GetRealObject return a reference to the Singleton object that already exists in the AppDomain. So, when the formatter’s Deserialize method returns, the a2 array contains two elements, both of which refer to the AppDomain’s Singleton object. The SingletonSerializationHelper object used to help with the deserialization is immediately unreachable and will be garbage collected in the future.
The second call to WriteLine displays “True,” verifying that both of a2’s array elements refer to the exact same object. The third and last call to WriteLine also displays “True,” proving that the elements in both arrays all refer to the exact same object.
💡小结:有的类型 (比如 System.DBNull
和 System.Reflection.Missing
) 设计成每个 AppDomain 一个实例。经常将这些类型称为单实例 (singleton) 类型。给定一个 DBNull
对象引用,序列化和反序列化它不应造成在 AppDomain 中新建一个 DBNull
对象。反序列化后,返回的引用应指向 AppDomain 中现有的 DBNull
对象。对于某些类型 (例如 System.Type
和 System.Reflection.Assembly
,以及其他反射类型,例如 MemberInfo
),每个类型、程序集或者成员等都只能有一个实例。例如,假定一个数组中的每个元素都引用一个 MemberInfo
对象,其中 5 个元素引用的都是一个 MemerInfo
对象。序列化和反序列化这个数组后,那 5 个元素引用的应该还是一个 MemberInfo
对象 (而不是分别引用 5 个不同的对象)。除此之外,这些元素引用的 MemberInfo
对象还必须实际对应于 AppDomain 中的一个特定成员。轮询数据库连接对象或者其他任何类型的对象时,这个功能也是很好用的。对于远程控制的对象,CLR 序列化与服务器对象有关的信息。在客户端上反序列化时,会造成 CLR 创建一个代理对象。这个代理对象的类型有别于服务器对象的类型,但这对于客户端代码来说是透明的 (客户端不需要关心这个问题)。客户端直接在代理对象上调用实例方法。然后,代理代码内部会调用远程发送给服务器,由后者实际执行请求的操作。如果类型实现了 System.Runtime.Serialization.IObjectReference
接口,格式化器会调用 GetRealObject
方法。这个方法返回在对象反序列化好之后你真正想引用的对象。
# Serialization Surrogates
Up to now, I’ve been discussing how to modify a type’s implementation to control how a type serializes and deserializes instances of itself. However, the formatters also allow code that is not part of the type’s implementation to override how a type serializes and deserializes its objects. There are two main reasons why application code might want to override a type’s behavior:
It allows a developer the ability to serialize a type that was not originally designed to be serialized.
It allows a developer to provide a way to map one version of a type to a different version of a type.
Basically, to make this mechanism work, you first define a “surrogate type” that takes over the actions required to serialize and deserialize an existing type. Then, you register an instance of your surrogate type with the formatter telling the formatter which existing type your surrogate type is responsible for acting on. When the formatter detects that it is trying to serialize or deserialize an instance of the existing type, it will call methods defined by your surrogate object. Let’s build a sample that demonstrates how all this works.
A serialization surrogate type must implement the System.Runtime.Serialization.ISerializationSurrogate interface, which is defined in the FCL as follows.
public interface ISerializationSurrogate { | |
void GetObjectData(Object obj, SerializationInfo info, StreamingContext context); | |
Object SetObjectData(Object obj, SerializationInfo info, StreamingContext context, | |
ISurrogateSelector selector); | |
} |
Now, let’s walk through an example that uses this interface. Let’s say your program contains some DateTime objects that contain values that are local to the user’s computer. What if you want to serialize the DateTime objects to a stream but you want the values to be serialized in universal time? This would allow you to send the data over a network stream to another machine in another part of the world and have the DateTime value be correct. Although you can’t modify the DateTime type that ships with the FCL, you can define your own serialization surrogate class that can control how DateTime objects are serialized and deserialized. Here is how to define the surrogate class.
internal sealed class UniversalToLocalTimeSerializationSurrogate : ISerializationSurrogate { | |
public void GetObjectData(Object obj, SerializationInfo info, StreamingContext context) { | |
// Convert the DateTime from local to UTC | |
info.AddValue("Date", ((DateTime)obj).ToUniversalTime().ToString("u")); | |
} | |
public Object SetObjectData(Object obj, SerializationInfo info, StreamingContext context, | |
ISurrogateSelector selector) { | |
// Convert the DateTime from UTC to local | |
return DateTime.ParseExact(info.GetString("Date"), "u", null).ToLocalTime(); | |
} | |
} |
The GetObjectData method here works just like the ISerializable interface’s GetObjectData method. The only difference is that ISerializationSurrogate’s GetObjectData method takes one additional parameter: a reference to the “real” object that is to be serialized. In the GetObjectData method above, this object is cast to DateTime, the value is converted from local time to universal time, and a string (formatted using universal full date/time pattern) is added to the SerializationInfo collection.
The SetObjectData method is called in order to deserialize a DateTime object. When this method is called, it is passed a reference to a SerializationInfo object. SetObjectData gets the string date out of this collection, parses it as a universal full date/time formatted string, and then converts the resulting DateTime object from universal time to the machine’s local time.
The Object that is passed for SetObjectData’s first parameter is a bit strange. Just before calling SetObjectData, the formatter allocates (via FormatterServices’s static GetUninitializedObject method) an instance of the type that the surrogate is a surrogate for. The instance’s fields are all 0/null and no constructor has been called on the object. The code inside SetObjectData can simply initialize the fields of this instance by using the values from the passed-in SerializationInfo object and then have SetObjectData return null. Alternatively, SetObjectData could create an entirely different object or even a different type of object and return a reference to this new object, in which case, the formatter will ignore any changes that may or may not have happened to the object it passed in to SetObjectData.
In my example, my UniversalToLocalTimeSerializationSurrogate class acts as a surrogate for the DateTime type, which is a value type. And so, the obj parameter refers to a boxed instance of a DateTime. There is no way to change the fields in most value types (because they are supposed to be immutable) and so, my SetObjectData method ignores the obj parameter and returns a new DateTime object with the desired value in it.
At this point, I’m sure you’re all wondering how the formatter knows to use this ISerializationSurrogate type when it tries to serialize/deserialize a DateTime object. The following code demonstrates how to test the UniversalToLocalTimeSerializationSurrogate class.
private static void SerializationSurrogateDemo() { | |
using (var stream = new MemoryStream()) { | |
// 1. Construct the desired formatter | |
IFormatter formatter = new SoapFormatter(); | |
// 2. Construct a SurrogateSelector object | |
SurrogateSelector ss = new SurrogateSelector(); | |
// 3. Tell the surrogate selector to use our surrogate for DateTime objects | |
ss.AddSurrogate(typeof(DateTime), formatter.Context, | |
new UniversalToLocalTimeSerializationSurrogate()); | |
// NOTE: AddSurrogate can be called multiple times to register multiple surrogates | |
// 4. Tell the formatter to use our surrogate selector | |
formatter.SurrogateSelector = ss; | |
// Create a DateTime that represents the local time on the machine & serialize it | |
DateTime localTimeBeforeSerialize = DateTime.Now; | |
formatter.Serialize(stream, localTimeBeforeSerialize); | |
// The stream displays the Universal time as a string to prove it worked | |
stream.Position = 0; | |
Console.WriteLine(new StreamReader(stream).ReadToEnd()); | |
// Deserialize the Universal time string & convert it to a local DateTime | |
stream.Position = 0; | |
DateTime localTimeAfterDeserialize = (DateTime)formatter.Deserialize(stream); | |
// Prove it worked correctly: | |
Console.WriteLine("LocalTimeBeforeSerialize ={0}", localTimeBeforeSerialize); | |
Console.WriteLine("LocalTimeAfterDeserialize={0}", localTimeAfterDeserialize); | |
} | |
} |
After steps 1 through 4 have executed, the formatter is ready to use the registered surrogate types. When the formatter’s Serialize method is called, each object’s type is looked up in the set maintained by the SurrogateSelector. If a match is found, then the ISerializationSurrogate object’s GetObjectData method is called to get the information that should be written out to the stream.
When the formatter’s Deserialize method is called, the type of the object about to be deserialized is looked up in the formatter’s SurrogateSelector and if a match is found, then the ISerializationSurrogate object’s SetObjectData method is called to set the fields within the object being deserialized.
Internally, a SurrogateSelector object maintains a private hash table. When AddSurrogate is called, the Type and StreamingContext make up the key and the ISerializationSurrogate object is the key’s value. If a key with the same Type/StreamingContext already exists, then AddSurrogate throws an ArgumentException. By including a StreamingContext in the key, you can register one surrogate type object that knows how to serialize/deserialize a DateTime object to a file and register a different surrogate object that knows how to serialize/deserialize a DateTime object to a different process.
💡注意: BinaryFormatter
类有一个 bug,会造成代理无法序列化循环引用的对象,为了解决这个问题,需要将对自己的 ISerializationSurrogate
对象的引用传给 FormatterServices
的静态 GetSurrogateForCyclicalReference
方法。该方法返回一个 ISerializationSurrogate
对象。然后,可以将对这个对象的引用传给 SurrogateSelector
的 AddSurrogate
方法。但要注意,使用 GetSurrogateForCyclicalReference
方法时,代理的 SetObjectData
方法必须修改 SetObjectData
的 obj
参数所引用的对象中的值,而且最后要向调用方法返回 null
或 obj
。在本书的配套资源中,有一个例子展示了如何修改 UniversalToLocalTimeSerializationSurrogate
类和 SerializationSurrogateDemo
方法来支持循环引用。
# Surrogate Selector Chains
Multiple SurrogateSelector objects can be chained together. For example, you could have a SurrogateSelector that maintains a set of serialization surrogates that are used for serializing types into proxies that get remoted across the wire or between AppDomains. You could also have a separate SurrogateSelector object that contains a set of serialization surrogates that are used to convert Version 1 types into Version 2 types.
If you have multiple SurrogateSelector objects that you’d like the formatter to use, you must chain them together into a linked list. The SurrogateSelector type implements the ISurrogateSelector interface, which defines three methods. All three of these methods are related to chaining. Here is how the ISurrogateSelector interface is defined.
public interface ISurrogateSelector { | |
void ChainSelector(ISurrogateSelector selector); | |
ISurrogateSelector GetNextSelector(); | |
ISerializationSurrogate GetSurrogate(Type type, StreamingContext context, | |
out ISurrogateSelector selector); | |
} |
The ChainSelector method inserts an ISurrogateSelector object immediately after the ISurrogateSelector object being operated on (‘this’ object). The GetNextSelector method returns a reference to the next ISurrogateSelector object in the chain or null if the object being operated on is the end of the chain.
The GetSurrogate method looks up a Type/StreamingContext pair in the ISurrogateSelector object identified by this. If the pair cannot be found, then the next ISurrogateSelector object in the chain is accessed, and so on. If a match is found, then GetSurrogate returns the ISerializationSurrogate object that handles the serialization/deserialization of the type looked up. In addition, GetSurrogate also returns the ISurrogateSelector object that contained the match; this is usually not needed and is ignored. If none of the ISurrogateSelector objects in the chain have a match for the Type/StreamingContext pair, GetSurrogate returns null.
💡注意:FCL 定义了一个 ISurrogateSelector
接口,还定义了一个实现了该接口的 SurrogateSelector
类型。然而,只有在一些非常罕见的情况下,才需要定义自己的类型来实现 ISurrogateSelector
接口。实现 ISurrogateSelector
接口的唯一原因就是将类型映射到另一个类型时需要更大的灵活性。例如,你可能希望以一种特殊方式序列化从一个特定基类继承的所有类型。 System.Runtime.Remoting.Messaging.RemotingSurrogateSelector
类就是一个很好的例子。出于远程访问 (remoting) 目的而序列化对象时,CLR 使用 RemotingSurrogateSelector
来格式化对象。这个代理选择器 (surrogate selector) 以一种特殊方式序列化从 System.MarshalByRefObject
派生的所有对象,确保反序列化会造成在客户端创建代理对象 (proxy object)。
💡小结:前面讨论了如何修改一个类型的实现,控制该类型如何对它本身的实例进行序列化和反序列化。然而,格式化器还允许不是” 类型实现的一部分 “的代码重写该类型” 序列化和反序列化其对象 “的方式。应用程序代码之所以要重写 (覆盖) 类型的行为,主要是出于两方面的考虑。1. 允许开发人员序列化最初没有设计成要序列化的类型。2. 允许开发人员提供一种方式将类型的一个版本映射到类型的一个不同的版本。简单地说,为了使这个机制工作起来,首先要定义一个” 代理类型 “(surrogate type),它接管对现有类型进行序列化和反序列化的行动。然后,向格式化器登记该代理类型的实例,告诉格式化器代理类型要作用于现有的哪个类型。一旦格式化器要对现有类型的实例进行序列化或反序列化,就调用由你的代理对象定义的方法。序列化代理类型必须实现 System.Runtime.Serialization.ISerializationSurrogate
接口,它在 FCL 中定义了 GetObjectData
和 SetObjectData
方法。 GetObjectData
方法在这里的工作方式与 ISerializable
接口的 GetObjectData
方法差不多。唯一的区别在于, ISerializationSurrogate
的 GetObjectData
方法要获取一个额外的参数 —— 对要序列化的” 真实” 对象的引用。调用格式化器的 Serialize
方法时,会在 SurrogateSelector
维护的集合 (一个哈希表) 中查找 (要序列化的) 每个对象的类型。如果发现一个匹配,就调用 ISerializationSurrogate
对象的 GetObjectData
方法来获取应该写入流的信息。格式化器的 Deserialize
方法在调用时,会在格式化器的 SurrogateSelector
中查找要反序列化的对象的类型。如果发现一个匹配,就调用 ISerializationSurrogate
对象的 SetObjectData
方法来设置要反序列化的对象中的字段。 SurrogateSelector
对象在内部维护了一个私有哈希表。调用 AddSurrogate
时, Type
和 StreamingContext
构成了哈希表的键 (key),对应的值 (value) 就是 ISerializationSurrogate
对象。如果已经存在和要添加的 Type/StreamingContext
相同的一个键, AddSurrogate
会抛出一个 ArgumentException
。通过在键中包含一个 StreamingContext
,可以登记一个代理类型对象,它知道如何将 DateTime
对象序列化 / 反序列化到一个文件中;再登记一个不同的代理对象,它知道如何将 DateTime
对象序列化、反序列化到一个不同的进程中。多个 SurrogateSelector
对象可链接到一起。例如,可以让一个 SurrogateSelector
对象维护一组序列化代理,这些序列化代理 (surrogate) 用于将类型序列化成带代理 (proxy),以便通过网络传送,或者跨越不同的 AppDomain 传送。还可以让另一个 SurrogateSelector
对象维护一组序列化代理,这些序列化代理用于将版本 1 的类型转换成版本 2 的类型。如果有多个希望格式化器使用的 SurrogateSelector
对象,必须把它们链接到一个链表中。 SurrogateSelector
类型实现了 ISurrogateSelector
接口,该接口定义了三个方法。这些方法全部跟链接有关。 ChainSelector
方法紧接在当前操作的 ISurrogateSelector
对象 ( this
对象) 之后插入一个 ISurrogateSelector
对象。 GetNextSelector
方法返回对链表中的下一个 ISurrogateSelector
对象的引用;如果当前操作的对象是链尾,就返回 null
。 GetSurrogate
方法在 this
所代表的 ISurrogateSelector
对象中查找一对 Type/StreamingContext
。如果没有找到 Type/StreamingContext
对,就访问链中的下一个 ISurrogateSelector
对象,依次类推。如果找到一个匹配项, GetSurrogate
将返回一个 ISerializationSurrogate
对象,该对象负责对找到的类型进行序列化 / 反序列化。除此之外, GetSurrogate
还会返回包含匹配项的 ISurrogateSelector
对象;一般都用不着这个对象,所以一般会将其忽略。如果链中所有 ISurrogateSelector
对象都不包含匹配的一对 Type/StreamingContext
, GetSurrogate
将返回 null
。
# Overriding the Assembly and/or Type When Deserializing an Object
When serializing an object, formatters output the type’s full name and the full name of the type’s defining assembly. When deserializing an object, formatters use this information to know exactly what type of object to construct and initialize. The earlier discussion about the ISerializationSurrogate interface showed a mechanism allowing you to take over the serialization and deserialization duties for a specific type. A type that implements the ISerializationSurrogate interface is tied to a specific type in a specific assembly.
However, there are times when the ISerializationSurrogate mechanism doesn’t provide enough flexibility. Here are some scenarios when it might be useful to deserialize an object into a different type than it was serialized as:
A developer might decide to move a type’s implementation from one assembly to a different assembly. For example, the assembly’s version number changes making the new assembly different from the original assembly.
An object on a server that gets serialized into a stream that is sent to a client. When the client processes the stream, it could deserialize the object to a completely different type whose code knows how to remotely invoke method calls to the server’s object.
A developer makes a new version of a type. We want to deserialize any already-serialized objects into the new version of the type.
The System.Runtime.Serialization.SerializationBinder class makes deserializing an object to a different type very easy. To do this, you first define your own type that derives from the abstract SerializationBinder type. In the following code, assume that version 1.0.0.0 of your assembly defined a class called Ver1 and assume that the new version of your assembly defines the Ver1ToVer2SerializationBinder class and also defines a class called Ver2.
internal sealed class Ver1ToVer2SerializationBinder : SerializationBinder { | |
public override Type BindToType(String assemblyName, String typeName) { | |
// Deserialize any Ver1 object from version 1.0.0.0 into a Ver2 object | |
// Calculate the assembly name that defined the Ver1 type | |
AssemblyName assemVer1 = Assembly.GetExecutingAssembly().GetName(); | |
assemVer1.Version = new Version(1, 0, 0, 0); | |
// If deserializing the Ver1 object from v1.0.0.0, turn it into a Ver2 object | |
if (assemblyName == assemVer1.ToString() && typeName == "Ver1") | |
return typeof(Ver2); | |
// Else, just return the same type being requested | |
return Type.GetType(String.Format("{0}, {1}", typeName, assemblyName)); | |
} | |
} |
Now, after you construct a formatter, construct an instance of Ver1ToVer2SerializationBinder and set the formatter’s Binder read/write property to refer to the binder object. After setting the Binder property, you can now call the formatter’s Deserialize method. During deserialization, the formatter sees that a binder has been set. As each object is about to be deserialized, the formatter calls the binder’s BindToType method, passing it the assembly name and type that the formatter wants to deserialize. At this point, BindToType decides what type should actually be constructed and returns this type.
💡注意: SerializationBinder
类还可重写 BindToName
方法,从而序列化对象时更改程序集 / 类型信息,这个方法看起来像下面这样:public virtual void BindToName(Type serializedType, out string assemblyName, out string typeName)
序列化期间,格式化器调用这个方法,传递它想要序列化的类型。然后,你可以通过两个 out 参数返回真正想要序列化的程序集和类型。如果两个 out 参数返回 null
和 null
(默认实现就是这样的),就不执行任何更改。
💡小结:序列化对象时,格式化器输出类型及其定义程序集的全名。反序列化对象时,格式化器根据这个信息确定要为对象构造并初始化什么类型。前面讨论了如何利用 ISerializationSurrogate
接口来接管特定类型的序列化和反序列化工作。实现了 ISerializationSurrogate
接口的类型与特定程序集中的特定类型关联。但有的时候, ISerializationSurrogate
机制的灵活性显得有点不足。在下面列举的情形中,有必要将对象反序列化成和序列化时不同的类型。1. 开发人员可能想把一个类型的实现从一个程序集移动到另一个程序集。例如,程序集版本号的变化造成新程序集有别于原始程序集。2. 服务器对象序列化到发送客户端的流中。客户端处理流时,可以将对象反序列化成完全不同的类型,该类型的代码知道如何向服务器的对象发出远程方法调用。3. 开发人员创建了类型的新版本,想把已序列化的对象反序列化成类型的新版本。利用 System.Runtime.Serialization.SerializationBinder
类,可以非常简单地将一个对象反序列化成不同类型。为此,要先定义自己的类型,让它从抽象类 SerializationBinder
派生。现在,在构造好格式化器之后,构造好派生类的实例,并设置格式化器的可读 / 可写属性 Binder
,让它引用绑定器 (binder) 对象。设置好 Binder
属性后,调用格式化器的 Deserialize
方法。在反序列化期间,格式化器发现已设置了一个绑定器。每个对象要反序列化时,格式化器都调用绑定器的 BindToType
方法,向它传递程序集名称以及格式化器想要反序列化的类型。然后, BindToType
判断实际应该构建什么类型,并返回这个类型。