# Chapter 8 Methods

# Instance Constructors and Classes (Reference Types)

Constructors are special methods that allow an instance of a type to be initialized to a good state. Constructor methods are always called .ctor (for constructor) in a method definition metadata table. When creating an instance of a reference type, memory is allocated for the instance’s data fields, the object’s overhead fields (type object pointer and sync block index) are initialized, and then the type’s instance constructor is called to set the initial state of the object.

When constructing a reference type object, the memory allocated for the object is always zeroed out before the type’s instance constructor is called. Any fields that the constructor doesn’t explicitly overwrite are guaranteed to have a value of 0 or null.

Unlike other methods, instance constructors are never inherited. That is, a class has only the instance constructors that the class itself defines. Because instance constructors are never inherited, you cannot apply the following modifiers to an instance constructor: virtual, new, override, sealed, or abstract. If you define a class that does not explicitly define any constructors, the C# compiler defines a default (parameterless) constructor for you whose implementation simply calls the base class’s parameterless constructor.

For example, if you define the following class.

public class SomeType { 
}

it is as though you wrote the code as follows.

public class SomeType { 
 public SomeType() : base() { } 
}

If the class is abstract, the compiler-produced default constructor has protected accessibility; otherwise, the constructor is given public accessibility. If the base class doesn’t offer a parameterless constructor, the derived class must explicitly call a base class constructor or the compiler will issue an error. If the class is static (sealed and abstract), the compiler will not emit a default constructor at all into the class definition.

A type can define several instance constructors. Each constructor must have a different signature, and each can have different accessibility. For verifiable code, a class’s instance constructor must call its base class’s constructor before accessing any of the inherited fields of the base class. The C# compiler will generate a call to the default base class’s constructor automatically if the derived class’s constructor does not explicitly invoke one of the base class’s constructors. Ultimately, System.Object ’s public, parameterless constructor gets called. This constructor does nothing—it simply returns. This is because System.Object defines no instance data fields, and therefore its constructor has nothing to do.

In a few situations, an instance of a type can be created without an instance constructor being called. In particular, calling Object’s MemberwiseClone method allocates memory, initializes the object’s overhead fields, and then copies the source object’s bytes to the new object. Also, a constructor is usually not called when deserializing an object with the runtime serializer. The deserialization code allocates memory for the object without calling a constructor by using the System.Runtime.Serialization.FormatterServices type's GetUninitializedObject or GetSafeUninitializedObject methods (as discussed in Chapter 24, “Runtime Serialization”).

💡重要提示:不要在构造器中调用虚方法。原因是假如被实例化的类型重写了虚方法,就会执行派生类型对虚方法的实现。但在这个时候,尚未完成对继承层次结构中的所有字段的初始化 (被实例化的类型的构造器还没有运行呢)。所以,调用虚方法会导致无法预测的行为。归根到底,这是由于调用虚方法时,直到运行时之前都不会选择执行该方法的实际类型。

C# offers a simple syntax that allows the initialization of fields defined within a reference type when an instance of the type is constructed.

internal sealed class SomeType { 
 private Int32 m_x = 5; 
}

When a SomeType object is constructed, its m_x field will be initialized to 5. How does this happen? Well, if you examine the Intermediate Language (IL) for SomeType ’s constructor method (also called .ctor ), you’ll see the code shown here.

.method public hidebysig specialname rtspecialname 
 instance void .ctor() cil managed
{
 // Code size 14 (0xe)
 .maxstack 8
 IL_0000: ldarg.0
 IL_0001: ldc.i4.5
 IL_0002: stfld int32 SomeType::m_x
 IL_0007: ldarg.0
 IL_0008: call instance void [mscorlib]System.Object::.ctor()
 IL_000d: ret
} // end of method SomeType::.ctor

In this code, you see that SomeType ’s constructor contains code to store a 5 into m_x and then calls the base class’s constructor. In other words, the C# compiler allows the convenient syntax that lets you initialize the instance fields inline and translates this to code in the constructor method to perform the initialization. This means that you should be aware of code explosion, as illustrated by the following class definition.

internal sealed class SomeType { 
 private Int32 m_x = 5; 
 private String m_s = "Hi there"; 
 private Double m_d = 3.14159; 
 private Byte m_b; 
 // Here are some constructors. 
 public SomeType() { ... } 
 public SomeType(Int32 x) { ... } 
 public SomeType(String s) { ...; m_d = 10; } 
}

When the compiler generates code for the three constructor methods, the beginning of each method includes the code to initialize m_x, m_s, and m_d. After this initialization code, the compiler inserts a call to the base class’s constructor, and then the compiler appends to the method the code that appears in the constructor methods. For example, the code generated for the constructor that takes a String parameter includes the code to initialize m_x, m_s, and m_d, call the base class’s (Object’s) constructor, and then overwrite m_d with the value 10. Note that m_b is guaranteed to be initialized to 0 even though no code exists to explicitly initialize it.

💡注意:编译器在调用基类构造器前使用简化语法对所有字段进行初始化,以维持源代码给人留下的 “这些字段总是有一个值” 的印象。但假如基类构造器调用了虚方法并回调由派生类定义的方法,就可能出问题。在这种情况下,使用简化语法初始化的字段在调用虚方法之前就初始化好了。

Because there are three constructors in the preceding class, the compiler generates the code to initialize m_x, m_s, and m_d three times—once per constructor. If you have several initialized instance fields and a lot of overloaded constructor methods, you should consider defining the fields without the initialization, creating a single constructor that performs the common initialization, and having each constructor explicitly call the common initialization constructor. This approach will reduce the size of the generated code. Here is an example using C#’s ability to explicitly have a constructor call another constructor by using the this keyword.

internal sealed class SomeType { 
 // Do not explicitly initialize the fields here. 
 private Int32 m_x; 
 private String m_s; 
 private Double m_d; 
 private Byte m_b; 
 // This constructor sets all fields to their default. 
 // All of the other constructors explicitly invoke this constructor. 
 public SomeType() { 
 m_x = 5; 
 m_s = "Hi there"; 
 m_d = 3.14159; 
 m_b = 0xff; 
 } 
 // This constructor sets all fields to their default, then changes m_x. 
 public SomeType(Int32 x) : this() { 
 m_x = x; 
 } 
 // This constructor sets all fields to their default, then changes m_s. 
 public SomeType(String s) : this() { 
 m_s = s; 
 } 
 // This constructor sets all fields to their default, then changes m_x & m_s. 
 public SomeType(Int32 x, String s) : this() { 
 m_x = x; 
 m_s = s; 
 } 
}

💡小结:创建引用类型的实例时,首先为实例的数据字段分配内存,然后初始化对象的附加字段(类型对象指针和同步块索引),这些附加字段称为 overhead fields,“overhead” 是开销的意思,意味着是创建对象时必须的 “开销”。最后调用类型的实例构造器来设置对象的初始状态。构造应引用类型对象时,在调用类型的实例构造器之前,为对象分配的内存总是先被归零。没有被构造器显示重写的所有字段都保证获得 0 或 null 值。由于永远不能继承实例构造器,所以实例构造器不能使用以下修饰符:virtual,new,override,sealed 和 abstract。如果类的修饰符为 abstract,那么编译器生成的默认构造器的可访问性就为 protected;否则,构造器就会被赋予 public 可访问性。如果基类没有提供无参构造器,那么派生类必须显示调用一个基类构造器,否则编译器会报错。如果类的修饰符为 static(sealed 和 abstract,静态类在元数据中是抽象密封类),编译器根本不会在类的定义中生成默认构造器。为了使代码 “可验证”(verifiable),类的实例构造器在访问从基类继承的任何字段之前,必须先调用基类的构造器。如果派生类的构造器没有显示调用一个基类构造器,C# 编译器会自动生成对默认的基类构造器的调用。极少数时候可以在不调用实例构造器的前提下创建类型的实例。一个典型的例子使 Object 的 MemberwiseClone 方法。该方法的作用是分配内存,初始化对象的附加字段(类型对象指针和同步块索引),然后将源对象的字节数据复制到新对象中。另外,用运行时序列化器(runtime serializer)反序列化对象时,通常也不需要调用构造器。反序列化代码使用 System.Runtime.Serialization.FormatterServices 类型的 GetUninitializedObject 或者 GetSafeUninitializedObject 方法为对象分配内存,期间不会调用一个构造器。C# 编译器提供了一个简化的语法,允许以 “内联” 方式初始化实例字段。在这些初始化代码之后,编译器会插入对基类的构造器的调用。再然后,会插入构造器自己的代码。在 C# 中可以利用 this 关键字显示调用另一个构造器,这样能减少生成的代码。

# Instance Constructors and Structures (Value Types)

Value type (struct) constructors work quite differently from reference type (class) constructors. The common language runtime (CLR) always allows the creation of value type instances, and there is no way to prevent a value type from being instantiated. For this reason, value types don’t actually even need to have a constructor defined within them, and the C# compiler doesn't emit default parameterless constructors for value types. Examine the following code.

internal struct Point { 
 public Int32 m_x, m_y; 
} 
internal sealed class Rectangle { 
 public Point m_topLeft, m_bottomRight; 
}

To construct a Rectangle, the new operator must be used, and a constructor must be specified. In this case, the default constructor automatically generated by the C# compiler is called. When memory is allocated for the Rectangle, the memory includes the two instances of the Point value type. For performance reasons, the CLR doesn’t attempt to call a constructor for each value type field contained within the reference type. But as I mentioned earlier, the fields of the value types are initialized to 0/null.

The CLR does allow you to define constructors on value types. The only way that these constructors will execute is if you write code to explicitly call one of them, as in Rectangle’s constructor, shown here.

internal struct Point { 
 public Int32 m_x, m_y; 
 public Point(Int32 x, Int32 y) { 
 m_x = x; 
 m_y = y; 
 } 
} 
internal sealed class Rectangle { 
 public Point m_topLeft, m_bottomRight; 
 public Rectangle() { 
 // In C#, new on a value type calls the constructor to 
 // initialize the value type's fields. 
 m_topLeft = new Point(1, 2); 
 m_bottomRight = new Point(100, 200); 
 } 
}

A value type’s instance constructor is executed only when explicitly called. So if Rectangle’s constructor didn’t initialize its m_topLeft and m_bottomRight fields by using the new operator to call Point’s constructor, the m_x and m_y fields in both Point fields would be 0.

In the Point value type defined earlier, no default parameterless constructor is defined. However, let’s rewrite that code as follows.

internal struct Point { 
 public Int32 m_x, m_y; 
 public Point() { 
 m_x = m_y = 5; 
 } 
}
internal sealed class Rectangle { 
 public Point m_topLeft, m_bottomRight; 
 public Rectangle() { 
 } 
}

Now when a new Rectangle is constructed, what do you think the m_x and m_y fields in the two Point fields, m_topLeft and m_bottomRight , would be initialized to: 0 or 5? (Hint: This is a trick question.)

Many developers (especially those with a C++ background) would expect the C# compiler to emit code in Rectangle’s constructor that automatically calls Point’s default parameterless constructor for the Rectangle’s two fields. However, to improve the run-time performance of the application, the C# compiler doesn’t automatically emit this code. In fact, many compilers will never emit code to call a value type’s default constructor automatically, even if the value type offers a parameterless constructor. To have a value type’s parameterless constructor execute, the developer must add explicit code to call a value type’s constructor.

Based on the information in the preceding paragraph, you should expect the m_x and m_y fields in Rectangle’s two Point fields to be initialized to 0 in the code shown earlier because there are no explicit calls to Point’s constructor anywhere in the code.

However, I did say that my original question was a trick question. The trick part is that C# doesn’t allow a value type to define a parameterless constructor. So the previous code won’t actually compile. The C# compiler produces the following message when attempting to compile that code: error CS0568: Structs cannot contain explicit parameterless constructors.

C# purposely disallows value types from defining parameterless constructors to remove any confusion a developer might have about when that constructor gets called. If the constructor can’t be defined, the compiler can never generate code to call it automatically. Without a parameterless constructor, a value type’s fields are always initialized to 0/null.

💡注意:严格地说,只有当值类型的字段嵌套到引用类型中时,才保证被初始化为 0 或 null 。基于栈的值类型字段则无此保证。但是,为了确保代码的 “可验证性”(verifiability),任何基于栈的值类型字段都必须在读取之前写入 (赋值)。允许先读再写会造成安全漏洞。对于所有基于栈的值类型中的字段,C# 和其他能生成 “可验证” 代码的编译器可以保证对它们进行 “置零”,或至少保证在读取之前赋值,确保不会在运行时因验证失败而抛出异常。所以,你完全可以忽略本 “注意” 的内容,假定自己的值类型的字段都会被初始化为 0 或 null

Keep in mind that although C# doesn’t allow value types with parameterless constructors, the CLR does. So if the unobvious behavior described earlier doesn’t bother you, you can use another programming language (such as IL assembly language) to define your value type with a parameterless constructor.

Because C# doesn’t allow value types with parameterless constructors, compiling the following type produces the following message: error CS0573: ' SomeValType.m_x ': cannot have instance field initializers in structs.

internal struct SomeValType { 
 // You cannot do inline instance field initialization in a value type. 
 private Int32 m_x = 5; 
}

In addition, because verifiable code requires that every field of a value type be written to prior to any field being read, any constructors that you do have for a value type must initialize all of the type’s fields. The following type defines a constructor for the value type but fails to initialize all of the fields.

internal struct SomeValType { 
 private Int32 m_x, m_y; 
 // C# allows value types to have constructors that take parameters. 
 public SomeValType(Int32 x) { 
 m_x = x; 
 // Notice that m_y is not initialized here. 
 } 
}

When compiling this type, the C# compiler produces the following message: error CS0171: Field ' SomeValType.m_y ' must be fully assigned before control leaves the constructor. To fix the problem, assign a value (usually 0) to y in the constructor.

As an alternative way to initialize all the fields of a value type, you can actually do the following.

// C# allows value types to have constructors that take parameters.
public SomeValType(Int32 x) {
 // Looks strange but compiles fine and initializes all fields to 0/null.
 this = new SomeValType();
 m_x = x; // Overwrite m_x's 0 with x
 // Notice that m_y was initialized to 0.
}

In a value type’s constructor, this represents an instance of the value type itself and you can actually assign to it the result of newing up an instance of the value type, which really just zeroes out all the fields. In a reference type’s constructor, this is considered read-only, so you cannot assign to it at all.

💡小结:CLR 总是允许创建值类型的实例,并且没有办法阻止值类型的实例化。所以,值类型其实并不需要定义构造器,C# 编译器根本不会为值类型内联(嵌入)默认的无参构造器。考虑到性能,CLR 不会为包含在引用类型中的每个值类型字段都主动调用构造器。但是,如前所述,值类型的字段会被初始化为 0 或 null。CLR 确实允许为值类型定义构造器(有参),但必须显示调用才会执行。为了增强应用程序的运行时性能,即便类型提供了无参构造器,许多编译器也永远不会生成代码来自动调用它。但事实上,C# 编译器不允许值类型定义无参构造器。(测试的时候是可以定义,可能由于 C# 版本不同)如果编译类似代码,C# 编译器会显示以下消息:error CS0568:结构不能包含显式的无参数构造器。由于不能定义无参构造器,所以编译器永远不会生成自动调用它的代码。没有无参构造器,值类型的字段总是被初始化为 0 或 null。注意,虽然 C# 不允许值类型带有无参构造器,但 CLR 允许。由于 C# 不允许为值类型定义无参构造器,所以不能在值类型中内联实例字段的初始化。如果编译类似代码,C# 编译器将显示消息:error CS0573:结构中不能有实例字段初始值设定项。(测试时报错 CS8983,需要显式声明一个构造器,但是字段初始值还是不起作用就是了)另外,为了生成 “可验证” 代码,在访问值类型的任何字段字段之前,都需要对全部字段进行赋值。否则,C# 编译器会显式消息:error CS0171:在控制返回到调用方之前,字段必须完全赋值。在值类型的构造器中,this 代表值类型本身的一个实例,用 new 创建的值类型的一个实例可以赋给 this。在 new 的过程中,会将所有字段置为零。而在引用类型的构造器中,this 被认为是只读的,所以不能对它进行赋值。

# Type Constructors

n addition to instance constructors, the CLR supports type constructors (also known as static constructors, class constructors, or type initializers). A type constructor can be applied to interfaces (although C# doesn’t allow this), reference types, and value types. Just as instance constructors are used to set the initial state of an instance of a type, type constructors are used to set the initial state of a type. By default, types don’t have a type constructor defined within them. If a type has a type constructor, it can have no more than one. In addition, type constructors never have parameters. In C#, here’s how to define a reference type and a value type that have type constructors.

internal sealed class SomeRefType { 
 static SomeRefType() { 
 // This executes the first time a SomeRefType is accessed. 
 } 
} 
internal struct SomeValType { 
 // C# does allow value types to define parameterless type constructors. 
 static SomeValType() { 
 // This executes the first time a SomeValType is accessed. 
 } 
}

You’ll notice that you define type constructors just as you would parameterless instance constructors, except that you must mark them as static. Also, type constructors should always be private; C# makes them private for you automatically. In fact, if you explicitly mark a type constructor as private (or anything else) in your source code, the C# compiler issues the following error: error CS0515: 'SomeValType.SomeValType()': access modifiers are not allowed on static constructors. Type constructors should be private to prevent any developer-written code from calling them; the CLR is always capable of calling a type constructor.

💡重要提示:虽然能在值类型中定义类型构造器,但永远都不要真的那么做,因为 CLR 有时不会调用值类型的静态类型构造器。下面是一个例子:

internal struct SomeValType { 
 static SomeValType() { 
 Console.WriteLine("This never gets displayed"); 
 } 
 public Int32 m_x; 
} 
public sealed class Program { 
 public static void Main() { 
 SomeValType[] a = new SomeValType[10]; 
 a[0].m_x = 123; 
 Console.WriteLine(a[0].m_x); // Displays 123 
 } 
}

The calling of a type constructor is a tricky thing. When the just-in-time (JIT) compiler is compiling a method, it sees what types are referenced in the code. If any of the types define a type constructor, the JIT compiler checks if the type’s type constructor has already been executed for this AppDomain . If the constructor has never executed, the JIT compiler emits a call to the type constructor into the native code that the JIT compiler is emitting. If the type constructor for the type has already executed, the JIT compiler does not emit the call because it knows that the type is already initialized.

Now, after the method has been JIT-compiled, the thread starts to execute it and will eventually get to the code that calls the type constructor. In fact, it is possible that multiple threads will be executing the same method concurrently. The CLR wants to ensure that a type’s constructor executes only once per AppDomain . To guarantee this, when a type constructor is called, the calling thread acquires a mutually exclusive thread synchronization lock. So if multiple threads attempt to simultaneously call a type’s static constructor, only one thread will acquire the lock and the other threads will block. The first thread will execute the code in the static constructor. After the first thread leaves the constructor, the waiting threads will wake up and will see that the constructor’s code has already been executed. These threads will not execute the code again; they will simply return from the constructor method. In addition, if any of these methods ever get called again, the CLR knows that the type constructor has already executed and will ensure that the constructor is not called again.

💡注意:由于 CLR 保证一个类型构造器在每个 AppDomain 中执行一次,而且 (这种执行) 是线程安全的,所以非常适合在类型构造器中初始化类型需要的任何单实例 (Singleton) 对象。

Within a single thread, there is a potential problem that can occur if two type constructors contain code that reference each other. For example, ClassA has a type constructor containing code that references ClassB, and ClassB has a type constructor containing code that references ClassA. In this situation, the CLR still guarantees that each type constructor’s code executes only once; however, it cannot guarantee that ClassA’s type constructor code has run to completion before executing ClassB’s type constructor. You should certainly try to avoid writing code that sets up this scenario. In fact, because the CLR is responsible for calling type constructors, you should always avoid writing any code that requires type constructors to be called in a specific order.

Finally, if a type constructor throws an unhandled exception, the CLR considers the type to be unusable. Attempting to access any fields or methods of the type will cause a System.TypeInitializationException to be thrown.

The code in a type constructor has access only to a type’s static fields, and its usual purpose is to initialize those fields. As it does with instance fields, C# offers a simple syntax that allows you to initialize a type’s static fields.

internal sealed class SomeType { 
 private static Int32 s_x = 5; 
}

💡注意:虽然 C# 不允许值类型为它的实例字段使用内联字段初始化语法,但可以为静态字段使用。换句话说,如果将前面定义的 SomeType 类型从 class 改为 struct ,那么代码也能通过编译,而且会像你预期的那样工作。

When this code is built, the compiler automatically generates a type constructor for SomeType. It’s as if the source code had originally been written as follows.

internal sealed class SomeType { 
 private static Int32 s_x; 
 static SomeType() { s_x = 5; } 
}

Using ILDasm.exe, it’s easy to verify what the compiler actually produced by examining the IL for the type constructor. Type constructor methods are always called .cctor (for class constructor) in a method definition metadata table.

In the code below, you see that the .cctor method is private and static. In addition, notice that the code in the method does in fact load a 5 into the static field s_x.

.method private hidebysig specialname rtspecialname static 
 void .cctor() cil managed
{
 // Code size 7 (0x7)
 .maxstack 8
 IL_0000: ldc.i4.5
 IL_0001: stsfld int32 SomeType::s_x
 IL_0006: ret
} // end of method SomeType::.cctor

Type constructors shouldn’t call a base type’s type constructor. Such a call isn’t necessary because none of a type’s static fields are shared or inherited from its base type.

💡注意:有的语言 (比如 Java) 希望在访问类型时自动调用它的类型构造器,并调用它的所有基类型的类型构造器。此外,类型实现的接口也必须调用接口的类型构造器。CLR 不支持这种行为。但是,使用由 System.Runtime.CompilerServices.RuntimeHelpers 提供的 RunClassConstructor 方法,编译器和开发人员可以实现这种行为。任何语言想要实现这种行为,可以告诉它的编译器在一个类型的类型构造器中生成代码,为所有基类型都调用这个方法,用 RunClassConstructor 方法调用一个类型构造器,CLR 就知道类型构造器之前是否执行过。如果是,CLR 不会再次调用它。

Finally, assume that you have this code.

internal sealed class SomeType { 
 private static Int32 s_x = 5; 
 static SomeType() { 
 s_x = 10; 
 } 
}

In this case, the C# compiler generates a single type constructor method. This constructor first initializes s_x to 5 and then initializes s_x to 10. In other words, when the C# compiler generates IL code for the type constructor, it first emits the code required to initialize the static fields followed by the explicit code contained in your type constructor method.

💡重要提示:偶尔有开发人员问我,是否可以在卸载类型时执行一些代码。首先要搞清楚的是,类型只有在 AppDomain 卸载时才会卸载。 AppDomain 卸载时,用于标识类型的对象 (类型对象) 将成为 “不可达” 的对象 (不存在对它的引用),垃圾回收器会回收类型对象的内存。这个行为导致许多开发人员认为可以为类型添加一个静态 Finalize 方法。当类型卸载时,就自动地调用这个方法。遗憾的是,CLR 并不支持静态 Finalize 方法。但也不是完全没有办法,要在 AppDomain 卸载时执行一些代码,可向 System.AppDomain 类型的 DomainUnload 事件登记一个回调方法。

💡小结:类型构造器可以应用于接口(虽然 C# 编译器不允许)、引用类型和值类型。实例构造器的作用是设置类型的实例的初始状态。对应地,类型构造器的作用是设置类型的初始状态。类型默认没有定义类型构造器。如果定义,也只能定义一个。此外,类型构造器永远没有参数。类型构造器总是私有,C# 自动把它们标记为 private。之所以必须私有,是为了防止任何由开发人员写的代码调用它,对它的调用总是由 CLR 负责。JIT 编译器在编译一个方法时,会查看代码中都引用了哪些类型。任何一个类型定义了类型构造器,JIT 编译器都会检查针对当前 AppDomain ,是否已经执行了这个类型构造器。如果构造器从未执行,JIT 编译器会在它生成的本机(native)代码中添加对类型构造器的调用。如果类型构造器已经执行,JIT 编译器就不添加对它的调用。事实上,多个线程可能同时执行相同的方法。CLR 希望确保在每个 AppDomain 中,一个类型构造器只执行一次。为了保证这一点,在调用类型构造器时,调用线程要获取一个互斥线程同步锁。这样一来,如果多个线程试图同时调用某个类型的静态构造器,只有一个线程才可以获得锁,其他线程会被阻塞(blocked)。由于时 CLR 负责类型构造器的调用,所以任何代码都不应要求以特定的顺序调用类型构造器。类型构造器中的代码只能访问类型的静态字段,并且它的常规用途就是初始化这些字段。类型构造器方法总是叫 .cctor (代表 class constructor)。类型构造器不应调用基类型的类型构造器。这种调用之所以没必要,是因为类型不可能有静态字段是从基类型分享或继承的。当 C# 编译器为类型构造器生成 IL 代码时,它首先生成的是初始化静态字段所需的代码,然后才会添加你的类型构造器方法中显式包含的代码。

# Operator Overload Methods

Some programming languages allow a type to define how operators should manipulate instances of the type. For example, a lot of types (such as System.String, System.Decimal, and System.DateTime) overload the equality (==) and inequality (!=) operators. The CLR doesn’t know anything about operator overloading because it doesn’t even know what an operator is. Your programming language defines what each operator symbol means and what code should be generated when these special symbols appear.

For example, in C#, applying the + symbol to primitive numbers causes the compiler to generate code that adds the two numbers together. When the + symbol is applied to String objects, the C# compiler generates code that concatenates the two strings together. For inequality, C# uses the != symbol, while Microsoft Visual Basic uses the <> symbol. Finally, the ^ symbol means exclusive OR (XOR) in C#, but it means exponent in Visual Basic.

Although the CLR doesn’t know anything about operators, it does specify how languages should expose operator overloads so that they can be readily consumed by code written in a different programming language. Each programming language gets to decide for itself whether it will support operator overloads, and if it does, the syntax for expressing and using them. As far as the CLR is concerned, operator overloads are simply methods.

Your choice of programming language determines whether or not you get the support of operator overloading and what the syntax looks like. When you compile your source code, the compiler produces a method that identifies the behavior of the operator. The CLR specification mandates that operator overload methods be public and static methods. In addition, C# (and many other languages) requires that at least one of the operator method’s parameters must be the same as the type that the operator method is defined within. The reason for this restriction is that it enables the C# compiler to search for a possible operator method to bind to in a reasonable amount of time.

Here is an example of an operator overload method defined in a C# class definition.

public sealed class Complex { 
 public static Complex operator+(Complex c1, Complex c2) { ... } 
}

The compiler emits a metadata method definition entry for a method called op_Addition; the method definition entry also has the specialname flag set, indicating that this is a “special” method. When language compilers (including the C# compiler) see a + operator specified in source code, they look to see if one of the operand’s types defines a specialname method called op_Addition whose parameters are compatible with the operand’s types. If this method exists, the compiler emits code to call this method. If no such method exists, a compilation error occurs.

Tables 8-1 and 8-2 show the set of unary and binary operators that C# supports being overloaded, their symbols, and the corresponding Common Language Specification (CLS) method name that the compiler emits. I’ll explain the tables’ third columns in the next section.

image-20221025202233270

image-20221025202253493

image-20221025202324849

The CLR specification defines many additional operators that can be overloaded, but C# does not support these additional operators. Therefore, they are not in mainstream use, so I will not list them here. If you are interested in the complete list, please see the ECMA specifications (www.ecma-international.org/publications/standards/Ecma-335.htm) for the Common Language Infrastructure (CLI), Partition I, Concepts and Architecture, Sections 10.3.1 (unary operators) and 10.3.2 (binary operators).

💡注意:检查 Framework 类库 (FCL) 的核心数值类型 ( Int32 , Int64UInt32 等),会发现它们没有定义任何操作符重载方法。之所以不定义,是因为编译器会 (在代码中) 专门查找针对这些基元类型执行的操作 (运算),并生成直接操作这些类型的实例的 IL 指令。如果类型要提供方法,而且编译器要生成代码来调用这些方法,方法调用就会产生额外的运行时开销。另外,方法最终都要执行一些 IL 指令来完成你希望的操作。这正是核心 FCL 类型没有定义任何操作符重载方法的原因。对于开发人员,这意味着假如选择的编程语言不支持其中的某个 FCL 类型,便不能对该类型的实例执行任何操作。

# Operators and Programming Language Interoperability

Operator overloading can be a very useful tool, allowing developers to express their thoughts with succinct code. However, not all programming languages support operator overloading. When using a language that doesn’t support operator overloading, the language will not know how to interpret the + operator (unless the type is a primitive in that language), and the compiler will emit an error. When using languages that do not support operator overloading, the language should allow you to call the desired op_* method directly (such as op_Addition).

If you are using a language that doesn’t support + operator overloading to be defined in a type, obviously, this type could still offer an op_Addition method. From C#, you might expect that you could call this op_Addition method by using the + operator, but you cannot. When the C# compiler detects the + operator, it looks for an op_Addition method that has the specialname metadata flag associated with it so that the compiler knows for sure that the op_Addition method is intended to be an operator overload method. Because the op_Addition method is produced by a language that doesn’t support operator overloads, the method won’t have the specialname flag associated with it, and the C# compiler will produce a compilation error. Of course, code in any language can explicitly call a method that just happens to be named op_Addition, but the compilers won’t translate a usage of the + symbol to call this method.

💡注意:FCL 的 System.Decimal 类型很好地演示了如何重载操作符并根据 Microsoft 的设计规范定义友好方法名。

💡小结:有的语言允许类型定义操作符应该如何操作类型的实例。CLR 对操作符重载一无所知,它甚至不知道什么是操作符。是编程语言定义了每个操作符的含义,以及当这些特殊符号出现时,应该生成什么样的代码。CLR 规范要求操作符重载方法必须是 public 和 static 方法。另外,C#(以及其他许多语言)要求操作符重载方法至少有一个参数的类型与当前定义这个方法的类型相同。之所以要进行这样的限制,是为了使 C# 编译器能在合理的时间内找到要绑定的操作符方法。CLR 规范定义了许多额外的可重载的操作符,但 C# 不支持这些额外的操作符。使用不支持操作符重载的语言时,语言不知道如何解释 + 操作符(除非类型是该语言的基元类型),编译器会报错。使用不支持操作符重载的编程语言时,语言应该允许你直接调用希望的 op_* 方法(例如 op_Addition)。如果语言不支持在类型中定义 + 操作符重载,这个类型仍然可能提供了一个 op_Addition 方法,但是在 C# 中使用 + 操作符是不会调用这个 op_Addition 方法的,因为 C# 编译器在检测到操作符 + 时,会查找关联了 specialname 元数据标志的 op_Addition 方法,以确定 op_Addition 方法是要作为操作符重载方法使用。但是由于现在这个 op_Addition 方法是由不支持操作符重载的编程语言生成的,所以方法没有关联 specialname 标记。因此,C# 编译器会报告编译错误。当然,用任何编程语言写的代码都可以显式调用碰巧命名为 op_Addition 的方法,但编译器不会将一个 + 号的使用翻译成对这个方法的调用。

# Conversion Operator Methods

Occasionally, you need to convert an object from one type to an object of a different type. For example, I’m sure you’ve had to convert a Byte to an Int32 at some point in your life. When the source type and the target type are a compiler’s primitive types, the compiler knows how to emit the necessary code to convert the object.

If the source type or target type is not a primitive, the compiler emits code that has the CLR perform the conversion (cast). In this case, the CLR just checks if the source object’s type is the same type as the target type (or derived from the target type). However, it is sometimes natural to want to convert an object of one type to a completely different type. For example, the System.Xml.Linq.XElement class allows you to convert an Extensible Markup Language (XML) element to a Boolean, (U)Int32, (U)Int64, Single, Double, Decimal, String, DateTime, DateTimeOffset, TimeSpan, Guid, or the nullable equivalent of any of these types (except String). You could also imagine that the FCL included a Rational data type and that it might be convenient to convert an Int32 object or a Single object to a Rational object. Moreover, it also might be nice to convert a Rational object to an Int32 or a Single object.

To make these conversions, the Rational type should define public constructors that take a single parameter: an instance of the type that you’re converting from. You should also define public instance ToXxx methods that take no parameters (just like the very popular ToString method). Each method will convert an instance of the defining type to the Xxx type. Here’s how to correctly define conversion constructors and methods for a Rational type.

public sealed class Rational { 
 // Constructs a Rational from an Int32 
 public Rational(Int32 num) { ... } 
 // Constructs a Rational from a Single 
 public Rational(Single num) { ... } 
 // Converts a Rational to an Int32 
 public Int32 ToInt32() { ... } 
 // Converts a Rational to a Single 
 public Single ToSingle() { ... } 
}

By invoking these constructors and methods, a developer using any programming language can convert an Int32 or a Single object to a Rational object and convert a Rational object to an Int32 or a Single object. The ability to do these conversions can be quite handy, and when designing a type, you should seriously consider what conversion constructors and methods make sense for your type.

In the previous section, I discussed how some programming languages offer operator overloading. Well, some programming languages (such as C#) also offer conversion operator overloading. Conversion operators are methods that convert an object from one type to another type. You define a conversion operator method by using special syntax. The CLR specification mandates that conversion overload methods be public and static methods. In addition, C# (and many other languages) requires that either the parameter or the return type must be the same as the type that the conversion method is defined within. The reason for this restriction is that it enables the C# compiler to search for a possible operator method to bind to in a reasonable amount of time. The following code adds four conversion operator methods to the Rational type.

public sealed class Rational { 
 // Constructs a Rational from an Int32 
 public Rational(Int32 num) { ... } 
 // Constructs a Rational from a Single 
 public Rational(Single num) { ... } 
 // Converts a Rational to an Int32 
 public Int32 ToInt32() { ... } 
 // Converts a Rational to a Single 
 public Single ToSingle() { ... }
 
 // Implicitly constructs and returns a Rational from an Int32 
 public static implicit operator Rational(Int32 num) { 
 return new Rational(num); 
 } 
 // Implicitly constructs and returns a Rational from a Single 
 public static implicit operator Rational(Single num) { 
 return new Rational(num); 
 } 
 // Explicitly returns an Int32 from a Rational 
 public static explicit operator Int32(Rational r) { 
 return r.ToInt32(); 
 } 
 // Explicitly returns a Single from a Rational 
 public static explicit operator Single(Rational r) { 
 return r.ToSingle(); 
 } 
}

For conversion operator methods, you must indicate whether a compiler can emit code to call a conversion operator method implicitly or whether the source code must explicitly indicate when the compiler is to emit code to call a conversion operator method. In C#, you use the implicit keyword to indicate to the compiler that an explicit cast doesn’t have to appear in the source code in order to emit code that calls the method. The explicit keyword allows the compiler to call the method only when an explicit cast exists in the source code.

After the implicit or explicit keyword, you tell the compiler that the method is a conversion operator by specifying the operator keyword. After the operator keyword, you specify the type that an object is being cast to; in the parentheses, you specify the type that an object is being cast from.

Defining the conversion operators in the preceding Rational type allows you to write code like this (in C#).

public sealed class Program { 
 public static void Main() { 
 Rational r1 = 5; // Implicit cast from Int32 to Rational 
 Rational r2 = 2.5F; // Implicit cast from Single to Rational 
 Int32 x = (Int32) r1; // Explicit cast from Rational to Int32 
 Single s = (Single) r2; // Explicit cast from Rational to Single 
 } 
}

Under the covers, the C# compiler detects the casts (type conversions) in the code and internally generates IL code that calls the conversion operator methods defined by the Rational type. But what are the names of these methods? Well, compiling the Rational type and examining its metadata shows that the compiler produces one method for each conversion operator defined. For the Rational type, the metadata for the four conversion operator methods looks like this.

public static Rational op_Implicit(Int32 num) 
public static Rational op_Implicit(Single num) 
public static Int32 op_Explicit(Rational r) 
public static Single op_Explicit(Rational r)

As you can see, methods that convert an object from one type to another are always named op_ Implicit or op_Explicit. You should define an implicit conversion operator only when precision or magnitude isn’t lost during a conversion, such as when converting an Int32 to a Rational. However, you should define an explicit conversion operator if precision or magnitude is lost during the conversion, as when converting a Rational object to an Int32. If an explicit conversion fails, you should indicate this by having your explicit conversion operator method throw an OverflowException or an InvalidOperationException .

💡注意:两个 op_Explicit 方法获取相同的参数,也就是一个 Rational 。但两个方法的返回类型不同,一个是 Int32 ,另一个是 Single 。这是仅凭返回类型来区分两个方法的例子。CLR 允许在一个类型中定义仅返回类型不同的多个方法。但只有极少数语言支持这个能力。你可能已经注意到了,C++,C#,Visual Basic 和 Java 语言都不允许在一个类型中定义仅返回类型不同的多个方法。个别语言 (比如 IL 汇编语言) 允许开发人员显式选择调用其中哪一个方法。当然,IL 汇编语言的程序员不应利用这个能力,否则定义的方法无法从其他语言中调用。虽然 C# 语言没有向 C# 程序员公开这个能力,但当一个类型定义了转换操作符方法时,C# 编译器会在内部利用这个能力。

C# has full support for conversion operators. When it detects code where you’re using an object of one type and an object of a different type is expected, the compiler searches for an implicit conversion operator method capable of performing the conversion and generates code to call that method. If an implicit conversion operator method exists, the compiler emits a call to it in the resulting IL code. If the compiler sees source code that is explicitly casting an object from one type to another type, the compiler searches for an implicit or explicit conversion operator method. If one exists, the compiler emits the call to the method. If the compiler can’t find an appropriate conversion operator method, it issues an error and doesn’t compile the code.

💡注意:使用强制类型转换表达式时,C# 生成代码来调用显式转换操作符方法。使用 C# 的 asis 操作符时,则永远不会调用这些方法。(参见 4.2 节。)

To really understand operator overload methods and conversion operator methods, I strongly encourage you to examine the System.Decimal type as a role model. Decimal defines several constructors that allow you to convert objects from various types to a Decimal. It also offers several ToXxx methods that let you convert a Decimal object to another type. Finally, the type defines several conversion operators and operator overload methods as well.

💡小结:除了定义构造函数和一些其他方法来完成类型转换外,有些编程语言(比如 C#)还提供了转换操作符重载。转换操作符是将对象从一种类型转换成另一种类型的方法。可以使用特殊的语法来定义转换操作符方法。CLR 规范要求转换操作符重载方法必须是 public 和 static 方法。此外 C#(以及许多其他语言)要求参数类型和方法返回类型两者必有其一与定义转换方法的类型相同。之所以要进行这个限制,是为了使 C# 编译器能在一个合理的时间内找到要绑定的操作符方法。对于转换操作符方法,编译器既可生成代码来隐式调用转换操作符方法,也可只有在源代码进行了显式转型时才生成代码来调用转换操作符方法。在 C# 中,implicit 关键字告诉编译器为了生成代码来调用方法,不需要再源代码中进行显式转型。相反,explicit 关键字告诉编译器只有在发现了显示转型时,才调用方法。在 implicit 或 explicit 关键字之后,要指定 operator 关键字告诉编译器该方法是一个转换操作符。在 operator 之后,指定对象要转换成什么类型。在圆括号内,则指定要从什么类型转换。将对象从一种类型转换成另一种类型的方法的元数据总是叫做 op_Implicit 或者 op_Explicit。只有在转换不损失精度或数量级的前提下才能定义隐式转换操作符,否则就应该定义显式转换操作符,并且在显式转换失败时抛出 OverflowException 或者 InvalidOperationException 异常。如果检测到代码中存在隐式转换的代码,如果存在对应的隐式转换操作符方法,编译器就会在结果 IL 代码中生成对它的调用。如果编译器看到的是将对象从一种类型显式转换为另一种类型,就会查找能执行这种转换的隐式或显式转换操作符方法。如果找到一个,编译器就生成 IL 代码来调用它。如果没有找到合适的转换操作符方法,就报错并停止编译。

# Extension Methods

The best way to understand C#’s extension methods feature is by way of an example. In the “ StringBuilder Members” section in Chapter 14, “Chars, Strings, and Working with Text,” I mention how the StringBuilder class offers fewer methods than the String class for manipulating a string and how strange this is, considering that the StringBuilder class is the preferred way of manipulating a string because it is mutable. So, let’s say that you would like to define some of these missing methods yourself to operate on a StringBuilder . For example, you might want to define your own IndexOf method as follows.

public static class StringBuilderExtensions {
 public static Int32 IndexOf(StringBuilder sb, Char value) {
 for (Int32 index = 0; index < sb.Length; index++)
 if (sb[index] == value) return index;
 return -1;
 }
}

Now that you have defined this method, you can use it as the following code demonstrates.

StringBuilder sb = new StringBuilder("Hello. My name is Jeff."); // The initial string
// Change period to exclamation and get # characters in 1st sentence (5).
Int32 index = StringBuilderExtensions.IndexOf(sb.Replace('.', '!'), '!');

This code works just fine, but is it not ideal from a programmer’s perspective. The first problem is that a programmer who wants to get the index of a character within a StringBuilder must know that the StringBuilderExtensions class even exists. The second problem is that the code does not reflect the order of operations that are being performed on the StringBuilder object, making the code difficult to write, read, and maintain. The programmer wants to call Replace first and then call IndexOf ; but when you read the last line of code from left to right, IndexOf appears first on the line and Replace appears second. Of course, you could alleviate this problem and make the code’s behavior more understandable by rewriting it like this.

// First, change period to exclamation mark
sb.Replace('.', '!');
// Now, get # characters in 1st sentence (5)
Int32 index = StringBuilderExtensions.IndexOf(sb, '!');

However, a third problem exists with both versions of this code that affects understanding the code’s behavior. The use of StringBuilderExtensions is overpowering and detracts a programmer’s mind from the operation that is being performed: IndexOf . If the StringBuilder class had defined its own IndexOf method, then we could rewrite the code above as follows.

// Change period to exclamation and get # characters in 1st sentence (5).
Int32 index = sb.Replace('.', '!').IndexOf('!');

Wow, look how great this is in terms of code maintainability! In the StringBuilder object, we’re going to replace a period with an exclamation mark and then find the index of the exclamation mark.

Now, I can explain what C#’s extension methods feature does. It allows you to define a static method that you can invoke using instance method syntax. Or, in other words, we can now define our own IndexOf method and the three problems mentioned above go away. To turn the IndexOf method into an extension method, we simply add the this keyword before the first argument.

public static class StringBuilderExtensions {
 public static Int32 IndexOf(this StringBuilder sb, Char value) {
 for (Int32 index = 0; index < sb.Length; index++)
 if (sb[index] == value) return index;
 return -1;
 }
}

Now, when the compiler sees code like the following, the compiler first checks if the StringBuilder class or any of its base classes offers an instance method called IndexOf that takes a single Char parameter.

Int32 index = sb.IndexOf('X');

If an existing instance method exists, then the compiler produces IL code to call it. If no matching instance method exists, then the compiler will look at any static classes that define static methods called IndexOf that take as their first parameter a type matching the type of the expression being used to invoke the method. This type must also be marked with the this keyword. In this example, the expression is sb, which is of the StringBuilder type. In this case, the compiler is looking specifically for an IndexOf method that takes two parameters: a StringBuilder (marked with the this keyword) and a Char. The compiler will find our IndexOf method and produce IL code that calls our static method.

OK—so this now explains how the compiler improves the last two problems related to code understandability that I mentioned earlier. However, I haven’t yet addressed the first problem: how does a programmer know that an IndexOf method even exists that can operate on a StringBuilder object? The answer to this question is found in Microsoft Visual Studio’s IntelliSense feature. In the editor, when you type a period, Visual Studio’s IntelliSense window opens to show you the list of instance methods that are available. Well, that IntelliSense window also shows you any extension methods that exist for the type of expression you have to the left of the period. Figure 8-1 shows Visual Studio’s IntelliSense window; the icon for an extension method has a down arrow next to it, and the tooltip next to the method indicates that the method is really an extension method. This is truly awesome because it is now easy to define your own methods to operate on various types of objects and have other programmers discover your methods naturally when using objects of these types.

image-20221025222135596

# Rules and Guidelines

There are some additional rules and guidelines that you should know about extension methods:

  • C# supports extension methods only; it does not offer extension properties, extension events, extension operators, and so on.

  • Extension methods (methods with this before their first argument) must be declared in nongeneric, static classes. However, there is no restriction on the name of the class; you can call it whatever you want. Of course, an extension method must have at least one parameter, and only the first parameter can be marked with the this keyword.

  • The C# compiler looks only for extension methods defined in static classes that are themselves defined at the file scope. In other words, if you define the static class nested within another class, the C# compiler will emit the following message: error CS1109: Extension method must be defined in a top-level static class; StringBuilderExtensions is a nested class.

  • Because the static classes can have any name you want, it takes the C# compiler time to find extension methods because it must look at all the file-scope static classes and scan their static methods for a match. To improve performance and also to avoid considering an extension method that you may not want, the C# compiler requires that you “import” extension methods. For example, if someone has defined a StringBuilderExtensions class in a Wintellect namespace, then a programmer who wants to have access to this class’s extension methods must put a using Wintellect ; directive at the top of his or her source code file.

  • It is possible that multiple static classes could define the same extension method. If the compiler detects that two or more extension methods exist, then the compiler issues the following message: error CS0121: The call is ambiguous between the following methods or properties: ' StringBuilderExtensions.IndexOf(string, char) ' and ' AnotherStringBuilderExtensions.IndexOf(string, char) '. To fix this error, you must modify your source code. Specifically, you cannot use the instance method syntax to call this static method anymore; instead you must now use the static method syntax where you explicitly indicate the name of the static class to explicitly tell the compiler which method you want to invoke.

  • You should use this feature sparingly, because not all programmers are familiar with it. For example, when you extend a type with an extension method, you are actually extending derived types with this method as well. Therefore, you should not define an extension method whose first parameter is System.Object , because this method will be callable for all expression types and this will really pollute Visual Studio’s IntelliSense window.

  • There is a potential versioning problem that exists with extension methods. If, in the future, Microsoft adds an IndexOf instance method to their StringBuilder class with the same prototype as my code is attempting to call, then when I recompile my code, the compiler will bind to Microsoft’s IndexOf instance method instead of my static IndexOf method. Because of this, my program will experience different behavior. This versioning problem is another reason why this feature should be used sparingly.

# Extending Various Types with Extension Methods

In this chapter, I demonstrated how to define an extension method for a class, StringBuilder . I’d like to point out that because an extension method is really the invocation of a static method, the CLR does not emit code ensuring that the value of the expression used to invoke the method is not null.

// sb is null
StringBuilder sb = null;
// Calling extension method: NullReferenceException will NOT be thrown when calling IndexOf
// NullReferenceException will be thrown inside IndexOf’s for loop
sb.IndexOf('X');
// Calling instance method: NullReferenceException WILL be thrown when calling Replace
sb.Replace('.', '!');
I’d also like to point out that you can define extension methods for interface types, as the following code shows.
public static void ShowItems<T>(this IEnumerable<T> collection) {
 foreach (var item in collection) 
 Console.WriteLine(item);
}
The extension method above can now be invoked using any expression that results in a type that 
implements the IEnumerable<T> interface.
public static void Main() {
 // Shows each Char on a separate line in the console
 "Grant".ShowItems();
 // Shows each String on a separate line in the console
 new[] { "Jeff", "Kristin" }.ShowItems();
 // Shows each Int32 value on a separate line in the console
 new List<Int32>() { 1, 2, 3 }.ShowItems();
}

💡重要提示:扩展方法是 Microsoft 的 LINQ (Language Integrated Query,语言集成查询) 技术的基础。要想仔细研究提供了许多扩展方法的一个典型的类,请自行在文档中查看静态类 System.Linq.Enumerable 及其所有静态扩展方法。这个类中的每个扩展方法都扩展了 IEnumerableIEnumerable<T> 接口。

You can define extension methods for delegate types, too. Here is an example.

public static void InvokeAndCatch<TException>(this Action<Object> d, Object o)
 where TException : Exception {
 try { d(o); }
 catch (TException) { }
}

And here is an example of how to invoke it.

Action<Object> action = o => Console.WriteLine(o.GetType()); // Throws NullReferenceException
action.InvokeAndCatch<NullReferenceException>(null); // Swallows NullReferenceException

You can also add extension methods to enumerated types. I show an example of this in the “Adding Methods to Enumerated Types” section in Chapter 15, “Enumerated Types and Bit Flags.

And last but not least, I want to point out that the C# compiler allows you to create a delegate (see Chapter 17, “Delegates,” for more information) that refers to an extension method over an object.

public static void Main () {
 // Create an Action delegate that refers to the static ShowItems extension method
 // and has the first argument initialized to reference the "Jeff" string. 
 Action a = "Jeff".ShowItems;
 .
 .
 .
 // Invoke the delegate that calls ShowItems passing it a reference to the "Jeff" string.
 a();
}

In the preceding code, the C# compiler generates IL code to construct an Action delegate. When creating a delegate, the constructor is passed the method that should be called and is also passed a reference to an object that should be passed to the method’s hidden this parameter. Normally, when you create a delegate that refers to a static method, the object reference is null because static methods don’t have a this parameter. However, in this example, the C# compiler generated some special code that creates a delegate that refers to a static method (ShowItems) and the target object of the static method is the reference to the “Jeff” string. Later, when the delegate is invoked, the CLR will call the static method and will pass to it the reference to the “Jeff” string. This is a little hacky, but it works great and it feels natural so long as you don’t think about what is happening internally.

# The Extension Attribute

It would be best if this concept of extension methods was not C#-specific. Specifically, we want programmers to define a set of extension methods in some programming language and for people in other programming languages to take advantage of them. For this to work, the compiler of choice must support searching static types and methods for potentially matching extension methods. And compilers need to do this quickly so that compilation time is kept to a minimum.

In C#, when you mark a static method’s first parameter with the this keyword, the compiler internally applies a custom attribute to the method and this attribute is persisted in the resulting file’s metadata. The attribute is defined in the System.Core.dll assembly, and it looks like this.

// Defined in the System.Runtime.CompilerServices namespace
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class | AttributeTargets.Assembly)]
public sealed class ExtensionAttribute : Attribute {
}

In addition, this attribute is applied to the metadata for any static class that contains at least one extension method. And this attribute is also applied to the metadata for any assembly that contains at least one static class that contains an extension method. So now, when compiling code that invokes an instance method that doesn’t exist, the compiler can quickly scan all the referenced assemblies to know which ones contain extension methods. Then it can scan only these assemblies for static classes that contain extension methods, and it can scan just the extension methods for potential matches to compile the code as quickly as possible.

💡Note : The ExtensionAttribute class is defined in the System.Core.dll assembly. This means that the resulting assembly produced by the compiler will have a reference to System.Core.dll embedded in it even if I do not use any types from System.Core.dll and do not even reference System.Core.dll when compiling my code. However, this is not too bad a problem because the ExtensionAttribute is used only at compile time; at run time, System.Core.dll will not have to be loaded unless the application consumes something else in this assembly.

💡小结:C# 扩展方法允许在一个静态类中定义一个静态方法,并用实例方法的语法来调用。扩展方法顾名思义时对类行方法的扩展,在调用方法时首先会检查类或者它的任何基类是否定义该名称的实例方法。如果定义了,就生成 IL 代码来调用它。如果没有找到匹配的实例方法,就继续检查是否有任何静态类定义了该名称的静态方法,方法的第一个参数的类型和当前用于调用方法的那个表达式的类型匹配,而且该类型必须用 this 关键字标识。关于扩展方法,有些规则和原则需要注意。例如 C# 只支持扩展方法,不支持扩展属性、扩展事件、扩展操作符等。扩展方法必须在非泛型的静态类中声明。扩展方法至少要有一个参数,而且只有第一个参数能用 this 关键字标记。C# 编译器在静态类中查找扩展方法时,要求静态类本身必须具有文件作用域。用一个扩展方法扩展一个类型时,同时也扩展了派生类型。由于扩展方法实际是对一个静态方法的调用,所以 CLR 不会生成代码对调用方法的表达式的值进行 null 值检查(不保证它非空)。我们还可以为接口类型、委托类型、枚举类型定义扩展方法。C# 编译器允许创建委托来引用一个对象上的扩展方法。在 C# 中,一旦 this 关键字标记了某个静态方法的第一个参数,编译器就会在内部向该方法应用一个定制特性。该特性会在最终生成的文件的元数据中持久性地存储下来。除此之外,任何静态类只要包含至少一个扩展方法,它的元数据中也会应用这个特性。类似地,任何程序集只要包含了至少一个符合上述特点的静态类,它的元数据中也会应用这个特性。

# Partial Methods

Imagine that you use a tool that produces a C# source code file containing a type definition. The tool knows that there are potential places within the code it produces where you might want to customize the type’s behavior. Normally, customization would be done by having the tool-produced code invoke virtual methods. The tool-produced code would also have to contain definitions for these virtual methods, and the way these methods would be implemented is to do nothing and simply return. Now, if you want to customize the behavior of the class, you’d define your own class, derive it from the base class, and then override any virtual methods implementing it so that it has the behavior you desire. Here is an example.

// Tool-produced code in some source code file:
internal class Base {
 private String m_name;
 // Called before changing the m_name field
 protected virtual void OnNameChanging(String value) { 
 }
 public String Name {
 get { return m_name; }
 set { 
 OnNameChanging(value.ToUpper()); // Inform class of potential change
 m_name = value; // Change the field
 }
 }
}
// Developer-produced code in some other source code file:
internal class Derived : Base {
 protected override void OnNameChanging(string value) {
 if (String.IsNullOrEmpty(value)) 
 throw new ArgumentNullException("value");
 }
}

Unfortunately, there are two problems with the preceding code:

  • The type must be a class that is not sealed. You cannot use this technique for sealed classes or for value types (because value types are implicitly sealed). In addition, you cannot use this technique for static methods because they cannot be overridden.

  • There are efficiency problems here. A type is being defined just to override a method; this wastes a small amount of system resources. And, even if you do not want to override the behavior of OnNameChanging , the base class code still invokes a virtual method that simply does nothing but return. Also, ToUpper is called whether OnNameChanging accesses the argument passed to it or not.

C#’s partial methods feature allows you the option of overriding the behavior or a type while fixing the aforementioned problems. The code below uses partial methods to accomplish the same semantic as the previous code.

// Tool-produced code in some source code file:
internal sealed partial class Base {
 private String m_name;
 // This defining-partial-method-declaration is called before changing the m_name field
 partial void OnNameChanging(String value);
 public String Name {
 get { return m_name; }
 set { 
 OnNameChanging(value.ToUpper()); // Inform class of potential change
 m_name = value; // Change the field
 }
 }
}
// Developer-produced code in some other source code file:
internal sealed partial class Base {
 // This implementing-partial-method-declaration is called before m_name is changed 
 partial void OnNameChanging(String value) {
 if (String.IsNullOrEmpty(value)) 
 throw new ArgumentNullException("value");
 }
}

There are several things to notice about this new version of the code:

  • The class is now sealed (although it doesn’t have to be). In fact, the class could be a static class or even a value type.

  • The tool-produced code and the developer-produced code are really two partial definitions that ultimately make up one type definition. For more information about partial types, see the “Partial Classes, Structures, and Interfaces” section in Chapter 6, “Type and Member Basics.”

  • The tool-produced code defined a partial method declaration. This method is marked with the partial token and it has no body.

  • The developer-produced code implemented the partial method declaration. This method is also marked with the partial token and it has a body.

Now, when you compile this code, you see the same effect as the original code I showed you. Again, the big benefit here is that you can rerun the tool and produce new code in a new source code file, but your code remains in a separate file and is unaffected. And, this technique works for sealed classes, static classes, and value types.

💡注意:在 Visual Studio 编辑器中,如果输入 partial 并按空格键,“智能感知” 窗口会列出当前类型定义的、还没有匹配实现的所有分部方法声明。可以方便地从窗口中选择一个分部方法。然后,Visual Studio 会自动生成方法原型。这个功能提高了编程效率。

But, there is another big improvement we get with partial methods. Let’s say that you do not need to modify the behavior of the tool-produced type. In this case, you do not supply your source code file at all. If you just compile the tool-produced code by itself, the compiler produces IL code and metadata as if the tool-produced code looked like this.

// Logical equivalent of tool-produced code if there is no 
// implementing partial method declaration:
internal sealed class Base {
 private String m_name;
 public String Name {
 get { return m_name; }
 set { 
 m_name = value; // Change the field
 }
 }
}

That is, if there is no implementing partial method declaration, the compiler will not emit any metadata representing the partial method. In addition, the compiler will not emit any IL instructions to call the partial method. And the compiler will not emit code that evaluates any arguments that would have been passed to the partial method. In this example, the compiler will not emit code to call the ToUpper method. The result is that there is less metadata/IL, and the run-time performance is awesome!

💡注意:分部方法的工作方式类似于 System.Diagnostics.ConditionalAttribute 特性。然而,分部方法只能在单个类型中使用,而 ConditionalAttribute 能用于对另一个类型中定义的方法进行有选择的调用。

# Rules and Guidelines

There are some additional rules and guidelines that you should know about partial methods:

  • They can only be declared within a partial class or struct.

  • Partial methods must always have a return type of void, and they cannot have any parameters marked with the out modifier. These restrictions are in place because at run time, the method may not exist and so you can’t initialize a variable to what the method might return because the method might not exist. Similarly, you can’t have an out parameter because the method would have to initialize it and the method might not exist. A partial method may have ref parameters, may be generic, may be instance or static, and may be marked as unsafe.

  • Of course, the defining partial method declaration and the implementing partial method declaration must have identical signatures. If both have custom attributes applied to them, then the compiler combines both methods’ attributes together. Any attributes applied to a parameter are also combined.

  • If there is no implementing partial method declaration, then you cannot have any code that attempts to create a delegate that refers to the partial method. Again, the reason is that the method doesn’t exist at run time. The compiler produces this message: error CS0762: Cannot create delegate from method ' Base.OnNameChanging(string) ' because it is a partial method without an implementing declaration.

  • Partial methods are always considered to be private methods. However, the C# compiler forbids you from putting the private keyword before the partial method declaration.

💡小结:假设我们用工具生成了包含类型定义的 C# 源代码文件,并且想要在代码的某些位置定义类型的行为,正常情况下,是让工具生成的代码调用虚方法来进行定制。但是这样的代码存在两个问题。一是类型必须是非密封的类。二是效率问题,会调用很多多余的代码。利用 C# 的分部方法功能,可以解决上述问题的同时覆盖类的行为。使用分部方法的好处在于,可以重新运行工具,在新的源代码文件中生成新的代码,但我们自己的代码时存储在一个单独的文件中的,不会收到影响。另外,这个技术可用于密封类、静态类以及值类型。如果没有实现分布方法,编译器不会生成任何代表分布方法的元数据。此外,编译器不会生成任何调用分部方法的 IL 指令。而且,编译器不会生成对本该传给分部方法的实参进行求值的 IL 指令。对于分部方法,有一些规则和原则需要注意。它们只能在分部类或结构中声明。分部方法的返回类型始终是 void,任何参数都不能用 out 修饰符来标记。之所以有这两个限制,是因为方法在运行时可能不存在,所以不能将变量初始化为方法也许会返回的东西。类似地,不允许 out 参数是因为方法必须初始化它,而方法可能不存在。分部方法可以有 ref 参数,可以是泛型方法,可以是实例或静态方法,而且可标记为 unsafe 。当然,分部方法的声明和实现必须具有完全一致的签名。如果两者都应了定制特性,编译器会合并两个方法的特性。应用于参数的任何特性也会合并。如果没有对应的实现部分,便不能在代码中创建一个委托来引用这个分部方法。这同样是由于方法在运行时不存在。分部方法总是被视为 private 方法,但 C# 编译器禁止在分部方法声明之前添加 private 关键字。