# Chapter 6 Type and Member Basics
# The Different Kinds of Type Members
A type can define zero or more of the following kinds of members:
Constants A constant is a symbol that identifies a never-changing data value. These symbols are typically used to make code more readable and maintainable. Constants are always associated with a type, not an instance of a type. Logically, constants are always static members. Discussed in Chapter 7, “Constants and Fields.”
Fields A field represents a read-only or read/write data value. A field can be static, in which case the field is considered part of the type’s state. A field can also be instance (nonstatic), in which case it’s considered part of an object’s state. I strongly encourage you to make fields private so that the state of the type or object can’t be corrupted by code outside of the defining type. Discussed in Chapter 7.
Instance constructors An instance constructor is a special method used to initialize a new object’s instance fields to a good initial state. Discussed in Chapter 8, “Methods.”
Type constructors A type constructor is a special method used to initialize a type’s static fields to a good initial state. Discussed in Chapter 8.
Methods A method is a function that performs operations that change or query the state of a type (static method) or an object (instance method). Methods typically read and write to the fields of the type or object. Discussed in Chapter 8.
Operator overloads An operator overload is a method that defines how an object should be manipulated when certain operators are applied to the object. Because not all programming languages support operator overloading, operator overload methods are not part of the Common Language Specification (CLS). Discussed in Chapter 8.
Conversion operators A conversion operator is a method that defines how to implicitly or explicitly cast or convert an object from one type to another type. As with operator overload methods, not all programming languages support conversion operators, so they’re not part of the CLS. Discussed in Chapter 8.
Properties A property is a mechanism that allows a simple, field-like syntax for setting or querying part of the logical state of a type (static property) or object (instance property) while ensuring that the state doesn’t become corrupt. Properties can be parameterless (very common) or parameterful (fairly uncommon but used frequently with collection classes). Discussed in Chapter 10, “Properties.”
Events A static event is a mechanism that allows a type to send a notification to one or more static or instance methods. An instance (nonstatic) event is a mechanism that allows an object to send a notification to one or more static or instance methods. Events are usually raised in response to a state change occurring in the type or object offering the event. An event consists of two methods that allow static or instance methods to register and unregister interest in the event. In addition to the two methods, events typically use a delegate field to maintain the set of registered methods. Discussed in Chapter 11, “Events.”
Types A type can define other types nested within it. This approach is typically used to break a large, complex type down into smaller building blocks to simplify the implementation.
Again, the purpose of this chapter isn’t to describe these various members in detail but to set the stage and explain what these various members all have in common.
Regardless of the programming language you’re using, the corresponding compiler must process your source code and produce metadata and Intermediate Language (IL) code for each kind of member in the preceding list. The format of the metadata is identical regardless of the source programming language you use, and this feature is what makes the CLR a common language run time. The metadata is the common information that all languages produce and consume, enabling code in one programming language to seamlessly access code written in a completely different programming language.
This common metadata format is also used by the CLR, which determines how constants, fields, constructors, methods, properties, and events all behave at run time. Simply stated, metadata is the key to the whole Microsoft .NET Framework development platform; it enables the seamless integration of languages, types, and objects.
The following C# code shows a type definition that contains an example of all the possible members. The code shown here will compile (with warnings), but it isn’t representative of a type that you’d normally create; most of the methods do nothing of any real value. Right now, I just want to show you how the compiler translates this type and its members into metadata. Once again, I’ll discuss the individual members in the next few chapters.
using System; | |
public sealed class SomeType { // 1 | |
// Nested class | |
private class SomeNestedType { } // 2 | |
// Constant, read-only, and static read/write field | |
private const Int32 c_SomeConstant = 1; // 3 | |
private readonly String m_SomeReadOnlyField = "2"; // 4 | |
private static Int32 s_SomeReadWriteField = 3; // 5 | |
// Type constructor | |
static SomeType() { } // 6 | |
// Instance constructors | |
public SomeType(Int32 x) { } // 7 | |
public SomeType() { } // 8 | |
// Instance and static methods | |
private String InstanceMethod() { return null; } // 9 | |
public static void Main() {} // 10 | |
// Instance property | |
public Int32 SomeProp { // 11 | |
get { return 0; } // 12 | |
set { } // 13 | |
} | |
// Instance parameterful property (indexer) | |
public Int32 this[String s] { // 14 | |
get { return 0; } // 15 | |
set { } // 16 | |
} | |
// Instance event | |
public event EventHandler SomeEvent; // 17 | |
} |
If you were to compile the type just defined and examine the metadata in ILDasm.exe, you’d see the output shown in Figure 6-1.
Notice that all the members defined in the source code cause the compiler to emit some metadata. In fact, some of the members cause the compiler to generate additional members as well as additional metadata. For example, the event member (17) causes the compiler to emit a field, two methods, and some additional metadata. I don’t expect you to fully understand what you’re seeing here now. But as you read the next few chapters, I encourage you to look back to this example to see how the member is defined and what effect it has on the metadata produced by the compiler.
💡小结:类型中可以定义各种各样的成员,这些成员有:常量、字段、实例构造器、类型构造器、方法、操作符重载、转换操作符、属性、事件、类型。值得一提的是,C# 中的常量总与类型关联,不与类型的实例关联。常量逻辑上总是静态成员。无论什么编程语言,编译器都必须能处理源代码,为上述每种成员生成元数据和 IL 代码。所有编程语言生成的元数据格式完全一致。这正是 CLR 成为 “公共语言运行时” 的原因。元数据是所有语言都生成和使用的公共信息。正是由于有元数据,用一种语言写的代码才能无缝访问用另一种语言写的代码。元数据是整个 Microsoft.NET Framework 开发平台的关键,它实现了编程语言、类型和对象的无缝集成,源代码中定义的所有成员都会造成编译器生成元数据。
# Type Visibility
When defining a type at file scope (versus defining a type nested within another type), you can specify the type’s visibility as being either public or internal. A public type is visible to all code within the defining assembly as well as all code written in other assemblies. An internal type is visible to all code within the defining assembly, and the type is not visible to code written in other assemblies. If you do not explicitly specify either of these when you define a type, the C# compiler sets the type’s visibility to internal (the more restrictive of the two). Here are some examples.
using System; | |
// The type below has public visibility and can be accessed by code | |
// in this assembly as well as code written in other assemblies. | |
public class ThisIsAPublicType { ... } | |
// The type below has internal visibility and can be accessed by code | |
// in this assembly only. | |
internal class ThisIsAnInternalType { ... } | |
// The type below is internal because public/internal | |
// is not explicitly stated. | |
class ThisIsAlsoAnInternalType { ... } |
# Friend Assemblies
Imagine the following scenario: a company has one team, TeamA, that is defining a bunch of utility types in one assembly, and they expect these types to be used by members in another team, TeamB. For various reasons such as time schedules or geographical location, or perhaps different cost centers or reporting structures, these two teams cannot build all of their types into a single assembly; instead, each team produces its own assembly file.
In order for TeamB’s assembly to use TeamA’s types, TeamA must define all of their utility types as public. However, this means that their types are publicly visible to any and all assemblies; developers in another company could write code that uses the public utility types, and this is not desirable. Maybe the utility types make certain assumptions that TeamB ensures when they write code that uses TeamA’s types. What we’d like to have is a way for TeamA to define their types as internal while still allowing TeamB to access the types. The CLR and C# support this via friend assemblies. This friend assembly feature is also useful when you want to have one assembly containing code that performs unit tests against the internal types within another assembly.
When an assembly is built, it can indicate other assemblies it considers “friends” by using the
InternalsVisibleTo
attribute defined in theSystem.Runtime.CompilerServices
namespace. The attribute has a string parameter that identifies the friend assembly’s name and public key (the string you pass to the attribute must not include a version, culture, or processor architecture). Note that friend assemblies can access all of an assembly’s internal types as well as these type’s internal members. Here is an example of how an assembly can specify two other strongly named assemblies named “Wintellect” and “Microsoft” as its friend assemblies.
using System; | |
using System.Runtime.CompilerServices; // For InternalsVisibleTo attribute | |
// This assembly's internal types can be accessed by any code written | |
// in the following two assemblies (regardless of version or culture): | |
[assembly:InternalsVisibleTo("Wintellect, PublicKey=12345678...90abcdef")] | |
[assembly:InternalsVisibleTo("Microsoft, PublicKey=b77a5c56...1934e089")] | |
internal sealed class SomeInternalType { ... } | |
internal sealed class AnotherInternalType { ... } |
Accessing the above assembly’s internal types from a friend assembly is trivial. For example, here’s how a friend assembly called “Wintellect” with a public key of “12345678...90abcdef” can access the internal type
SomeInternalType
in the preceding assembly.
using System; | |
internal sealed class Foo { | |
private static Object SomeMethod() { | |
// This "Wintellect" assembly accesses the other assembly's | |
// internal type as if it were a public type | |
SomeInternalType sit = new SomeInternalType(); | |
return sit; | |
} | |
} |
Because the internal members of the types in an assembly become accessible to friend assemblies, you should think carefully about what accessibility you specify for your type’s members and which assemblies you declare as your friends. Note that the C# compiler requires you to use the /out: compiler switch when compiling the friend assembly (the assembly that does not contain the
InternalsVisibleTo
attribute). The switch is required because the compiler needs to know the name of the assembly being compiled in order to determine if the resulting assembly should be considered a friend assembly. You would think that the C# compiler could determine this on its own because it normally determines the output file name on its own; however, the compiler doesn’t decide on an output file name until it is finished compiling the code. So requiring the /out: compiler switch improves the performance of compiling significantly.
Also, if you are compiling a module (as opposed to an assembly) using C#’s /t:module switch, and this module is going to become part of a friend assembly, you need to compile the module by using the C# compiler’s /moduleassemblyname: switch as well. This tells the compiler what assembly the module will be a part of so the compiler can allow code in the module to access the other assembly’s internal types.
💡小结:public 类型不仅对定义程序集中的所有代码可见,还对其他程序集中的代码可见。internal 类型则仅对定义程序集中的所有代码可见,对其他程序集中的代码不可见。定义类型时不显示指定可见性,C# 编译器会将其默认为 internal 级别的可见性。有时我们会希望网文另一个程序集中的 internal 类型,CLR 和 C# 通过友元程序集(friend assembly)提供这方面的支持。用一个程序集中的代码对另一个程序集中的内部类型进行单元测试时,友元程序集功能也能派上用场。生成程序集时,可用 System.Runtime.CompilerServices
命名空间中的 InternalsVisibleTo
特性表明它认为是 “友元” 的其他程序集。该特性获取标识友元程序集名称和公钥的字符串参数(传给该特性的字符串绝不能包含版本、语言文化和处理器架构)。C# 编译器在编译友元程序集(不含 InternalsVisibleTo
特性的程序集)时要求使用编译器开关 /out:<file>
。使用这个编译器开关的原因在于,编译器需要知道准备编译的程序集的名称,从而判断生成的程序集是不是友元程序集。同样地,如果使用 C# 编译器的 /t:module
开关来编译模块(而不是编译成程序集),而且该模块将成为某个友元程序集的一部分,那么还需要使用 C# 编译器的 /moduleassemblyname:<string>
开关来编译该模块。
# Member Accessibility
When defining a type’s member (which includes nested types), you can specify the member’s accessibility. A member’s accessibility indicates which members can be legally accessed from referent code. The CLR defines the set of possible accessibility modifiers, but each programming language chooses the syntax and term it wants developers to use when applying the accessibility to a member. For example, the CLR uses the term Assembly to indicate that a member is accessible to any code within the same assembly, whereas the C# term for this is internal.
Table 6-1 shows the six accessibility modifiers that can be applied to a member. The rows of the table are in order from most restrictive (Private) to least restrictive (Public).
Of course, for any member to be accessible, it must be defined in a type that is visible. For example, if AssemblyA defines an internal type with a public method, code in AssemblyB cannot call the public method because the internal type is not visible to AssemblyB.
When compiling code, the language compiler is responsible for checking that the code is referencing types and members correctly. If the code references some type or member incorrectly, the compiler has the responsibility of emitting the appropriate error message. In addition, the just-in-time (JIT) compiler also ensures that references to fields and methods are legal when compiling IL code into native CPU instructions at run time. For example, if the JIT compiler detects code that is improperly attempting to access a private field or method, the JIT compiler throws a
FieldAccessException
or aMethodAccessException
, respectively.
Verifying the IL code ensures that a referenced member’s accessibility is properly honored at run time, even if a language compiler ignored checking the accessibility. Another, more likely, possibility is that the language compiler compiled code that accessed a public member in another type (in another assembly); but at run time, a different version of the assembly is loaded, and in this new version, the public member has changed and is now protected or private.
In C#, if you do not explicitly declare a member’s accessibility, the compiler usually (but not always) defaults to selecting private (the most restrictive of them all). The CLR requires that all members of an interface type be public. The C# compiler knows this and forbids the programmer from explicitly specifying accessibility on interface members; the compiler just makes all the members public for you.
Furthermore, you’ll notice the CLR offers an accessibility called Family and Assembly. However, C# doesn’t expose this in the language. The C# team felt that this accessibility was for the most part useless and decided not to incorporate it into the C# language.
💡Note:从 C#7.2 开始支持 private protected,它是 protected 和 internal 可访问性的交集。若一个成员是 private protected 的,那么该成员只能够在其类型中或被其相同程序集中的子类型访问(它的可访问性比 protected 和 internal 都低)。
When a derived type is overriding a member defined in its base type, the C# compiler requires that the original member and the overriding member have the same accessibility. That is, if the member in the base class is protected, the overriding member in the derived class must also be protected. However, this is a C# restriction, not a CLR restriction. When deriving from a base class, the CLR allows a member’s accessibility to become less restrictive but not more restrictive. For example, a class can override a protected method defined in its base class and make the overridden method public (more accessible). However, a class cannot override a protected method defined in its base class and make the overridden method private (less accessible). The reason a class cannot make a base class method more restricted is because a user of the derived class could always cast to the base type and gain access to the base class’s method. If the CLR allowed the derived type’s method to be less accessible, it would be making a claim that was not enforceable.
💡小结:定义类型的成员(包括嵌套类型)时,可指定成员的可访问性。任何成员要想被访问,都必须在可见的类型中定义。编译代码时,编程语言的编译器会检查代码是不是正确引用了类型和成员。如果代码不正确地引用了类型或成员,编译器会生成一条合适的错误消息。另外,在运行时将 IL 代码编译成本机 CPU 指令时,JIT 编译器也会确保对字段和方法的引用合法。通过对 IL 代码进行验证,可确保被引用成员的可访问性在运行时得到正确兑现 -- 即使语言的编译器忽略了对可访问性的检查。在 C# 中,如果没有显式声明成员的可访问性,编译器通常默认选择 private。CLR 要求接口类型的所有成员都具有 public 可访问性。C# 编译器知道这一点,因此禁止开发人员显式指定接口成员的可访问性;编译器自动将所有成员的可访问性设为 public。派生类型重写基类型定义的成员时,C# 编译器要求原始成员和重写成员具有相同的可访问性。但这时 C# 的限制,不是 CLR 的。从基类派生时,CLR 允许放宽但不允许收紧成员的可访问性限制。之所以不能在派生类中收紧对基类方法的访问,是因为 CLR 承诺派生类总能转型为基类,并获取对基类方法的访问权。如果允许派生类收紧限制,CLR 的承诺就无法兑现了。
# Static Classes
There are certain classes that are never intended to be instantiated, such as Console, Math, Environment, and ThreadPool. These classes have only static members and, in fact, the classes exist simply as a way to group a set of related members together. For example, the Math class defines a bunch of methods that do math-related operations. C# allows you to define non-instantiable classes by using the C# static keyword. This keyword can be applied only to classes, not structures (value types) because the CLR always allows value types to be instantiated and there is no way to stop or prevent this.
The compiler enforces many restrictions on a static class:
- The class must be derived directly from System.Object because deriving from any other base class makes no sense because inheritance applies only to objects, and you cannot create an instance of a static class.
- The class must not implement any interfaces because interface methods are callable only when using an instance of a class.
- The class must define only static members (fields, methods, properties, and events). Any instance members cause the compiler to generate an error.
- The class cannot be used as a field, method parameter, or local variable because all of these would indicate a variable that refers to an instance, and this is not allowed. If the compiler detects any of these uses, the compiler issues an error.
Here is an example of a static class that defines some static members; this code compiles (with a warning) but the class doesn’t do anything interesting.
using System; | |
public static class AStaticClass { | |
public static void AStaticMethod() { } | |
public static String AStaticProperty { | |
get { return s_AStaticField; } | |
set { s_AStaticField = value; } | |
} | |
private static String s_AStaticField; | |
public static event EventHandler AStaticEvent; | |
} |
If you compile the code above into a library (DLL) assembly and look at the result by using ILDasm.exe, you’ll see what is shown in Figure 6-2. As you can see in Figure 6-2, defining a class by using the static keyword causes the C# compiler to make the class both abstract and sealed. Furthermore, the compiler will not emit an instance constructor method into the type. Notice that there is no instance constructor (.ctor) method shown in Figure 6-2.
💡小结:有一些永远不需要实例化的类,这些类常被称作工具类。在 C# 中,要用 static 关键字定义不可实例化的类。该关键字只能应用于类,不能应用于结构(值类型)。因为 CLR 总是允许值类型实例化,这时没办法阻止的。C# 编译器对静态类进行了一些限制,例如静态类必须直接从基类 System.Object
派生,从其他任何基类派生都没有意义。静态类不能实现任何接口,这时因为只有使用类的实例时,才可调用类的接口方法。使用关键字 static 定义类,将导致 C# 编译器将该类标记为 abstract 和 sealed。另外,编译器也不会在类型中生成实例构造器方法。
# Partial Classes, Structures, and Interfaces
In this section, I discuss partial classes, structures, and interfaces. The partial keyword tells the C# compiler that the source code for a single class, structure, or interface definition may span one or more source code files. It should be noted that the compiler combines all of a type’s partials together at compile time; the CLR always works on complete type definitions. There are three main reasons why you might want to split the source code for a type across multiple files:
Source control Suppose a type’s definition consists of a lot of source code, and a programmer checks it out of source control to make changes. No other programmer will be able to modify the type at the same time without doing a merge later. Using the partial keyword allows you to split the code for the type across multiple source code files, each of which can be checked out individually so that multiple programmers can edit the type at the same time.
Splitting a class, structure, or interface into distinct logical units within a single file I sometimes create a single type that provides multiple features so that the type can provide a complete solution. To simplify my implementation, I will sometimes declare the same partial type repeatedly within a single source code file. Then, in each part of the partial type, I implement one feature with all its fields, methods, properties, events, and so on. This allows me to easily see all the members that provide a single feature grouped together, which simplifies my coding. Also, I can easily comment out a part of the partial type to remove a whole feature from the type and replace it with another implementation (via a new part of the partial type).
Code spitters In Microsoft Visual Studio, when you create a new project, some source code files are created automatically as part of the project. These source code files contain templates that give you a head start at building these kinds of projects. When you use the Visual Studio designers and drag and drop controls onto the design surface, Visual Studio writes source code for you automatically and spits this code into the source code files. This really improves your productivity. Historically, the generated code was emitted into the same source code file that you were working on. The problem with this is that you might edit the generated code accidentally and cause the designers to stop functioning correctly. Starting with Visual Studio 2005, when you create a new form, user control, and so on, Visual Studio creates two source code files: one for your code and the other for the code generated by the designer. Because the designer code is in a separate file, you’ll be far less likely to accidentally edit it.
The partial keyword is applied to the types in all files. When the files are compiled together, the compiler combines the code to produce one type that is in the resulting .exe or .dll assembly file (or .netmodule module file). As I stated in the beginning of this section, the partial types feature is completely implemented by the C# compiler; the CLR knows nothing about partial types at all. This is why all of the source code files for the type must use the same programming language, and they must all be compiled together as a single compilation unit.
💡小结:partial 关键字告诉 C# 编译器:类、结构或接口的定义源代码可能要分散到一个或多个源代码文件中。将类型源代码分散到多个文件的原因可能有以下几种:源代码控制、在同一个文件中将类或结构分解成不同的逻辑单元、代码拆分。要将 partial 关键字应用于所有文件中的类型,这些文件编译到一起时,编译器会合并代码,在最后的.exe 或.dll 程序集文件(或.netmodule 模块文件)中生成单个类型。“分布类型” 功能完全由 C# 编译器实现,CLR 对该功能一无所知,这解释了一个类型的所有源代码文件为什么必须使用相同的语言,而且必须作为一个编译单元编译到一起。
# Components, Polymorphism, and Versioning
Object-oriented programming (OOP) has been around for many, many years. When it was first used in the late 1970s/early 1980s, applications were much smaller in size and all the code to make the application run was written by one company. Sure, there were operating systems back then and applications did make use of what they could out of those operating systems, but the operating systems offered very few features compared with the operating systems of today.
Today, software is much more complex and users demand that applications offer rich features such as GUIs, menu items, mouse input, tablet input, printer output, networking, and so on. For this reason, our operating systems and development platforms have grown substantially over recent years. Furthermore, it is no longer feasible or even cost effective for application developers to write all of the code necessary for their application to work the way users expect. Today, applications consist of code produced by many different companies. This code is stitched together using an object-oriented paradigm.
Component Software Programming (CSP) is OOP brought to this level. Here are some attributes of a component:
■ A component (an assembly in the .NET Framework) has the feeling of being “published.”
■ A component has an identity (a name, version, culture, and public key).
■ A component forever maintains its identity (the code in an assembly is never statically linked into another assembly; .NET always uses dynamic linking).
■ A component clearly indicates the components it depends upon (reference metadata tables).
■ A component should document its classes and members. C# offers this by allowing in-source Extensible Markup Language (XML) documentation along with the compiler’s /doc commandline switch.
■ A component must specify the security permissions it requires. The CLR’s code access security (CAS) facilities enable this.
■ A component publishes an interface (object model) that won’t change for any servicings. A servicing is a new version of a component whose intention is to be backward compatible with the original version of the component. Typically, a servicing version includes bug fixes, security patches, and possibly some small feature enhancements. But a servicing cannot require any new dependencies or any additional security permissions.
As indicated by the last bullet, a big part of CSP has to do with versioning. Components will change over time and components will ship on different time schedules. Versioning introduces a whole new level of complexity for CSP that didn’t exist with OOP, with which all code was written, tested, and shipped as a single unit by a single company. In this section, I’m going to focus on component versioning.
In the .NET Framework, a version number consists of four parts: a major part, a minor part, a build part, and a revision part. For example, an assembly whose version number is 1.2.3.4 has a major part of 1, a minor part of 2, a build part of 3, and a revision part of 4. The major/minor parts are typically used to represent a consistent and stable feature set for an assembly and the build/revision parts are typically used to represent a servicing of this assembly’s feature set.
Let’s say that a company ships an assembly with version 2.7.0.0. If the company later wants to fix a bug in this component, they would produce a new assembly in which only the build/revision parts of the version are changed, something like version 2.7.1.34. This indicates that the assembly is a servicing whose intention is to be backward compatible with the original component (version 2.7.0.0).
On the other hand, if the company wants to make a new version of the assembly that has significant changes to it and is therefore not intended to be backward compatible with the original assembly, the company is really creating a new component and the new assembly should be given a version number in which the major/minor parts are different from the existing component (version 3.0.0.0, for example).
💡注意:此处只是说明应该如何看待版本号。遗憾的是,CLR 不以这种方式看待版本号。现在,CLR 将版本号看成是固定值,如果某个程序集依赖版本号为 1.2.3.4 的另 个程序集,CLR 只会尝试加载版本号为 1.2.3.4 的程序集 (除非设置了绑定重定向)。
Now that we’ve looked at how we use version numbers to update a component’s identity to reflect a new version, let’s take a look at some of the features offered by the CLR and programming languages (such as C#) that allow developers to write code that is resilient to changes that may be occurring in components that they are using.
Versioning issues come into play when a type defined in a component (assembly) is used as the base class for a type in another component (assembly). Obviously, if the base class versions (changes) underneath the derived class, the behavior of the derived class changes as well, probably in a way that causes the class to behave improperly. This is particularly true in polymorphism scenarios in which a derived type overrides virtual methods defined by a base type.
C# offers five keywords that you can apply to types and/or type members that impact component versioning. These keywords map directly to features supported in the CLR to support component versioning. Table 6-2 contains the C# keywords related to component versioning and indicates how each keyword affects a type or type member definition.
I will demonstrate the value and use of all these keywords in the upcoming section titled “Dealing with Virtual Methods When Versioning Types.” But before we get to a versioning scenario, let’s focus on how the CLR actually calls virtual methods.
# How the CLR Calls Virtual Methods, Properties, and Events
In this section, I will be focusing on methods, but this discussion is relevant to virtual properties and virtual events as well. Properties and events are actually implemented as methods; this will be shown in their corresponding chapters.
Methods represent code that performs some operation on the type (static methods) or an instance of the type (nonstatic methods). All methods have a name, a signature, and a return type (that may be void). The CLR allows a type to define multiple methods with the same name as long as each method has a different set of parameters or a different return type. So it’s possible to define two methods with the same name and same parameters as long as the methods have a different return type. However, except for IL assembly language, I’m not aware of any language that takes advantage of this “feature”; most languages (including C#) require that methods differ by parameters and ignore a method’s return type when determining uniqueness. (C# actually relaxes this restriction when defining conversion operator methods; see Chapter 8 for details.)
The Employee class shown below defines three different kinds of methods.
internal class Employee { | |
// A nonvirtual instance method | |
public Int32 GetYearsEmployed() { ... } | |
// A virtual method (virtual implies instance) | |
public virtual String GetProgressReport() { ... } | |
// A static method | |
public static Employee Lookup(String name) { ... } | |
} |
When the compiler compiles this code, the compiler emits three entries in the resulting assembly’s method definition table. Each entry has flags set indicating if the method is instance, virtual, or static.
When code is written to call any of these methods, the compiler emitting the calling code examines the method definition’s flags to determine how to emit the proper IL code so that the call is made correctly. The CLR offers two IL instructions for calling a method:
■ The
call
IL instruction can be used to call static, instance, and virtual methods. When thecall
instruction is used to call a static method, you must specify the type that defines the method that the CLR should call. When thecall
instruction is used to call an instance or virtual method, you must specify a variable that refers to an object. Thecall
instruction assumes that this variable is not null. In other words, the type of the variable itself indicates which type defines the method that the CLR should call. If the variable’s type doesn’t define the method, base types are checked for a matching method. Thecall
instruction is frequently used to call a virtual method nonvirtually.■ The
callvirt
IL instruction can be used to call instance and virtual methods, not static methods. When thecallvirt
instruction is used to call an instance or virtual method, you must specify a variable that refers to an object. When thecallvirt
IL instruction is used to call a nonvirtual instance method, the type of the variable indicates which type defines the method that the CLR should call. When thecallvirt
IL instruction is used to call a virtual instance method, the CLR discovers the actual type of the object being used to make the call and then calls the method polymorphically. In order to determine the type, the variable being used to make the call must not be null. In other words, when compiling this call, the JIT compiler generates code that verifies that the variable’s value is not null. If it is null, the callvirt instruction causes the CLR to throw aNullReferenceException
. This additional check means that the callvirt IL instruction executes slightly more slowly than the call instruction. Note that this null check is performed even when the callvirt instruction is used to call a nonvirtual instance method.
So now, let’s put this together to see how C# uses these different IL instructions.
using System; | |
public sealed class Program { | |
public static void Main() { | |
Console.WriteLine(); // Call a static method | |
Object o = new Object(); | |
o.GetHashCode(); // Call a virtual instance method | |
o.GetType(); // Call a nonvirtual instance method | |
} | |
} |
If you were to compile the code above and look at the resulting IL, you’d see the following.
.method public hidebysig static void Main() cil managed { | |
.entrypoint | |
// Code size 26 (0x1a) | |
.maxstack 1 | |
.locals init (object o) | |
IL_0000: call void System.Console::WriteLine() | |
IL_0005: newobj instance void System.Object::.ctor() | |
IL_000a: stloc.0 | |
IL_000b: ldloc.0 | |
IL_000c: callvirt instance int32 System.Object::GetHashCode() | |
IL_0011: pop | |
IL_0012: ldloc.0 | |
IL_0013: callvirt instance class System.Type System.Object::GetType() | |
IL_0018: pop | |
IL_0019: ret | |
} // end of method Program::Main |
Notice that the C# compiler uses the call IL instruction to call Console’s WriteLine method. This is expected because WriteLine is a static method. Next, notice that the
callvirt
IL instruction is used to callGetHashCode
. This is also expected, becauseGetHashCode
is a virtual method. Finally, notice that the C# compiler also uses thecallvirt
IL instruction to call theGetType
method. This is surprising becauseGetType
is not a virtual method. However, this works because while JIT-compiling this code, the CLR will know thatGetType
is not a virtual method, and so the JIT-compiled code will simply callGetType
nonvirtually.
Of course, the question is, why didn’t the C# compiler simply emit the call instruction instead? The answer is because the C# team decided that the JIT compiler should generate code to verify that the object being used to make the call is not null. This means that calls to nonvirtual instance methods are a little slower than they could be. It also means that the following C# code will cause a
NullReferenceException
to be thrown. In some other programming languages, the intention of the following code would run just fine.
using System; | |
public sealed class Program { | |
public Int32 GetFive() { return 5; } | |
public static void Main() { | |
Program p = null; | |
Int32 x = p.GetFive(); // In C#, NullReferenceException is thrown | |
} | |
} |
Theoretically, the preceding code is fine. Sure, the variable p is null, but when calling a nonvirtual method (
GetFive
), the CLR needs to know just the data type of p, which is Program. IfGetFive
did get called, the value of the this argument would be null. Because the argument is not used inside theGetFive
method, noNullReferenceException
would be thrown. However, because the C# compiler emits acallvirt
instruction instead of a call instruction, the preceding code will end up throwing theNullReferenceException
.
💡重要提示:将方法定义为非虚方法后,将来永远都不要把它更改为虚方法。这是因为某些编译器会用 call
而不是 callvirt
调用非虚方法。如果方法从非虚变成虚,而引用代码没有重新编译,会以非虚方式调用虚方法,造成应用程序行为无法预料。用 C# 写的引用代码不会出问题,因为 C# 用 callvirt
指令调用所有实例方法。但如果引用代码是用其他语言写的,就可能出问题。
Sometimes, the compiler will use a call instruction to call a virtual method instead of using a
callvirt
instruction. At first, this may seem surprising, but the code below demonstrates why it is sometimes required.
xxxxxxxxxx internal class SomeClass { // ToString is a virtual method defined in the base class: Object. public override String ToString() { // Compiler uses the 'call' IL instruction to call // Object’s ToString method nonvirtually. // If the compiler were to use 'callvirt' instead of 'call', this // method would call itself recursively until the stack overflowed. return base.ToString(); }} |
When calling base.ToString (a virtual method), the C# compiler emits a call instruction to ensure that the
ToString
method in the base type is called nonvirtually. This is required because ifToString
were called virtually, the call would execute recursively until the thread’s stack overflowed, which obviously is not desired.
Compilers tend to use the call instruction when calling methods defined by a value type because value types are sealed. This implies that there can be no polymorphism even for their virtual methods, which causes the performance of the call to be faster. In addition, the nature of a value type instance guarantees it can never be null, so a
NullReferenceException
will never be thrown. Finally, if you were to call a value type’s virtual method virtually, the CLR would need to have a reference to the value type’s type object in order to refer to the method table within it. This requires boxing the value type. Boxing puts more pressure on the heap, forcing more frequent garbage collections and hurting performance.
Regardless of whether call or callvirt is used to call an instance or virtual method, these methods always receive a hidden this argument as the method’s first parameter. The this argument refers to the object being operated on.
When designing a type, you should try to minimize the number of virtual methods you define. First, calling a virtual method is slower than calling a nonvirtual method. Second, virtual methods cannot be inlined by the JIT compiler, which further hurts performance. Third, virtual methods make versioning of components more brittle, as described in the next section. Fourth, when defining a base type, it is common to offer a set of convenience overloaded methods. If you want these methods to be polymorphic, the best thing to do is to make the most complex method virtual and leave all of the convenience overloaded methods nonvirtual. By the way, following this guideline will also improve the ability to version a component without adversely affecting the derived types. Here is an example.
public class Set { | |
private Int32 m_length = 0; | |
// This convenience overload is not virtual | |
public Int32 Find(Object value) { | |
return Find(value, 0, m_length); | |
} | |
// This convenience overload is not virtual | |
public Int32 Find(Object value, Int32 startIndex) { | |
return Find(value, startIndex, m_length - startIndex); | |
} | |
// The most feature-rich method is virtual and can be overridden | |
public virtual Int32 Find(Object value, Int32 startIndex, Int32 endIndex) { | |
// Actual implementation that can be overridden goes here... | |
} | |
// Other methods go here | |
} |
# Using Type Visibility and Member Accessibility Intelligently
With the .NET Framework, applications are composed of types defined in multiple assemblies produced by various companies. This means that the developer has little control over the components he or she is using and the types defined within those components. The developer typically doesn’t have access to the source code (and probably doesn’t even know what programming language was used to create the component), and components tend to version with different schedules. Furthermore, due to polymorphism and protected members, a base class developer must trust the code written by the derived class developer. And, of course, the developer of a derived class must trust the code that he is inheriting from a base class. These are just some of the issues that you need to really think about when designing components and types.
In this section, I’d like to say just a few words about how to design a type with these issues in mind. Specifically, I’m going to focus on the proper way to set type visibility and member accessibility so that you’ll be most successful.
First, when defining a new type, compilers should make the class sealed by default so that the class cannot be used as a base class. Instead, many compilers, including C#, default to unsealed classes and allow the programmer to explicitly mark a class as sealed by using the sealed keyword. Obviously, it is too late now, but I think that today’s compilers have chosen the wrong default and it would be nice if this could change with future compilers. There are three reasons why a sealed class is better than an unsealed class:
Versioning When a class is originally sealed, it can change to unsealed in the future without breaking compatibility. However, after a class is unsealed, you can never change it to sealed in the future as this would break all derived classes. In addition, if the unsealed class defines any unsealed virtual methods, ordering of the virtual method calls must be maintained with new versions or there is the potential of breaking derived types in the future.
Performance As discussed in the previous section, calling a virtual method doesn’t perform as well as calling a nonvirtual method because the CLR must look up the type of the object at run time in order to determine which type defines the method to call. However, if the JIT compiler sees a call to a virtual method using a sealed type, the JIT compiler can produce more efficient code by calling the method nonvirtually. It can do this because it knows there can’t possibly be a derived class if the class is sealed. For example, in the code below, the JIT compiler can call the virtual
ToString
method nonvirtually.
using System; public sealed class Point { private Int32 m_x, m_y; public Point(Int32 x, Int32 y) { m_x = x; m_y = y; } public override String ToString() { return String.Format("({0}, {1})", m_x, m_y);}
public static void Main() { Point p = new Point(3, 4);// The C# compiler emits the callvirt instruction here but the
// JIT compiler will optimize this call and produce code that
// calls ToString nonvirtually because p's type is Point,
// which is a sealed class
Console.WriteLine(p.ToString());}
}
- Security and predictability A class must protect its own state and not allow itself to ever become corrupted. When a class is unsealed, a derived class can access and manipulate the base class’s state if any data fields or methods that internally manipulate fields are accessible and not private. In addition, a virtual method can be overridden by a derived class, and the derived class can decide whether to call the base class’s implementation. By making a method, property, or event virtual, the base class is giving up some control over its behavior and its state. Unless carefully thought out, this can cause the object to behave unpredictably, and it opens up potential security holes.
The problem with a sealed class is that it can be a big inconvenience to users of the type. Occasionally, developers want to create a class derived from an existing type in order to attach some additional fields or state information for their application’s own use. In fact, they may even want to define some helper or convenience methods on the derived type to manipulate these additional fields. Although the CLR offers no mechanism to extend an already-built type with helper methods or fields, you can simulate helper methods by using C#’s extension methods (discussed in Chapter 8) and you can simulate tacking state onto an object by using the
ConditionalWeakTable
class (discussed in Chapter 21, “The Managed Heap and Garbage Collection.”
Here are the guidelines I follow when I define my own classes:
When defining a class, I always explicitly make it sealed unless I truly intend for the class to be a base class that allows specialization by derived classes. As stated earlier, this is the opposite of what C# and many other compilers default to today. I also default to making the class internal unless I want the class to be publicly exposed outside of my assembly. Fortunately, if you do not explicitly indicate a type’s visibility, the C# compiler defaults to internal. If I really feel that it is important to define a class that others can derive but I do not want to allow specialization, I will simulate creating a closed class by using the above technique of sealing the virtual methods that my class inherits.
Inside the class, I always define my data fields as private and I never waver on this. Fortunately, C# does default to making fields private. I’d actually prefer it if C# mandated that all fields be private and that you could not make fields protected, internal, public, and so on. Exposing state is the easiest way to get into problems, have your object behave unpredictably, and open potential security holes. This is true even if you just declare some fields as internal. Even within a single assembly, it is too hard to track all code that references a field, especially if several developers are writing code that gets compiled into the same assembly.
Inside the class, I always define my methods, properties, and events as private and nonvirtual. Fortunately, C# defaults to this as well. Certainly, I’ll make a method, property, or event public to expose some functionality from the type. I try to avoid making any of these members protected or internal, because this would be exposing my type to some potential vulnerability. However, I would sooner make a member protected or internal than I would make a member virtual because a virtual member gives up a lot of control and really relies on the proper behavior of the derived class.
There is an old OOP adage that goes like this: when things get too complicated, make more types. When an implementation of some algorithm starts to get complicated, I define helper types that encapsulate discrete pieces of functionality. If I’m defining these helper types for use by a single super-type, I’ll define the helper types nested within the super-type. This allows for scoping and also allows the code in the nested, helper type to reference the private members defined in the super-type. However, there is a design guideline rule, enforced by the Code Analysis tool (FxCopCmd.exe) in Visual Studio, which indicates that publicly exposed nested types should be defined at file or assembly scope and not be defined within another type. This rule exists because some developers find the syntax for referencing nested types cumbersome. I appreciate this rule, and I never define public nested types.
# Dealing with Virtual Methods When Versioning Types
As was stated earlier, in a Component Software Programming environment, versioning is a very important issue. I talked about some of these versioning issues in Chapter 3, “Shared Assemblies and Strongly Named Assemblies,” when I explained strongly named assemblies and discussed how an administrator can ensure that an application binds to the assemblies that it was built and tested with. However, other versioning issues cause source code compatibility problems. For example, you must be very careful when adding or modifying members of a type if that type is used as a base type. Let’s look at some examples.
CompanyA has designed the following type, Phone.
namespace CompanyA { | |
public class Phone { | |
public void Dial() { | |
Console.WriteLine("Phone.Dial"); | |
// Do work to dial the phone here. | |
} | |
} | |
} |
Now imagine that CompanyB defines another type, BetterPhone, which uses CompanyA’s Phone type as its base.
namespace CompanyB { | |
public class BetterPhone : CompanyA.Phone { | |
public void Dial() { | |
Console.WriteLine("BetterPhone.Dial"); | |
EstablishConnection(); | |
base.Dial(); | |
} | |
protected virtual void EstablishConnection() { | |
Console.WriteLine("BetterPhone.EstablishConnection"); | |
// Do work to establish the connection. | |
} | |
} | |
} |
When CompanyB attempts to compile its code, the C# compiler issues the following message.
warning CS0108: 'CompanyB.BetterPhone.Dial()' hides inherited member 'CompanyA.Phone.Dial()'.
Use the new keyword if hiding was intended.” This warning is notifying the developer that BetterPhone is defining a Dial method, which will hide the Dial method defined in Phone. This new method could change the semantic meaning of Dial (as defined by CompanyA when it originally created the Dial method).
It’s a very nice feature of the compiler to warn you of this potential semantic mismatch. The compiler also tells you how to remove the warning by adding the new keyword before the definition of Dial in the BetterPhone class. Here’s the fixed BetterPhone class.
namespace CompanyB { | |
public class BetterPhone : CompanyA.Phone { | |
// This Dial method has nothing to do with Phone's Dial method. | |
public new void Dial() { | |
Console.WriteLine("BetterPhone.Dial"); | |
EstablishConnection(); | |
base.Dial(); | |
} | |
protected virtual void EstablishConnection() { | |
Console.WriteLine("BetterPhone.EstablishConnection"); | |
// Do work to establish the connection. | |
} | |
} | |
} |
At this point, CompanyB can use BetterPhone.Dial in its application. Here’s some sample code that CompanyB might write.
public sealed class Program { | |
public static void Main() { | |
CompanyB.BetterPhone phone = new CompanyB.BetterPhone(); | |
phone.Dial(); | |
} | |
} |
When this code runs, the following output is displayed.
BetterPhone.Dial
BetterPhone.EstablishConnection
Phone.Dial
This output shows that CompanyB is getting the behavior it desires. The call to Dial is calling the new Dial method defined by BetterPhone, which calls the virtual
EstablishConnection
method and then calls the Phone base type’s Dial method.
Now let’s imagine that several companies have decided to use CompanyA’s Phone type. Let’s further imagine that these other companies have decided that the ability to establish a connection in the Dial method is a really useful feature. This feedback is given to CompanyA, which now revises its Phone class.
namespace CompanyA { | |
public class Phone { | |
public void Dial() { | |
Console.WriteLine("Phone.Dial"); | |
EstablishConnection(); | |
// Do work to dial the phone here. | |
} | |
protected virtual void EstablishConnection() { | |
Console.WriteLine("Phone.EstablishConnection"); | |
// Do work to establish the connection. | |
} | |
} | |
} |
Now when CompanyB compiles its BetterPhone type (derived from this new version of CompanyA’s Phone), the compiler issues this message.
warning CS0114: 'CompanyB.BetterPhone.EstablishConnection()' hides inherited member 'CompanyA.
Phone.EstablishConnection()'. To make the current member override that implementation, add the
override keyword. Otherwise, add the new keyword.
The compiler is alerting you to the fact that both Phone and BetterPhone offer an EstablishConnection method and that the semantics of both might not be identical; simply recompiling BetterPhone can no longer give the same behavior as it did when using the first version of the Phone type.
If CompanyB decides that the EstablishConnection methods are not semantically identical in both types, CompanyB can tell the compiler that the Dial and EstablishConnection method defined in BetterPhone is the correct method to use and that it has no relationship with the EstablishConnection method defined in the Phone base type. CompanyB informs the compiler of this by adding the new keyword to the EstablishConnection method.
namespace CompanyB { | |
public class BetterPhone : CompanyA.Phone { | |
// Keep 'new' to mark this method as having no | |
// relationship to the base type's Dial method. | |
public new void Dial() { | |
Console.WriteLine("BetterPhone.Dial"); | |
EstablishConnection(); | |
base.Dial(); | |
} | |
// Add 'new' to mark this method as having no | |
// relationship to the base type's EstablishConnection method. | |
protected new virtual void EstablishConnection() { | |
Console.WriteLine("BetterPhone.EstablishConnection"); | |
// Do work to establish the connection. | |
} | |
} | |
} |
In this code, the new keyword tells the compiler to emit metadata, making it clear to the CLR that BetterPhone’s EstablishConnection method is intended to be treated as a new function that is introduced by the BetterPhone type. The CLR will know that there is no relationship between Phone’s and BetterPhone’s methods.
When the same application code (in the Main method) executes, the output is as follows.
BetterPhone.Dial
BetterPhone.EstablishConnection
Phone.Dial
Phone.EstablishConnection
This output shows that Main’s call to Dial calls the new Dial method defined by BetterPhone.Dial, which in turn calls the virtual
EstablishConnection
method that is also defined by BetterPhone. When BetterPhone’sEstablishConnection
method returns, Phone’s Dial method is called. Phone’s Dial method callsEstablishConnection
, but because BetterPhone’sEstablishConnection
is marked with new, BetterPhone’sEstablishConnection
method isn’t considered an override of Phone’s virtualEstablishConnection
method. As a result, Phone’s Dial method calls Phone’sEstablishConnection
method—this is the expected behavior.
💡注意:如果编译器像原生 C++ 编译器那样默认将方法视为重写, BetterPhone
的开发者就不能使用 Dial
和 EstablishConnection
方法名了。这极有可能造成整个源代码 base 的连锁反应,破坏源代码和二进制兼容性。这种波及面太大的改变是我们不希望的,尤其是中大型的项目。但是,如果更改方法名只会造成源代码发生适度更新,就应该更改方法名,避免 Dial
和 EstablishConnection
方法的两种不同的含义使开发人员产生混淆。
Alternatively, CompanyB could have gotten the new version of CompanyA’s Phone type and decided that Phone’s semantics of Dial and EstablishConnection are exactly what it’s been looking for. In this case, CompanyB would modify its BetterPhone type by removing its Dial method entirely. In addition, because CompanyB now wants to tell the compiler that BetterPhone’s EstablishConnection method is related to Phone’s EstablishConnection method, the new keyword must be removed. Simply removing the new keyword isn’t enough, though, because now the compiler can’t tell exactly what the intention is of BetterPhone’s EstablishConnection method. To express his intent exactly, the CompanyB developer must also change BetterPhone’s EstablishConnection method from virtual to override. The following code shows the new version of BetterPhone.
namespace CompanyB { | |
public class BetterPhone : CompanyA.Phone { | |
// Delete the Dial method (inherit Dial from base). | |
// Remove 'new' and change 'virtual' to 'override' to | |
// mark this method as having a relationship to the base | |
// type's EstablishConnection method. | |
protected override void EstablishConnection() { | |
Console.WriteLine("BetterPhone.EstablishConnection"); | |
// Do work to establish the connection. | |
} | |
} | |
} |
Now when the same application code (in the Main method) executes, the output is as follows.
Phone.Dial
BetterPhone.EstablishConnection
This output shows that Main’s call to Dial calls the Dial method defined by Phone and inherited by BetterPhone. Then when Phone’s Dial method calls the virtual EstablishConnection method, BetterPhone’s EstablishConnection method is called because it overrides the virtual EstablishConnection method defined by Phone.
💡小结:组件软件编程(Component Software Programming,CSP)是 OOP 发展到极致的成果。版本控制使 CSP 的复杂性上升到了 OOP 无法企及的高度。.NET Framework 将版本号分为了 4 个部分:主版本号(major version)、次版本号(minor version)、内部版本号(build number)和修订号(revision)。major/minor 部分通常代表程序集的一个连续的、稳定的功能集,而 build/revision 部分通常代表对这个功能集的一次维护。将一个组件(程序集)中定义的类型作为另一个组件(程序集)中的一个类型的基类使用时,便会发生版本控制问题。如果基类的版本(被修改得)低于派生类,派生类的行为也会改变,这可能造成类的行为失常。在多态情形中,由于派生类型会重写基类型定义的虚方法,所以这个问题显得尤其突出。CLR 允许类型定义多个同名方法,只要每个方法都有一组不同的参数或者一个不同的返回类型。但除了 IL 汇编语言,大多数语言(包括 C#)在判断方法的唯一性时,除了方法名之外,都只以参数为准,方法返回类型会被忽略。CLR 提供了两个方法调用指令,分别是 call
和 callvirt
。其中 call
可调用静态方法、实例方法和虚方法,用 call
指令调用静态方法时指定方法定义的类型,调用实例方法或虚方法时会假设变量不为 null。 callvirt
可调用实例方法和虚方法,不能调用静态方法。 callvirt
无论时调用实例方法还是虚方法都会检查变量是不是 null,正是由于有这种额外的检查,所以 callvirt
指令的执行速度比 call
指令稍慢。调用实例方法或者虚方法时,JIT 编译器总是会调用 callvirt
来验证发出调用的对象不为 null,除非编译器要确保以非虚方式调用基类方法的时候。编译器调用值类型定义的方法时倾向于使用 call
指令,因为值类型时密封的。这意味着即使值类型含有虚方法也不用考虑多态性,这使调用更快。此外,值类型实例的本质保证它永不为 null,所以永远不抛出 NullReferenceException
异常。最后,如果以虚方式调用值类型中的虚方法,CLR 要获取对值类型的类型对象的引用,以便引用(类型对象中的)方法表,这要求对值类型装箱。无论用 call
还是 callvirt
调用实例方法或虚方法,这些方法通常接收隐藏的 this
实参作为方法第一个参数。 this
实参引用要操作的对象。设计类型时应尽量减少虚方法的数量。如果希望这些方法是多态的,最好的办法就是使最复杂的方法成为虚方法,使所有重载的简便方法成为非虚方法。