# Chapter 22 CLR Hosting and AppDomains

# CLR Hosting

The .NET Framework runs on top of Windows. This means that the .NET Framework must be built using technologies that Windows can interface with. For starters, all managed module and assembly files must use the Windows portable executable (PE) file format and be either a Windows executable (EXE) file or a DLL.

When developing the CLR, Microsoft implemented it as a COM server contained inside a DLL; that is, Microsoft defined a standard COM interface for the CLR and assigned GUIDs to this interface and the COM server. When you install the .NET Framework, the COM server representing the CLR is registered in the Windows registry just as any other COM server would. If you want more information about this topic, refer to the MetaHost.h C++ header file that ships with the .NET Framework SDK. This header file defines the GUIDs and the unmanaged ICLRMetaHost interface definition.

Any Windows application can host the CLR. However, you shouldn’t create an instance of the CLR COM server by calling CoCreateInstance; instead, your unmanaged host should call the CLRCreateInstance function declared in MetaHost.h. The CLRCreateInstance function is implemented in the MSCorEE.dll file, which is usually found in the C:\Windows\System32 directory. This DLL is affectionately referred to as the shim, and its job is to determine which version of the CLR to create; the shim DLL doesn’t contain the CLR COM server itself.

A single machine may have multiple versions of the CLR installed, but there will be only one version of the MSCorEE.dll file (the shim).1 The version of MSCorEE.dll installed on the machine is the version that shipped with the latest version of the CLR installed on the machine. Therefore, this version of MSCorEE.dll knows how to find any previous versions of the CLR that may be installed.

The actual CLR code is contained in a file whose name has changed with different versions of the CLR. For versions 1.0, 1.1, and 2.0, the CLR code is in a file called MSCorWks.dll, and for version 4, the CLR code is in a file called Clr.dll. Because you can have multiple versions of the CLR installed on a single machine, these files are installed into different directories as follows.

  • Version 1.0 is in C:\Windows\Microsoft.NET\Framework\v1.0.3705

  • Version 1.1 is in C:\Windows\Microsoft.NET\Framework\v1.0.4322

  • Version 2.0 is in C:\Windows\Microsoft.NET\Framework\v2.0.50727

  • Version 4 is in C:\Windows\Microsoft.NET\Framework\v4.0.21006

The CLRCreateInstance function can return an ICLRMetaHost interface. A host application can call this interface’s GetRuntime function, specifying the version of the CLR that the host would like to create. The shim then loads the desired version of the CLR into the host’s process.

By default, when a managed executable starts, the shim examines the executable file and extracts the information indicating the version of the CLR that the application was built and tested with. However, an application can override this default behavior by placing requiredRuntime and supportedRuntime entries in its XML configuration file (described in Chapter 2, “Building, Packaging, Deploying, and Administering Applications and Types,” and Chapter 3, “Shared Assemblies and Strongly Named Assemblies”).

The GetRuntime function returns a pointer to the unmanaged ICLRRuntimeInfo interface from which the ICLRRuntimeHost interface is obtained via the GetInterface method. The hosting application can call methods defined by this interface to:

  • Set Host managers. Tell the CLR that the host wants to be involved in making decisions related to memory allocations, thread scheduling/synchronization, assembly loading, and more. The host can also state that it wants notifications of garbage collection starts and stops and when certain operations time out.

  • Get CLR managers. Tell the CLR to prevent the use of some classes/members. In addition, the host can tell which code can and can’t be debugged and which methods in the host should be called when a special event—such as an AppDomain unload, CLR stop, or stack overflow exception—occurs.

  • Initialize and start the CLR.

  • Load an assembly and execute code in it.

  • Stop the CLR, thus preventing any more managed code from running in the Windows process.

There are many reasons why hosting the CLR is useful. Hosting allows any application to offer CLR features and a programmability story and to be at least partially written in managed code. Any application that hosts the runtime offers many benefits to developers who are trying to extend the application. Here are some of the benefits:

  • Programming can be done in any programming language.

  • Code is just-in-time (JIT)–compiled for speed (versus being interpreted).

  • Code uses garbage collection to avoid memory leaks and corruption.

  • Code runs in a secure sandbox.

  • The host doesn’t need to worry about providing a rich development environment. The host makes use of existing technologies: languages, compilers, editors, debuggers, profilers, and more.

If you are interested in using the CLR for hosting scenarios, I highly recommend that you get Steven Pratschner’s excellent book, Customizing the Microsoft .NET Framework Common Language Runtime (Microsoft Press 2005), even though it focuses on pre-4 versions of the CLR.

💡注意 Windows 进程完全可以不加载 CLR,只有在进程中执行托管代码时才进行加载。在 .NET Framework 4 之前,CLR 只允许它的一个实例寄宿在 Windows 进程中。换言之,在一个进程中,要么不包含任何 CLR,要么只能包含 CLR v1.0, CLR v1.1 或者 CLR 2.0 之一。每进程仅一个版本的 CLR 显然过于局限。例如,这样 Microsoft Office Outlook 就不能加载为不同版本的 .NET Framework 生成和测试的两个加载项了。

但是,随着 .NET Framework 4 的发布,Microsoft 支持在一个 Windows 进程中同时加载 CLR v2.0 和 v4.0,为 .NET Framework 2.0 和 4.0 写的不同组件能同时运行,不会出现任何兼容性问题。这是一个令人激动的功能,因为它极大扩展了 .NET Framework 组件的应用场合。可利用 CLrVer.exe 工具检查给定的进程加载的是哪个 (哪些) 版本的 CLR。

一个 CLR 加载到 Windows 进程之后,便永远不能卸载;在 ICLRRuntimeHost 接口上调用 AddRefRelease 方法是没有作用的。CLR 从进程中卸载的唯一途径就是终止进程,这会造成 Windows 清理进程使用的所有资源。

💡小结:寄宿(hosting)使任何应用程序都能利用 CLR 的功能。特别要指出的是,它使现在的应用程序至少能部分使用托管代码编写。另外,寄宿还为应用程序提供了通过编译来进行自定义和扩展的功能。.NET Framework 在 Windows 平台的顶部运行。这意味着.NET Framework 必须用 Windows 能理解的技术来构建。首先,所有托管模块和程序集文件必须使用 Windows PE 文件格式,而且要么是 Windows EXE 文件,要么是 DLL 文件。开发 CLR 时,Microsoft 实际是把它实现成包含在一个 DLL 中的 COM 服务器。也就是说,Microsoft 为 CLR 定义了一个标准的 COM 接口,并为该接口和 COM 服务器分配了 GUID。安装.NET Framework 时,代表 CLR 的 COM 服务器和其他 COM 服务器一样在 Windows 注册表中注册。任何 Windows 应用程序都能寄宿(容纳)CLR。但不要通过调用 CoCreateInstance 来创建 CLR COM 服务器的实例,相反,你的非托管宿主应该调用 MetaHost.h 文件中声明的 CLRCreateInstance 函数。 CLRCreateInstance 函数在 MSCorEE.dll 文件中实现,该文件一般在 C:\WIndows\System32 目录中。这个 DLL 被人们亲切地称为 “垫片”(shim),它的工作是决定创建哪个版本的 CLR;垫片 DLL 本身不包含 CLR COM 服务器。一台机器可安装多个版本的 CLR,但只有一个版本的 MSCorEE.dll 文件(垫片)。机器上安装的 MSCorEE.dll 是与机器上安装的最新版本的 CLR 一起发布的那个版本。所以,该版本的 MSCorEE.dll 知道如何查找机器上的老版本 CLR。包含实际 CLR 代码的文件的名称在不同版本的 CLR 中是不同的。版本 1.0,1.1 和 2.0 的 CLR 代码在 MSCorWks.dll 文件中;版本 4 则在 Clr.dll 文件中。 CLRCreateInstance 函数可返回一个 ICLRMetaHost 接口。宿主应用程序可调用这个接口的 GetRuntime 函数,指定宿主要创建的 CLR 的版本。然后,垫片将所需版本的 CLR 加载到宿主的进程中。默认情况下,当一个托管的可执行文件启动时,垫片会检查可执行文件,提取当初生成和测试应用程序时使用的 CLR 的版本信息。但应用程序可以在它的 XML 配置文件中设置 requiredRuntimesupportedRuntime 这两项来覆盖该默认行为。 GetRuntime 函数返回指向非托管 ICLRRuntimeInfo 接口的指针。有了这个指针后,就可利用 GetInterface 方法获得 ICLRRuntimeHost 接口。宿主应用程序可调用该接口定义的方法来设置宿主管理器、获取 CLR 管理器、初始化并启动 CLR、加载程序集并执行其中的代码、停止 CLR,阻止任何更多的托管代码在 Windows 进程中运行。

# AppDomains

When the CLR COM server initializes, it creates an AppDomain. An AppDomain is a logical container for a set of assemblies. The first AppDomain created when the CLR is initialized is called the default AppDomain; this AppDomain is destroyed only when the Windows process terminates.

In addition to the default AppDomain, a host using either unmanaged COM interface methods or managed type methods can instruct the CLR to create additional AppDomains. The whole purpose of an AppDomain is to provide isolation. Here are the specific features offered by an AppDomain:

  • Objects created by code in one AppDomain cannot be accessed directly by code in another AppDomain When code in an AppDomain creates an object, that object is “owned” by that AppDomain. In other words, the object is not allowed to live beyond the lifetime of the AppDomain whose code constructed it. Code in other AppDomains can access another AppDomain’s object only by using marshal-by-reference or marshal-by-value semantics. This enforces a clean separation and boundary because code in one AppDomain can’t have a direct reference to an object created by code in a different AppDomain. This isolation allows AppDomains to be easily unloaded from a process without affecting code running in other AppDomains.

  • AppDomains can be unloaded The CLR doesn’t support the ability to unload a single assembly from an AppDomain. However, you can tell the CLR to unload an AppDomain, which will cause all of the assemblies currently contained in it to be unloaded as well.

  • AppDomains can be individually secured When created, an AppDomain can have a permission set applied to it that determines the maximum rights granted to assemblies running in the AppDomain. This allows a host to load some code and be ensured that the code cannot corrupt or read important data structures used by the host itself.

  • AppDomains can be individually configured When created, an AppDomain can have a bunch of configuration settings associated with it. These settings mostly affect how the CLR loads assemblies into the AppDomain. There are configuration settings related to search paths, version binding redirects, shadow copying, and loader optimizations.

💡重要提示:Windows 的一个重要特色是让每个应用程序都在自己的进程地址空间中运行。这就保证了一个应用程序的代码不能访问另一个应用程序使用的代码或数据。进程隔离可防范安全漏洞、数据破坏和其他不可预测的行为,确保了 Windows 系统以及在它上面运行的应用程序的健壮性。遗憾的是,在 Windows 中创建进程的开销很大。Win32 CreateProcess 函数的速度很慢,而且 Windows 需要大量内存来虚拟化进程的地址空间。

但是,如果应用程序安全由托管代码构成 (这些代码的安全性可以验证),同时这些代码没有调用非托管代码,那么在一个 Windows 进程中运行多个托管应用程序是没有问题的。AppDomain 提供了保护、配置和终止其中每一个应用程序所需的隔离。

Figure 22-1 shows a single Windows process that has one CLR COM server running in it. This CLR is currently managing two AppDomains (although there is no hard-coded limit to the number of AppDomains that could be running in a single Windows process). Each AppDomain has its own loader heap, each of which maintains a record of which types have been accessed because the AppDomain was created. These type objects were discussed in Chapter 4, “Type Fundamentals;” each type object in the loader heap has a method table, and each entry in the method table points to JIT-compiled native code if the method has been executed at least once.

In addition, each AppDomain has some assemblies loaded into it. AppDomain #1 (the default AppDomain) has three assemblies: MyApp.exe, TypeLib.dll, and System.dll. AppDomain #2 has two assemblies loaded into it: Wintellect.dll and System.dll.

You’ll notice that the System.dll assembly has been loaded into both AppDomains. If both AppDomains are using a single type from System.dll, both AppDomains will have a type object for the same type allocated in each loader heap; the memory for the type object is not shared by all of the AppDomains. Furthermore, as code in an AppDomain calls methods defined by a type, the method’s Intermediate Language (IL) code is JIT-compiled, and the resulting native code is associated with each AppDomain; the code for the method is not shared by all AppDomains that call it.

Not sharing the memory for the type objects or native code is wasteful. However, the whole purpose of AppDomains is to provide isolation; the CLR needs to be able to unload an AppDomain and free up all of its resources without adversely affecting any other AppDomain. Replicating the CLR data structures ensures that this is possible. It also ensures that a type used by multiple AppDomains has a set of static fields for each AppDomain.

image-20221202183601974

Some assemblies are expected to be used by several AppDomains. MSCorLib.dll is the best example. This assembly contains System.Object, System.Int32, and all of the other types that are so integral to the .NET Framework. This assembly is automatically loaded when the CLR initializes, and all AppDomains share the types in this assembly. To reduce resource usage, MSCorLib.dll is loaded in an AppDomain-neutral fashion; that is, the CLR maintains a special loader heap for assemblies that are loaded in a domain-neutral fashion. All type objects in this loader heap and all native code for methods of these types are shared by all AppDomains in the process. Unfortunately, the benefit gained by sharing these resources does come with a price: assemblies that are loaded domain-neutral can never be unloaded. The only way to reclaim the resources used by them is to terminate the Windows process to cause Windows to reclaim the resources.

# Accessing Objects Across AppDomain Boundaries

Code in one AppDomain can communicate with types and objects contained in another AppDomain. However, access to these types and objects is allowed only through well-defined mechanisms. The following Ch22-1-AppDomains sample application demonstrates how to create a new AppDomain, load an assembly into it, and construct an instance of a type defined in that assembly. The code shows the different behaviors when constructing a type that is marshaled by reference, a type that is marshaled by value, and a type that can’t be marshaled at all. The code also shows how these differently marshaled objects behave when the AppDomain that created them is unloaded. The Ch22- 1-AppDomains sample application has very little code in it, but I have added a lot of comments. After the code listing, I’ll walk through the code, explaining what the CLR is doing.

private static void Marshalling() {
 // Get a reference to the AppDomain that the calling thread is executing in
 AppDomain adCallingThreadDomain = Thread.GetDomain();
 // Every AppDomain is assigned a friendly string name (helpful for debugging)
 // Get this AppDomain's friendly string name and display it
 String callingDomainName = adCallingThreadDomain.FriendlyName;
 Console.WriteLine("Default AppDomain's friendly name={0}", callingDomainName);
 // Get and display the assembly in our AppDomain that contains the 'Main' method
 String exeAssembly = Assembly.GetEntryAssembly().FullName;
 Console.WriteLine("Main assembly={0}", exeAssembly);
 // Define a local variable that can refer to an AppDomain
 AppDomain ad2 = null;
 // *** DEMO 1: Cross-AppDomain Communication Using Marshal-by-Reference ***
 Console.WriteLine("{0}Demo #1", Environment.NewLine);
 // Create new AppDomain (security and configuration match current AppDomain)
 ad2 = AppDomain.CreateDomain("AD #2", null, null);
 MarshalByRefType mbrt = null;
 // Load our assembly into the new AppDomain, construct an object, marshal 
 // it back to our AD (we really get a reference to a proxy)
 mbrt = (MarshalByRefType)
 ad2.CreateInstanceAndUnwrap(exeAssembly, "MarshalByRefType");
 Console.WriteLine("Type={0}", mbrt.GetType()); // The CLR lies about the type
 // Prove that we got a reference to a proxy object
 Console.WriteLine("Is proxy={0}", RemotingServices.IsTransparentProxy(mbrt));
 // This looks like we're calling a method on MarshalByRefType but we're not.
 // We're calling a method on the proxy type. The proxy transitions the thread
 // to the AppDomain owning the object and calls this method on the real object.
 mbrt.SomeMethod();
 // Unload the new AppDomain
 AppDomain.Unload(ad2);
 // mbrt refers to a valid proxy object; the proxy object refers to an invalid AppDomain
 try {
 // We're calling a method on the proxy type. The AD is invalid, exception is thrown
 mbrt.SomeMethod();
 Console.WriteLine("Successful call.");
 }
 catch (AppDomainUnloadedException) {
 Console.WriteLine("Failed call.");
 }
 // *** DEMO 2: Cross-AppDomain Communication Using Marshal-by-Value ***
 Console.WriteLine("{0}Demo #2", Environment.NewLine);
 // Create new AppDomain (security and configuration match current AppDomain)
 ad2 = AppDomain.CreateDomain("AD #2", null, null);
 // Load our assembly into the new AppDomain, construct an object, marshal 
 // it back to our AD (we really get a reference to a proxy)
 mbrt = (MarshalByRefType)
 ad2.CreateInstanceAndUnwrap(exeAssembly, "MarshalByRefType");
 // The object's method returns a COPY of the returned object; 
 // the object is marshaled by value (not by reference).
 MarshalByValType mbvt = mbrt.MethodWithReturn();
 // Prove that we did NOT get a reference to a proxy object
 Console.WriteLine("Is proxy={0}", RemotingServices.IsTransparentProxy(mbvt));
 // This looks like we're calling a method on MarshalByValType and we are.
 Console.WriteLine("Returned object created " + mbvt.ToString());
 // Unload the new AppDomain
 AppDomain.Unload(ad2);
 // mbvt refers to valid object; unloading the AppDomain has no impact.
 try {
 // We're calling a method on an object; no exception is thrown
 Console.WriteLine("Returned object created " + mbvt.ToString());
 Console.WriteLine("Successful call.");
 }
 catch (AppDomainUnloadedException) {
 Console.WriteLine("Failed call.");
 }
 // DEMO 3: Cross-AppDomain Communication Using non-marshalable type ***
 Console.WriteLine("{0}Demo #3", Environment.NewLine);
 // Create new AppDomain (security and configuration match current AppDomain)
 ad2 = AppDomain.CreateDomain("AD #2", null, null);
 // Load our assembly into the new AppDomain, construct an object, marshal 
 // it back to our AD (we really get a reference to a proxy)
 mbrt = (MarshalByRefType)
 ad2.CreateInstanceAndUnwrap(exeAssembly, "MarshalByRefType");
 // The object's method returns a non-marshalable object; exception
 NonMarshalableType nmt = mbrt.MethodArgAndReturn(callingDomainName);
 // We won't get here...
}
// Instances can be marshaled-by-reference across AppDomain boundaries
public sealed class MarshalByRefType : MarshalByRefObject {
 public MarshalByRefType() {
 Console.WriteLine("{0} ctor running in {1}",
 this.GetType().ToString(), Thread.GetDomain().FriendlyName);
 }
 public void SomeMethod() {
 Console.WriteLine("Executing in " + Thread.GetDomain().FriendlyName);
 }
 public MarshalByValType MethodWithReturn() {
 Console.WriteLine("Executing in " + Thread.GetDomain().FriendlyName);
 MarshalByValType t = new MarshalByValType();
 return t;
 }
 public NonMarshalableType MethodArgAndReturn(String callingDomainName) {
 // NOTE: callingDomainName is [Serializable]
 Console.WriteLine("Calling from '{0}' to '{1}'.",
 callingDomainName, Thread.GetDomain().FriendlyName);
 NonMarshalableType t = new NonMarshalableType();
 return t;
 }
}
// Instances can be marshaled-by-value across AppDomain boundaries
[Serializable]
public sealed class MarshalByValType : Object {
 private DateTime m_creationTime = DateTime.Now; // NOTE: DateTime is [Serializable]
 public MarshalByValType() {
 Console.WriteLine("{0} ctor running in {1}, Created on {2:D}",
 this.GetType().ToString(),
 Thread.GetDomain().FriendlyName,
 m_creationTime);
 }
 public override String ToString() {
 return m_creationTime.ToLongDateString();
 }
}
// Instances cannot be marshaled across AppDomain boundaries
// [Serializable]
public sealed class NonMarshalableType : Object {
 public NonMarshalableType() {
 Console.WriteLine("Executing in " + Thread.GetDomain().FriendlyName);
 }
}

If you build and run the Ch22-1-AppDomains application, you get the following output.

Default AppDomain's friendly name= Ch22-1-AppDomains.exe
Main assembly=Ch22-1-AppDomains, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null
Demo #1
MarshalByRefType ctor running in AD #2
Type=MarshalByRefType
Is proxy=True
Executing in AD #2
Failed call.
Demo #2
MarshalByRefType ctor running in AD #2
Executing in AD #2
MarshalByValType ctor running in AD #2, Created on Friday, August 07, 2009
Is proxy=False
Returned object created Saturday, June 23, 2012
Returned object created Saturday, June 23, 2012
Successful call.
Demo #3
MarshalByRefType ctor running in AD #2
Calling from 'Ch22-1-AppDomains.exe' to 'AD #2'.
Executing in AD #2
Unhandled Exception: System.Runtime.Serialization.SerializationException: 
Type 'NonMarshalableType' in assembly 'Ch22-1-AppDomains, Version=0.0.0.0, 
Culture=neutral, PublicKeyToken=null' is not marked as serializable.
at MarshalByRefType.MethodArgAndReturn(String callingDomainName)
at Program.Marshalling()
at Program.Main() 

Now, I will discuss what this code and the CLR are doing.

Inside the Marshalling method, I first get a reference to an AppDomain object that identifies the AppDomain the calling thread is currently executing in. In Windows, a thread is always created in the context of one process, and the thread lives its entire lifetime in that process. However, a one-to-one correspondence doesn’t exist between threads and AppDomains. AppDomains are a CLR feature; Windows knows nothing about AppDomains. Because multiple AppDomains can be in a single Windows process, a thread can execute code in one AppDomain and then execute code in another AppDomain. From the CLR’s perspective, a thread is executing code in one AppDomain at a time. A thread can ask the CLR what AppDomain it is currently executing in by calling System.Threading. Thread’s static GetDomain method. The thread could also query System.AppDomain’s static, readonly CurrentDomain property to get the same information.

When an AppDomain is created, it can be assigned a friendly name. A friendly name is just a String that you can use to identify an AppDomain. This is typically useful in debugging scenarios. Because the CLR creates the default AppDomain before any of our code can run, the CLR uses the executable file’s file name as the default AppDomain’s friendly name. My Marshalling method queries the default AppDomain’s friendly name by using System.AppDomain’s read-only FriendlyName property.

Next, my Marshalling method queries the strong-name identity of the assembly (loaded into the default AppDomain) that defines the entry point method Main that calls Marshalling. This assembly defines several types: Program, MarshalByRefType, MarshalBy ValType, and NonMarshalableType. At this point, we’re ready to look at the three demos that are all pretty similar to each other.

# Demo #1: Cross-AppDomain Communication Using Marshal-by-Reference

In Demo #1, System.AppDomain’s static CreateDomain method is called, instructing the CLR to create a new AppDomain in the same Windows process. The AppDomain type actually offers several overloads of the CreateDomain method; I encourage you to study them and select the version that is most appropriate when you are writing code to create a new AppDomain. The version of CreateDomain that I call accepts three arguments:

  • A String identifying the friendly name I want assigned to the new AppDomain I’m passing in “AD #2” here.

  • A System.Security.Policy.Evidence identifying the evidence that the CLR should use to calculate the AppDomain’s permission set I’m passing null here so that the new AppDomain will inherit the same permission set as the AppDomain creating it. Usually, if you want to create a security boundary around code in an AppDomain, you’d construct a System. Security.PermissionSet object, add the desired permission objects to it (instances of types that implement the IPermission interface), and then pass the resulting PermissionSet object reference to the overloaded version of the CreateDomain method that accepts a PermissionSet.

  • A System.AppDomainSetup identifying the configuration settings the CLR should use for the new AppDomain Again, I’m passing null here so that the new AppDomain will inherit the same configuration settings as the AppDomain creating it. If you want the AppDomain to have a special configuration, construct an AppDomainSetup object, set its various properties to whatever you want, such as the name of the configuration file, and then pass the resulting AppDomainSetup object reference to the CreateDomain method.

Internally, the CreateDomain method creates a new AppDomain in the process. This AppDomain will be assigned the specified friendly name, security, and configuration settings. The new AppDomain will have its very own loader heap, which will be empty because there are currently no assemblies loading into the new AppDomain. When you create an AppDomain, the CLR does not create any threads in this AppDomain; no code runs in the AppDomain unless you explicitly have a thread call code in the AppDomain.

Now to create an instance of an object in the new AppDomain, we must first load an assembly into the new AppDomain and then construct an instance of a type defined in this assembly. This is precisely what the call to AppDomain’s public, instance CreateInstanceAndUnwrap method does. When calling CreateInstanceAndUnwrap, I pass two arguments: a String identifying the assembly I want loaded into the new AppDomain (referenced by the ad2 variable) and another String identifying the name of the type that I want to construct an instance of. Internally, CreateInstanceAndUnwrap causes the calling thread to transition from the current AppDomain into the new AppDomain. Now, the thread (which is inside the call to CreateInstanceAndUnwrap) loads the specified assembly into the new AppDomain and then scans the assembly’s type definition metadata table, looking for the specified type (“MarshalByRefType”). After the type is found, the thread calls the MarshalByRefType’s parameterless constructor. Now the thread transitions back to the default AppDomain so that CreateInstanceAndUnwrap can return a reference to the new MarshalByRefType object.

💡注意: CreateInstanceAndUnwrap 方法的一些重载版本允许在调用类型的构造器时传递实参。

Although this sounds all fine and good, there is a problem: the CLR cannot allow a variable (root) living in one AppDomain to reference an object created in another AppDomain. If CreateInstanceAndUnwrap simply returned the reference to the object, isolation would be broken, and isolation is the whole purpose of AppDomains! So, just before CreateInstanceAndUnwrap returns the object reference, it performs some additional logic.

You’ll notice that the MarshalByRefType type is derived from a very special base class: System. MarshalByRefObject. When CreateInstanceAndUnwrap sees that it is marshalling an object whose type is derived from MarshalByRefObject, the CLR will marshal the object by reference across the AppDomain boundaries. Here is what it means to marshal an object by reference from one AppDomain (the source AppDomain where the object is really created) to another AppDomain (the destination AppDomain from where CreateInstanceAndUnwrap is called).

When a source AppDomain wants to send or return the reference of an object to a destination AppDomain, the CLR defines a proxy type in the destination AppDomain’s loader heap. This proxy type is defined using the original type’s metadata, and therefore, it looks exactly like the original type; it has all of the same instance members (properties, events, and methods). The instance fields are not part of the type, but I’ll talk more about this in a moment. This new type does have some instance fields defined inside of it, but these fields are not identical to that of the original data type. Instead, these fields indicate which AppDomain “owns” the real object and how to find the real object in the owning AppDomain. (Internally, the proxy object uses a GCHandle instance that refers to the real object. The GCHandle type is discussed in Chapter 21, “The Managed Heap and Garbage Collection.”)

After this type is defined in the destination AppDomain, CreateInstanceAndUnwrap creates an instance of this proxy type, initializes its fields to identify the source AppDomain and the real object, and returns a reference to this proxy object to the destination AppDomain. In my Ch22-1-AppDomains application, the mbrt variable will be set to refer to this proxy. Notice that the object returned from CreateInstanceAndUnwrap is actually not an instance of the MarshalByRefType type. The CLR will usually not allow you to cast an object of one type to an incompatible type. However, in this situation, the CLR does allow the cast, because this new type has the same instance members as defined on the original type. In fact, if you use the proxy object to call GetType, it actually lies to you and says that it is a MarshalByRefType object.

However, it is possible to prove that the object returned from CreateInstanceAndUnwrap is actually a reference to a proxy object. To do this, my Ch22-1-AppDomains application calls System. Runtime.Remoting.RemotingService’s public, static IsTransparentProxy method passing in the reference returned from CreateInstanceAndUnwrap. As you can see from the output, IsTransparentProxy returns true, indicating that the object is a proxy.

Now, my Ch22-1-AppDomains application uses the proxy to call the SomeMethod method. Because the mbrt variable refers to a proxy object, the proxy’s implementation of this method is called. The proxy’s implementation uses the information fields inside the proxy object to transition the calling thread from the default AppDomain to the new AppDomain. Any actions now performed by this thread run under the new AppDomain’s security and configuration settings. Next, the thread uses the proxy object’s GCHandle field to find the real object in the new AppDomain, and then it uses the real object to call the real SomeMethod method.

There are two ways to prove that the calling thread has transitioned from the default AppDomain to the new AppDomain. First, inside the SomeMethod method, I call Thread.GetDomain().FriendlyName. This will return “AD #2” (as evidenced by the output) because the thread is now running in the new AppDomain created by using AppDomain.CreateDomain with “AD #2” as the friendly name parameter. Second, if you step through the code in a debugger and display the Call Stack window, the [AppDomain Transition] line marks where a thread has transitioned across an AppDomain boundary. See the Call Stack window near the bottom of Figure 22-2.

image-20221130105852421

FIGURE 22-2 The Debugger’s Call Stack window showing an AppDomain transition.

When the real SomeMethod method returns, it returns to the proxy’s SomeMethod method, which transitions the thread back to the default AppDomain, and then the thread continues executing code in the default AppDomain.

💡注意 一个 AppDomain 中的线程调用另一个 AppDomain 中的方法时,线程会在这两个 AppDomain 之间切换。这意味着跨 AppDomain 边界的方法调用是同步执行的。任何时刻一个线程只能在一个 AppDomain 中,而且要用那个 AppDomain 的安全和配置设置来执行代码。如果希望多个 AppDomain 中的代码并发执行,应创建额外的线程,让这些线程在你希望的 AppDomain 中执行你希望的代码。

The next thing that my Ch22-1-AppDomains application does is call AppDomain’s public, static Unload method to force the CLR to unload the specified AppDomain, including all of the assemblies loaded into it. A garbage collection is forced to free up any objects that were created by code in the unloading AppDomain. At this point, the default AppDomain’s mbrt variable still refers to a valid proxy object; however, the proxy object no longer refers to a valid AppDomain (because it has been unloaded).

When the default AppDomain attempts to use the proxy object to call the SomeMethod method, the proxy’s implementation of this method is called. The proxy’s implementation determines that the AppDomain that contained the real object has been unloaded, and the proxy’s SomeMethod method throws an AppDomainUnloadedException to let the caller know that the operation cannot complete.

Wow! The CLR team at Microsoft had to do a lot of work to ensure AppDomain isolation, but it is important work because these features are used heavily and are being used more and more by developers every day. Obviously, accessing objects across AppDomain boundaries by using marshalby-reference semantics has some performance costs associated with it, so you typically want to keep the use of this feature to a minimum.

I promised you that I’d talk a little more about instance fields. A type derived from MarshalByRefObject can define instance fields. However, these instance fields are not defined as being part of the proxy type and are not contained inside a proxy object. When you write code that reads from or writes to an instance field of a type derived from MarshalByRefObject, the JIT compiler emits code that uses the proxy object (to find the real AppDomain/object) by calling System.Object’s FieldGetter or FieldSetter methods, respectively. These methods are private and undocumented; they are basically methods that use reflection to get and set the value in a field. So although you can access fields of a type derived from MarshalByRefObject, the performance is particularly bad because the CLR really ends up calling methods to perform the field access. In fact, the performance is bad even if the object that you are accessing is in your own AppDomain.

访问实例字段时的性能问题

我用以下代码演示性能损失的程度:

private sealed class NonMBRO : Object { public Int32 x; }
private sealed class MBRO    : Object { public Int32 x; }

private static void FieldAccessTiming() {
    const Int32 count = 100000000;
    NonMBRO nonMbro = new NonMBRO();
    MBRO mbro = new MBRO();

    Stopwatch sw = Stopwatch.StartNew();
    for (Int32 c = 0; c < count; c++ ) nonMbro.x++;
    Console.WriteLine("{0}", sw.Elapsed);       // 00:00:00.4073560

    sw = Stopwatch.StartNew();
    for (Int32 c = 0; c < count; c++) mbro.x++;
    Console.WriteLine("{0}", sw.Elapsed);       // 00:00:02.5388665 
}

我运行以上代码,访问从 Object 派生的 NonMBRO 类的实例字段只花了约 0.4 秒,但访问从 MarshalByRefObject 派生的 MBRO 类的实例字段却花了 2.54 秒。也就是说,访问从 MarshalByRefObject 派生的一个类的实例字段要多花约 6 倍的时间!

From a usability standpoint, a type derived from MarshalByRefObject should really avoid defining any static members. The reason is that static members are always accessed in the context of the calling AppDomain. No AppDomain transition can occur because a proxy object contains the information identifying which AppDomain to transition to, but there is no proxy object when calling a static member. Having a type’s static members execute in one AppDomain while instance members execute in another AppDomain would make a very awkward programming model.

Because there are no roots in the second AppDomain, the original object referred to by the proxy could be garbage collected. Of course, this is not ideal. On the other hand, if the original object is held in memory indefinitely, then the proxy could go away and the original object would still live; this is also not ideal. The CLR solves this problem by using a lease manager. When a proxy for an object is created, the CLR keeps the object alive for five minutes. If no calls have been made through the proxy after five minutes, then the object is deactivated and will have its memory freed at the next garbage collection. After each call into the object, the lease manager renews the object’s lease so that it is guaranteed to remain in memory for another two minutes before being deactivated. If an application attempts to call into an object through a proxy after the object’s lease has expired, the CLR throws a System.Runtime.Remoting.RemotingException.

It is possible to override the default lease times of five minutes and two minutes by overriding MarshalByRefObject’s virtual InitializeLifetimeServices method. For more information, see the section titled “Lifetime Leases” in the .NET Framework SDK documentation.

# Demo #2: Cross-AppDomain Communication Using Marshal-by-Value

Demo #2 is very similar to Demo #1. Again, another AppDomain is created exactly as Demo #1 did it. Then, CreateInstanceAndUnwrap is called to load the same assembly into the new AppDomain and create an instance of a MarshalByRefType object in this new AppDomain. Next, the CLR creates a proxy to the object and the mbrt variable (in the default AppDomain) is initialized referring to the proxy. Now, using the proxy, I call MethodWithReturn. This method, which takes no arguments, will execute in the new AppDomain to create an instance of the MarshalByValType type before returning a reference to the object to the default AppDomain.

MarshalByValType is not derived from System.MarshalByRefObject, and therefore, the CLR cannot define a proxy type to create an instance from; the object can’t be marshaled by reference across the AppDomain boundary.

However, because MarshalByValType is marked with the [Serializable] custom attribute, MethodWithReturn is allowed to marshal the object by value. The next paragraph describes what it means to marshal an object by value from one AppDomain (the source AppDomain) to another AppDomain (the destination AppDomain). For more information about the CLR's serialization and deserialization mechanisms, see Chapter 24, "Runtime Serialization.”

When a source AppDomain wants to send or return a reference to an object to a destination AppDomain, the CLR serializes the object’s instance fields into a byte array. This byte array is copied from the source AppDomain to the destination AppDomain. Then, the CLR deserializes the byte array in the destination AppDomain. This forces the CLR to load the assembly that defines the type being deserialized into the destination AppDomain if it is not already loaded. Then, the CLR creates an instance of the type and uses the values in the byte array to initialize the object’s fields so that they have values identical to those they had in the original object. In other words, the CLR makes an exact duplicate of the source object in the destination’s AppDomain MethodWithReturn, and then returns a reference to this copy; the object has been marshaled by value across the AppDomain’s boundary.

💡重要提示:加载程序集时,CLR 使用目标 AppDomain 的策略和配置设置 (而 AppDomain 可能设置了不同的 AppBase 目录或者不同的版本绑定重定向)。策略上的差异可能妨碍 CLR 定位程程序集。程序集无法加载时会抛出异常,目标 AppDomain 接收不到对象引用。

At this point, the object in the source AppDomain and the object in the destination AppDomain live separate lifetimes, and their states can change independently of each other. If there are no roots in the source AppDomain keeping the original object alive (as in my Ch22-1-AppDomains application), its memory will be reclaimed at the next garbage collection.

To prove that the object returned from MethodWithReturn is not a reference to a proxy object, my Ch22-1-AppDomains application calls System.Runtime.Remoting.RemotingService’s public, static IsTransparentProxy method passing in the reference returned from MethodWithReturn. As you can see from the output, IsTransparentProxy returns false, indicating that the object is a real object, not a proxy.

Now, my program uses the real object to call the ToString method. Because the mbvt variable refers to a real object, the real implementation of this method is called, and no AppDomain transition occurs. This can be evidenced by examining the debugger’s Call Stack window, which will not show an [Appdomain Transition] line.

To further prove that no proxy is involved, my Ch22-1-AppDomains application unloads the new AppDomain and then attempts to call the ToString method again. Unlike in Demo #1, the call succeeds this time because unloading the new AppDomain had no impact on objects “owned” by the default AppDomain, and this includes the object that was marshaled by value.

# Demo #3: Cross-AppDomain Communication Using Non-Marshalable Types

Demo #3 starts out very similar to Demos #1 and #2. Just as in Demos #1 and #2, an AppDomain is created. Then, CreateInstanceAndUnwrap is called to load the same assembly into the new AppDomain, create a MarshalByRefType object in this new AppDomain, and have the mbrt variable refer to a proxy to this object.

Then, using this proxy, I call MethodArgAndReturn, which accepts an argument. Again, the CLR must maintain AppDomain isolation, so it cannot simply pass the reference to the argument into the new AppDomain. If the type of the object is derived from MarshalByRefObject, the CLR will make a proxy for it and marshal it by reference. If the object’s type is marked as [Serializable], the CLR will serialize the object (and its children) to a byte array, marshal the byte array into the new AppDomain, and then deserialize the byte array into an object graph, passing the root of the object graph into the MethodArgAndReturn method.

In this particular demo, I am passing a System.String object across AppDomain boundaries. The System.String type is not derived from MarshalByRefObject, so the CLR cannot create a proxy. Fortunately, System.String is marked as [Serializable], and therefore the CLR can marshal it by value, which allows the code to work. Note that for String objects, the CLR performs a special optimization. When marshaling a String object across an AppDomain boundary, the CLR just passes the reference to the String object across the boundary; it does not make a copy of the String object. The CLR can offer this optimization because String objects are immutable; therefore, it is impossible for code in one AppDomain to corrupt a String object’s characters.4 For more about String immutability, see Chapter 14, “Chars, Strings, and Working with Text.”

Inside MethodArgAndReturn, I display the string passed into it to show that the string came across the AppDomain boundary, and then I create an instance of the NonMarshalableType type and return a reference to this object to the default AppDomain. Because NonMarshalableType is not derived from System.MarshalByRefObject and is also not marked with the [Serializable] custom attribute, MethodArgAndReturn is not allowed to marshal the object by reference or by value—the object cannot be marshaled across an AppDomain boundary at all! To report this, MethodArgAndReturn throws a SerializationException in the default AppDomain. Because my program doesn’t catch this exception, the program just dies.

💡小结:CLR COM 服务器初始化时会创建一个 AppDomain。AppDomain 是一组程序集的逻辑容器。CLR 初始化时创建的第一个 AppDomain 称为 “默认 AppDomain”,这个默认的 AppDomain 只有在 Windows 进程终止时才会被销毁。除了默认 AppDomain,正在使用非托管 COM 接口方法或托管类型方法的宿主还可要求 CLR 创建额外的 AppDomain。AppDomain 是为了提供隔离而设计的。一个 AppDomain 中的代码创建了一个对象后,该对象便被该 AppDomain “拥有”,换言之,它的生存期不能超过创建它的代码所在的 AppDomain。一个 AppDomain 中的代码要访问另一个 AppDomain 中的对象,只能使用 “按引用封送”(marshal-by-reference)或者 “按值封送”(marshal-by-value)的语义。这就强制建立了清晰的分隔和边界,因为一个 AppDomain 中的代码不能直接引用另一个 AppDomain 中的代码创建的对象。CLR 不支持从 AppDomain 中卸载特定的程序集。但可以告诉 CLR 卸载一个 AppDomain,从而卸载该 AppDomain 当前包含的所有程序集。AppDomain 创建后会应用一个权限集,它决定了向这个 AppDomain 中运行的程序集授予的最大权限。正是由于存在这些权限,所以当宿主加载一些代码后,可以保证这些代码不会破坏(或读取)宿主本身使用的一些重要数据结构。AppDomain 创建后会关联一组配置设置。这些设置主要影响 CLR 在 AppDomain 中加载程序集的方式。每个 AppDomain 都有自己的 Loader 堆,每个 Loader 堆都记录了自 AppDomain 创建以来已访问过哪些类型。这些类型对象已在第 4 章讨论过,Loader 堆中的每个类型对象都有一个方法表,方法表中的每个记录项都指向 JIT 编译的本机代码(前提是方法至少执行过一次)。如果两个 AppDomain 都使用了同一个程序集的一个类型,那么两个 AppDomain 的 Loader 堆会为相同的类型分别分配一个类型对象;类型对象的内存不会由两个 AppDomain 共享。另外,一个 AppDomain 中的代码调用一个类型定义的方法时,方法的 IL 代码会进行 JIT 编译,生成的本机代码单独与每个 AppDomain 关联,而不是由调用它的所有 AppDomain 共享。不共享类型对象的内存或本机代码显得有些浪费。但 AppDomain 的设计宗旨就是提供隔离;CLR 要求在卸载某个 AppDomain 并释放其所有资源时不会影响到其他任何 AppDomain。CLR 初始化时,MSCorLib.dll 程序集会自动加载,而且所有 AppDomain 都共享该程序集中的类型。为了减少资源消耗,MSCorLib.dll 程序集以一种 “AppDomain 中立” 的方式加载。也就是说,针对以 “AppDomain 中立” 的方式加载的程序集,CLR 会为它们维护一个特殊的 Loader 堆。该 Loader 堆中的所有类型对象,以及为这些类型定义的方法 JIT 编译器生成的所有本机代码,都会由进程中的所有 AppDomain 共享。以 “AppDomain 中立” 的方式加载的所有程序集永远不能卸载。要回收它们占用的资源,唯一的办法就是终止 Windows 进程,让 Windows 去回收资源。在 Windows 中,线程总是在一个进程中的上下文中创建,而且线程的整个生存期都在该进程的生存期内。但线程和 AppDomain 没有一对一关系。AppDomain 是一项 CLR 功能;Windows 对 AppDomain 一无所知。由于一个 Windows 进程可包含多个 AppDomain,所以线程能执行一个 AppDomain 中的代码,再执行另一个 AppDomain 中的代码。从 CLR 的角度看,线程一次只能执行一个 AppDomain 中的代码。AppDomain 创建后可被赋予一个友好名称。它是用于标识 AppDomain 的一个 String。友好名称主要是为了方便调试。 CreateDomain 方法内部会在进程中创建一个 AppDomain,该 AppDomain 将被赋予指定的友好名称、安全性和配置设置。创建 AppDomain 时,CLR 不在这个 AppDomain 中创建任何线程;AppDomain 中也不会运行代码,除非显式地让一个线程调用 AppDomain 中的代码。CLR 不允许一个 AppDomain 中的变量(根)引用另一个 AppDomain 中创建的对象。可以使用两种方式进行跨 AppDomain 进行通信,一种是使用 “按引用封送”,另一种是使用 “按值封送”。当 CreateInstanceAndUnwrap 发现它封送的一个对象的类型派生自 MarshalByRefObject 时,CLR 就会跨 AppDomain 边界按引用封送对象。源 AppDomain 想向目标 AppDomain 发送或返回对象引用时,CLR 会在目标 AppDomain 的 Loader 堆中定义一个代理类型。代理类型是用原始类型完全一样;有完全一样的实例成员 (属性、事件和方法)。但是,实例字段不会成为 (代理) 类型的一部分,我稍后会具体解释这一点。代理类型确实定义了几个 (自己的) 实例字段,但这些字段和原始类型的不一致。相反,这些字段只是指出哪个 AppDomain “拥有” 真实的对象,以及如何在拥有 (对象的) AppDomain 中找到真实的对象 (在内部,代理对象用一个 GCHandle 实例引用真实的对象)。在目标 AppDomain 中定义好这个代理类型之后, CreateInstanceAndUnwrap 方法就会创建代理类型的实例,初始化它的字段来标识源 AppDomain 和真实对象,然后将对这个代理对象的引用返回给目标 AppDomain。从 MarshalByRefObject 派生的类型可定义实例字段。但这些实例字段不会成为代理类型的一部分,也不会包含在代理对象中。写代码对派生自 MarshalByRefObject 的类型的实例字段进行读写时,JIT 编译器会自动生成代码,分别调用 System.ObjectFieldGetter 方法 (用于读) 或 FieldSetter 方法 (用于写) 来使用代理对象 (以找到真正的 AppDomain / 对象)。这些方法是私有的,而且没有在文档中记录。简单地说,这些方法利用反射机制获取或设置字段值。因此,虽然能访问派生自 MarshalByRefObject 的一个类型中的字段,但性能很差,因为 CLR 最终要调用方法来执行字段访问。事实上,即使要访问的字段在你自己的 AppDomain 中,性能也好不到哪里去。从好不好用 (usability) 的角度说,派生自 MarshalByRefObject 的类型真的应该避免定义任何静态成员。这是因为静态成员总是在调用 AppDomain 的上下文中访问。要切换到哪个 AppDomain 的信息包含在代理对象中,但调用静态成员时没有代理对象,所以不会发生 AppDomain 的切换。由于第二个 AppDomain 中没有根,所以代理引用的原始对象可以被垃圾回收。这当然不理想。但另一方面,假如将原始对象不确定地 (indefinitely) 留在内存中,代理可能不再引用它,而原始对象依然存活;这同样不理想。CLR 解决这个问题的办法是使用一个 “租约管理器”(lease manager)。一个对象的代理创建好之后,CLR 保持对象存活 5 分钟。5 分钟内没有通过代理发出调用,对象就会失效,下次垃圾回收会释放它的内存。每发出一次对对象的调用,“租约管理器” 都会续订对象的租期,保证它在接下来的 2 分钟内在内存中保持存活。在对象过期后试图通过代理调用它,CLR 会抛出 System.Runtime.Remoting.RemotingException 异常。默认的 5 分钟和 2 分钟租期设定是可以修改的,重写 MarshalByRefObject 的虚方法 InitializeLifetimeService 即可。对于标记了自定义特性 [Serializable] 的对象能按值进行封送。源 AppDomain 想向目标 AppDomain 发送或返回一个对象引用时,CLR 将对象的实例字段序列化成一个字节数组。字节数组从源 AppDomain 复制到目标 AppDomain。然后,CLR 在目标 AppDomain 中反序列化字节数组,这会强制 CLR 将定义了 “被反序列化的类型” 的程序集加载到目标 AppDomain 中 (如果尚未加载的话)。接着,CLR 创建类型的实例,并利用字节数组中的值初始化对象的字段,使之与源对象中的值相同。换言之,CLR 在目标 AppDomain 中精确复制了源对象。然后返回对这个副本的引用;这样一来,对象就跨 AppDomain 的边界按值封送了。至此,源 AppDomain 中的对象和目标 AppDomain 中的对象就有了独立的生存期,它们的状态也可以独立地更改。注意,对于 String 对象,CLR 会采取特殊的优化措施。跨越 AppDomain 边界封送一个 String 对象时, CLR 只是跨越边界传递对 String 对象的引用;不会真的生成 String 对象的副本。之所以能提供这个优化,是因为 String 对象是不可变的;所以,一个 AppDomain 中的代码不可能破坏 String 对象的字段。

# AppDomain Unloading

One of the great features of AppDomains is that you can unload them. Unloading an AppDomain causes the CLR to unload all of the assemblies in the AppDomain, and the CLR frees the AppDomain’s loader heap as well. To unload an AppDomain, you call AppDomain’s Unload static method (as the Ch22-1-AppDomains application does). This call causes the CLR to perform a lot of actions to gracefully unload the specified AppDomain:

  1. The CLR suspends all threads in the process that have ever executed managed code.
  2. The CLR examines all of the threads’ stacks to see which threads are currently executing code in the AppDomain being unloaded, or which threads might return at some point to code in the AppDomain that is being unloaded. The CLR forces any threads that have the unloading AppDomain on their stack to throw a ThreadAbortException (resuming the thread’s execution). This causes the threads to unwind, executing any finally blocks on their way out so that cleanup code executes. If no code catches the ThreadAbortException, it will eventually become an unhandled exception that the CLR swallows; the thread dies, but the process is allowed to continue running. This is unusual, because for all other unhandled exceptions, the CLR kills the process.

💡重要提示:如果线程当前正在 finally 块、 catch 块、类构造器、临界执行区域或非托管代码中执行,那么 CLR 不会立即终止该线程。否则、资源清理代码、错误恢复代码、类型初始化代码、关键 (critical) 代码或者其他任何 CLR 不了解的代码都将无法完成,导致应用程序的行为无法预测,甚至可能造成安全漏洞。线程终止时会等待这些代码块执行完毕。然后,当代码块结束时,CLR 再强制线程抛出一个 ThreadAbortException 异常。

  1. After all threads discovered in step 2 have left the AppDomain, the CLR then walks the heap and sets a flag in each proxy object that referred to an object created by the unloaded AppDomain. These proxy objects now know that the real object they referred to is gone. If any code now calls a method on an invalid proxy object, the method will throw an AppDomainUnloadedException.
  2. The CLR forces a garbage collection to occur, reclaiming the memory used by any objects that were created by the now unloaded AppDomain. The Finalize methods for these objects are called, giving the objects a chance to clean themselves up properly.
  3. The CLR resumes all of the remaining threads. The thread that called AppDomain.Unload will now continue running; calls to AppDomain.Unload occur synchronously.

My Ch22-1-AppDomains application uses just one thread to do all of the work. Whenever my code calls AppDomain.Unload, there are no threads in the unloading AppDomain, and therefore, the CLR doesn’t have to throw any ThreadAbortException exceptions. I’ll talk more about ThreadAbortException later in this chapter.

By the way, when a thread calls AppDomain.Unload, the CLR waits 10 seconds for the threads in the unloading AppDomain to leave it. If after 10 seconds, the thread that called AppDomain.Unload doesn’t return, it will throw a CannotUnloadAppDomainException, and the AppDomain may or may not be unloaded in the future.

💡注意:如果调用 AppDomain.Unload 方法的线程不巧在要卸载的 AppDomain 中,CLR 会创建另一个线程来尝试卸载 AppDomain。第一个线程被强制抛出 ThreadAbortException 并展开 (unwind)。新线程将等待 AppDomain 卸载,然后新线程会终止。如果 AppDomain 卸载失败,新线程将抛出 CannotUnloadAppDomainException 异常。但是,由于我们没有写由新线程执行的代码,所以无法捕捉这异常。

💡小结:AppDomain 很强大的一个地方就是可以卸载它。卸载 AppDomain 会导致 CLR 卸载 AppDomain 中的所有程序集,还会释放 AppDomain 的 Loader 堆。卸载 AppDomain 的办法是调用 AppDomain 的静态 Unload 方法。这导致 CLR 执行一系列操作来得体地卸载指定的 AppDomain。CLR 会挂起进程中执行过托管代码的所有线程。然后 CLR 检查所有线程栈,查看哪些线程正在执行要卸载的 AppDomain 中的代码,或者哪些线程会在某个时候返回至要卸载的 AppDomain。任何栈上有要卸载的 AppDomain,CLR 都会强迫对应的线程抛出一个 ThreadAbortException (同时恢复线程的执行)。这将导致线程展开 (unwind),并执行遇到的所有 finally 块以清理资源。如果没有代码捕捉 ThreadAbortException ,它最终会成为未处理的异常,CLR 会 “吞噬” 这个异常:线程会终止,但进程可继续运行。这是很特别的一点,因为对于其他所有未经处理的异常,CLR 都会终止进程。当第 2 步发现的所有线程都离开 AppDomain 后,CLR 遍历堆,为引用了 “由已卸载的 AppDomain 创建的对象” 的每个代理对象都设置一个标志 (flag)。这些代理对象现在知道它们引用的真实对象已经不在了。现在,任何代码在无效的代理对象上调用方法都会抛出一个 AppDomainUnloadedException 异常。CLR 强制垃圾回收,回收由已卸载的 AppDomain 创建的任何对象的内存。这些对象的 Finalize 方法被调用,使对象有机会正确清理它们占用的资源。CLR 恢复剩余所有线程的执行。调用 AppDomain.Unload 方法的线程将继续运行;对 AppDomain.Unload 的调用是同步进行的。换言之,一旦调用 Unload,只有在它返回之后,线程才能恢复运行。顺便说一句,当一个线程调用 AppDomain.Unload 方法时,针对要卸载的 AppDomain 中的线程, CLR 会给它们 10 秒钟的时间离开。10 秒钟后,如果调用 AppDomain.Unload 方法的线程还没有返回,CLR 将抛出一个 CannotUnloadAppDomainException 异常,AppDomain 将来可能会、也可能不会卸载。

# AppDomain Monitoring

A host application can monitor the resources that an AppDomain consumes. Some hosts will use this information to decide when to forcibly unload an AppDomain should its memory or CPU consumption rise above what the host considers reasonable. Monitoring can also be used to compare the resource consumption of different algorithms to determine which uses fewer resources. Because AppDomain monitoring incurs additional overhead, hosts must explicitly turn the monitoring on by setting AppDomain’s static MonitoringEnabled property to true. This turns on monitoring for all AppDomains. After monitoring is turned on, it cannot be turned off; attempting to set the MonitoringEnabled property to false causes an ArgumentException to be thrown.

After monitoring is turned on, your code can query the following four read-only properties offered by the AppDomain class:

  • MonitoringSurvivedProcessMemorySize This static Int64 property returns the number of bytes that are currently in use by all AppDomains controlled by the current CLR instance. The number is accurate as of the last garbage collection.

  • MonitoringTotalAllocatedMemorySize This instance Int64 property returns the number of bytes that have been allocated by a specific AppDomain. The number is accurate as of the last garbage collection.

  • MonitoringSurvivedMemorySize This instance Int64 property returns the number of bytes that are currently in use by a specific AppDomain. The number is accurate as of the last garbage collection.

  • MonitoringTotalProcessorTime This instance TimeSpan property returns the amount of CPU usage incurred by a specific AppDomain.

The following class shows how to use three of these properties to see what has changed within an AppDomain between two points in time.

private sealed class AppDomainMonitorDelta : IDisposable {
 private AppDomain m_appDomain;
 private TimeSpan m_thisADCpu;
 private Int64 m_thisADMemoryInUse;
 private Int64 m_thisADMemoryAllocated;
 static AppDomainMonitorDelta() {
 // Make sure that AppDomain monitoring is turned on
 AppDomain.MonitoringIsEnabled = true;
 }
 public AppDomainMonitorDelta(AppDomain ad) {
 m_appDomain = ad ?? AppDomain.CurrentDomain;
 m_thisADCpu = m_appDomain.MonitoringTotalProcessorTime;
 m_thisADMemoryInUse = m_appDomain.MonitoringSurvivedMemorySize;
 m_thisADMemoryAllocated = m_appDomain.MonitoringTotalAllocatedMemorySize;
 }
 public void Dispose() {
 GC.Collect();
 Console.WriteLine(“FriendlyName={0}, CPU={1}ms”, m_appDomain.FriendlyName,
 (m_appDomain.MonitoringTotalProcessorTime - m_thisADCpu).TotalMilliseconds);
 Console.WriteLine(“ Allocated {0:N0} bytes of which {1:N0} survived GCs”,
 m_appDomain.MonitoringTotalAllocatedMemorySize - m_thisADMemoryAllocated,
 m_appDomain.MonitoringSurvivedMemorySize - m_thisADMemoryInUse);
 }
}

The following code shows how to use the AppDomainMonitorDelta class.

private static void AppDomainResourceMonitoring() {
 using (new AppDomainMonitorDelta(null)) {
 // Allocate about 10 million bytes that will survive collections
 var list = new List<Object>();
 for (Int32 x = 0; x < 1000; x++) list.Add(new Byte[10000]);
 // Allocate about 20 million bytes that will NOT survive collections
 for (Int32 x = 0; x < 2000; x++) new Byte[10000].GetType();
 // Spin the CPU for about 5 seconds
 Int64 stop = Environment.TickCount + 5000;
 while (Environment.TickCount < stop) ;
 }
}

When I execute this code, I get the following output.

FriendlyName=03-Ch22-1-AppDomains.exe, CPU=5031.25ms
 Allocated 30,159,496 bytes of which 10,085,080 survived GCs

💡小结:宿主应用程序可监视 AppDomain 消耗的资源。有的宿主根据这种信息判断一个 AppDomain 的内存或 CPU 消耗是否超过了应有的水准,并强制卸载一个 AppDomain。还可利用监视来比较不同算法的资源消耗情况,判断哪一种算法用的资源较少。由于 AppDomain 监视本身也会产生开销,所以宿主必须将 AppDomain 的静态 MonitoringEnabled 属性设为 true ,从而显式地打开监视。监视一旦打开便不能关闭;将 MonitoringEnabled 属性设为 false 会抛出一个 ArgumentException 异常。

# AppDomain First-Chance Exception Notifications

Each AppDomain can have associated with it a series of callback methods that get invoked when the CLR begins looking for catch blocks within an AppDomain. These methods can perform logging, or a host can use this mechanism to monitor exceptions being thrown within an AppDomain. The callbacks cannot handle the exception or swallow it in any way; they are just receiving a notification that the exception has occurred. To register a callback method, just add a delegate to AppDomain’s instance FirstChanceException event.

Here is how the CLR processes an exception: when the exception is first thrown, the CLR invokes any FirstChanceException callback methods registered with the AppDomain that are throwing the exception. Then, the CLR looks for any catch blocks on the stack that are within the same AppDomain. If a catch block handles the exception, then processing of the exception is complete and execution continues as normal. If the AppDomain has no catch block to handle the exception, then the CLR walks up the stack to the calling AppDomain and throws the same exception object again (after serializing and deserializing it). At this point, it is as if a brand new exception is being thrown, and the CLR invokes any FirstChanceException callback methods registered with the now current AppDomain. This continues until the top of the thread’s stack is reached. At that point, if the exception is not handled by any code, the CLR terminates the whole process.

💡小结:每个 AppDomain 都可关联一组回调方法;CLR 开始查找 AppDomain 中的 catch 块时,这些回调方法将得以调用。可用这些方法执行日志记录操作。另外,宿主可利用这个机制监视 AppDomain 中抛出的异常。回调方法不能处理异常,也不能以任何方式 “吞噬” 异常 (装作异常没有发生);它们只是接收关于异常发生的通知。要登记回调方法,为 AppDomain 的实例事件 FirstChanceException 添加一个委托就可以了。异常首次抛出时,CLR 调用向抛出异常的 AppDomain 登记的所有 FirstChanceException 回调方法。然后,CLR 查找栈上在同一个 AppDomain 中的任何 catch 块。有一个 catch 块能处理异常,则异常处理完成,将继续正常执行。如果 AppDomain 中没有一个 catch 块能处理异常,则 CLR 沿着栈向上来到调用 AppDomain,再次抛出同一个异常对象 (序列化和反序列化之后)。这时感觉就像是抛出了一个全新的异常,CLR 调用向当前 AppDomain 登记的所有 FirstChanceException 回调方法。这个过程会一直持续,直到抵达线程栈顶部。届时如果异常还未被任何代理处理,CLR 只好终止整个进程。

# How Hosts Use AppDomains

So far, I’ve talked about hosts and how they load the CLR. I’ve also talked about how the hosts tell the CLR to create and unload AppDomains. To make the discussion more concrete, I’ll describe some common hosting and AppDomain scenarios. In particular, I’ll explain how different application types host the CLR and how they manage AppDomains.

# Executable Applications

Console UI applications, NT Service applications, Windows Forms applications, and Windows Presentation Foundation (WPF) applications are all examples of self-hosted applications that have managed EXE files. When Windows initializes a process by using a managed EXE file, Windows loads the shim, and the shim examines the CLR header information contained in the application’s assembly (the EXE file). The header information indicates the version of the CLR that was used to build and test the application. The shim uses this information to determine which version of the CLR to load into the process. After the CLR loads and initializes, it again examines the assembly’s CLR header to determine which method is the application’s entry point (Main). The CLR invokes this method, and the application is now up and running.

As the code runs, it accesses other types. When referencing a type contained in another assembly, the CLR locates the necessary assembly and loads it into the same AppDomain. Any additionally referenced assemblies also load into the same AppDomain. When the application’s Main method returns, the Windows process terminates (destroying the default AppDomain and all other AppDomains).

💡注意:顺便说一句,要关闭 Windows 进程 (包括它的所有 AppDomain),可调用 System.Enviroment 的静态方法 ExitExit 是终止进程最得体的方式,因为它首先调用托管堆上的所有对象的 Finalize 方法,再释放 CLR 容纳的所有非托管 COM 对象,最后, Exit 调用 Win32 ExitProcess 函数。

It’s possible for the application to tell the CLR to create additional AppDomains in the process’s address space. In fact, this is what my Ch22-1-AppDomains application did.

# Microsoft Silverlight Rich Internet Applications

Microsoft’s Silverlight runtime technology uses a special CLR that is different from the normal desktop version of the .NET Framework. After the Silverlight runtime is installed, navigating to a website that uses Silverlight causes the Silverlight CLR (CoreClr.dll) to load in your browser (which may or may not be Windows Internet Explorer—you may not even be using a Windows machine). Each Silverlight control on the page runs in its own AppDomain. When the user closes a tab or navigates to another website, any Silverlight controls no longer in use have their AppDomains unloaded. The Silverlight code running in the AppDomain runs in a limited-security sandbox so that it cannot harm the user or the machine in any way.

# Microsoft ASP.NET and XML Web Services Applications

ASP.NET is implemented as an ISAPI DLL (implemented in ASPNet_ISAPI.dll). The first time a client requests a URL handled by the ASP.NET ISAPI DLL, ASP.NET loads the CLR. When a client makes a request of a web application, ASP.NET determines if this is the first time a request has been made. If it is, ASP.NET tells the CLR to create a new AppDomain for this web application; each web application is identified by its virtual root directory. ASP.NET then tells the CLR to load the assembly that contains the type exposed by the web application into this new AppDomain, creates an instance of this type, and starts calling methods in it to satisfy the client’s web request. If the code references more types, the CLR will load the required assemblies into the web application’s AppDomain.

When future clients make requests of an already running web application, ASP.NET doesn’t create a new AppDomain; instead, it uses the existing AppDomain, creates a new instance of the web application’s type, and starts calling methods. The methods will already be JIT-compiled into native code, so the performance of processing all subsequent client requests is excellent.

If a client makes a request of a different web application, ASP.NET tells the CLR to create a new AppDomain. This new AppDomain is typically created inside the same worker process as the other AppDomains. This means that many web applications run in a single Windows process, which improves the efficiency of the system overall. Again, the assemblies required by each web application are loaded into an AppDomain created for the sole purpose of isolating that web application’s code and objects from other web applications.

A fantastic feature of ASP.NET is that the code for a website can be changed on the fly without shutting down the web server. When a website’s file is changed on the hard disk, ASP.NET detects this, unloads the AppDomain that contains the old version of the files (when the last currently running request finishes), and then creates a new AppDomain, loading into it the new versions of the files. To make this happen, ASP.NET uses an AppDomain feature called shadow copying.

# Microsoft SQL Server

Microsoft SQL Server is an unmanaged application because most of its code is still written in unmanaged C++. SQL Server allows developers to create stored procedures by using managed code. The first time a request comes in to the database to run a stored procedure written in managed code, SQL Server loads the CLR. Stored procedures run in their own secured AppDomain, prohibiting the stored procedures from adversely affecting the database server.

This functionality is absolutely incredible! It means that developers will be able to write stored procedures in the programming language of their choice. The stored procedure can use strongly typed data objects in its code. The code will also be JIT-compiled into native code when executed instead of being interpreted. And developers can take advantage of any types defined in the Framework Class Library (FCL) or in any other assembly. The result is that our job becomes much easier and our applications perform much better. What more could a developer ask for?!

# Your Own Imagination

Productivity applications such as word processors and spreadsheets also allow users to write macros in any programming language they choose. These macros will have access to all of the assemblies and types that work with the CLR. They will be compiled, so they will execute fast, and, most important, these macros will run in a secure AppDomain so that users don’t get hit with any unwanted surprises. Your own applications can use this ability, too, in any way you want.

💡小结:控制台 UI 应用程序、NT Service 应用程序、Windows 窗体应用程序和 Windows Presentation Foundation (WPF) 应用程序都会自寄宿 (self-hosted,即自己容纳 CLR) 的应用程序,它们都有托管 EXE 文件。Windows 用托管 EXE 文件初始化进程时,会加载垫片。垫片检查应用程序的程序集 (EXE 文件) 中的 CLR 头信息。头信息指明了生成和测试应用程序时使用的 CLR 版本。垫片根据这些信息决定将哪个版本的 CLR 加载到进程中,CLR 加载并初始化好之后,会再次检查程序集的 CLR 头,判断哪个方法是应用程序的入口方 ( Main )。CLR 调用该方法,此时应用程序才真正启动并运行起来。代码运行时会访问其他类型。引用另一个程序集中的类型时,CLR 会定位所需的程序集,并将其加载到同一个 AppDomain 中。应用程序的 Main 方法返回后,Windows 进程终止 (销毁默认 AppDomain 和其他所有 AppDomain)。

# Advanced Host Control

In this section, I’ll mention some more advanced topics related to hosting the CLR. My intent is to give you a taste of what is possible, and this will help you to understand more of what the CLR is capable of. I encourage you to seek out other texts if you find this information particularly interesting.

# Managing the CLR by Using Managed Code

The System.AppDomainManager class allows a host to override CLR default behavior by using managed code instead of using unmanaged code. Of course, using managed code makes implementing a host easier. All you need to do is define your class and derive it from the System.AppDomainManager class, overriding any virtual methods where you want to take over control. Your class should then be built into its very own assembly and installed into the global assembly cache (GAC) because the assembly needs to be granted full-trust, and all assemblies in the GAC are always granted full-trust.

Then, you need to tell the CLR to use your AppDomainManager-derived class. In code, the best way to do this is to create an AppDomainSetup object initializing its AppDomainManagerAssembly and AppDomainManagerType properties, both of which are of type String. Set the AppDomainManagerAssembly property to the string identifying the strong-name identity of the assembly that defines your AppDomainManager-derived class, and then set the AppDomainManagerType property to the full name of your AppDomainManager-derived class. Alternatively, AppDomainManager can be set in your application’s XML configuration file by using the appDomainManagerAssembly and appDomainManagerType elements. In addition, a native host could query for the ICLRControl interface and call this interface’s SetAppDomainManagerType function, passing in the identity of the GAC-installed assembly and the name of the AppDomainManager-derived class.

Now, let’s talk about what an AppDomainManager-derived class can do. The purpose of the AppDomainManager-derived class is to allow a host to maintain control even when an add-in tries to create AppDomains of its own. When code in the process tries to create a new AppDomain, the AppDomainManager-derived object in that AppDomain can modify security and configuration settings.

It can also decide to fail an AppDomain creation, or it can decide to return a reference to an existing AppDomain instead. When a new AppDomain is created, the CLR creates a new AppDomainManagerderived object in the AppDomain. This object can also modify configuration settings, how execution context is flowed between threads, and permissions granted to an assembly.

# Writing a Robust Host Application

A host can tell the CLR what actions to take when a failure occurs in managed code. Here are some examples (listed from least severe to most severe):

  • The CLR can abort a thread if the thread is taking too long to execute and return a response. (I’ll discuss this more in the next section.)

  • The CLR can unload an AppDomain. This aborts all of the threads that are in the AppDomain and causes the problematic code to be unloaded.

  • The CLR can be disabled. This stops any more managed code from executing in the process, but unmanaged code is still allowed to run.

  • The CLR can exit the Windows process. This aborts all of the threads and unloads all of the AppDomains first so that cleanup operations occur, and then the process terminates.

The CLR can abort a thread or AppDomain gracefully or rudely. A graceful abort means that cleanup code executes. In other words, code in finally blocks runs, and objects have their Finalize methods executed. A rude abort means that cleanup code does not execute. In other words, code in finally blocks may not run, and objects may not have their Finalize methods executed. A graceful abort cannot abort a thread that is in a catch or finally block. However, a rude abort will abort a thread that is in a catch or finally block. Unfortunately, a thread that is in unmanaged code or in a constrained execution region (CER) cannot be aborted at all.

A host can set what is called an escalation policy, which tells the CLR how to deal with managed code failures. For example, SQL Server tells the CLR what to do should an unhandled exception be thrown while the CLR is executing managed code. When a thread experiences an unhandled exception, the CLR first attempts to upgrade the exception to a graceful thread abort. If the thread does not abort in a specified time period, the CLR attempts to upgrade the graceful thread abort to a rude thread abort.

What I just described is what usually happens. However, if the thread experiencing the unhandled exception is in a critical region, the policy is different. A thread that is in a critical region is a thread that has entered a thread synchronization lock that must be released by the same thread, for example, a thread that has called Monitor.Enter, Mutex’s WaitOne, or one of ReaderWriterLock’s AcquireReaderLock or AcquireWriterLock methods.6 Successfully waiting for an AutoResetEvent, ManualResetEvent, or Semaphore doesn’t cause the thread to be in a critical region because another thread can signal these synchronization objects. When a thread is in a critical region, the CLR believes that the thread is accessing data that is shared by multiple threads in the same AppDomain. After all, this is probably why the thread took the lock. If the thread is accessing shared data, just terminating the thread isn’t good enough, because other threads may then try to access the shared data that is now corrupt, causing the AppDomain to run unpredictably or with possible security vulnerabilities.

So, when a thread in a critical region experiences an unhandled exception, the CLR first attempts to upgrade the exception to a graceful AppDomain unload in an effort to get rid of all of the threads and data objects that are currently in use. If the AppDomain doesn’t unload in a specified amount of time, the CLR upgrades the graceful AppDomain unload to a rude AppDomain unload.

# How a Host Gets Its Thread Back

Normally, a host application wants to stay in control of its threads. Let’s take a database server as an example. When a request comes into the database server, a thread picks up the request and then dispatches the request to another thread that is to perform the actual work. This other thread may need to execute code that wasn’t created and tested by the team that produced the database server. For example, imagine a request coming into the database server to execute a stored procedure written in managed code by the company running the server. It’s great that the database server can run the stored procedure code in its own AppDomain, which is locked down with security. This prevents the stored procedure from accessing any objects outside of its own AppDomain, and it also prevents the code from accessing resources that it is not allowed to access, such as disk files or the clipboard.

But what if the code in the stored procedure enters an infinite loop? In this case, the database server has dispatched one of its threads into the stored procedure code, and this thread is never coming back. This puts the server in a precarious position; the future behavior of the server is unknown. For example, the performance might be terrible now because a thread is in an infinite loop. Should the server create more threads? Doing so uses more resources (such as stack space), and these threads could also enter an infinite loop themselves.

To solve these problems, the host can take advantage of thread aborting. Figure 22-3 shows the typical architecture of a host application trying to solve the runaway thread problem. Here’s how it works (the numbers correspond to the circled numbers in the figure):

  1. A client sends a request to the server.
  2. A server thread picks up this request and dispatches it to a thread pool thread to perform the actual work.
  3. A thread pool thread picks up the client request and executes trusted code written by the company that built and tested the host application.
  4. This trusted code then enters a try block, and from within the try block, calls across an AppDomain boundary (via a type derived from MarshalByRefObject). This AppDomain contains the untrusted code (perhaps a stored procedure) that was not built and tested by the company that produced the host application. At this point, the server has given control of its thread to some untrusted code; the server is feeling nervous right now.
  5. When the host originally received the client’s request, it recorded the time. If the untrusted code doesn’t respond to the client in some administrator-set amount of time, the host calls Thread’s Abort method, asking the CLR to stop the thread pool thread, forcing it to throw a ThreadAbortException.
  6. At this point, the thread pool thread starts unwinding, calling finally blocks so that cleanup code executes. Eventually, the thread pool thread crosses back over the AppDomain boundary. Because the host’s stub code called the untrusted code from inside a try block, the host’s stub code has a catch block that catches the ThreadAbortException.
  7. In response to catching the ThreadAbortException, the host calls Thread’s ResetAbort method. I’ll explain the purpose of this call shortly.
  8. Now that the host’s code has caught the ThreadAbortException, the host can return some sort of failure back to the client and allow the thread pool thread to return to the pool so that it can be used for a future client request.

image-20221202190540124

Let me now clear up a few loose ends about this architecture. First, Thread’s Abort method is asynchronous. When Abort is called, it sets the target thread’s AbortRequested flag and returns immediately. When the runtime detects that a thread is to be aborted, the runtime tries to get the thread to a safe place. A thread is in a safe place when the runtime feels that it can stop what the thread is doing without causing disastrous effects. A thread is in a safe place if it is performing a managed blocking operation such as sleeping or waiting. A thread can be corralled to a safe place by using hijacking (described in Chapter 21). A thread is not in a safe place if it is executing a type’s class constructor, code in a catch or finally block, code in a CER, or unmanaged code.

After the thread reaches a safe place, the runtime will detect that the AbortRequested flag is set for the thread. This causes the thread to throw a ThreadAbortException. If this exception is not caught, the exception will be unhandled, all pending finally blocks will execute, and the thread will kill itself gracefully. Unlike all other exceptions, an unhandled ThreadAbortException does not cause the application to terminate. The runtime silently eats this exception and the thread dies, but the application and all of its remaining threads continue to run just fine.

In my example, the host catches the ThreadAbortException, allowing the host to regain control of the thread and return it to the pool. But there is a problem: What is to stop the untrusted code from catching the ThreadAbortException itself to keep control of the thread? The answer is that the CLR treats the ThreadAbortException in a very special manner. Even when code catches the ThreadAbortException, the CLR doesn’t allow the exception to be swallowed. In other words, at the end of the catch block, the CLR automatically rethrows the ThreadAbortException exception.

This CLR feature raises another question: If the CLR rethrows the ThreadAbortException at the end of a catch block, how can the host catch it to regain control of the thread? Inside the host’s catch block, there is a call to Thread’s ResetAbort method. Calling this method tells the CLR to stop rethrowing the ThreadAbortException at the end of each catch block.

This raises yet another question: What’s to stop the untrusted code from catching the ThreadAbortException and calling Thread’s ResetAbort method itself to keep control of the thread? The answer is that Thread’s ResetAbort method requires the caller to have the SecurityPermission with the ControlThread flag set to true. When the host creates the AppDomain for the untrusted code, the host will not grant this permission, and now, the untrusted code cannot keep control of the host’s thread.

I should point out that there is still a potential hole in this story: while the thread is unwinding from its ThreadAbortException, the untrusted code can execute catch and finally blocks. Inside these blocks, the untrusted code could enter an infinite loop, preventing the host from regaining control of its thread. A host application fixes this problem by setting an escalation policy (discussed earlier). If an aborting thread doesn’t finish in a reasonable amount of time, the CLR can upgrade the thread abort to a rude thread abort, a rude AppDomain unload, disabling of the CLR, or killing of the process. I should also note that the untrusted code could catch the ThreadAbortException and, inside the catch block, throw some other kind of exception. If this other exception is caught, at the end of the catch block, the CLR automatically rethrows the ThreadAbortException.

It should be noted, though, that most untrusted code is not actually intended to be malicious; it is just written in such a way so as to be taking too long by the host’s standards. Usually, catch and finally blocks contain very little code, and this code usually executes quickly without any infinite loops or long-running tasks. And so it is very unlikely that the escalation policy will have to go into effect for the host to regain control of its thread.

By the way, the Thread class actually offers two Abort methods: one takes no parameters, and the other takes an Object parameter allowing you to pass anything. When code catches the ThreadAbortException, it can query its read-only ExceptionState property. This property returns the object that was passed to Abort. This allows the thread calling Abort to specify some additional information that can be examined by code catching the ThreadAbortException. The host can use this to let its own handling code know why it is aborting threads.

🍓:) ThreadAbort 方法是异步的。调用 Abort 方法时,会在设置目标线程的 AbortRequsted 标志后立即返回。“运行时” 检测到一个线程要中止时,会尝试将该线程弄到一个安全地点 (safe place)。如果 “运行时” 认为能安全地停止线程正在做的事情,不会造成灾难性后果,就说线程在安全地点。如果线程正在执行一个托管的阻塞操作 (比如睡眠或等待),它就在一个安全地点。相反,如果线程正在执行类型的类构造器、 catch 块或 finally 块中的代码、CER 中的代码或者非托管代码,线程就不在安全地点。

🍓:)线程到达安全地点后,“运行时” 检测到线程已设置了 AbortRequsted 标志。这导致线程抛出一个 ThreadAbortException ,如果该异常未捕捉,异常就会成为未处理的异常,所有挂起的 finally 块将执行,线程得体地中止。和其他所有异常不同,未处理的 ThreadAbortException 不会导致应用程序终止。“运行时” 会悄悄地 “吞噬” 这个异常 (假装它没有发生),线程将 “死亡”。但应用程序及其剩余的所有线程都将继续运行。

🍓:)有一个问题:宿主用什么办法阻止不可信代码自己捕捉 ThreadAbortException ,从而保持宿主对线程的控制呢?答案是 CLR 以一种非常特殊的方式对待 ThreadAbortException 。即使代码捕捉了 ThreadAbortException 。即使代码捕捉了 ThreadAbortException ,CLR 也不允许代码悄悄地 “吞噬” 该异常。换言之,在 catch 块的尾部,CLR 会自动重新抛出 ThreadAbortException 异常。

🍓:)CLR 的这个功能又引起另一个问题:如果 CLR 在 catch 块的尾部重新抛出了 ThreadAbortException 异常,宿主如何捕捉它并重新获取线程的控制权呢?宿主的 catch 块中有一个对 ThreadResetAbort 方法的调用。调用该方法会告诉 CLR 在 catch 块的尾部不要重新抛出 ThreadAbortException 异常。

🍓:)这又引起了另一个问题:宿主怎么阻止不可信代码自己捕捉 ThreadAbortException 并调用 ThreadResetAbort 方法,从而保持宿主对线程的控制呢?答案是 ThreadResetAbort 方法要求调用者被授予了 SecurityPermission 权限,而且其 ControlThread 标志已被设为 true 。宿主为不可信代码创建 AppDomain 时,不会向其授予这个权限,所以不可信代码不能保持对宿主的线程的控制权。

🍓:)需要指出的是,这里仍然存在一个潜在的漏洞:当线程从它的 ThreadAbortException 展开时,不可信代码可执行 catch 块和 finally 块。在这些块中,不可信代码可能进入死循环,阻止宿主重新获取线程的控制权。宿主应用程序通过设置一个升级策略 (前面已进行了讨论) 来修正这个问题。要终止的线程在合理的时间内没有完成,CLR 可将线程的终止方式升级成 “粗鲁” 的线程终止、“粗鲁” 的 AppDomain 卸载、禁用 CLR 或者干脆杀死整个进程。还要注意,不可信代码可捕捉 ThreadAbortException ,并在 catch 块中抛出其他种类的一个异常。如果这个其他的异常捕捉到,CLR 会在 catch 块的尾部自动重新抛出 ThreadAbortException 异常。

💡小结: System.AppDomainManager 类允许宿主使用托管代码 (而不是非托管代码) 覆盖 CLR 的默认行为。当然,使用托管代码使宿主的实现变得更容易。你唯一要做的就是定义自己的类,让它从 System.AppDomainManager 派生,重写想接手控制的任何虚方法。然后,在专用的程序集中生成类,并将程序集安装到 GAC 中。这是由于该程序集需要被授予完全信任权限,而 GAC 中的所有程序集都总是被授予完全信任权限。 AppDomainManager 派生类的作用是使宿主保持控制权,即使是在加载项 (add-in) 试图创建自己的 AppDomain 时。进程中的代码试图创建新 AppDomain 时,那个 AppDomain 中的 AppDomainManager 派生对象可修改安全性和配置设置。它还可决定阻止一次 AppDomain 创建,或返回对现有 AppDomain 的引用。新 AppDomain 创建好之后,CLR 会在其中创建新的 AppDomainManager 派生对象。新 AppDomain 创建好之后,CLR 会在其中创建新的 AppDomainManager 派生对象。这个对象也能修改配置设置、决定执行上下文如何在线程之间切换,并决定向程序集授予的权限。托管代码出现错误时,宿主可告诉 CLR 采取什么行动。例如,如果线程执行时间过长,CLR 可终止线程并返回一个响应。CLR 可卸载 AppDomain。这会终止该 AppDomain 中的所有线程,导致有问题的代码卸载。CLR 可被禁用。这会阻止更多的托管代码在程序中运行,但仍然允许非托管代码运行。CLR 可退出 Windows 进程。首先会终止所有线程,并卸载所有 AppDomain,使资源清理操作得以执行,然后才会终止进程。CLR 可以得体地 (gracefully) 或者粗鲁地 (rudely) 终止线程或 AppDomain。得体意味着会执行 (资源) 清理代码。换言之, finally 块中的代码会运行,对象的 Finalize 方法也将被执行。而粗鲁意味着清理代码不会执行。换言之, finally 块中的代码可能不会运行,对象的 Finalize 方法也可能不会执行。如果得体地终止,当前正在一个 catch 块或 finally 块中的线程。遗憾的是,非托管代码或者约束执行区 (Constrained Execution Region,CER) 中的线程完全无法终止。宿主可设置所谓的升级策略 (escalation policy),从而告诉 CLR 应该如何处理托管代码的错误。线程在临界区时,CLR 认为线程访问的数据是由同一个 AppDomain 中的多个线程共享的。毕竟,这才是导致线程跑去获取一个锁的原因。直接终止正在访问共享数据的线程是不合适的,因为其他线程随后得到的就可能是已损坏的数据,造成 AppDomain 的运行变得无法预测,甚至可能留下安全隐患。所以,位于临界区的线程遭遇未处理的异常时,CLR 首先尝试将异常升级成一次得体的 AppDomain 卸载,从而摆脱 (清理) 当前正在这个 AppDomain 中的所有线程以及当前正在使用的数据对象。如果 AppDomain 未能在指定时间内卸载,CLR 就将得体的 AppDomain 卸载升级成粗鲁的 AppDomain 卸载。