.Net CLR Overview

The Microsoft .NET Framework is a software framework that can be installed on computers running Microsoft Windows operating systems. It includes a large library of coded solutions to common programming problems and a virtual machine that manages the execution of programs written specifically for the framework. The .NET Framework is a Microsoft offering and is intended to be used by most new applications created for the Windows platform.

The framework’s Base Class Library provides a large range of features including user interface, data access, database connectivity, cryptography, web application development, numeric algorithms, and network communications. The class library is used by programmers, who combine it with their own code to produce applications.

Programs written for the .NET Framework execute in a software environment that manages the program’s runtime requirements. Also part of the .NET Framework, this runtime environment is known as the Common Language Runtime (CLR). The CLR provides the appearance of an application virtual machine so that programmers need not consider the capabilities of the specific CPU that will execute the program. The CLR also provides other important services such as security, memory management, and exception handling. The class library and the CLR together constitute the .NET Framework.

The Common Language Runtime (CLR) is the foundation of the .NET Framework. CLR act as an agent that manages code at execution time, providing core services such as memory management, thread management, and remoting, while also enforcing strict type safety and facilitates with code accuracy that ensure security and robustness. The concept of code management is a fundamental principle of the CLR. Code that targets the CLR is known as managed code, while code that does not target the CLR is known as unmanaged code.

Developers using the CLR write code in a language such as C# or VB.NET. At compile time, a .NET compiler converts such code into CIL code. At runtime, the CLR’s just-in-time compiler converts the CIL code into code native to the operating system. Alternatively, the CIL code can be compiled to native code in a separate step prior to runtime by using the Native Image Generator (NGEN). This speeds up all later runs of the software as the CIL-to-native compilation is no longer necessary.

  •  The .NET Framework provides a run-time environment called the common language runtime, which runs the code and provides services that make the development process easier.
  • The core runtime engine in the Microsoft .NET Framework for executing applications.
  • The common language runtime supplies managed code with services such as cross-language integration, code access security, object lifetime management, resouce management, type safety, pre-emptive threading, metadata services (type reflection), and debugging and profiling support.
  • The CLR is a multi-language execution environment. There are currently over 15 compilers being built by Microsoft and other companies that produce code that will execute in the CLR.
  • It is Microsoft’s implementation of the Common Language Infrastructure (CLI) standard, which defines an execution environment for program code.
  • In the CLR, code is expressed in a form of bytecode called the Common Intermediate Language (CIL, previously known as MSIL—Microsoft Intermediate Language).
  • You can create source code files using any programming language that supports the CLR. Then, you
    use the corresponding compiler to check syntax and analyze the source code. Regardless of which compiler you use, the
    result is a managed module. A managed module is a standard Windows portable executable (PE) file that requires the
    CLR to execute.

A Managed Module is composed of the following parts:

PE header : This is the standard Windows PE file header, which is similar to the Common Object File Format (COFF) header. The PE header indicates the type of file–GUI, CUI, or DLL—and also has a timestamp indicating when the file was built. For modules that contain only IL code (see below, Intermediate Language Code), the bulk of the information in the PE header is ignored. For modules that contain native CPU code, this header contains information about the native CPU code.

CLR header : This header contains the information (interpreted by the CLR and utilities) that makes this a managed module. It includes the version of the CLR required, some flags, the MethodDef metadata token of the managed module’s entry point method (Main method), and the location/size of the module’s metadata, resources, strong name, some flags, and other less interesting stuff.

MetaData : Every managed module contains metadata tables, of which there are 2 main types: those that describe the types and members defined in your source code, and those that describe the types and members referenced by your source code.

 Intermediate Language (IL) Code : This is the code that was produced by the compiler as it compiled the source code. IL is later compiled by the CLR into native CPU instructions.

Most compilers of the past produced code targeted to a specific CPU architecture, such as x86, IA64, Alpha, or PowerPC. All CLR-compliant compilers produce intermediate language (IL) code instead. IL code is sometimes referred to as managed code, because its lifetime and execution are managed by the CLR.

In brief, metadata is simply a set of data tables that describe what is defined in the module, such as types and
their members.The metadata is always embedded in the same EXE/DLL as the code, making it impossible to separate the two. Since the metadata and code are produced by the compiler at the same time and are bound into the resulting managed module, the metadata and the IL code it describes are never out of sync with one another.
Metadata has many uses. Here are some of them:

  • Metadata removes the need for header and library files when compiling, because all the information about the referenced types/members is contained in one file along with the IL that implements those type/members. Compilers can read metadata directly from managed modules.
  • Visual Studio uses metadata to help you write code. Its IntelliSense feature parses metadata to tell you what methods a type offers and what parameters that method expects.
  • The CLR code verification process uses metadata to ensure that your code performs only “safe” operations. Verification is discussed shortly.
  • Metadata allows an object’s fields to be serialized into a memory block, remoted to another machine, and then deserialized, recreating the object and its state on the remote machine.
  • Metadata allows the garbage collector to track the lifetime of objects. For any object, the garbage collector can determine the type of the object, and from the metadata it knows which fields within that object refer to other objects.

The CLR doesn’t actually work with modules; it works with assemblies. An assembly is an abstract concept, which can be difficult to grasp at first. First, an assembly is a logical grouping of one or more managed modules or resource files. Second, an assembly is the smallest unit of reuse, security, and versioning. Depending on the choices you make with your compilers or tools, you can produce a single-file assembly or you can produce a multi-file assembly.

Loading the Common Language Runtime

When you build an EXE assembly, the compiler/linker emits some special information into the resulting assembly’s PE File header and the file’s .text section. When the EXE file is invoked, this special information causes the CLR to load and initialize. Then the CLR locates the entry point method for the application and lets the application start executing. 

How a managed EXE loads and initializes the CLR.

When the compiler/linker creates an executable assembly, the following 6-byte x86 stub function is emitted into the .text section of the PE file: JMP _CorExeMain The _CorExeMain function is imported from the Microsoft MSCorEE.dll dynamic-link library, and therefore MSCorEE.dll is referenced in the import (.idata) section of the assembly file. (MSCorEE.dll stands for Microsoft Component Object Runtime Execution Engine.) When the managed EXE file is invoked, Windows treats it just like any normal (unmanaged) EXE file: the Windows loader loads the file and examines the .idata section to see that MSCorEE.dll should be loaded into the process’s address space. Then, the loader obtains the address of the _CorExeMain function inside MSCorEE.dll and fixes up the stub function’s JMP instruction in the managed EXE file. The primary thread for the process begins executing this x86 stub function, which immediately jumps to _CorExeMain in MSCorEE.dll. _CorExeMain initializes the CLR and then looks at the CLR header for the executable assembly to determine what managed entry point method should execute. The IL code for the method is then compiled into native CPU instructions, after which the CLR jumps to the native code (using the process’s primary thread). At this point, the managed application code is running.

The situation is similar for a managed DLL. When building a managed DLL, the compiler/linker emits a similar 6-byte x86 stub function for a DLL assembly in the .text section of the PE file: JMP _CorDllMain .  The _CorDllMain function is also imported from the MSCorEE.dll, causing the .idata section for the DLL to reference MSCorEE.dll. When Windows loads the DLL, it  automatically loads MSCorEE.dll (if it isn’t already loaded), obtains the address of the _CorDllMain function, and fixes up the 6 byte x86 JMP stub in the managed DLL. The thread that called LoadLibrary to load the managed DLL then jumps to the x86 stub in the managed DLL assembly, which immediately jumps to the _CorDllMain in MSCorEE.dll. _CorDllMain initializes the CLR (if it hasn’t already been initialized for the process) and then returns so that the application can continue executing as normal.

We have shipped several versions of .Net framework: 1.0, 1.1, and 2.0 is on the horizon. All of them are side by side, meaning, someone may be using 1.0 CLR, at the same time, someone else is using 1.1 CLR. In the same process, there can be only one CLR. Once CLR is loaded in the process, it cannot be unloaded.

Side by Side Execution 

We have shipped several versions of .Net framework: 1.0, 1.1, 2.0,3.0,3.5 and 4.0 is on the horizon. All of them are side by side, meaning, someone may be using 2.0 CLR, at the same time, someone else is using 4.0 CLR. In the same process, there can be only one CLR (4.0 has new feature : In-process side by side (Inproc SxS) is the ability to run multiple versions of the CLR in a single process.). Once CLR is loaded in the process, it cannot be unloaded.

So which CLR will my app use?

It depends on which .Net framework has installed, and which framework your app is built with.

The real component to determine which CLR to load is mscoree.dll, residing  in %windir%system32. When you install .Net framework, it will replace mscoree.dll if the existing one is older then the one it carries, and it will leave it alone if the existing one is newer. So we always have the latest mscoree.dll in %windir%system32, even the corresponding .Net framework has been uninstalled. For this reason, mscoree.dll has to maintain very strict compatibility.

 Because we always have the latest mscoree.dll, the following discussion is based on what newest .Net framework you have ever installed.

 If only 1.0 is installed, then 1.0 CLR will always be used. 1.0 mscoree.dll is not side by side aware.

 If 1.1 is installed, then the CLR you built with will be loaded. If you built your app with 1.0, then 1.0 CLR will be loaded. If you built your app with 1.1, then 1.1 CLR will be loaded. If the required CLR is not available in your machine, mscoree.dll will bring up a dialog and quit. This is so that your app won’t run under a different CLR that you did not test.

 The thinking shifts in 2.0. In 2.0, mscoree.dll will try to use the CLR you built with first. If that CLR cannot be found, mscoree.dll will load 2.0 CLR to run your app. But if the CLR you built with is newer than (the currently installed) 2.0, mscoree.dll will bring up the same dialog and quit. The latter behavior is frequently seen in internal testing.

 For apps built with interim release, mscoree.dll maps it to the closest officially released CLR. So 1.0 beta2+ will use 1.0 CLR. 1.1 beta will use 1.1 CLR.

 Of course, you can use a config file to overwrite the default behavior.

Microsoft has released CLR (2.0) source code Implementation of CLI (Common Language Infrastructure) – ECMA standard that describes the core of the .NET Framework world.The Shared Source CLI goes beyond the printed specification of the ECMA standards, providing a working implementation for CLI developers to explore and understand.

Developers interested in the internal workings of the .NET Framework can explore this implementation of the CLI to see how garbage collection works, JIT compilation and verification is handled, security protocols implemented, and the organization of frameworks and virtual object systems.

Jump to Next Part : .Net CLR Internals (CLR modules in detail..) 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s