Monday, August 16, 2004
Basics of .Net
This blog is about some discussion points on various .Net feature. These are mostly basic things one should be aware of.
Productivity is the most important feature of .Net. Productivity is the numer one feature, speed and memory conservation are secondary features. That means, the CLR does more work for you. So your software design methods have to change accordingly.
First thing to note is that a .Net assembly is much more than an executable. It has a rich meta data describing various aspects of the assembly. Attributes are an easy way to add your own information to the assembly. Using reflection other assemblies can read your meta data. Also note that all the classes in your assembly are fully described, including names of private members.
The execution environment is dynamic. Don't hesitate to allocate memory and don't worry about memory fragmentation. CLR does excellent memory management. Memory usage of a .Net program is generally high, there is a lot of memory copying involved. For a C programmer, this is scary, just get over it.
When you develop a .Net application, your goal is to develop a reliable and secure application. Memory overhead and speed of execution should be your secondary goals, if they they are your primary goals better stick with C/C++.
Managed CodeA .net assembly (in intermediate language) needs to be converted to native (x86) code by the Common Language Runtime (CLR) before it gets executed. CLR does the compilation from intermediate language to native code, so it knows exactly what your assembly is doing. Even after the compilation is done, the CLR doesn't give complete control of the processor to your assembly. The CLR is the running program, your assembly is just a set of library calls for the CLR. Your assembly is managed by the framework, that's why it's called managed code.
Contrast this with native code, it has full control of the processor, it can do anything it wants, (of course limited to OS rights management).
Let me put it this way, managed code is digital and native code is analog. It is easy to manipulate digital data but not analog signals.
Common Type System
A .net assembly is just a collection of types - classes, interfaces and so on. Any class with a static Main method, can be marked as the entry point for the assembly. While an entry point is typically how an assembly starts execution, it doesn't have to be that way. Some other assembly can load your assembly and use one of the types you include in your assembly, bypassing your entry point completely. So it is safe to think that your assembly is just a collection of classes, interfaces, value types and other basic types, you just suggest an entry point.
For any instance of your class, the type information can be easily obtained using reflection.
All the types (classes, interfaces) you define in your c# program are fully described in your assembly, only a few things are lost during compilation from C# to assembly. A .net assembly is not a binary in the traditional sense, unless you use some obfuscation technique, your assembly is readable source code.
This meta data is a fundamental feature of .net. This enables your classes to be fully described in the assembly, another assembly can learn about your types dynamically and can invoke your class's methods dynamically. Contrast this with calling a function in a native code DLL, the caller has to know the function signature during development, it cannot be learnt dynamically.
Each assembly has metadata describing itself. It includes digital certificate, version number, company name and other things. By including version number in each assembly, DLL hell is avoided, even though version hell is introduced.
ReflectionReflection is the process of publishing information about types and it is a widely used feature in .Net. The CLR learns about your assembly using reflection.
The types in an assembly are fully described using the Type class. With this class you can learn about a class's members, methods, attributes and so on. You can call a class's method or change the value of members using reflection.
Reflection is not an esoteric feature, it is an integral part of CLR. Don't be shy about using them.
IO StreamsTo do any IO in .Net you should be comfortable with the streams framework. Using streams is well documented in MSDN. But the most important thing to know about streams is it's pluggable model of doing IO. Another stream can be attached above or below your stream enabling the user to build a chain of streams. A pluggable stream doesn't necessarily know that another stream is attached on either top or bottom. Not all streams are pluggable, some streams can only be at the bottom of the chain.
For example, to build a compressed, encrypted, network data stream: Obtain a network stream from a connected socket, create a encryption stream and attach it on top of the network stream, then create a compression stream and attach it on top. The end result is a single stream that you can pass to your network application. When the application writes data to this stream, it will be compressed, encrypted, and sent to the other end reliably. This is all done without a network application doing extra work for compression/encryption.
EnvironmentEvery .Net installation is meant to co-exist with other installations. By default, .Net files are installed in Windows\Microsoft.Net\Framework\ under it's own version dependent directory.
Compilers for C#, VB.NET and VJ are installed by default.
Global assembly cache (GAC) contains the system .Net assemblies.
There are no header files .Net. The c# compiler doesn't compile each c# file separately, it combines all the files together and compiles the resulting source file.
Garbage collectionIn a .Net environment, dynamic memory allocation is very common. Compared to native code, memory allocation in .Net is faster. So when programming in .Net, don't worry about memory allocation overhead.
Every class instance in .Net is a pointer. All these instances are tracked by CLR and when these pointers are no longer in use, their memory is reclaimed. This process is called garbage collection (GC).
GC is done in a function that's called when memory is running low. It frees memory with no reference to it. For example, if a variable pointing to a class instance is no longer in scope, there is no reference to that class instance, so it can be freed.
GC may change the memory address of a class instance. So don't count on memory address of a class instance to be the same through out the application execution.
GC is a time consuming process, all threads are stopped when GC is running. Keep this in mind if you are designing a real time system.
Arrays and Structures
The order of members in a class is not preserved in memory. For example, if you have defined a byte array in between integers in your class, CLR may combine both integers together in memory and keep the byte array afterwards.
All arrays have their length included with the data. Generally this length is not modified after the array has been created. This length is just the capacity of the array, not the number of valid elements in the array. For example, if you have a network buffer declared as a byte array, you may need a separate variable to keep track of actual bytes in the array.
Asynchronous execution is another widely used feature in .Net. This feature enables the calling thread to continue execution while the called function is still executing.
In asynchronous execution the called function is split into two parts, the begin method and end method.
To use asynchronous call execution, you pass the callback function (delegate) to the begin called function along with a context value (any object). The begin method returns an IAsyncResult instance. When the called function is complete, it invokes the delegate and pass the context value as a parameter. Now the callback has to call the end method with the IAsyncResult instance returned by the begin method. The purpose of this IAsyncResult instance is to link the begin and end methods.
Code Access Security
Since .Net is a managed code environment, the CLR can identify and enforce access control. CLR defines an extensive set of rights that the user can configure, and applies it against a set of evidence the CLR obtains from the assembly.
Code Access Security (CAS) is still evolving, may be in the longhorn time frame it will be fully appreciated.
This class is used to dynamically generate code. Microsoft doesn't document this feature well, but MSDN has a working sample to start with.
With CodeDOM you can define your methods and members and dynamically generate code in any of the .Net languages. By default, code generators for C# and VB.Net are included. CodeDOM is difficult to get used to, the amount of work involved in generating a simple class is quite high. But once you get used to it, it could be very useful.