Generics in C#


  • Use generic types to maximize code reuse, type safety, and performance.
  • The most common use of generics is to create collection classes.
  • The .NET Framework class library contains several new generic collection classes in the System.Collections.Generic namespace. These should be used whenever possible instead of classes such as ArrayList in the System.Collections namespace.
  • You can create your own generic interfaces, classes, methods, events and delegates.
  • Generic classes may be constrained to enable access to methods on particular data types.
  • Information on the types that are used in a generic data type may be obtained at run-time by using reflection.

Why we need generics
The limitations of using non-generic collection classes can be demonstrated by writing a short program that uses the ArrayList collection class from the .NET Framework class library. ArrayList is a highly convenient collection class that can be used without modification to store any reference or value type.
// The .NET Framework 1.1 way to create a list:
System.Collections.ArrayList list1 = new System.Collections.ArrayList();
list1.Add(3);
list1.Add(105);

System.Collections.ArrayList list2 = new System.Collections.ArrayList();
list2.Add("It is raining in Redmond.");
list2.Add("It is snowing in the mountains.");
But this convenience comes at a cost. Any reference or value type that is added to an ArrayList is implicitly upcast to Object. If the items are value types, they must be boxed when they are added to the list, and unboxed when they are retrieved. Both the casting and the boxing and unboxing operations decrease performance; the effect of boxing and unboxing can be very significant in scenarios where you must iterate over large collections.
The other limitation is lack of compile-time type checking; because an ArrayList casts everything to Object, there is no way at compile-time to prevent client code from doing something such as this:
System.Collections.ArrayList list = new System.Collections.ArrayList();
// Add an integer to the list.
list.Add(3);
// Add a string to the list. This will compile, but may cause an error later.
list.Add("It is raining in Redmond.");

int t = 0;
// This causes an InvalidCastException to be returned. 
foreach (int x in list)
{
    t += x;
}
// The .NET Framework 2.0 way to create a list
List<int> list1 = new List<int>();

// No boxing, no casting:
list1.Add(3);

// Compile-time error: 
// list1.Add("It is raining in Redmond.");

GenericList<float> list1 = new GenericList<float>();
GenericList<ExampleClass> list2 = new GenericList<ExampleClass>();
GenericList<ExampleStruct> list3 = new GenericList<ExampleStruct>();
Generics are the most powerful feature of C# 2.0. Generics allow you to define
type-safe data structures, without committing to actual data types. This results in a significant performance boost and higher quality code, because you get to reuse data processing algorithms without duplicating type-specific code. In concept, generics are similar to C++ templates, but are drastically different in implementation and capabilities. 
Generics were added to version 2.0 of the C# language and the common language runtime (CLR). Generics introduce to the .NET Framework the concept of type parameters, which make it possible to design classes and methods that defer the specification of one or more types until the class or method is declared and instantiated by client code. For example, by using a generic type parameter T you can write a single class that other client code can use without incurring the cost or risk of runtime casts or boxing operations,


public class Stack
{
   object[] m_Items; 
   public void Push(object item)
   {...}
   public object Pop()
   {...}
}


However, there are two problems with Object-based solutions. The first issue is performance. When using value types, you have to box them in order to push and store them, and unbox the value types when popping them off the stack. Boxing and unboxing incurs a significant performance penalty in their own right, but it also increases the pressure on the managed heap, resulting in more garbage collections, which is not great for performance either. Even when using reference types instead of value types, there is still a performance penalty because you have to cast from an Object to the actual type you interact with and incur the casting cost:

The second (and often more severe) problem with the Object-based solution is type safety. Because the compiler lets you cast anything to and from Object, you lose compile-time type safety. 

Writing type-specific data structures is a tedious, repetitive, and error-prone task. When fixing a defect in the data structure, you have to fix it not just in one place, but in as many places as there are type-specific duplicates of what is essentially the same data structure. 


Generics allow you to define type-safe classes without compromising type safety, performance, or productivity.



public class Stack<T>
{
   T[] m_Items; 
   public void Push(T item)
   {...}
   public T Pop()
   {...}
}
Stack<int> stack = new Stack<int>();
stack.Push(1);
stack.Push(2);
int number = stack.Pop();

When using a generic stack, you have to instruct the compiler which type to use instead of the generic type parameter T, both when declaring the variable and when instantiating it:

The advantage of this programming model is that the internal algorithms and data manipulation remain the same while the actual data type can change based on the way the client uses your server code.


Generics Benefits

Generics in .NET let you reuse code and the effort you put into implementing it. The types and internal data can change without causing code bloat, regardless of whether you are using value or reference types. You can develop, test, and deploy your code once, reuse it with any type, including future types, all with full compiler support and type safety. Because the generic code does not force the boxing and unboxing of value types, or the down casting of reference types, performance is greatly improved. With value types there is typically a 200 percent performance gain, and with reference types you can expect up to a 100 percent performance gain in accessing the type (of course, the application as a whole may or may not experience any performance improvements). 
The default value for reference types is null, and the default value for value types (such as integers, enum, and structures) is a zero whitewash (filling the structure with zeros). Consequently, if the stack is constructed with strings, the Pop() methods return null when the stack is empty, and when the stack is constructed with integers, the Pop() methods return zero when the stack is empty.

Multiple Generic Types

A single type can define multiple generic-type parameters. For example, consider the generic linked list shown in Code block 3.
Code block 3. Generic linked list
class Node<K,T>
{
   public K Key;
   public T Item;
   public Node<K,T> NextNode;
   public Node()
   {
      Key      = default(K);
      Item     = defualt(T);
      NextNode = null;
   }
   public Node(K key,T item,Node<K,T> nextNode)
   {
      Key      = key;
      Item     = item;
      NextNode = nextNode;
   }
}

public class LinkedList<K,T>
{
   Node<K,T> m_Head;
   public LinkedList()
   {
      m_Head = new Node<K,T>();
   }
   public void AddHead(K key,T item)
   {
      Node<K,T> newNode = new Node<K,T>(key,item,m_Head.NextNode);
      m_Head.NextNode = newNode;
   }
}

Generic type aliasing

Sometimes is it useful to alias a particular combination of specific types. You can do that through the using statement, as shown in Code block 4. Note that the scope of aliasing is the scope of the file, so you have to repeat aliasing across the project files in the same way you would with using namespaces.
Code block 4. Generic type aliasing
using List = LinkedList<int,string>;

class ListClient
{
   static void Main(string[] args)
   {
      List list = new List();
      list.AddHead(123,"AAA");
   }
}

Generic Constraints

With C# generics, the compiler compiles the generic code into IL independent of any type arguments that the clients will use. As a result, the generic code could try to use methods, properties, or members of the generic type parameters that are incompatible with the specific type arguments the client uses. This is unacceptable because it amounts to lack of type safety. In C# you need to instruct the compiler which constraints the client-specified types must obey in order for them to be used instead of the generic type parameters. There are three types of constraints. A derivation constraint indicates to the compiler that the generic type parameter derives from a base type such an interface or a particular base class. A default constructor constraint indicates to the compiler that the generic type parameter exposes a default public constructor (a public constructor with no parameters). A reference/value type constraint constrains the generic type parameter to be a reference or a value type. A generic type can employ multiple constraints, and you even get IntelliSense reflecting the constraints when using the generic type parameter, such as suggesting methods or members from the base type.
It is important to note that although constraints are optional, they are often essential when developing a generic type. Without them, the compiler takes the more conservative, type-safe approach and only allows access to Object-level functionality in your generic type parameters. Constraints are part of the generic type metadata so that the client-side compiler can take advantage of them as well. The client-side compiler only allows the client developer to use types that comply with the constraints, thus enforcing type safety.
if(current.Key == key)
The line above will not compile because the compiler does not know whether K (or the actual type supplied by the client) supports the == operator. For example, structs by default do not provide such an implementation. You could try to overcome the == operator limitation by using the IComparable interface:
public interface IComparable 
{
   int CompareTo(object obj);
}

Derivation Constraints

In C# 2.0 you use the where reserved keyword to define a constraint. Use the where keyword on the generic type parameter followed by a derivation colon to indicate to the compiler that the generic type parameter implements a particular interface. For example, here is the derivation constraint required to implement the Find() method of LinkedList:
public class LinkedList<K,T> where K : IComparable
{
   T Find(K key)
   {
      Node<K,T> current = m_Head;
      while(current.NextNode != null)
      {
         if(current.Key.CompareTo(key) == 0)
            
            break;
         else      
            
            current = current.NextNode;
      }
      return current.Item; 
   }
   //Rest of the implementation 
}


You will also get IntelliSense support on the methods of interfaces you constrain.
When the client declares a variable of type LinkedList providing a type argument for the list's key, the client-side compiler will insist that the key type is derived fromIComparable and will refuse to build the client code otherwise.
Note that even though the constraint allows you to use IComparable, it does not eliminate the boxing penalty when the key used is a value type, such as an integer. To overcome this, the System.Collections.Generic namespace defines the generic interface IComparable<T>:
public interface IComparable<T> 
{
   int CompareTo(T other);
   bool Equals(T other);
}
You can constrain the key type parameter to support IComparable<T> with the key's type as the type parameter, and by doing so you gain not only type safety but also eliminate the boxing of value types when used as keys:
public class LinkedList<K,T> where K : IComparable<K>
{...}

No comments:

Post a Comment