Problem
There are many sophisticated data structures in.NET. Unfortunately, some of them are really similar, and I’m never sure when to use one over the other. Most of my C# and VB books mention them to some extent, but they never go into great depth regarding them.
What’s the difference between an Array, an ArrayList, a List, a Hashtable, a Dictionary, and a SortedList or a SortedDictionary?
Which ones can be enumerated (IList can conduct foreach loops)? Which ones (IDict) use key/value pairs?
What about memory usage? What is the rate of insertion? What is the retrieval speed?
Is there anything else worth mentioning in terms of data structures?
I’m still looking for further information on memory use and performance (Big-O notation)
Asked by Pretzel
Solution #1
To give you an idea , what I’m thinking about right now:
I frequently use List and Dictionary; once you start using them strongly typed with generics, it’s difficult to go back to the non-generic versions.
There are a variety of different data structures available, like KeyValuePair, which may be used to accomplish some interesting things, and SortedDictionary, which can also be handy.
Answered by Sam Schutte
Solution #2
Use generics whenever possible. Among them are:
Answered by Adam Tegen
Solution #3
To begin, IEnumerable is implemented by all collections in.NET.
Second, because generics were added to the framework in version 2.0, many of the collections are duplicates.
So, while the generic collections are likely to include additional features, for the most part:
Arrays are a fixed-size collection in which the value stored at a specific index can be changed.
SortedDictionary is an IDictionaryT,K> with keys that are sorted. SortedList is an IDictionaryT,K> that has been sorted using an IComparer.
As a result, the IDictionary implementations (those that support KeyValuePairs) are as follows:
Hashset is another new collection in.NET 3.5. It’s a set-based collection.
Furthermore, the LinkedList is a standard implementation of a linked-list (the List is an array-list for faster retrieval).
Answered by 2 revs, 2 users 98%
Solution #4
Here are some general pointers:
Answered by blackwing
Solution #5
Arrays are the “old school” collection, according to one user (yes, arrays are considered a collection though not part of System.Collections). But, in compared to other collections, such as the ones you listed in your title (here, ArrayList and List(Of T)), what makes arrays “old school”? Let’s start with the fundamentals and look at arrays.
To begin, arrays are “mechanisms that allow you to consider numerous [logically connected] items as a single collection” in Microsoft.NET (see linked article). What exactly does that imply? Individual members (elements) are stored in memory sequentially, one after the other, using a starting address. We can readily access the sequentially stored elements beginning at that address by using the array.
Beyond that and contrary to programming 101 common conceptions, Arrays really can be quite complex:
Arrays can have a single dimension, a multiple dimension, or a jagged dimension (jagged arrays are worth reading about). Arrays are not dynamic in and of themselves: once started, an array of size n reserves enough space to hold n objects. The array’s number of elements cannot be increased or decreased. _array is a variable. Because Int32() = New Int32(100) leaves enough room on the memory block for 100 Int32 primitive type objects, the array can hold 100 of them (in this case, the array is initialized to contain 0s). This block’s address is returned to _array.
According to the article, all arrays must be zero-based according to the Common Language Specification (CLS). Non-zero-based arrays are supported by.NET arrays, but they are less frequent. Because of the “commonness” of zero-based arrays, Microsoft has spent a lot of time optimizing their performance; as a result, single dimension, zero-based (SZs) arrays are “special” – and the best implementation of an array (as opposed to multidimensional, etc.) – because SZs have specific intermediate language instructions for manipulating them.
An crucial element of the Array puzzle to understand is that arrays are always supplied by reference (as a memory address). Bounds checking can be deactivated on arrays while they still do it (thus throwing an error).
The most significant disadvantage of arrays is that they are not resizable. They have a predetermined capacity. ArrayList and List(Of T) are two new terms in our history:
The ArrayList (together with List(Of T) – albeit there are some important distinctions to be made here, which will be addressed later) is likely best thought of as the next generation of collections (in the broad sense). IList (a descendent of ‘ICollection’) is the interface that ArrayList inherits from. ArrayLists are inherently larger than Lists, requiring greater overhead.
IList allows the implementation to consider ArrayLists as fixed-sized lists (like Arrays); but, aside from the additional functionality provided by ArrayLists, there are no real benefits to utilizing fixed-size ArrayLists, as ArrayLists (over Arrays) are significantly slower in this scenario.
ArrayLists cannot be jagged, according to my reading: “Using multidimensional arrays as elements… is not supported.” Another nail in the coffin of ArrayLists has been driven home. ArrayLists are likewise not “typed,” implying that an ArrayList is nothing more than a dynamic Array of Objects: Object[]. When implementing ArrayLists, this necessitates a lot of implicit and explicit boxing and unboxing, which adds to their overhead.
Unsubstantiated thought: I believe I recall reading or hearing from one of my professors that ArrayLists are the bloated conceptual child of the attempt to move from Arrays to List-type Collections, i.e., while they were once a great improvement over Arrays, they are no longer the best option as collections have progressed.
The memory use difference is significant enough that a List(Of Int32) used 56 percent less memory than an ArrayList containing the same primitive type (8 MB vs. 19 MB in the above gentleman’s linked demonstration: again, linked here) – however this is a consequence amplified by the 64-bit computer. This distinction indicates two points: Because of the inner workings of a 64-bit system, a boxed Int32-type “object” (ArrayList) is substantially larger than a pure Int32 primitive type (List); second (2), the difference is exponential.
So, what’s the difference between the two, and what exactly is a List(Of T)? A List(Of T) is defined by MSDN as “a highly typed list of objects that can be accessed by index.” The “highly typed” part is crucial: a List(Of T)’recognizes’ types and stores objects as their type. As a result, an Int32 is kept as an Int32 rather than an Object. The problems generated by boxing and unpacking are no longer an issue.
This distinction, according to MSDN, only applies when storing primitive types rather than reference types. In addition, the change is noticeable on a wide scale: there are over 500 pieces in all. What’s more, the MSDN documentation says, “It is to your advantage to utilize the type-specific version of the List(Of T) class instead of the ArrayList class…”
List(Of T) is essentially an improved version of ArrayList. It’s ArrayList’s “generic equivalent.” It is not guaranteed to be sorted until it is sorted, just like ArrayList (go figure). List(Of T) also offers some other features.
Answered by Thomas
Post is based on https://stackoverflow.com/questions/128636/net-data-structures-arraylist-list-hashtable-dictionary-sortedlist-sorted