Coder Perfect

In C#, how do you distinguish between a multidimensional array and an array of arrays?

Problem

In C#, how can you tell the difference between multidimensional arrays double[,] and array-of-arrays double[][]?

If there is a distinction, what is the ideal application for each?

Asked by ecleel

Solution #1

Arrays of arrays (jagged arrays) are faster and more effective than multi-dimensional arrays. The syntax of multidimensional arrays is more aesthetically pleasing.

If you develop some simple code with jagged and multidimensional arrays, you’ll see that the storage and retrieval from jagged (or single dimensional) arrays are simple IL instructions, whereas the equivalent actions for multidimensional arrays are method invocations, which are always slower.

Consider the procedures below:

static void SetElementAt(int[][] array, int i, int j, int value)
{
    array[i][j] = value;
}

static void SetElementAt(int[,] array, int i, int j, int value)
{
    array[i, j] = value;
}

The following will be their IL:

.method private hidebysig static void  SetElementAt(int32[][] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       7 (0x7)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldelem.ref
  IL_0003:  ldarg.2
  IL_0004:  ldarg.3
  IL_0005:  stelem.i4
  IL_0006:  ret
} // end of method Program::SetElementAt

.method private hidebysig static void  SetElementAt(int32[0...,0...] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       10 (0xa)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldarg.2
  IL_0003:  ldarg.3
  IL_0004:  call       instance void int32[0...,0...]::Set(int32,
                                                           int32,
                                                           int32)
  IL_0009:  ret
} // end of method Program::SetElementAt

You can simply execute operations like row swapping and row resizing while utilizing jagged arrays. Although multidimensional arrays may be safer in some instances, Microsoft FxCop recommends that you utilize jagged arrays instead of multidimensional arrays when analyzing your applications.

Answered by okutane

Solution #2

A multidimensional array provides a lovely linear memory architecture, whereas a jagged array adds numerous levels of indirection.

var jagged = new int[10][5] var jagged = new int[10][5] var jagged = new int[10][5] var jagged = new int[10][5] var jagged = new int[10][5] works such as these: Look up the element at index 3 (an array) and the element at position 6 in that array (which is a value). In this situation, there is an additional look up for each dimension (this is an expensive memory access pattern).

In memory, a multidimensional array is set out linearly, and the actual value is calculated by multiplying the indexes together. The Length property of a multidimensional array, given the array var mult = new int[10,30], returns the total number of items, i.e. 10 * 30 = 300.

A jagged array’s Rank property is always 1, whereas a multidimensional array’s Rank property can be any value. To get the length of each dimension, use the GetLength method of any array. In this example, the multidimensional array is called mult. GetLength(1) returns a value of 30.

The indexing of the multidimensional array is more efficient. For example, given the multidimensional array mult[1,7] = 30 * 1 + 7 = 37, find the entry at index 37. Because only one memory location is involved, which is the array’s base address, this is a superior memory access pattern.

As a result, a multidimensional array allocates a continuous memory block, whereas a jagged array, such as jagged[1,], does not have to be square. It is not necessary for length to match jaggedness[2]. The length of a multidimensional array, which is true for any multidimensional array.

Multidimensional arrays should be faster in terms of performance. They should be a lot faster, but they aren’t because of a terrible CLR implementation.

 23.084  16.634  15.215  15.489  14.407  13.691  14.695  14.398  14.551  14.252 
 25.782  27.484  25.711  20.844  19.607  20.349  25.861  26.214  19.677  20.171 
  5.050   5.085   6.412   5.225   5.100   5.751   6.650   5.222   6.770   5.305 

The first row exhibits jagged array timings, the second multidimensional arrays, and the third, well, that’s the way it should be. The program is shown below; please note that it was tested in mono. (The windows timings are drastically different, owing to differences in CLR implementation.)

On Windows, the jagged arrays’ timings are significantly better, roughly matching my own perception of what a multidimensional array lookup should be like (see ‘Single()’). Unfortunately, the Windows JIT-compiler is a jerk, which makes performance discussions tough because there are so many inconsistencies.

These are the timings I got on Windows, same deal, first row are jagged arrays, second multidimensional, and third my own multidimensional implementation, note how much slower this is on Windows than mono.

  8.438   2.004   8.439   4.362   4.936   4.533   4.751   4.776   4.635   5.864
  7.414  13.196  11.940  11.832  11.675  11.811  11.812  12.964  11.885  11.751
 11.355  10.788  10.527  10.541  10.745  10.723  10.651  10.930  10.639  10.595

Source code:

using System;
using System.Diagnostics;
static class ArrayPref
{
    const string Format = "{0,7:0.000} ";
    static void Main()
    {
        Jagged();
        Multi();
        Single();
    }

    static void Jagged()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var jagged = new int[dim][][];
            for(var i = 0; i < dim; i++)
            {
                jagged[i] = new int[dim][];
                for(var j = 0; j < dim; j++)
                {
                    jagged[i][j] = new int[dim];
                    for(var k = 0; k < dim; k++)
                    {
                        jagged[i][j][k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Multi()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var multi = new int[dim,dim,dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        multi[i,j,k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Single()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var single = new int[dim*dim*dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        single[i*dim*dim+j*dim+k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }
}

Answered by John Leidegren

Solution #3

Simply speaking, a multidimensional array is analogous to a table in a database management system. The Array of Arrays (jagged array) allows you to have each element hold another variable-length array of the same type.

A multi-dimensional array can be used if the data structure resembles that of a table (fixed rows/columns). The elements of a jagged array are fixed, and each element can hold an array of variable length.

E.g. Psuedocode:

int[,] data = new int[2,2];
data[0,0] = 1;
data[0,1] = 2;
data[1,0] = 3;
data[1,1] = 4;

Consider the following as a 2×2 table:

int[][] jagged = new int[3][]; 
jagged[0] = new int[4] {  1,  2,  3,  4 }; 
jagged[1] = new int[2] { 11, 12 }; 
jagged[2] = new int[3] { 21, 22, 23 }; 

Consider the following with each row having a different number of columns:

Answered by shahkalpesh

Solution #4

Update .NET 6:

With the release of.NET 6, I thought it was appropriate to explore this subject. I recreated the test code in new.NET and ran it, ensuring that each component took at least a second to complete. The benchmark was performed on an AMD Ryzen 5600x processor.

Results? It’s a difficult situation. Single arrays appear to be the most performant for small and large arrays (25x25x25 & > 200x200x200), with Jagged arrays being the fastest in the middle. Unfortunately, multi-dimensional appears to be the slowest option based on my testing. At best, you’ll be half as fast as the fastest alternative. But! Because jagged arrays can take much longer to start than single-dimensional arrays, it depends on what you need the arrays for. On the 503 cube, the initialization took nearly three times as long as single-dimensional arrays. Multi-dimensional processing was just slightly slower than single-dimensional processing.

What’s the bottom line? Benchmark your code on the system it will execute on if you need it to be speedy. The relative performance of each method can be completely altered by CPU architecture.

Numbers!

Method name         Ticks/Iteration     Scaled to the best
Array size 1x1x1 (10,000,000 iterations):
Jagged:             0.15                4.28
Single:             0.035               1
Multi-dimensional:  0.77                22

Array size 10x10x10 (25,000 iterations):
Jagged:             15                  1.67
Single:             9                   1
Multi-dimensional:  56                  6.2

Array size 25x25x25 (25,000 iterations):
Jagged:             157                 1.3
Single:             120                 1
Multi-dimensional:  667                 5.56

Array size 50x50x50 (10,000 iterations):
Jagged:             1,140               1
Single:             2,440               2.14
Multi-dimensional:  5,210               4.57

Array size 100x100x100 (10,000 iterations):
Jagged:             9,800               1
Single:             19,800              2
Multi-dimensional:  41,700              4.25

Array size 200x200x200 (1,000 iterations):
Jagged:             161,622             1
Single:             175,507             1.086
Multi-dimensional:  351,275             2.17

Array size 500x500x500 (100 iterations):
Jagged:             4,057.413           1.5
Single:             2,709,301           1
Multi-dimensional:  5,359,393           1.98

Don’t believe me? Check it out for yourself.

Note that while the constant size appears to give jagged arrays an advantage, it isn’t significant enough to affect the order in my tests. In some cases, using size from user input resulted in a 7% drop in performance for jagged arrays, no change for single arrays, and a very minor difference (1% or less) for multi-dimensional arrays. It’s most noticeable in the middle, where jagged arrays dominate.

    using System.Diagnostics;

const string Format = "{0,7:0.000} ";
const int TotalPasses = 25000;
const int Size = 50;
Stopwatch timer = new();

var functionList = new List<Action> { Jagged, Single, SingleStandard, Multi };

Console.WriteLine("{0,5}{1,20}{2,20}{3,20}{4,20}", "Run", "Ticks", "ms", "Ticks/Instance", "ms/Instance");

foreach (var item in functionList)
{
    var warmup = Test(item);
    var run = Test(item);

    Console.WriteLine($"{item.Method.Name}:");
    PrintResult("warmup", warmup);
    PrintResult("run", run);
    Console.WriteLine();
}

static void PrintResult(string name, long ticks)
{
    Console.WriteLine("{0,10}{1,20}{2,20}{3,20}{4,20}", name, ticks, string.Format(Format, (decimal)ticks / TimeSpan.TicksPerMillisecond), (decimal)ticks / TotalPasses, (decimal)ticks / TotalPasses / TimeSpan.TicksPerMillisecond);
}

long Test(Action func)
{
    timer.Restart();
    func();
    timer.Stop();
    return timer.ElapsedTicks;
}

static void Jagged()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var jagged = new int[Size][][];
        for (var i = 0; i < Size; i++)
        {
            jagged[i] = new int[Size][];
            for (var j = 0; j < Size; j++)
            {
                jagged[i][j] = new int[Size];
                for (var k = 0; k < Size; k++)
                {
                    jagged[i][j][k] = i * j * k;
                }
            }
        }
    }
}

static void Multi()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var multi = new int[Size, Size, Size];
        for (var i = 0; i < Size; i++)
        {
            for (var j = 0; j < Size; j++)
            {
                for (var k = 0; k < Size; k++)
                {
                    multi[i, j, k] = i * j * k;
                }
            }
        }
    }
}

static void Single()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var single = new int[Size * Size * Size];
        for (var i = 0; i < Size; i++)
        {
            int iOffset = i * Size * Size;
            for (var j = 0; j < Size; j++)
            {
                var jOffset = iOffset + j * Size;
                for (var k = 0; k < Size; k++)
                {
                    single[jOffset + k] = i * j * k;
                }
            }
        }
    }
}

static void SingleStandard()
{
    for (var passes = 0; passes < TotalPasses; passes++)
    {
        var single = new int[Size * Size * Size];
        for (var i = 0; i < Size; i++)
        {
            for (var j = 0; j < Size; j++)
            {
                for (var k = 0; k < Size; k++)
                {
                    single[i * Size * Size + j * Size + k] = i * j * k;
                }
            }
        }
    }
}

Lesson learned: Include the CPU in all benchmarks since it matters. Is it true this time? I’m not sure, but I believe it did.

Original answer:

I’d like to update this because multi-dimensional arrays are faster in.NET Core than jagged arrays. On.NET Core 2.0 preview 2, I ran John Leidegren’s tests and got the following results. I adjusted the dimension value to reduce the visibility of any background app affects.

Debug (code optimalization disabled)
Running jagged 
187.232 200.585 219.927 227.765 225.334 222.745 224.036 222.396 219.912 222.737 

Running multi-dimensional  
130.732 151.398 131.763 129.740 129.572 159.948 145.464 131.930 133.117 129.342 

Running single-dimensional  
 91.153 145.657 111.974  96.436 100.015  97.640  94.581 139.658 108.326  92.931 


Release (code optimalization enabled)
Running jagged 
108.503 95.409 128.187 121.877 119.295 118.201 102.321 116.393 125.499 116.459 

Running multi-dimensional 
 62.292  60.627  60.611  60.883  61.167  60.923  62.083  60.932  61.444  62.974 

Running single-dimensional 
 34.974  33.901  34.088  34.659  34.064  34.735  34.919  34.694  35.006  34.796 

This is what I discovered when I looked into disassemblies.

executing jagged[i][j][k] = I * j * k; required 34 instructions

multi[i, j, k] = I * j * k; execution required 11 instructions

I * j * k; single[i * dim * dim + j * dim + k] = I * j * k; There are 23 instructions to follow.

I couldn’t figure out why single-dimensional arrays were still faster than multi-dimensional arrays, but I’m guessing it has something to do with CPU optimization.

Answered by adsamcik

Solution #5

Preface: This comment is intended to respond to okutane’s response, but I’m unable to submit it because of SO’s illogical reputation system.

Because of the method calls, you claim that one is slower than the other. This is incorrect. Because of more intricate bounds-checking techniques, one is slower than the other. You can readily check this by looking at the built assembly rather than the IL. Accessing an element (through pointer in edx) contained in a two-dimensional array pointed to by ecx with indices stored in eax and edx, for example, looks like this on my 4.5 install:

sub eax,[ecx+10]
cmp eax,[ecx+08]
jae oops //jump to throw out of bounds exception
sub edx,[ecx+14]
cmp edx,[ecx+0C]
jae oops //jump to throw out of bounds exception
imul eax,[ecx+0C]
add eax,edx
lea edx,[ecx+eax*4+18]

There is no overhead from method calls in this case. Due to the potential of non-zero indexes, which is a feature not available with jagged arrays, bounds testing is quite complicated. The code very much resolves to (x*y max+y)*sizeof(ptr)+sizeof(array header) if we eliminate the sub,cmp, and jmps for the non-zero situations. This calculation is approximately as fast as anything else for random access to an element (one multiply could be substituted with a shift, because that’s why we chose bytes to be powers of two bits in the first place).

Another complication is that there are plenty of cases where a modern compiler will optimize away the nested bounds-checking for element access while iterating over a single-dimension array. As a result, the code simply advances an index pointer over the array’s contiguous memory. Naive iteration over multi-dimensional arrays generally involves an extra layer of nested logic, so a compiler is less likely to optimize the operation. So, even though the bounds-checking overhead of accessing a single element amortizes out to constant runtime with respect to array dimensions and sizes, a simple test-case to measure the difference may take many times longer to execute.

Answered by Eglin

Post is based on https://stackoverflow.com/questions/597720/what-are-the-differences-between-a-multidimensional-array-and-an-array-of-arrays