Problem
In C#, how can you tell the difference between multidimensional arrays double[,] and array-of-arrays double[][]?
If there is a distinction, what is the ideal application for each?
Asked by ecleel
Solution #1
Arrays of arrays (jagged arrays) are faster and more effective than multi-dimensional arrays. The syntax of multidimensional arrays is more aesthetically pleasing.
If you develop some simple code with jagged and multidimensional arrays, you’ll see that the storage and retrieval from jagged (or single dimensional) arrays are simple IL instructions, whereas the equivalent actions for multidimensional arrays are method invocations, which are always slower.
Consider the procedures below:
static void SetElementAt(int[][] array, int i, int j, int value)
{
array[i][j] = value;
}
static void SetElementAt(int[,] array, int i, int j, int value)
{
array[i, j] = value;
}
The following will be their IL:
.method private hidebysig static void SetElementAt(int32[][] 'array',
int32 i,
int32 j,
int32 'value') cil managed
{
// Code size 7 (0x7)
.maxstack 8
IL_0000: ldarg.0
IL_0001: ldarg.1
IL_0002: ldelem.ref
IL_0003: ldarg.2
IL_0004: ldarg.3
IL_0005: stelem.i4
IL_0006: ret
} // end of method Program::SetElementAt
.method private hidebysig static void SetElementAt(int32[0...,0...] 'array',
int32 i,
int32 j,
int32 'value') cil managed
{
// Code size 10 (0xa)
.maxstack 8
IL_0000: ldarg.0
IL_0001: ldarg.1
IL_0002: ldarg.2
IL_0003: ldarg.3
IL_0004: call instance void int32[0...,0...]::Set(int32,
int32,
int32)
IL_0009: ret
} // end of method Program::SetElementAt
You can simply execute operations like row swapping and row resizing while utilizing jagged arrays. Although multidimensional arrays may be safer in some instances, Microsoft FxCop recommends that you utilize jagged arrays instead of multidimensional arrays when analyzing your applications.
Answered by okutane
Solution #2
A multidimensional array provides a lovely linear memory architecture, whereas a jagged array adds numerous levels of indirection.
var jagged = new int[10][5] var jagged = new int[10][5] var jagged = new int[10][5] var jagged = new int[10][5] var jagged = new int[10][5] works such as these: Look up the element at index 3 (an array) and the element at position 6 in that array (which is a value). In this situation, there is an additional look up for each dimension (this is an expensive memory access pattern).
In memory, a multidimensional array is set out linearly, and the actual value is calculated by multiplying the indexes together. The Length property of a multidimensional array, given the array var mult = new int[10,30], returns the total number of items, i.e. 10 * 30 = 300.
A jagged array’s Rank property is always 1, whereas a multidimensional array’s Rank property can be any value. To get the length of each dimension, use the GetLength method of any array. In this example, the multidimensional array is called mult. GetLength(1) returns a value of 30.
The indexing of the multidimensional array is more efficient. For example, given the multidimensional array mult[1,7] = 30 * 1 + 7 = 37, find the entry at index 37. Because only one memory location is involved, which is the array’s base address, this is a superior memory access pattern.
As a result, a multidimensional array allocates a continuous memory block, whereas a jagged array, such as jagged[1,], does not have to be square. It is not necessary for length to match jaggedness[2]. The length of a multidimensional array, which is true for any multidimensional array.
Multidimensional arrays should be faster in terms of performance. They should be a lot faster, but they aren’t because of a terrible CLR implementation.
23.084 16.634 15.215 15.489 14.407 13.691 14.695 14.398 14.551 14.252
25.782 27.484 25.711 20.844 19.607 20.349 25.861 26.214 19.677 20.171
5.050 5.085 6.412 5.225 5.100 5.751 6.650 5.222 6.770 5.305
The first row exhibits jagged array timings, the second multidimensional arrays, and the third, well, that’s the way it should be. The program is shown below; please note that it was tested in mono. (The windows timings are drastically different, owing to differences in CLR implementation.)
On Windows, the jagged arrays’ timings are significantly better, roughly matching my own perception of what a multidimensional array lookup should be like (see ‘Single()’). Unfortunately, the Windows JIT-compiler is a jerk, which makes performance discussions tough because there are so many inconsistencies.
These are the timings I got on Windows, same deal, first row are jagged arrays, second multidimensional, and third my own multidimensional implementation, note how much slower this is on Windows than mono.
8.438 2.004 8.439 4.362 4.936 4.533 4.751 4.776 4.635 5.864
7.414 13.196 11.940 11.832 11.675 11.811 11.812 12.964 11.885 11.751
11.355 10.788 10.527 10.541 10.745 10.723 10.651 10.930 10.639 10.595
Source code:
using System;
using System.Diagnostics;
static class ArrayPref
{
const string Format = "{0,7:0.000} ";
static void Main()
{
Jagged();
Multi();
Single();
}
static void Jagged()
{
const int dim = 100;
for(var passes = 0; passes < 10; passes++)
{
var timer = new Stopwatch();
timer.Start();
var jagged = new int[dim][][];
for(var i = 0; i < dim; i++)
{
jagged[i] = new int[dim][];
for(var j = 0; j < dim; j++)
{
jagged[i][j] = new int[dim];
for(var k = 0; k < dim; k++)
{
jagged[i][j][k] = i * j * k;
}
}
}
timer.Stop();
Console.Write(Format,
(double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
}
Console.WriteLine();
}
static void Multi()
{
const int dim = 100;
for(var passes = 0; passes < 10; passes++)
{
var timer = new Stopwatch();
timer.Start();
var multi = new int[dim,dim,dim];
for(var i = 0; i < dim; i++)
{
for(var j = 0; j < dim; j++)
{
for(var k = 0; k < dim; k++)
{
multi[i,j,k] = i * j * k;
}
}
}
timer.Stop();
Console.Write(Format,
(double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
}
Console.WriteLine();
}
static void Single()
{
const int dim = 100;
for(var passes = 0; passes < 10; passes++)
{
var timer = new Stopwatch();
timer.Start();
var single = new int[dim*dim*dim];
for(var i = 0; i < dim; i++)
{
for(var j = 0; j < dim; j++)
{
for(var k = 0; k < dim; k++)
{
single[i*dim*dim+j*dim+k] = i * j * k;
}
}
}
timer.Stop();
Console.Write(Format,
(double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
}
Console.WriteLine();
}
}
Answered by John Leidegren
Solution #3
Simply speaking, a multidimensional array is analogous to a table in a database management system. The Array of Arrays (jagged array) allows you to have each element hold another variable-length array of the same type.
A multi-dimensional array can be used if the data structure resembles that of a table (fixed rows/columns). The elements of a jagged array are fixed, and each element can hold an array of variable length.
E.g. Psuedocode:
int[,] data = new int[2,2];
data[0,0] = 1;
data[0,1] = 2;
data[1,0] = 3;
data[1,1] = 4;
Consider the following as a 2×2 table:
int[][] jagged = new int[3][];
jagged[0] = new int[4] { 1, 2, 3, 4 };
jagged[1] = new int[2] { 11, 12 };
jagged[2] = new int[3] { 21, 22, 23 };
Consider the following with each row having a different number of columns:
Answered by shahkalpesh
Solution #4
Update .NET 6:
With the release of.NET 6, I thought it was appropriate to explore this subject. I recreated the test code in new.NET and ran it, ensuring that each component took at least a second to complete. The benchmark was performed on an AMD Ryzen 5600x processor.
Results? It’s a difficult situation. Single arrays appear to be the most performant for small and large arrays (25x25x25 & > 200x200x200), with Jagged arrays being the fastest in the middle. Unfortunately, multi-dimensional appears to be the slowest option based on my testing. At best, you’ll be half as fast as the fastest alternative. But! Because jagged arrays can take much longer to start than single-dimensional arrays, it depends on what you need the arrays for. On the 503 cube, the initialization took nearly three times as long as single-dimensional arrays. Multi-dimensional processing was just slightly slower than single-dimensional processing.
What’s the bottom line? Benchmark your code on the system it will execute on if you need it to be speedy. The relative performance of each method can be completely altered by CPU architecture.
Numbers!
Method name Ticks/Iteration Scaled to the best
Array size 1x1x1 (10,000,000 iterations):
Jagged: 0.15 4.28
Single: 0.035 1
Multi-dimensional: 0.77 22
Array size 10x10x10 (25,000 iterations):
Jagged: 15 1.67
Single: 9 1
Multi-dimensional: 56 6.2
Array size 25x25x25 (25,000 iterations):
Jagged: 157 1.3
Single: 120 1
Multi-dimensional: 667 5.56
Array size 50x50x50 (10,000 iterations):
Jagged: 1,140 1
Single: 2,440 2.14
Multi-dimensional: 5,210 4.57
Array size 100x100x100 (10,000 iterations):
Jagged: 9,800 1
Single: 19,800 2
Multi-dimensional: 41,700 4.25
Array size 200x200x200 (1,000 iterations):
Jagged: 161,622 1
Single: 175,507 1.086
Multi-dimensional: 351,275 2.17
Array size 500x500x500 (100 iterations):
Jagged: 4,057.413 1.5
Single: 2,709,301 1
Multi-dimensional: 5,359,393 1.98
Don’t believe me? Check it out for yourself.
Note that while the constant size appears to give jagged arrays an advantage, it isn’t significant enough to affect the order in my tests. In some cases, using size from user input resulted in a 7% drop in performance for jagged arrays, no change for single arrays, and a very minor difference (1% or less) for multi-dimensional arrays. It’s most noticeable in the middle, where jagged arrays dominate.
using System.Diagnostics;
const string Format = "{0,7:0.000} ";
const int TotalPasses = 25000;
const int Size = 50;
Stopwatch timer = new();
var functionList = new List<Action> { Jagged, Single, SingleStandard, Multi };
Console.WriteLine("{0,5}{1,20}{2,20}{3,20}{4,20}", "Run", "Ticks", "ms", "Ticks/Instance", "ms/Instance");
foreach (var item in functionList)
{
var warmup = Test(item);
var run = Test(item);
Console.WriteLine($"{item.Method.Name}:");
PrintResult("warmup", warmup);
PrintResult("run", run);
Console.WriteLine();
}
static void PrintResult(string name, long ticks)
{
Console.WriteLine("{0,10}{1,20}{2,20}{3,20}{4,20}", name, ticks, string.Format(Format, (decimal)ticks / TimeSpan.TicksPerMillisecond), (decimal)ticks / TotalPasses, (decimal)ticks / TotalPasses / TimeSpan.TicksPerMillisecond);
}
long Test(Action func)
{
timer.Restart();
func();
timer.Stop();
return timer.ElapsedTicks;
}
static void Jagged()
{
for (var passes = 0; passes < TotalPasses; passes++)
{
var jagged = new int[Size][][];
for (var i = 0; i < Size; i++)
{
jagged[i] = new int[Size][];
for (var j = 0; j < Size; j++)
{
jagged[i][j] = new int[Size];
for (var k = 0; k < Size; k++)
{
jagged[i][j][k] = i * j * k;
}
}
}
}
}
static void Multi()
{
for (var passes = 0; passes < TotalPasses; passes++)
{
var multi = new int[Size, Size, Size];
for (var i = 0; i < Size; i++)
{
for (var j = 0; j < Size; j++)
{
for (var k = 0; k < Size; k++)
{
multi[i, j, k] = i * j * k;
}
}
}
}
}
static void Single()
{
for (var passes = 0; passes < TotalPasses; passes++)
{
var single = new int[Size * Size * Size];
for (var i = 0; i < Size; i++)
{
int iOffset = i * Size * Size;
for (var j = 0; j < Size; j++)
{
var jOffset = iOffset + j * Size;
for (var k = 0; k < Size; k++)
{
single[jOffset + k] = i * j * k;
}
}
}
}
}
static void SingleStandard()
{
for (var passes = 0; passes < TotalPasses; passes++)
{
var single = new int[Size * Size * Size];
for (var i = 0; i < Size; i++)
{
for (var j = 0; j < Size; j++)
{
for (var k = 0; k < Size; k++)
{
single[i * Size * Size + j * Size + k] = i * j * k;
}
}
}
}
}
Lesson learned: Include the CPU in all benchmarks since it matters. Is it true this time? I’m not sure, but I believe it did.
Original answer:
I’d like to update this because multi-dimensional arrays are faster in.NET Core than jagged arrays. On.NET Core 2.0 preview 2, I ran John Leidegren’s tests and got the following results. I adjusted the dimension value to reduce the visibility of any background app affects.
Debug (code optimalization disabled)
Running jagged
187.232 200.585 219.927 227.765 225.334 222.745 224.036 222.396 219.912 222.737
Running multi-dimensional
130.732 151.398 131.763 129.740 129.572 159.948 145.464 131.930 133.117 129.342
Running single-dimensional
91.153 145.657 111.974 96.436 100.015 97.640 94.581 139.658 108.326 92.931
Release (code optimalization enabled)
Running jagged
108.503 95.409 128.187 121.877 119.295 118.201 102.321 116.393 125.499 116.459
Running multi-dimensional
62.292 60.627 60.611 60.883 61.167 60.923 62.083 60.932 61.444 62.974
Running single-dimensional
34.974 33.901 34.088 34.659 34.064 34.735 34.919 34.694 35.006 34.796
This is what I discovered when I looked into disassemblies.
executing jagged[i][j][k] = I * j * k; required 34 instructions
multi[i, j, k] = I * j * k; execution required 11 instructions
I * j * k; single[i * dim * dim + j * dim + k] = I * j * k; There are 23 instructions to follow.
I couldn’t figure out why single-dimensional arrays were still faster than multi-dimensional arrays, but I’m guessing it has something to do with CPU optimization.
Answered by adsamcik
Solution #5
Preface: This comment is intended to respond to okutane’s response, but I’m unable to submit it because of SO’s illogical reputation system.
Because of the method calls, you claim that one is slower than the other. This is incorrect. Because of more intricate bounds-checking techniques, one is slower than the other. You can readily check this by looking at the built assembly rather than the IL. Accessing an element (through pointer in edx) contained in a two-dimensional array pointed to by ecx with indices stored in eax and edx, for example, looks like this on my 4.5 install:
sub eax,[ecx+10]
cmp eax,[ecx+08]
jae oops //jump to throw out of bounds exception
sub edx,[ecx+14]
cmp edx,[ecx+0C]
jae oops //jump to throw out of bounds exception
imul eax,[ecx+0C]
add eax,edx
lea edx,[ecx+eax*4+18]
There is no overhead from method calls in this case. Due to the potential of non-zero indexes, which is a feature not available with jagged arrays, bounds testing is quite complicated. The code very much resolves to (x*y max+y)*sizeof(ptr)+sizeof(array header) if we eliminate the sub,cmp, and jmps for the non-zero situations. This calculation is approximately as fast as anything else for random access to an element (one multiply could be substituted with a shift, because that’s why we chose bytes to be powers of two bits in the first place).
Another complication is that there are plenty of cases where a modern compiler will optimize away the nested bounds-checking for element access while iterating over a single-dimension array. As a result, the code simply advances an index pointer over the array’s contiguous memory. Naive iteration over multi-dimensional arrays generally involves an extra layer of nested logic, so a compiler is less likely to optimize the operation. So, even though the bounds-checking overhead of accessing a single element amortizes out to constant runtime with respect to array dimensions and sizes, a simple test-case to measure the difference may take many times longer to execute.
Answered by Eglin
Post is based on https://stackoverflow.com/questions/597720/what-are-the-differences-between-a-multidimensional-array-and-an-array-of-arrays