Delphi: How to use Dynamic Arrays

Articles
Members Online:
-Article/Tip Search
-News Group
Member Area
-Top 10 NEW!!
-Forums Upgraded!!
-Indexes NEW!!
Employment
Contacts
Embarcadero
Embarcadero Community
JEDI
Links
How to use Dynamic Arrays
17-Jul-03
Category
Algorithm
Language
Delphi 4.x
Views
200
User Rating
No Votes
# Votes
Replies
Publisher:
DSP, Administrator
Reference URL:
DKB
			Author: Aleksandr Gofen

Dynamic Arrays overview 

Answer:

The Long and Winding Way of the Dynamic Array

Dynamic arrays were introduced to Object Pascal in Delphi 4. It was, however, not 
the first attempt of the Pascal/Delphi team to evolve the static array of Wirth's 
original Pascal.

Before going any further, let's first clarify some terminology. The terms "static" 
and "dynamic" are now applied at least four ways:

For arrays whose boundaries may vary (dynamic), versus arrays with constant 
boundaries (static). 
For the methods of assigning memory to the variables: Either their relative 
addresses are known at compile time (static), or the addresses are assigned by the 
system at run time (dynamic). Correspondingly, there are two methods of memory 
allocation: on the stack (static), or on the heap (dynamic). 
There are two methods referring to variables: either directly by their names 
(static), with one-to-one correspondence between a variable and its instance in 
memory; or indirectly via pointers (dynamic) where a variable and its instance are 
not the same. 
Methods in the class declaration may be either static or virtual/dynamic. 

This article follows the evolution of the "dynamic" concept with regard to arrays 
only. We will analyze its particularities for all array types appearing in 
Borland's Object Pascal. In addition to the standard arrays (static), there are now 
three more types in the family of arrays: open, variant, and dynamic. Why so many? 
Although it's beyond the scope of this article, we should also add to this list all 
the different types of strings in Delphi 4, which are a family of special character 
arrays whose diversity is even larger.

The Origin

The idea of dynamic arrays has a long history, beginning as early as ALGOL-60. In 
ALGOL-60, the syntax of array type looks similar to that of Pascal. For example, a 
declaration:

Real array A[M1: N1, M2: N2, M3: N3]

defines a 3D array of Real numbers. Only constants (integer numbers) are allowed as 
the array boundaries in the outermost block - just as in Pascal; in this case, the 
array is called static. But in the inner blocks of ALGOL-60, unlike Pascal, the 
boundaries may be variables (see Figure 1).

1   begin
2     Real array A[1: 100, 0..10]; { Static array in ALGOL-60. }
3     Integer M1, N1, M2, N2; { Variables. }
4     { M1, N1, M2, N2 have to be defined. }
5     ...
6     begin
7       Real array B[M1: N1, M2: N2]; { Dynamic array in ALGOL-60. }
8       Real array C[1: 3, 1: 4]; { "Static" array in ALGOL-60. }
9       ...
10    end;
11    ...
12  end

Figure 1: Variable boundaries in the inner blocks of ALGOL-60.

In the inner blocks, the array boundaries could be any arithmetic expression with 
the only requirement that numeric values were assigned to all variables in the 
boundaries before entering the block. Therefore, beginning with ALGOL-60, the 
concept of a dynamic array means that its boundaries may vary during run time, 
while for static arrays only explicit numbers are allowed in the boundary 
expressions.

For arrays that are local variables (like arrays B and C in Figure 1), the memory 
was allocated when entering the block and released when leaving it. Because 
declarations of the variables local in a block could appear only in the beginning 
of that block, no complications with redefining dynamic arrays occurred, and 
obviously such arrays could not be kept till the next entrance in this block. 
Nevertheless, in ALGOL-60, one could use the specifier own for any local variable 
including dynamic arrays, which meant that, although the visibility of the 
concerned variables obeyed the rule of scope, their values were kept available 
after re-entering the block. According to the Revised Report on ALGOL-60, this 
persistency in case of dynamic arrays ought to be followed also for the subset of 
indexes that are valid for the current and previous versions of the own array, 
although specific compilers could limit or simplify that behavior.

Note: Regarding compile-time versus run-time assignment of memory to the variables, 
it was typically run-time assignment in ALGOL-60 versus compile time in Pascal, 
although both use the stack.

The Long and Winding Road

In Wirth's Pascal, programmers could derive an unlimited number of new types from 
the basic types, but, for some reasons, the basic array type strictly required only 
constant boundaries and, therefore, was allowed to deal only with static arrays. 
Wirth's motivation probably was to have very fast and efficient one-pass compilers, 
for which all but indirect variables (i.e. except instances of pointer types) are 
compile-time variables providing the most efficient access. As a drawback, there 
was no way to overcome the static nature of the arrays, e.g. to develop procedures 
that deal with vectors, matrixes, and other structures of arbitrary sizes. We had 
to hard-code all array sizes as the maximum possible numeric values in the const 
section at least once.

For one-dimensional arrays, with a constant low boundary Low1 and variable boundary 
High1, we could resort to a trick involving indirect variables (pointers):

13  type
14    TMaxArray = array[Low1..MaxInteger] of AnyType;
15    PArray = ^TMaxArray;


and later allocate memory to the instance of PArray according to the actual value 
of High1:
16  
17  GetMem(PArray, (High1 - Low1 + 1) * SizeOf(AnyType));


Then, we can use index expressions like PArray^[k]; we can omit ^ because of the 
undocumented syntax feature of Delphi. Unfortunately, for multi-dimensional arrays, 
this approach with indirect variables doesn't work as simply with just one pointer. 
Another drawback is that we have to deal with pointers instead of direct variables 
(as we are responsible for allocating and de-allocating memory and other possible 
confusions connected with indirect access).

Open Arrays

The open array, introduced in Delphi 1, was the first extension of Pascal's concept 
of static arrays, but it wasn't really a type like the others that could be used to 
declare variables. Instead, it was an intrinsic type, applicable only to formal 
parameters in procedures and functions. If a formal parameter looks like this:

FormalArr: array of TSomeType;

then the actual parameter may be either a one-dimensional array of type TSomeType, 
just one variable, or the so-called open array constructor [Expr1, Expr2, ..., 
ExprN] - all of type TSomeType. The latter is a nice feature not available in 
ALGOL-60, otherwise the behavior of the open array as a formal parameter suffers 
two serious drawbacks. First, only one-dimensional arrays as actual parameters are 
allowed. Second, whatever the low and high boundaries of the actual array are, 
they're always mapped to the zero-based formal open array. The same is true for the 
open array constructor. The expressions are numbered starting with 0, which looks a 
little confusing. This zero-based indexing of the open arrays isn't consistent with 
the more convenient Pascal array boundaries of low..high type, but, for some 
reason, Borland still adheres to this principle.

The open arrays enabled us to overcome a serious limitation of static arrays, and 
allowed us to deal comfortably with one-dimensional arrays. Thus, procedures for 
simple vectors of any size like scalar product, average, maximum, minimum, etc. 
were no longer a problem.

The variant open array formal parameter looks like:

FormalVarArr: array of const ;

and is intended only to transfer the open array constructors containing expressions 
of different types as an actual parameter - an extension of the similar feature for 
open arrays and the predecessor of the more general idea of the type variant.

Variant Arrays

The Variant type, introduced by Delphi 2, was a very powerful extension of Pascal 
intended for different purposes. We're going to discuss it here only with regard to 
the concept of dynamic arrays. A variable declared as Variant may represent a 
multi-dimensional dynamic array, but you need a special non-Pascal statement to 
specify the dimensions and the type of elements:

18  var
19    vArr: Variant;
20    {...}
21  begin
22    {...}
23    vArr := VarArrayCreate([Low1, High1, ..., LowN, HighN],
24      ElementType);


where the boundaries may be variables and ElementType belongs to the fixed list of 
basic Pascal types denoted by the identifiers of the format varXXXX, for example 
varInteger, varDouble. After that we can consider vArr as an N-dimensional index 
variable vArr[i1,i2,...,iN]. This implementation is the closest to the notion of 
dynamic arrays as it appeared in ALGOL-60; it really made possible the 
multi-dimensional rectangular arrays with the variable boundaries of low..high 
type. Unfortunately, access to elements of variant arrays is at least 10 times 
slower than to static arrays; a simple benchmark that transposes a big matrix (e.g. 
A[i,j]:=A[j,i], i=1,...,N; j = 1,...,i-1) demonstrates it well. Also, in terms of 
memory consumption, any variant variable requires a 16-byte overhead. Although it 
doesn't seem too much if one variable represents a big array, it's something to 
keep in mind in case of many non-array variants. 

The re-dimensioning of variant arrays is possible within the same block, but only 
for the last (right-most) dimension; the special function, varArrayRedim, does the 
job.

As to the efficiency of access to the elements, it may be improved to the level as 
fast as that of static arrays via the special procedure varArrayLock. It returns a 
pointer that is assignment compatible with pointers to any static array, but 
meaningful only if that static type corresponds exactly to the dimensions specified 
in the varArrayCreate. For example, for vArr, the corresponding static array type 
must be:

TvArrStat = array[LoN..HiN, ..., Lo1..Hi1] of ElementType;

with the dimensions specified in the order inverse to that in the varArrayCreate 
(why?!) and all LoN, HiN, ... Lo1, Hi1 being constants equal to the current values 
of the corresponding variable boundaries. Then, providing the declaration:

25  var
26    vArrLock: ^TvArrStat;


the statement:

vArrLock := varArrayLock(vArr)

allows us to use the index variable vArrLock^[iN,...,i1] (or simply 
vArrLock[iN,...,i1]) with the access speed as quick as static arrays. We increased 
speed, but, to deal with variable boundaries, we must explicitly declare as many 
different static array types as we are going to have in run time. For example, we 
may need to prepare in advance several type declarations:

27  type
28    T200x200 = array[1..200; 1..200] of Real;
29    T150x150 = array[1..150; 1..150] of Real;
30    {...}


and then specify the variable dimensions in the varArrayCreate according to one of 
these types - not a very convenient technique.

The interesting feature of variant arrays is that the ElementType may be variant, 
too:
31  
32  vArr := VarArrayCreate([Low1, High1, varVariant)


In particular, it means that individual elements vArr[k] may be defined as a 
variant array again with any number of dimensions of any size:

vArr[k] := VarArrayCreate([kLow, kHigh, varDouble)

This creates an illusion as though we can treat vArr as a two-dimensional, 
non-rectangular array. (The examples of non-rectangular arrays of two dimensions 
are triangle matrixes, or matrixes with just a few stripes. For three dimensions, 
it may be an integer grid of points inside a pyramid.) Unfortunately, the variant 
array of variant arrays doesn't work like the similar construction of the standard 
Pascal arrays. Providing the declaration of vArr given previously, the code 
compiles for the index variable like vArr[i,j], but stops with a run-time error 
(because vArr is created as one-dimensional). Surprisingly, vArr[i][j] - that 
should be the synonym in Pascal - shows different behavior: It even produces a 
syntax error if it appears in the left side of the assignment statement; a := 
vArr[i][j] compiles and runs correctly, while vArr[i][j] := b doesn't, resulting in 
a syntax error.

So we see that although the variant array type allows some functionality of the 
ALGOL-60's dynamic arrays, the variant arrays are far more complex, slow, and not 
consistent with the Pascal's concept of arrays both in syntax and semantics.

Dynamic Arrays of Delphi 4

Finally, here is the latest attempt to implement the dynamic arrays (covered only 
on two pages of the Object Pascal Language Guide!). The declaration of 
one-dimensional dynamic arrays looks like this:

type
  TDynArray1 = array of baseType;

boundaries [...] must be omitted, where the baseType may be also a static array 
type or a dynamic array type again. This allows the declaration of the 
multi-dimensional "mixture" as well as "purely" dynamic arrays. For example, the 
declaration for three dimensions takes the following form:

type
  TDynArray3 = array of array of array of baseType;

This syntax allows us to declare a certain number of dimensions, but not their 
sizes (which require special consideration). Thus, if the baseType is of the 
non-array type, a variable:

var
  A, B: TDynArray3

may be used with up to three indexes, e.g. A[i,j,k]. Otherwise, if the baseType = 
array[1..100;1..200] of Double, this variable may appear with up to five indexes 
A[k1,k2,k3,k4,k5].

After a dynamic array variable is declared, it still cannot be used unless the 
special statement SetLength specifies the sizes of all dimensions and allocates the 
required memory. This shows the important difference between the static and dynamic 
arrays. The latter are - but only partially behave like - hidden pointers, i.e. a 
dynamic array variable is not strictly associated with its memory image, the 
instance, but rather separates from it.

Thus, the above-mentioned variable A (without indexes), or A[i], or A[i,j] (with 
number of indexes less than the declared number) are all hidden pointers. As such, 
at certain moments they may point nowhere, or more than one hidden pointer may 
point to the same instance. For example, after the assignment A := B, both A and B 
point to the same instance of B, so that any change to the elements of A affects B; 
this contradicts the usual meaning of the assignment statement. While the instance 
of A (if it exists) seems to be lost because it's pointed to by nothing, it doesn't 
cause a memory leak, which is prevented by the so-called reference count technique 
implemented for the dynamic arrays. For that reason, two consecutive statements - 
SetLength(A, ...) and SetLength(A, ...) - don't cause the loss of the piece of 
memory allocated in the first statement (leak) - unlike the similar situation, say, 
for classes. The sequence: 

33  X := TAnyClass.Create;
34  X := TAnyClass.Create;
35  
36  //is a mistake. Also, the assignment:
37  
38  A := nil

  
actually signals to the system that the memory (instance) must be freed, which is 
never the case for classes or pointers.

And even if:

A[i1, i2, i3] = B[i1, i2, i3]

for all indexes, it never means that the conditions:

A = B or A[i1] = B[i1] or A[i1, i2] = B[i1, i2]

are True, because these partially-indexed variables point to different locations.

In terms of persistency, while leaving and re-entering a block, the dynamic array 
variables behave like all other local (static) variables: leaving the block, the 
variables and their instances are freed automatically. For local variables of the 
types class and pointer it's wrong to leave the block without freeing all the 
instances of all such variables - the reason why such variables must be declared, 
for example, global. The user should nil a dynamic array only if it's important to 
free the memory before leaving the block. Hence, dynamic arrays are much safer than 
classes and pointers.

Thus, both the syntax and semantics of dynamic arrays differ from those of static 
arrays. Two system procedures, SetLength and Copy, previously intended to deal with 
strings, are applied now also for dynamic arrays. To define the sizes of dimensions 
- and the allowed index space - we must use the system procedure SetLength(A, 
Length1,...) with a non-fixed number of the integer parameters Length1,..., 
LengthN. At least one of them is always required to specify the size of the 
left-most dimension. If the number of dimensions is more than 1, the remaining 
sizes may be specified either in the same SetLength statement, or later in other 
such statements individually for each sub-array element. The former method defines 
rectangular arrays, similar to those known in ALGOL-60 or standard Pascal (but with 
mandatory zero low indexes), while the latter enables the so-called non-rectangular 
arrays.

For example, providing:

39  var
40    A: TDynArray3
41  
42  //the single statement:
43  
44  SetLength(A, N1, N2, N3)


defines the rectangular array with the index field [0..N1-1; 0..N2-1; 0..N3-1], 
while the statement:

SetLength(A, N1)

defines the size for the first dimension as N1 and correspondingly the index field 
for the first index as [0.. N1-1]. This postpones the definition of the 2 other 
dimensions for each A[k] individually. Figure 2 defines two types of triangle 
matrixes.

45  var
46    A, B: array of array of Double;
47    N, i: Integer;
48  begin
49    { Defining N. }
50    SetLength(A, N);
51    SetLength(B, N);
52    for i := 0 to N - 1 do
53    begin
54      { Lower-left triangle matrix; index field 0<=i<=N-1, 0<=j<=i }
55      SetLength(A[i], i + 1);
56      { Upper-left triangle matrix; index field 0<=i<=N-1, 0<=j<=N-i-1 }
57      SetLength(B[i], N - i);
58    end;
59    { ...}
60  end;

Figure 2: Two types of triangle matrixes.

Unfortunately, because of the limitation imposed by zero-based indexing, dynamic 
arrays don't allow us to define the lower- and upper-right triangle matrixes, 
matrixes with several diagonal stripes this way. Figure 3 shows some examples of 
three-dimensional dynamic arrays of a 3- and 4-lateral pyramid-type.

61  var
62    C, D: array of array of array of Double;
63    N, i, j: Integer;
64  begin
65    { Defining N }
66    SetLength(C, N);
67    SetLength(D, N);
68    for i := 0 to N - 1 do
69    begin
70      { 4-lateral pyramid; index field  0<=i<=N-1, 0<=j,k<=i }
71      SetLength(C[i], i + 1, i + 1);
72      { 3-lateral pyramid }
73      SetLength(D[i], i + 1);
74      for j := 0 to i do
75        { index field  0<=i<=N-1, 0<=j<=i, 0<=k <=j }
76        SetLength(D[i, j], j + 1)
77    end;
78    {...}
79  end;

Figure 3: 3D dynamic arrays of a 3- and 4-lateral pyramid type.

As to the speed of access to elements of dynamic arrays, it's almost as high as for 
static arrays, at least one- and two-dimensional ones, as the benchmark with matrix 
transposing proves. For static arrays, the memory location of each element in a 
multi-dimensional array is known as soon as the index expression computes. Thus, 
for a static element, such as A[k1,k2,k3], the relative location may look like this:

N2N3 * k1 + N3 * k2 + k3

Instead, for dynamic arrays to access an element of K-dimensional array, the code 
must sequentially de-reference K pointers to the respective one-dimensional 
sub-arrays. Both approaches seem compatible.

Language Barrier

The evolution of dynamic arrays in Borland Pascal/Delphi wasn't straightforward. 
With regard to the functionality, the multi-dimensional rectangular variant arrays 
are the closest to the concept of dynamic arrays as they first appeared in 
ALGOL-60, but variant arrays are 10 times slower than static ones, and they differ 
in syntax. In addition, for the concept of a variant array of variant arrays, both 
syntax and semantics remain not quite clear.

The dynamic arrays in Delphi 4 exceed the arrays of ALGOL-60, at least in that they 
can be non-rectangular and still be as fast as static arrays in Pascal. 
Unfortunately, this notion reveals several language inconsistencies:

Why the mandatory zero-based indexing when the static and variant arrays don't 
require it? The low..high indexing in standard Pascal is an important feature, and 
can be very helpful in many applications. Also, the most natural and safe method of 
numbering in structures like vectors and matrixes is to number the elements in a 
vector corresponding to the index field: 1..N, not 0..N-1. 
Why the special syntax in the declaration that omits the boundaries [ ]? As a 
result, the separate statement, SetLength, is later always required. True, 
SetLength allows to re-define the dimensions several times within a block, if it's 
really needed, but for the more typical case when the dimensions and the sizes are 
declared once at the beginning of the block, the standard form array[low..high] is 
better, because it is consistent with the syntax. The compiler knows by itself if 
the boundary expression is constant or variable, therefore it could implement the 
static or dynamic models according to the situation. 
The assignment statements with incomplete indexes for dynamic array variables such 
as A := B have a quite unusual meaning: Any change in an element A[k] affects also 
B[k] because A and B refer to the same instance. This behavior contradicts the 
standard meaning of the assignment statement and is dangerous, so that incomplete 
indexes in assignment statements for dynamic arrays shouldn't be allowed. 
The behavior of dynamic arrays as hidden pointers is better and safer than the 
behavior of classes (also hidden pointers) or pointers. Why not improve the 
behavior of classes and pointers to the level of dynamic arrays so that all 
indirect reference mechanisms in the language follow uniform rules? 
Logically, the Delphi type class must be simply an indirect reference version of 
the type object introduced in Turbo-Pascal 5.5. The only difference should be in 
the method of memory allocation: for the class on the heap, and for the object on 
the stack. This is safer and doesn't involve the dangerous separation of variables 
from their instances. Delphi 4 and all previous versions support the type object 
for backward compatibility, but there are still certain differences between syntax 
and semantics of both types. 

Conclusion

The fact that now there are as many as four different array types with quite 
different and inconsistent syntax and semantics in Borland Pascal/Delphi doesn't 
seem to be a good thing. Too many - and not always good - new features have been 
added to standard Pascal, which makes it cluttered, hectic, and less safe. It looks 
like the language doesn't evolve according to a well-developed fundamental plan; 
rather, it's trying to cope somehow with all different and inconsistent features of 
the very complex operating environment.

Delphi is still an unparalleled software development tool, but it's getting more 
and more complex, while its documentation and Help system lag behind. Even the 
Object Pascal Language Guide, the fundamental document of the language, is neither 
complete nor clear, or strict and formal enough as one should expect from a 
document of this type. This bears no resemblance to that high standard of 
documentation that Borland was proud of in the era of Borland Turbo Pascal. Back to 
the future?

I am very thankful to Dr Manfred Mackeben for his patience in reading and improving 
this text and for many valuable notes.
Vote: How useful do you find this Article/Tip?
Bad											Excellent
	1	2	3	4	5	6	7	8	9	10
Advertisement
Share this page
Advertisement