STK++ 0.9.13
Arrays Tutorial 1 : Introduction to STK++ Arrays and their usages

The containers/arrays you use in order to store and process the data in your application greatly influence the speed and the memory usage of your application.

STK++ proposes a large choice of containers/arrays and methods that you can used in conjunction with them.

There are mainly two kinds of arrays you can use with STK++:

  • The Array2D family classes which are the classes defined in the oldest versions of STK++,
  • the CArray family classes which have been introduced in version 0.4 of STK++ library.

Before explaining the usage and differences between the different arrays, we first introduce some vocabulary. The terminology used in STK++ project for the arrays is the following:

  • An array is often called a matrix.
  • In the case where a matrix have 1 column, such matrix is called column-vector, often abbreviated just as vector,
  • in the other case, where a matrix have 1 row, it is called row-vector, often abbreviated just as point.

The word point is borrowed from the statistical vocabulary where a row of a data array is often named a point.

The Array2D classes are very flexible if you need to add, insert, remove, resize,... quickly rows or columns to your container. On the other hand, the storing scheme of the the CArray classes allow you to used them easily with other linear algebra libraries (e.g. Lapack, Blas, ...).

Introductory Example

Let us give you an introductory example:

Example:Output:
#include "STKpp.h"
using namespace STK;
int main(int argc, char** argv)
{
// create a matrix of Real of dynamic size (3,5)
Array2D<Real> a(3, 4); a << 1.,2.,3.,4.
, 1.,2.,3.,4.
, 1.,1.,1.,1.;
std::cout << "a=\n" << a;
// create a vector of Real of dynamic size 3 with all coefficients equal to 0
b[2] = 1.;
std::cout << "b=\n" << b;
// create an unitialized CArray of Real of fixed size (3,4)
// create a CArray of Real of dynamic size (3,4) with initial value -2.
CArray<Real> d(3, 4, -2.);
// compute c = -a - d + 3 (Id)
c= -a - d + 3.; c(2,2) = 5.;
std::cout << "c=\n" << c;
// create an unitialized CVector of Real of fixed size 3
e = -2.; e[1] = 5.;
std::cout << "e=\n" << e;
// create an initialized CPoint of Real of fixed size 3
f = -2.; f[1] = 5.;
std::cout << "f= " << f;
return 0;
}
This file include all the header files of the STK++ project.
The MultidimRegression class allows to regress a multidimensional output variable among a multivariat...
The namespace STK is the main domain space of the Statistical ToolKit project.
int main(int argc, char **argv)
a=
1 2 3 4
1 2 3 4
1 1 1 1
b=
0
0
1
c=
4 3 2 1
4 3 2 1
4 4 5 4
e=
-2
 5
-2
f= -2 5 -2

Accessors

The primary coefficient accessors and mutators in STK++ are the overloaded parenthesis operators. For matrices, the row index is always passed first.

The operator[] is also overloaded for index-based access in vectors, but keep in mind that C++ doesn't allow operator[] to take more than one argument. The operator[] is thus restricted to vectors/points/diagonal matrices. For vectors, just pass one index in a bracket.

Indexing starts at 0 by default. This behavior can be modify by defining the STKBASEARRAYS macro at compile time using the directive -DSTKBASEARRAYS=1. Enabling this macro allow user to get 1 based arrays like in FORTRAN. If you want to build code independent of the first index you should use the beginCols(), beginRows(), endCols(), lastIdxCols(), endRows() and lastIdxRows() methods of the arrays and the begin(), end(), lastIdx() methods of the Row-vectors, Column-Vectors, square matrices and diagonal matrices.

For example

ArrayXX t(5, 5); // array of size 5x5
for (int i=t.beginRows(); i<t.endRows(); i++)
{
PointX r(t.row(i), true); // create a reference on the i-th row of t
// fill the i-th row of t with the number i
// for (int j=r.begin(); j<r.end(); j++) { r[j] = i;}
r= i;
}

Expression template

Assume that c, a,d are array of the same size and consider the line of code

c= -a - d + 3.;

It is an expression which involve matrix operations. All these expressions are encoded in an expression template and are completely inlined at compile time. That means there is no temporary objects created when these expressions are evaluated.

The constructors

The Array2D family have only one mandatory template parameter: the type of the data that will be stored. On the other hand the CArray family have four template parameters:

  • the type of the data that will be stored,
  • the number of rows (UnknownSize if it is not known at compile time),
  • the number of columns (UnknownSize if it is not known at compile time),
  • the orientation storage scheme of the data (by row or by column).

Only the first parameter is mandatory.

Example:
#include "STKpp.h"
using namespace STK;
int main(int argc, char** argv)
{
// Array2D constructors
Array2DDiagonal<Real> a(3); a= 1.; // same as Array2DDiagonal<Real> a(3, 1.);
Array2DSquare<Real> b(3); b = 0.; // same as Array2DSquare<Real> b(3, 0.);
Array2DUpperTriangular<Real> c(3, 3); c= 2.; // same as Array2DUpperTriangular<Real> c(3, 3, 2.);
Array2DLowerTriangular<Real> d(3, 3); d = -2.;// same as Array2DLowerTriangular<Real> d(3, 3, -2.);
Array2DVector<Real> e(3); e= 5.; // same as Array2DVector<Real> e(3, 5.);
Array2DPoint<Real> f(3); f = 6.; // same as Array2DPoint<Real> f(3, 6.);
// CArray constructors
CArraySquare<Real> g(3); g= 0.; // same as CArraySquare<Real> g(3, 0.);
CArrayVector<Real> j(3); j= 5.; // same as CArrayVector<Real> j(3, 5.);
CArrayPoint<Real> k(3); k = 6.; // same as CArrayPoint<Real> k(3, 6.);
return 0;
}

Accessing rows/columns/parts of an array

You can access rows, columns and sub-part of STK++ arrays easily. Here is an example:

Example:Output:
#include "STKpp.h"
using namespace STK;
int main(int argc, char** argv)
{
VectorX b(3, 0);
std::cout << "b=\n" << b;
b.sub(Range(2)) = 1.;
std::cout << "b=\n" << b << "\n";
ArrayXX a(3, 4); a << 1.,2.,3.,4.
, 1.,2.,3.,4.
, 1.,1.,1.,1.;
std::cout << "a=\n" << a;
a.col(1) = b;
a.row(1, Range(2,2)) = 0.;
a.sub(Range(1,2), Range(2,2)) = 9.;
std::cout << "a=\n" << a;
return 0;
}
Index sub-vector region: Specialization when the size is unknown.
Definition STK_Range.h:265
b=
0
0
0
b=
1
1
0

a=
1 2 3 4
1 2 3 4
1 1 1 1
a=
1 1 3 4
1 1 9 9
1 0 9 9

Applied to a vector/point/diagonal matrix the method sub require only one parameter: the Range of the data we want to access/mutate.

In the general case, you can use the following method in order to access/mutate columns/rows/part of an array:

// access to the ith row
a.row(i);
// access to the element [first, first+size-1] of the ith row
a.row(i, Range(first, size));
// access to the jth column
a.col(j);
// access to the element [first, first+size-1] of the jth columns
a.col(j, Range(first, size));
// access to the element [first, first+size-1] of the jth columns
a.col(j, Range(first, size));
// access to the sub-array formed by the range I,J.
a.sub(I, J);

For the Array2D classes (deriving form the IArray2D class), the operator () have been overloaded and you can also access to row/column/sub-part of the Array using

Range I(3,2), J(4,2);
// access to the row i, in the range J = 4:5
a(i,J);
// access to the column j in the Range I=3:4
a(I,j);
// access to a subarray in the range (3:4, 4:5)
a(I,J);

Using references and move

In some cases, you may want to conserve an access to some part of an array for some work. For this purpose, it is possible to create reference array, that is array that wrap (part of) another array.

Example:Output:
#include "STKpp.h"
using namespace STK;
int main(int argc, char** argv)
{
ArraySquareX a(4); a << 1.,2.,3.,4.
, 1.,2.,3.,4.
, 1.,1.,1.,1.
, 1.,1.,1.,1.;
// create a reference
ArrayXX b(a.sub(Range(2), Range(3)), true);
b = -1.;
std::cout << "Modified a=\n" << a << "\n";
m.move(Stat::mean(a));
std::cout << "m= " << m;
return 0;
}
Modified a=
-1 -1 -1 4
-1 -1 -1 4
 1  1  1 1
 1  1  1 1

m= 0 0 0 2.5

The Stat::mean function return an Array2D by value. In order to avoid a useless copy, we use the move function. The following piece of code

m.move(b);

perform the operations:

  • if a contains data, the memory is released,
  • a become the owner of the data contain by b,
  • b become a reference.
Note
If b was a reference, then a is also a reference.

Many more examples can be found in the test files.