Arrays and Indices

Arrays are fundamental data structures in algorithms and scientific programming. In Python, they are mainly handled with numpy, which provides optimized support for multidimensional arrays (or tensors).

This section explains how to interpret indices in 1D, 2D and 3D arrays and their display conventions, with mathematical and Python examples, as well as a few subtleties to keep in mind.

1D arrays: vectors and spaces

A one-dimensional array is called a vector. It can be represented in a horizontal or vertical form. Each element is indexed by an index \((i)\) corresponding to its position:

Mathematically, a row vector (\(V \in \mathbb{R}^{1 \times n}\)) is written as:

\[V = \begin{bmatrix} v_0 & v_1 & v_2 & \dots & v_{n-1} \end{bmatrix}\]

A column vector (\(V \in \mathbb{R}^{n \times 1}\)) is written as:

\[\begin{split}V = \begin{bmatrix} v_0 \\ v_1 \\ v_2 \\ \vdots \\ v_{n-1} \end{bmatrix}\end{split}\]

In Python with numpy:

import numpy as np

vecteur = np.array([1, 2, 3, 4])
print(vecteur.shape)  # (4,)
print(vecteur)        # Affichage Ligne

vecteur_colonne = vecteur[:, np.newaxis]  # Transformation en colonne
print(vecteur_colonne.shape)              # (4, 1)
print(vecteur_colonne)                    # Affichage Ligne

Important

Subtlety

In numpy, an array with shape \((n,)\) is a vector without a second dimension. Unlike an explicit vector with shape \((1, n)\) or \((n, 1)\), it does not always behave like a matrix.

2D arrays: matrices

A 2D array is a matrix (\(M \in \mathbb{R}^{m \times n}\)), where each element is referenced by a pair of indices \((i, j)\), representing the row and the column, respectively:

Mathematically, a matrix \(M\) of size \(m \times n\) is written as:

\[\begin{split}M = \begin{bmatrix} m_{0,0} & m_{0,1} & \dots & m_{0,n-1} \\ m_{1,0} & m_{1,1} & \dots & m_{1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ m_{m-1,0} & m_{m-1,1} & \dots & m_{m-1,n-1} \end{bmatrix}\end{split}\]

In Python with numpy:

import numpy as np

matrice = np.array([[1, 2, 3], [4, 5, 6]])
print(matrice)        # Affichage de la matrice
print(matrice.shape)  # (2, 3) -> 2 lignes, 3 colonnes

Important

Warning: coordinates (x, y) vs indices (i, j)

In computing:

  • Index i represents the row (the Y axis).

  • Index j represents the column (the X axis).

  • If we talk about Cartesian coordinates \((x, y)\), the order is reversed: \(x\) corresponds to columns, and \(y\) to rows.

  • Therefore, a point \((x, y)\) is located at index \((y, x)\) in an array.

Example:

import numpy as np

matrice = np.array([[1, 2, 3], [4, 5, 6]])
x, y = 1, 2             # Coordonnées classiques
valeur = matrice[y, x]  # Correspondance (y, x) en indices numpy
print(valeur)           # matrice[2,1]

Memory order (C vs Fortran)

The way an array is stored in memory can impact performance:

  • C order (row-major): elements of a row are contiguous in memory (default in numpy).

  • Fortran order (column-major): elements of a column are contiguous in memory (default in scientific languages such as Matlab, R, and Julia).

Check:

import numpy as np

A = np.array([[1, 2], [3, 4]], order='C')  # Row-major (C-contiguous)
B = np.array([[1, 2], [3, 4]], order='F')  # Column-major (F-contiguous)

print(f"A (Row-major):\n{A}")
print(f"Memory storage: {A.ravel(order="K")}", )  # Affiche l'ordre réel en mémoire
print(f"C-contiguous: {A.flags['C_CONTIGUOUS']}, F-contiguous : {A.flags['F_CONTIGUOUS']}")

print(f"B (Column-major):\n{B}")
print(f"Memory storage: {B.ravel(order="K")}", )  # Affiche l'ordre réel en mémoire
print(f"C-contiguous: {B.flags['C_CONTIGUOUS']}, F-contiguous : {B.flags['F_CONTIGUOUS']}")

3D arrays: tensors and interpretation

A 3D array, which adds a depth (an additional axis), represents a tensor (\(T \in \mathbb{R}^{m \times n \times l}\)), where each element is referenced by a triplet of indices \((i, j, k)\). It can be seen as a stack of matrices:

\[\begin{split}T = \begin{bmatrix} M_0 \\ M_1 \\ \vdots \\ M_{p-1} \end{bmatrix}\end{split}\]

or explicitly:

\[T[i, j, k] \quad \text{where } i, j, k \text{ are respectively the depth, row, and column indices}\]

In Python:

import numpy as np

tenseur = np.zeros((3, 4, 5))  # 3 plans, 4 lignes, 5 colonnes
print(tenseur.shape)  # (3, 4, 5)

Images and conventions

  1. RGB images (depth = 3)
    • An RGB image is often stored with shape \((height, width, 3)\), where the last dimension represents the Red, Green, Blue channels.

    import numpy as np
    
    image_rgb = np.random.randint(0, 256, (100, 200, 3), dtype=np.uint8)
    print(image_rgb.shape)  # (100, 200, 3)
    
  2. Multi-channel images (TIFF, hyperspectral)
    • Some images (TIFF) follow a convention \((plane, Y, X)\) where:
      • Plane = different slices of the image (e.g., different slices of a volumetric image)

      • Y = height (rows)

      • X = width (columns)

    import tifffile
    
    img_tiff = tifffile.imread("image.tiff")
    print(img_tiff.shape)  # (Nombre de plans, Hauteur, Largeur)
    

Visualizing a tensor in perspective

To better understand a 3D array, one can write a matrix notation in perspective, simulating depth:

\[\begin{split}T = \begin{bmatrix} \begin{bmatrix} t_{0,0,0} & t_{0,0,1} & \dots & t_{0,0,n-1} \\ t_{0,1,0} & t_{0,1,1} & \dots & t_{0,1,n-1} \\ \end{bmatrix}, \quad \begin{bmatrix} t_{1,0,0} & t_{1,0,1} & \dots & t_{1,0,n-1} \\ t_{1,1,0} & t_{1,1,1} & \dots & t_{1,1,n-1} \\ \end{bmatrix}, \dots \end{bmatrix}\end{split}\]

This makes it possible to mentally visualize each matrix plane separately.

Conclusion

Arrays are powerful structures, but it is essential to clearly understand index ordering depending on the context (mathematics, NumPy, images, etc.).