Shapes – an introduction

Photo by Pietro Jeng from Pexels

We’ve all had Tensorflow complain about some shapes at some point when starting out. In this first post, I’ll make shapes more intuitive by going into what shapes are, which data structures have a shape, and put everything into words.

The shape of an n-dimensional array shows its dimensions. A scalar has shape () since it has no dimensions. An array of length n has shape (n), but actually in practice (n,). This is only due to Python tuple syntax. A matrix would then have the shape (m,n), a cube (l,m,n) and so on.

In other words, let’s take an n-dimensional array of shape (2,3,4). We can read this as follows: we have two matrices, each of which has three rows, and each of the rows has length 4.

Us humans have trouble thinking in more than three dimensions, but we can still try to make sense of n-dimensions:

  • (1080,1920) could be a grey-scale full-HD picture, since we have 1080*1920 pixels with the luminosity value
  • (1080,1920,3) could be an RGB full-HD picture. Each pixel has 3 values for the three primary colours
  • (900,1080,1920,3) could be 30*30 frames = 30 seconds of full-HD video @ 30fps, a youtubehaiku
  • (1000,900,1080,1920,3) could be any of our embarrassingly large meme-video folders

Python & Tensorflow

A standard Python list has no shape, since it is not enforced to be rectangular. Take the following Python structure:

mylist = [
    [1,2,3,4,5],
    [1,2,3,4,5,6]
]

The shape of mylist is then ambiguous; it can either be (2,5) or (2,6). Data structures that have a shape, such as NumPy ndarrays, or TensorFlow tensors, enforce rectangularity.

import tensorflow as tf

print(tf.constant(1).shape) # ()
print(tf.constant([1]).shape) # (1,)
print(tf.constant([1,2,3]).shape) # (3,)
print(tf.constant([[1],[2],[3]]).shape) # (3,1)
print(tf.constant([[1,2],[3,4],[5,6]]).shape) # (3,2)
print(tf.constant([[[1,2],[3,4],[5,6]]]).shape) # (1,3,2)
print(tf.constant([[[[1,2],[3,4],[5,6]]]]).shape) # (1,1,3,2)

reshaping

Most frameworks offer useful tools to easily reshape our n-dimensional arrays or tensors. In the following snippet, I declare a tensor equal to [1,2,3,4] and reshape it in two different ways:

import tensorflow as tf

c = tf.constant([1,2,3,4])
print(c) # tf.Tensor([1 2 3 4], shape=(4,), dtype=int32)
print(tf.reshape(c, (1,4))) # tf.Tensor([[1 2 3 4]], shape=(1, 4), dtype=int32)
print(tf.reshape(c, (4,1))) # tf.Tensor([[1][2][3][4]], shape=(4, 1), dtype=int32)

Sometimes we are reshaping a tensor without knowing all of its dimensions. We can still perform a reshape using -1 as one of the shapes. In this case, using -1 as one of the dimensions tells Tensorflow to figure that one out for itself based on the other numbers. At most one dimension can be set as -1, since the expression would otherwise be ambiguous.

import tensorflow as tf

c = tf.constant([1,2,3,4])
print(c) # tf.Tensor([1 2 3 4], shape=(4,), dtype=int32)
print(tf.reshape(c, (-1,2))) # tf.Tensor([[1 2][3 4]], shape=(2, 2), dtype=int32)
print(tf.reshape(c, (-1,1,1,1,1))) # tf.Tensor([[[[[1]]]] [[[[2]]]] [[[[3]]]] [[[[4]]]]], shape=(4, 1, 1, 1, 1), dtype=int32)

Thanks for reading! I hope this post helped make shapes more intuitive if you were confused about them, as many are when starting out. If you’re curious to dive further in, my next post, A Closer Look At Shapes, has a more investigative approach into how shapes are used by Tensorflow.

Leave a Reply