Tensor Dimensions

There are some import concepts that you will need to understand in order to work with AI programming. A good starting point is to understand tensors and their associated dimensining. This knowledge is critical for training, deploying and inferring from AI models.

What is a tensor?

From a purely mathematical standpoint a tensor is an n-dimensional sized list of numbers with n being any positive integer. The commonly used terminology differs a little in that the term "tensor" is used for a matrix with 3 or more dimensions.

A 0-dimensional array is usually referred to as a scalar (a single number)
A 1-dimensional array is usually referred to as a vector (e.g [x, y, z])
A 2-dimensional array is usually referred to as a matrix e.g:

[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
A higher dimensional array is usually referred to as a tensor

When working with programming packages such a PyTorch a tensor data type can be used for any sized array. Let's have a look at what a tensor looks like in PyTorch:

import torch # Define a 1-dimensional tensor that contains 3 values my_tensor = torch.tensor([1, 2, 3]) # Print the tensor object and data type print(my_tensor) # Print the tensor data type print(my_tensor.dtype) >> tensor([1, 2, 3]) torch.int64

We have successfully created a PyTorch tensor of type torch.int64. Generally when we are dealing with AI models we will be using floating point numbers rather than integers. This can be achieved in PyTorch by defining a float tensor, denoted as one of torch.float, torch.float32 or torch.FloatTensor for 32-bit float tensors. For more information on the available data types see the PyTorch documentation.

# Define a 1-dimensional tensor that contains 5 values my_tensor = torch.tensor([1, 2, 3, 4.0, 5.5], dtype=torch.float) # Print the tensor object print(my_tensor) # Print the tensor data type print(my_tensor.dtype) >> tensor([1.0000, 2.0000, 3.0000, 4.0000, 5.5000]) torch.float32

What is a Dimension??

Let's start by thinking about the difference between 1-dimensional, 2-dimensional and 3-dimensional spaces. A one dimensional space is a space that can have any point represented by only single number e.g. 25. A 2 dimensional space is a space where the points are referenced by a pair of numbers e.g. [25, 32]. A 3 dimensional space is a space where each coordinate is referenced by 3 numbers e.g.[25, 32, 45]. Note that in every case only a 1-dimensional array of numbers is needed to reference a point in the n-dimensional space.

The qustion is, what do these numbers represent? They can actually represent whatever we want them to. If we want to define map coordinates for example, then the values would represent the coordinates in each direction on the map. If we had a 2-dimensional map (like Google earth) we could define a 1-d array containing 2 values to represent X and Y coordinates, or Latitude and Longitude. If we had a 3-dimensional map (like a computer game or rendering software such as Blender) we could define a 1-dimensional array containing 3 numbers which represent X, Y and Z locations.

But what if we don't want to represent coordinates? Well that's fine as we can define the tensor to represent whatever we like. For examle we could define a 2-dimensional array to represent a paragraph in a book ([5, 3] could be page 5, paragraph 3).

Bonus: What is the 4th Dimension? 🔽

(Use arrow to expand section)

I'll start off by saying this question is not as scary as it sounds. People will sometimes quote "The 4th dimesion is time, right?". Technically that is only one possible answer of many. The reason being that we can define the dimensions to represent anything we like. You could indeed represent time as a 4th dimension alongside the 3 spatial dimensions and the resultant coordinates would be of the form [x, y, z, t]. This representation is used in some areas of physics, however it's not the only option.

Another possibility for a 4th dimension is a 4th spatial dimension ([x, y, z, w]). What would that look like and how does it differ from having time as a 4th dimension? Firstly, if you treat time as a 4th dimension you would simply have a 3-d coordinate [x, y, z] with a 1-dimensional time value appended. This is an effective way to represent a location and a time.

Sounds simple enough, how about a 4th spatial dimension? Well that one is a bit trickier to imagine! We can picture a 2-dimensional surface as a flat surface like a piece of paper or a photograph. We can also picture a 3-dimensional surface like a room or an object. How could we possibly define a 4th dimension? Well the important thing to remeber is what we are trying to represent. In this case we are trying to create tensors that represent spatial coordinates.

This is made difficult by the fact we live in a 3-d world and have no way to picture dimensions above 3. What we need to do is separate the idea of dimensions from the idea of spatial dimensions. It's perfectly valid to define a 4th spatial dimension even though we haven't observed one in existance. The math will still work out and remember it's just a theoretical construct. Dimensions higher than 3 are indeed used often in machine learning and AI.

Coming back to the [x, y, z, t] and [x, y, z, w] comparision, one important thing to note is the idea of orthongonality. Put simply, the word orthogonal mean perpendicular to. For example in a 2-d space there are 90 degrees between the x and y axis just like in 3-d where there are 90 degrees between x, y and z.

This is difficult to picture in 4 dimensions however there are indeed 90 degrees between x, y, z and w dimensions in a theoretical 4-d world. Compare this to [x, y, z, t] where you have 90 degrees between x, y and z but those dimensions have no relationship to [t]. The time dimension is essentially a straight line that moves independently of the other dimensions.

Working With Tensors

Now that we understand what tensors are we need to understand how to work with them in PyTorch. The first thing worth learning is the .shape attribute. This is used to return an array which specifies the size of every dimension in the tensor. # Define a 2-dimensional tensor that contains 6 values my_tensor = torch.rand((2,3), dtype=torch.float) # Print the tensor object print(my_tensor) # Print the tensor shape print(my_tensor.shape) # Print the size of the first dimension print("First dimension size:", my_tensor.shape[0]) >> tensor([[0.1812, 0.4027, 0.9661], [0.5826, 0.2785, 0.8558]]) torch.Size([2, 3]) First dimension size: 2