PyTorch
PyTorch is the most popular AI programming libary at the moment. It's a free and open source Python libary which has support for a large array of AI model components. PyTorch is fast and efficient due to it's C++ backend as well as GPU acceleration via NVIDIA CUDA or AMD ROCm. Installation on Linux is straighforward and can be done with a single command. The only difficulty is ensuring that you install the version that is compatible with your OS and version of CUDA/ROCm.
This page on the PyTorch website has you covered for installation on Linux, Windows and Mac. You will need to either choose a package manager such as Conda or Pip or install from source. If you wish to use Conda see my Ananconda Setup notes.
Sample code for an MLP Classifier Model
This code defines a binary classifier model using the PyTorch Neural Network (nn) module. A binary classifier is a network which takes N inputs and has a single output between 0 and 1 which specifies the probability that the inputs represent a particular class. For example the inputs could be the grades for assignments and exams for a university course and the output would represent the probability of the student passing the course.
The layers of the Neural Network and the activation functions are defined and wrapped in a class. The class inherits from the
nn.Module
class in order to provide the desired neural network functionality. The init method has 2 parameters
which specity the number of inputs and the number of hidden neurons.
The first line in the init method calls super().__init__()
which performs the initialistion of the nn module. Then
we define the fc1 and fc2 fully-connected layers and pass in the number of inputs and outputs for each layer. Finally the activation
function is defined which in this case is the ReLU function which will add a non-linearity to the neural network (see
activation functions).
The forward method defines the behavior of a forward pass through the neural network. Forward passes are used during training the network and for performing inference on a trained network. The method takes in the inputs (typically denoted as 'x') and then runs them through the fully-connected layers and activation function. The output of the forward method is the prediction which in this case is a single floating point number. This will later be converted to a probability using Sigmoid.
Usually training of neural networks is performed in batches rather than sigle inputs. What this means is that PyTorch nn modules and associated methods can take in either single values or arrays of values. The batching simply adds an additional dimension to the beginning of the input tensor. For example if performing predictions for an entire class of students the input tensor when calling the forward method could be of dimension [students, input_size] and the output would be of size [students, 1].