Getting Started With Pytorch

Edna Figueira Fernandes
4 min readMay 11, 2020

--

In this blog post, I am going to build a neural network using the nn Module from Pytorch on the MNIST dataset of handwritten digits. This is a simple dataset that can get you started in image classification. The dataset has 60,000 training images; each image is a 28x28 pixel and the labels are the actual digits.

I start by importing the necessary libraries:

import numpy as np 
import torch
import matplotlib.pyplot as plt
%matplotlib inline
from torchvision import datasets, transforms
from torch import nn
from torch import optim
import torch.nn.functional as F

Next, I define my transformation pipeline. The pipeline has only two parameters: “transforms.ToTensor()” which transforms the images into tensors and “transforms.Normalize”. For normalization, I am applying a 0.5 mean and a 0.5 standard deviation. Essentially, each image is subtracted by the mean and the result is divided by the standard deviation, normalizing the image in the range [-1,1].

I am downloading the data and storing it in the variable trainset, therefore, this variable contains all 60,000 images. Next, I am loading the data in batches of 64 and storing it in the variable trainloader. Furthermore, I am setting shuffle equal to true; this allows for a random selection of data 64 images every time a new epoch starts. This is important because it allows the network to generalize better instead of just learning from a particular order.

transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,),(0.5,))])
trainset = datasets.MNIST(‘MNIST_data/’, download=True, train=True, transform=transform)trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)# turning the object trainloader into an iterator
dataiter = iter(trainloader)
images, labels = dataiter.next()

In Figure 1, I am visualizing one of the images in the dataset.

plt.imshow(images[1].numpy().squeeze())
Figure 1: Image[1] from trainloader

Building the neural network

Pytorch has a module called nn which happens to be a powerful method for defining network architectures. The network used for this project is called fully-connected. This means that each unit in one layer is connected to each unit in the next layer.

In fully-connected networks, the input layers need to be a 1D vector. In this case, the images are a 2D vector (28x28 pixels). In order to turn theminto a 1D vector, we do flattening, which results in a shape of (1, 784), where 784 represents 28 times 28 which comes from the pixel size.

While defining the classifier, I am using “nn.Module” in the class definition, meaning that I am inheriting from this class. “super().__init__()” means that I am accessing all of the class’s methods and attributes.

The network below has 784 inputs, 128 hidden layers, and then another 64 hidden layers, followed by 10 outputs. After each hidden layer, the relu activation function is applied. As explained by Artem Oppermann, activation functions are used to add non-linearity properties to the network, which helps the model learn more complex relationships and identify complex patterns from the data. Also, after the output layer, the softmax function is applied to get the probability that the model predicts the label given the image.

# Building the network using the nn Module from torchclass classifier(nn.Module):
def __init__(self):
super().__init__()

# defining hidden layer, output layer, sigmoid activation and output softmax
self.hidden1 = nn.Linear(784, 128)
self.hidden2 = nn.Linear(128, 64)
self.output = nn.Linear(64, 10)
self.relu = nn.ReLU()
self.softmax = nn.LogSoftmax(dim=1)

def forward(self, x)
# pass the input tensor through each of the operations
x = self.hidden1(x)
x = self.relu(x)
x = self.hidden2(x)
x = self.relu(x)
x = self.output(x)
x = self.softmax(x)

return x

To estimate the loss, I am usnig the negative log likelihood loss (nn.NLLLoss). The loss calculates how far away the network’s prediction is from the true labels of the images. Next, I am defining an optimizer. The optimizer defines how I am choosing to update the model with the weights.

To start training the model, first I am making a forward pass through the network, from the inputs all the way to the output. Once the network reaches the output, it calculates the loss and then it makes a backward pass, from the outputs to the inputs all the while calculating the gradients based on the loss that the model estimated in the forward pass. Finally, it optimizes the weights before doing the forward pass again. In this case, the network does this process 5 times (epochs =5).

criterion = nn.NLLLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
epochs = 5for epoch in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flattening
images = images.view(images.shape[0], -1)
optimizer.zero_grad()

output = model.forward(images)
loss = criterion(output, labels)
loss.backward()
optimizer.step()

running_loss += loss.item()
else:
print(‘Training loss: ‘, running_loss/len(trainloader))

Now that I am done training the model, I am going to use an image from the batch to test it.

I am choosing an image from the batch, then I am training the model on that image (Figure 2). Once the network makes its prediction, the probability of being a particular number is determined (Figure 3).

# choosing one image
img = images[1].view(1, 784)
# turn off gradients
with torch.no_grad():
logits = model.forward(img)
ps = F.softmax(logits, dim=1)plt.imshow(img.resize_(1, 28, 28).numpy().squeeze())
Figure 2: Image that is being used for testing.
ps = ps.data.numpy().squeeze()plt.barh(np.arange(10), ps)
plt.yticks(np.arange(10))
plt.show()
Figure 3: Probability that it predicted label 6.

The model estimates a very high probability that the image shown on Figure 2 is the number 6!

References

www.udacity.com

--

--

No responses yet