- Emily T. Burak

# PyTorch Introduction: A Gentle Introduction to Torch 101 Pt.1/?

Updated: Apr 7, 2021

## PyTorch -- a leading light in Deep Learning

Today I am writing about __PyTorch____,__ an open-source Machine Learning framework for Python that is very powerful for Deep Learning. I will be touching on what PyTorch is on a high level and what makes it exciting. I will also speak to the benefits of PyTorch. Finally, I will give an example of the distinctive style of PyTorch code.

### What is PyTorch?

PyTorch(or often colloquially, and import as, torch,) is a great framework with a large ecosystem for Machine Learning tasks, that shines especially in Deep Learning areas such as Computer Vision. PyTorch is based around the * tensor*, a multi-dimensional matrix containing elements of a single data type, which provides a convenient structure for storage, manipulation, and operation over data in concert with a Dynamic Computational Graph(a graph data structure in contrast to other libraries using static graphs) for Deep Learning.

PyTorch is favored in academia and research, though popular interest has surged with the __Practical Deep Learning for Coders__ course and PyTorch based __fastai__ library giving many their introduction to, well, practical Deep Learning and PyTorch as well. I will say I have been through some of the content in the course on a casual level and it is very interesting stuff, including utilizing the more abstract fastai library with a top-down teaching style that dives head-in to implementing concepts. Often, a frustration among Deep Learning learners(who learn and learn about Machine Learners, there's a lot of learning here..) is going bottom-up when being taught, starting with the lowly but key tensor quite often, as seen in the still-terrific __DataCamp__ courses that utilize PyTorch. Just throwing some resources at you!

PyTorch's syntax is heavily rooted in Object-Oriented Programming. This is in contrast to more procedural or declarative approaches to ML frameworks or Deep Learning. OOP makes it great for scalability and for those like me who come from a Software Engineering background utilizing OOP heavily. Finally, it has a large ecosystem including libraries for tasks from vision to graph neural networks and integration with scikit-learn, another popular ML framework.

### Benefits of PyTorch

I will point to two main benefits of PyTorch, which I've touched on already and will focus more on. The first is its Object-Oriented Programming design. The second is its popularity in academia.

Object-Oriented Programming, while often difficult to grasp at first, is absolutely a powerful way of structuring code and modeling, well, objects. In PyTorch's case, this is often a neural network itself, a complex object that I think intuitively makes sense to utilize OOP to model. As above, so below, as they say, and while no model is perfect, some are very useful, with OOP and tensors providing a powerfully useful combination in PyTorch.

PyTorch's popularity with research means it is great to know to digest research papers and keep ahead in the ever-expanding ML field, especially the blazing-fast development of Deep Learning as part of the field. As well, it means that adaptations of papers and papers with code through __arxiv__ and __Papers With Code__ make for great learning resources and material to adapt for your own work.

### A PyTorch example

I am going to wrap this blog post up with __some sample PyTorch code, taken from the excellent documentation__, presented without much comment. As mentioned, it utilizes tensors and an OOP paradigm. Read it over, presuming you're unfamiliar with torch, and look out for the next entry in this series where I will once again appreciate you coming here for information on Deep Learning:

```
```*# **-*****-** coding**:** utf**-**8** **-*****-*
**import** torch
**import** math
*# Create Tensors to hold input and outputs**.*
x **=** torch**.**linspace(**-**math**.**pi, math**.**pi, 2000)
y **=** torch**.**sin(x)
*# For **this** example**,** the output y is a linear **function** **of** **(**x, x^2, x^3**)**,** so*
*# we can consider it **as** a linear layer neural network**.** Let's prepare the*
*# **tensor** **(**x**,** x**^**2**,** x**^**3**)**.*
p **=** torch**.**tensor([1, 2, 3])
xx **=** x**.**unsqueeze(**-**1)**.**pow(p)
*# In the above code**,** x**.**unsqueeze**(**-**1**)** has **shape** **(**2000**,** **1**)**,** and p has shape*
*# **(**3**,**)**,** **for** **this** **case**,** broadcasting semantics will apply to obtain a tensor*
*# **of** **shape** **(**2000**,** **3**)** *
*# Use the nn **package** to define our model **as** a sequence **of** layers**.** nn**.**Sequential*
*# is a Module which contains other Modules**,** and applies them **in** sequence to*
*# produce its output**.** The Linear Module computes output **from** input using a*
*# linear **function**,** and holds internal Tensors **for** its weight and bias**.*
*# The Flatten layer flatens the output **of** the linear layer to a **1**D tensor**,*
*# to match the shape **of** **`y`**.*
model **=** torch**.**nn**.**Sequential(
torch**.**nn**.**Linear(3, 1),
torch**.**nn**.**Flatten(0, 1)
)
*# The nn **package** also contains definitions **of** popular loss functions**;** **in** **this*
*# **case** we will use Mean Squared **Error** **(**MSE**)** **as** our loss **function**.*
loss_fn **=** torch**.**nn**.**MSELoss(reduction**=**'sum')
learning_rate **=** 1e-6
**for** t **in** range(2000):
*# Forward pass**:** compute predicted y by passing x to the model**.** Module objects*
*# override the __call__ operator so you can call them like functions**.** When*
*# doing so you pass a Tensor **of** input data to the Module and it produces*
*# a Tensor **of** output data**.*
y_pred **=** model(xx)
*# Compute and print loss**.** We pass Tensors containing the predicted and **true*
*# values **of** y**,** and the loss **function** returns a Tensor containing the*
*# loss**.*
loss **=** loss_fn(y_pred, y)
**if** t **%** 100 **==** 99:
print(t, loss**.**item())
*# Zero the gradients before running the backward pass**.*
model**.**zero_grad()
*# Backward pass**:** compute gradient **of** the loss **with** respect to all the learnable*
*# parameters **of** the model**.** Internally**,** the parameters **of** each Module are stored*
*# **in** Tensors **with** requires_grad**=**True**,** so **this** call will compute gradients **for*
*# all learnable parameters **in** the model**.*
loss**.**backward()
*# Update the weights using gradient descent**.** Each parameter is a Tensor**,** so*
*# we can access its gradients like we did before**.*
**with** torch**.**no_grad():
**for** param **in** model**.**parameters():
param **-=** learning_rate ***** param**.**grad
*# You can access the first layer **of** **`model`** like accessing the first item **of** a list*
linear_layer **=** model[0]
*# For linear layer**,** its parameters are stored **as** **`weight`** and **`bias`**.*
print(f'Result: y = {linear_layer**.**bias**.**item()} + {linear_layer**.**weight[:, 0]**.**item()} x + {linear_layer**.**weight[:, 1]**.**item()} x^2 + {linear_layer**.**weight[:, 2]**.**item()} x^3')

Pretty wild, huh? Join me next time on my -- our -- adventure into the world of PyTorch!

__[Next post here, re: Tensors!]__

__[Next post here, re: Tensors!]__

*(Photo by **Linus Sandvide** on **Unsplash**)*