03 02 Neural Networks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Neural Networks

Dr. Martin “Doc” Carlisle


Neural Networks

Neural Networks are a machine learning


technique fashioned after a mathematical model
of a brain neuron
A “neuron”

Red multiplied by weights


Orange is an “activation function”
Activation function
• One example is sigmoid
1
– f(x) =
1+𝑒 −𝑥
Creating a “network”
• A multi-layer architecture of neurons

• Possibly many hidden layers


At a neuron
• Weights for each input

• H1=sigmoid(w11*x1+w12*x2)
• H2=sigmoid(w21*x1+w22*x2)
• O1=sigmoid(wo1*h1+wo2*h2)
Output layer
• We can have a neuron for each category, and we want 1
for yes, 0 for no
• Now we can go backwards with partial derivatives to
update weights using the chain rule
Updating weights
• We then update weights with a “learning rate”
Good and bad news

• Good news! We don’t need to know anything about partial differentials –


pytorch will do this all for us

• Bad news! Picking a correct architecture, learning rate can be hard (and is
a huge search space)
Speeding things up with a GPU

• GPUs offer data-parallelism


– This can make NN operations *much* faster
• BUT
– You have to explicitly put things on the GPU
– Copying to/from the GPU is expensive
PyTorch

Useful for doing neural networks on a GPU

Tensor = NumPy ndarray, but can be on a GPU


Device = place a tensor lives (CPU, or CUDA [GPU])

WARNING!
• Copying an array to/from the GPU is expensive!
• You have to think about where you want
computation to happen!
Jupyter Notebook

Go to Pytorch_Intro.ipynb at Google colab

https://colab.research.google.com/
Sample NN

• See Colab notebook for First_NN


Convolutional Neural Nets
• Aim for shift, scale and distortion invariance
– Local receptive fields
– Shared weights
– Sub-sampling
Local receptive fields
• Extract elementary visual features
– Oriented edges
– End points
– Corners
• Also, avoid explosion of weights! (224x224 image with 3 colors has
150,528 features)
Shared weights
• Detect same features at all possible locations in the input
Subsampling
• Reduces sensitivity to distortion
• Finds features relatively placed
Jupyter Notebook for CNN

• https://github.com/erykml/medium_articles/blob/master/Co
mputer%20Vision/lenet5_pytorch.ipynb
• Ienet5_pytorch.ipynb (Google Colab and Jupyter
Notebook)
• Note speed difference
• Note use of torch.no_grad()
• conda install torchvision -c pytorch
Train/Validate/Test
• Ideally, we split data into 3 parts
• Train
– Use this to train the model and update weights
• Validate
– Use this to prevent over-training
– Don’t train, but evaluate hyper-parameters (learning rate, # of
epochs, etc.)
• Test
– Use only on final model run
Recurrent Neural Network
• Connections have temporal sequence
• Useful for applications where context is useful for
prediction
– Handwriting recognition (unlike zipcodes, we have a good
sense what comes “afte”)
– Speech recognition
Long short-term memory (LSTM)

• “Forgets” part of previously stored memory and adds new


data
• Cell gets cell and hidden state from previous timestep,
input from current timestep
• Output is part of hidden state
LSTM
Example

• Password generator
– conditional-char-rnn.ipynb

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy