03 02 Neural Networks
03 02 Neural Networks
03 02 Neural Networks
• H1=sigmoid(w11*x1+w12*x2)
• H2=sigmoid(w21*x1+w22*x2)
• O1=sigmoid(wo1*h1+wo2*h2)
Output layer
• We can have a neuron for each category, and we want 1
for yes, 0 for no
• Now we can go backwards with partial derivatives to
update weights using the chain rule
Updating weights
• We then update weights with a “learning rate”
Good and bad news
• Bad news! Picking a correct architecture, learning rate can be hard (and is
a huge search space)
Speeding things up with a GPU
WARNING!
• Copying an array to/from the GPU is expensive!
• You have to think about where you want
computation to happen!
Jupyter Notebook
https://colab.research.google.com/
Sample NN
• https://github.com/erykml/medium_articles/blob/master/Co
mputer%20Vision/lenet5_pytorch.ipynb
• Ienet5_pytorch.ipynb (Google Colab and Jupyter
Notebook)
• Note speed difference
• Note use of torch.no_grad()
• conda install torchvision -c pytorch
Train/Validate/Test
• Ideally, we split data into 3 parts
• Train
– Use this to train the model and update weights
• Validate
– Use this to prevent over-training
– Don’t train, but evaluate hyper-parameters (learning rate, # of
epochs, etc.)
• Test
– Use only on final model run
Recurrent Neural Network
• Connections have temporal sequence
• Useful for applications where context is useful for
prediction
– Handwriting recognition (unlike zipcodes, we have a good
sense what comes “afte”)
– Speech recognition
Long short-term memory (LSTM)
• Password generator
– conditional-char-rnn.ipynb