CS5242 Assignment 2
CS5242 Assignment 2
CS5242 Assignment 2
The backward pass will receive upstream derivatives and the cache object, and will return gradients with
respect to the inputs and weights, like this:
return dx, dw
1
After implementing a bunch of layers this way, we will be able to easily combine them to build classifiers
with different architectures.
2 Submission details
Since we have not restricted the usage of other programming languages, our submission format will need to
be in output text form (similar to the previous assignment). For each question, we will provide the input
arguments and you have to provide a text file containing the corresponding output, to a certain precision.
This iPython notebook serves to: - explain the questions - explain the function APIs - providing helper
functions to piece functions together and check your code - providing helper functions to load and save arrays
as csv files for submission
Hence, we strongly encourage you to use Python for this assignment as you will only need to code the
relevant parts and it will reduce your workload significantly. For non-Python users, some of the cells here
are for illustration purpose, you do not have to replicate the demos.
The input files will be in the input files folder, and your output files should go into output files
folder. Similar to assignment 1, use np.float32 if you are using Python and use at least 16 significant
figures for your outputs. For Python users, if you use the accompanying printing functions when using
np.float32 variables, you should be ok.
In [ ]: # A bit of setup
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
from code_base.classifiers.cnn import *
from code_base.data_utils import get_CIFAR2_data
from code_base.gradient_check import eval_numerical_gradient_array, eval_numerical_gradient
from code_base.layers import *
from code_base.solver import Solver
%matplotlib inline
plt.rcParams[’figure.figsize’] = (10.0, 8.0) # set default size of plots
plt.rcParams[’image.interpolation’] = ’nearest’
plt.rcParams[’image.cmap’] = ’gray’
data = get_CIFAR2_data()
for k, v in data.items():
print(’%s: ’ % k, v.shape)
2
The input consists of N data points, each with C channels, height H and width W. We convolve each
input with F different filters, where each filter spans all C channels and has height HH and width HH.
Input: - x: Input data of shape (N, C, H, W)
• ‘stride’: The number of pixels between adjacent receptive fields in the horizontal and vertical directions.
• ‘pad’: The number of pixels that will be used to zero-pad the input in each x-y direction. We will use
the same definition in lecture notes 3b, slide 13 (ie. same padding on both sides). Hence p=2 means a
1-pixel border of padding with zeros.
WARNING: Please implement the matrix product method of convolution as shown in Lecture notes
4, slide 38. The naive version of implementing a sliding window will be too slow when you try to train the
whole CNN in later sections.
You can test your implementation by running the following:
In [ ]: x_shape = (2, 3, 4, 4)
w_shape = (3, 3, 4, 4)
x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)
w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)
b = np.linspace(-0.1, 0.2, num=3)
FOR SUBMISSION: Submit the corresponding output from your foward convolution for
the given input arguments. Load the files conv forward in x.csv, conv forward in w.csv and
conv forward in b.csv, they contain the input arguments for the x, w and b respectively
and are flattened to a 1D array in C-style, row-major order (see numpy.ravel for details:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html).
For Python users, you can use the code below to load and reshape the arrays to feed into your
conv forward function. Code is also provided to flatten the array and save your output to a csv file.
For users of other programming languages, you have to submit the output file conv forward out.csv which
contains the flattened output of conv forward. The array must be flattened in row-major order or else our
automated scripts will mark your outputs as incorrect.
3
In [ ]: x_shape = (2, 3, 6, 6)
w_shape = (3, 3, 4, 4)
x = np.loadtxt(’./input_files/conv_forward_in_x.csv’, delimiter=’,’)
x = x.reshape(x_shape)
w = np.loadtxt(’./input_files/conv_forward_in_w.csv’, delimiter=’,’)
w = w.reshape(w_shape)
b = np.loadtxt(’./input_files/conv_forward_in_b.csv’, delimiter=’,’)
4
if normalize:
img_max, img_min = np.max(img), np.min(img)
img = 255.0 * (img - img_min) / (img_max - img_min)
plt.imshow(img.astype(’uint8’))
plt.gca().axis(’off’)
# Show the original images and the results of the conv operation
plt.subplot(2, 3, 1)
imshow_noax(puppy, normalize=False)
plt.title(’Original image’)
plt.subplot(2, 3, 2)
imshow_noax(out[0, 0])
plt.title(’Grayscale’)
plt.subplot(2, 3, 3)
imshow_noax(out[0, 1])
plt.title(’Edges’)
plt.subplot(2, 3, 4)
imshow_noax(kitten_cropped, normalize=False)
plt.subplot(2, 3, 5)
imshow_noax(out[1, 0])
plt.subplot(2, 3, 6)
imshow_noax(out[1, 1])
plt.show()
• ‘stride’: The number of pixels between adjacent receptive fields in the horizontal and vertical directions.
• ‘pad’: The number of pixels that will be used to zero-pad the input in each x-y direction. We will use
the same definition in lecture notes 3b, slide 13 (ie. same padding on both sides).
For Python users, you can use the code below to load and reshape the arrays. Note that the code
runs conv forward first and saves the relevant arrays in cache for conv backward. Code is also provided
flatten and save your output to a csv file. For users of other programming languages, you have to submit
5
the output files conv backward out dx.csv, conv backward out dw.csv, conv backward out db.csv which
contains the flattened outputs of conv backward. The array must be flattened in row-major order or else
our automated scripts will mark your outputs as incorrect.
In [ ]: x_shape = (4, 3, 5, 5)
w_shape = (2, 3, 3, 3)
dout_shape = (4, 2, 5, 5)
x = np.loadtxt(’./input_files/conv_backward_in_x.csv’)
x = x.reshape(x_shape)
w = np.loadtxt(’./input_files/conv_backward_in_w.csv’)
w = w.reshape(w_shape)
b = np.loadtxt(’./input_files/conv_backward_in_b.csv’)
dout = np.loadtxt(’./input_files/conv_backward_in_dout.csv’)
dout = dout.reshape(dout_shape)
np.savetxt(’./output_files/conv_backward_out_dx.csv’, dx.ravel())
np.savetxt(’./output_files/conv_backward_out_dw.csv’, dw.ravel())
np.savetxt(’./output_files/conv_backward_out_db.csv’, db.ravel())
In [ ]: x_shape = (2, 3, 4, 4)
x = np.linspace(-0.3, 0.4, num=np.prod(x_shape)).reshape(x_shape)
pool_param = {’pool_width’: 2, ’pool_height’: 2, ’stride’: 2}
6
correct_out = np.array([[[[-0.26315789, -0.24842105],
[-0.20421053, -0.18947368]],
[[-0.14526316, -0.13052632],
[-0.08631579, -0.07157895]],
[[-0.02736842, -0.01263158],
[ 0.03157895, 0.04631579]]],
[[[ 0.09052632, 0.10526316],
[ 0.14947368, 0.16421053]],
[[ 0.20842105, 0.22315789],
[ 0.26736842, 0.28210526]],
[[ 0.32631579, 0.34105263],
[ 0.38526316, 0.4 ]]]])
FOR SUBMISSION: Submit the corresponding output from your forward maxpool for the given input
arguments.
Inputs: - x: Input data, of shape (N, C, H, W) - pool param: dictionary with the following keys: -
‘pool height’: The height of each pooling region - ‘pool width’: The width of each pooling region - ‘stride’:
The distance between adjacent pooling regions
In [ ]: x_shape = (3, 3, 8, 8)
pool_param = {’pool_width’: 2, ’pool_height’: 2, ’stride’: 2}
x = np.loadtxt(’./input_files/maxpool_forward_in_x.csv’)
x = x.reshape(x_shape)
np.savetxt(’./output_files/maxpool_backward_out.csv’, dx.ravel())
7
9 Convolutional “sandwich” layers
Here we introduce the concept of “sandwich” layers that combine multiple operations into commonly used
patterns. In the file code base/layer utils.py you will find sandwich layers that implement a few com-
monly used patterns for convolutional networks. With a modular design, it is very convenient to combine
layers according to your network architecture.
The following code test the sandwich layers of conv relu pool forward, conv relu pool backward,
conv relu forward and conv relu backward.
print(’Testing conv_relu_pool’)
print(’dx error: ’, rel_error(dx_num, dx))
print(’dw error: ’, rel_error(dw_num, dw))
print(’db error: ’, rel_error(db_num, db))
print(’Testing conv_relu:’)
print(’dx error: ’, rel_error(dx_num, dx))
print(’dw error: ’, rel_error(dw_num, dw))
print(’db error: ’, rel_error(db_num, db))
10 Three-layer ConvNet
Now that you have implemented all the necessary layers, we can put them together into a simple convolutional
network.
8
Open the file code base/classifiers/cnn.py and complete the implementation of the
ThreeLayerConvNet class. Run the following cells to help you debug:
In [ ]: model = ThreeLayerConvNet()
N = 50
X = np.random.randn(N, 3, 32, 32)
y = np.random.randint(10, size=N)
model.reg = 0.5
loss, grads = model.loss(X, y)
print(’Initial loss (with regularization): ’, loss)
12 Gradient check
After the loss looks reasonable, use numeric gradient checking to make sure that your backward pass is
correct. When you use numeric gradient checking you should use a small amount of artifical data and a
small number of neurons at each layer. Note: correct implementations may still have relative errors up to
1e-2.
In [ ]: num_inputs = 2
input_dim = (3, 16, 16)
reg = 0.0
num_classes = 10
np.random.seed(231)
X = np.random.randn(num_inputs, *input_dim)
y = np.random.randint(num_classes, size=num_inputs)
13 Solver
Following a modular design, for this assignment we have split the logic for training models into a separate
class. Open the file code base/solver.py and read through it to familiarize yourself with the API. We have
provided the functions for the various optimization techniques such as sgd and Adam.
9
14 Overfit small data
A nice trick is to train your model with just a few training samples to check that your code is working. You
should be able to overfit small datasets, which will result in very high training accuracy and comparatively
low validation accuracy.
In [ ]: np.random.seed(231)
num_train = 100
small_data = {
’X_train’: data[’X_train’][:num_train],
’y_train’: data[’y_train’][:num_train],
’X_val’: data[’X_val’],
’y_val’: data[’y_val’],
}
model = ThreeLayerConvNet(weight_scale=1e-2)
Plotting the loss, training accuracy, and validation accuracy should show clear overfitting:
In [ ]: plt.subplot(2, 1, 1)
plt.plot(solver.loss_history, ’o’)
plt.xlabel(’iteration’)
plt.ylabel(’loss’)
plt.subplot(2, 1, 2)
plt.plot(solver.train_acc_history, ’-o’)
plt.plot(solver.val_acc_history, ’-o’)
plt.legend([’train’, ’val’], loc=’upper left’)
plt.xlabel(’epoch’)
plt.ylabel(’accuracy’)
plt.show()
10
verbose=True, print_every=20)
solver.train()
16 Visualize Filters
You can visualize the first-layer convolutional filters from the trained network by running the following:
17 Dropout
Dropout [1] is a technique for regularizing neural networks by randomly setting some features to zero during
the forward pass. In this exercise you will implement a dropout layer and modify your fully-connected
network to optionally use dropout.
[1] Geoffrey E. Hinton et al, “Improving neural networks by preventing co-adaptation of feature detec-
tors”, arXiv 2012
In [ ]: x = np.loadtxt(’./input_files/dropout_forward_in_x.csv’)
# Larger p means more dropout
p = 0.3
out_train, _ = dropout_forward(x, {’mode’: ’train’, ’p’: p})
out_test, _ = dropout_forward(x, {’mode’: ’test’, ’p’: p})
np.savetxt(’./output_files/dropout_forward_out_train.csv’, out_train)
np.savetxt(’./output_files/dropout_forward_out_test.csv’, out_test)
11
FOR SUBMISSION: Submit the corresponding output from your backward dropout for the given
input arguments.
In [ ]: dout = np.loadtxt(’./input_files/dropout_backward_in_dout.csv’)
x = np.loadtxt(’./input_files/dropout_backward_in_x.csv’)
dropout_param = {’mode’: ’train’, ’p’: 0.8}
out, cache = dropout_forward(x, dropout_param)
dx_train = dropout_backward(dout, cache)
np.savetxt(’./output_files/dropout_backward_out_train.csv’, dx_train)
3) A short report (1-2 pages) in pdf titled report.pdf, explaining the logic (expressed using mathematical
expressions) behind coding each function and the findings from training your best net
12