redes-neuronales-desde-0
redes-neuronales-desde-0
redes-neuronales-desde-0
<!DOCTYPE html>
Neural Network Description
I have implemented a neural network rom scratch using Python and Jupyter Notebook. The neural
network consists o multiple layers including input, hidden, and output layers. Each neuron in the
network applies weights to the input data, ollowed by an activation unction to produce the output.
The model architecture includes ully connected layers with ReLU activation unctions or hidden
layers and sotmax activation unction or the output layer in the case o classifcation tasks. I have
used stochastic gradient descent (SGD) as the optimization algorithm and categorical cross-entropy
as the loss unction during training.
For training the neural network, I used a labeled dataset consisting o eature vectors and cor-
responding target labels. The dataset was split into training and validation sets to monitor the
model’s perormance and prevent overftting. The training process involved iterating over the
dataset or multiple epochs, adjusting the model’s weights using backpropagation, and evaluating
the model’s perormance on the validation set.
Ater training the neural network, I evaluated its perormance on a separate test dataset to assess
its generalization ability. The evaluation metrics included accuracy, precision, recall, and F1-score
depending on the specifc task.
[7]: image = mpimg.
↪imread('Architecture-of-multilayer-artificial-neural-network-with-error-backpropagation.
↪png')
plt.imshow(image)
plt.axis('off')
plt.show()
1
<!DOCTYPE html>
Forward Pass in Neural Network
The orward pass is a undamental step in the operation o a neural network. It is the process by
which input data is ed through the network’s layers to generate an output prediction.
During the orward pass, each layer o the neural network perorms two main operations:
1. Linear Transformation: The input data is multiplied by a weight matrix and added to a
bias vector. This step computes the weighted sum o inputs to each neuron in the layer.
<li>2. **Activation Function**: The result of the linear transformation is passed thr
</ul>
The output o one layer serves as the input to the next layer, and this process continues until
the data passes through all the layers o the network. The fnal output generated by the
output layer represents the prediction made by the neural network or the given input.
[9]: image = mpimg.imread("Screenshot 2024-02-19 145834.png")
plt.imshow(image)
plt.axis('off')
plt.show()
2
<!DOCTYPE html>
Loss Function in Neural Networks
In neural networks, the loss unction measures the disparity between predicted and actual values
during training. The mean squared error (MSE) is a common loss unction, expressed as:
() = [( − ())2]
Here, represents the actual target value, and () is the predicted output or input . The goal is
to minimize this unction, achieved through techniques like backpropagation and gradient descent,
to enhance predictive accuracy.
<!DOCTYPE html>
Stochastic Gradient Descent
Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is a widely used optimization algorithm in deep learning and
machine learning. It is employed to optimize model parameters and is oten more computationally
ecient when dealing with large datasets.
SGD approximates the gradient using each training example (input, target output) and updates
the model parameters. This means that the gradient is computed rom a random subset o the
data rather than the entire dataset, resulting in a aster training process.
One o the main advantages o SGD is its ability to reduce the risk o getting stuck in local minima.
Random sampling allows or a more diverse and unpredictable update o the gradient, which oten
helps reach the global minimum aster.
[10]: image = mpimg.imread("Başlıksız.png")
plt.imshow(image)
plt.axis('off')
plt.show()
3
<!DOCTYPE html>
Backpropagation
Backpropagation is a key algorithm or training artifcial neural networks. It allows the network to
update its parameters in order to minimize the error between the predicted output and the actual
output.
The process o backpropagation involves computing the gradient o the loss unction with respect
to each parameter in the network. This gradient is then used to update the parameters using an
optimization algorithm such as stochastic gradient descent.
Backpropagation works by propagating the error backwards through the network, hence the name.
It calculates the contribution o each parameter to the overall error, allowing the network to adjust
its weights and biases accordingly.
[13]: image = mpimg.imread("Screenshot 2024-02-19 153123.png")
plt.figure(figsize=(16,16))
plt.imshow(image)
plt.axis('off')
plt.show()
4
[14]: image = mpimg.imread("Screenshot 2024-02-19 153300.png")
plt.figure(figsize=(16,16))
plt.imshow(image)
plt.axis('off')
plt.show()
5
1 Multilayer Neural Networks
[3]: from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import load_iris
from sklearn.metrics import confusion_matrix
self.layer1 = layer1
self.activation = activation
self.learning_rate = learning_rate
self.random_state=random_state
self.output_size=output_size
self.epoch=epoch
6
return 1 - np.power(self.sigmoid(x), 2)
elif self.activation=='relu':
return np.where(x > 0, 1, 0)
all_bias.append(np.zeros([layers[i], 1]))
return all_weight, all_bias
else:
b=np.zeros([output['a_output'][i].shape[1],␣
↪output['a_output'][i].shape[1]])
np.fill_diagonal(b,self.
↪sigmoid_derivative(output['a_output'][i-1]))
s_m[i]=np.dot(np.dot(b,self.W[i+1]), s_m[i+1])
self.s_m=s_m
return s_m
def forward(self,layer1,X, output_size, matrix):
neurons_output={'net_input':[], 'a_output':[]}
W=matrix[0]
b=matrix[1]
self.W=W
self.b=b
for i in range(len(layer1)+1):
if i==0:
n=np.dot(X,self.W[0])+self.b[0].T
neurons_output['net_input'].append(n)
a_h=self.sigmoid(n)
neurons_output['a_output'].append(a_h)
else:
n=np.dot(a_h, self.W[i])+self.b[i].T
7
neurons_output['net_input'].append(n)
a_h=self.sigmoid(n)
neurons_output['a_output'].append(a_h)
return neurons_output
grad=sum(multip)/X.shape[0]
self.W[i]=self.W[i]-self.learning_rate*grad.T
self.b[i]=self.b[i]-self.learning_rate*np.mean(sm[i], axis=1).
↪reshape(-1,1)
else:
multip = [np.dot(sm[i][:,p].
↪reshape(-1,1),output['a_output'][i-1][p,:].reshape(-1,1).T) for p in␣
↪range(output['a_output'][0].shape[0])]
grad=sum(multip)/X.shape[0]
self.W[i]=self.W[i]-self.learning_rate*grad.T
self.b[i]=self.b[i]-self.learning_rate*np.mean(sm[i], axis=1).
↪reshape(-1,1)
self.grad=grad
return self.W, self.b
def fit(self, X,y):
self.matrix=self.w_initilaized(X.shape[1], self.output_size, self.
↪layer1)
losses=[]
for i in range(self.epoch):
output1=self.forward(self.layer1, X, self.output_size, self.matrix)
sm=self.sm_calculator(self.layer1, X, y, self.output_size, output1)
self.matrix=self.backward(self.layer1, X, y, self.output_size,␣
↪output1, sm)
loss=np.mean((y-output1['a_output'][-1])**2)
if i%10==0:
print(f'epoch {i} error{loss}')
losses.append(loss)
self.error=losses
return output1['a_output'][-1]
def predict(self, X):
pred=self.forward(self.layer1, X, self.output_size, self.matrix)
pred=pred['a_output'][-1]
return np.where(pred>0.5,1,0)
8
def predict_prob(self, X):
pred=self.forward(self.layer1, X, self.output_size, self.matrix)
pred=pred['a_output'][-1]
return pred
def online_fit(self, X_, y_):
self.matrix=self.w_initilaized(X_.shape[1], self.output_size, self.
↪layer1)
all_losses=[]
for i in range(self.epoch):
losses=[]
#print('fit iter', i)
for X, y in zip(X_, y_):
X=X.reshape(1,-1)
y=y.reshape(1,-1)
output1=self.forward(self.layer1, X, self.output_size, self.
↪matrix)
sm=self.sm_calculator(self.layer1, X, y, self.output_size,␣
↪output1)
self.matrix=self.backward(self.layer1, X, y, self.output_size,␣
↪output1, sm)
loss=(y-output1['a_output'][-1])**2
losses.append(loss)
if i%10==0:
print(f'epoch {i} error', sum(losses)/len(losses))
all_losses.append(sum(losses)/len(losses))
self.error=all_losses
return output1['a_output'][-1]
scl=MinMaxScaler()
scl.fit(X_train)
X_train_s=scl.transform(X_train)
X_test_s=scl.transform(X_test)
9
epoch 0 error1.8918892501815365
epoch 10 error0.4509385127922095
epoch 20 error0.42724271534455616
epoch 30 error0.41210622118762924
epoch 40 error0.40033610696739735
epoch 50 error0.38787977685346187
epoch 60 error0.4165918141142754
epoch 70 error0.43673985560917705
epoch 80 error0.43978566128469
epoch 90 error0.4357359954420347
epoch 100 error0.4287875568610628
epoch 110 error0.4210103779751217
epoch 120 error0.41871515760189
epoch 130 error0.42444669828764603
epoch 140 error0.43496231241852684
epoch 150 error0.44360661403823914
epoch 160 error0.44786980186990744
epoch 170 error0.44951215394187194
epoch 180 error0.4495585747392262
epoch 190 error0.4485685657006962
epoch 200 error0.44694778134302027
epoch 210 error0.44490208502343725
epoch 220 error0.4426077215755462
epoch 230 error0.4402444345908341
epoch 240 error0.43792251408958227
epoch 250 error0.435756775968689
epoch 260 error0.4337173594348666
epoch 270 error0.43161945335246527
epoch 280 error0.429515024607559
epoch 290 error0.4275416529663787
[228]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)
0.9252747252747253
0.956140350877193
10
[233]: from sklearn.metrics import confusion_matrix
#confusion_matrix=confusion_matrix(y_test, test_pred_last)
confusion_matrix=confusion_matrix(y_train, train_pred_last)
11
fontsize=8)
# Add colorbar
fig.colorbar(im)
fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)
12
ax.set_xticks(np.arange(len(class_names)))
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)
for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',
fontsize=8)
fig.colorbar(im)
plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()
13
3 Online (Incremental) Learning
[235]: ann=mlp(layer1=[30,30], output_size=1, learning_rate=0.05, activation='tanh',␣
↪epoch=50, random_state=42)
[237]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)
0.9362637362637363
0.9473684210526315
plt.xlabel('Epoch')
plt.ylabel('Error')
plt.show()
14
[240]: from sklearn.metrics import confusion_matrix
confusion_matrix=confusion_matrix(y_test, test_pred_last)
#confusion_matrix=confusion_matrix(y_train, train_pred_last)
fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)
ax.set_xticks(np.arange(len(class_names)))
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)
for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',
fontsize=8)
fig.colorbar(im)
15
plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()
16
scl=MinMaxScaler()
scl.fit(X_train)
X_train_s=scl.transform(X_train)
X_test_s=scl.transform(X_test)
epoch 0 error1.3237539217647336
epoch 10 error1.2957911346204896
epoch 20 error1.2607761296416244
epoch 30 error1.216191972861665
epoch 40 error1.1591489237642119
epoch 50 error1.0888419750277023
epoch 60 error1.0100398241341388
epoch 70 error0.9316724601375149
epoch 80 error0.8609179523186069
epoch 90 error0.8008322793452802
epoch 100 error0.7516183000189318
epoch 110 error0.7121526346121857
epoch 120 error0.6808782460183588
epoch 130 error0.6562400168169608
epoch 140 error0.6368676172448628
epoch 150 error0.6216276632959881
epoch 160 error0.6096147134318352
epoch 170 error0.6001188782759205
epoch 180 error0.5925886839850463
epoch 190 error0.5865971413508803
epoch 200 error0.5818135676021279
epoch 210 error0.5779812798389294
epoch 220 error0.5749003697305514
epoch 230 error0.5724145606140236
epoch 240 error0.5704012229919797
epoch 250 error0.5687637871766422
epoch 260 error0.567425959867396
epoch 270 error0.5663272954662575
epoch 280 error0.5654197867690168
epoch 290 error0.5646652260291023
[288]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)
17
[289]: print(accuracy_score(y_train, train_pred_last))
print(accuracy_score(y_test, test_pred_last))
0.9
1.0
fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)
ax.set_xticks(np.arange(len(class_names)))
18
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)
for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',
fontsize=8)
fig.colorbar(im)
plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()
19
5 Online Learning On Iris Dataset
[292]: ann=mlp(layer1=[4,10], output_size=1, learning_rate=0.001, activation='tanh',␣
↪epoch=300, random_state=42)
[294]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)
0.9625
1.0
20
[296]: plt.plot(np.arange(1, len(ann.error)+1).reshape(-1,1), np.concatenate(ann.
↪error))
plt.xlabel('Epoch')
plt.ylabel('Error')
plt.show()
5.1 References
Hagan, M. T., Demuth, H. B., & Beale, M. (1997). Neural network design. PWS Publishing Co.
21