redes-neuronales-desde-0

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

strach-mlp-git

February 19, 2024

[5]: import matplotlib.pyplot as plt


import matplotlib.image as mpimg

<!DOCTYPE html>
Neural Network Description
I have implemented a neural network rom scratch using Python and Jupyter Notebook. The neural
network consists o multiple layers including input, hidden, and output layers. Each neuron in the
network applies weights to the input data, ollowed by an activation unction to produce the output.
The model architecture includes ully connected layers with ReLU activation unctions or hidden
layers and sotmax activation unction or the output layer in the case o classifcation tasks. I have
used stochastic gradient descent (SGD) as the optimization algorithm and categorical cross-entropy
as the loss unction during training.
For training the neural network, I used a labeled dataset consisting o eature vectors and cor-
responding target labels. The dataset was split into training and validation sets to monitor the
model’s perormance and prevent overftting. The training process involved iterating over the
dataset or multiple epochs, adjusting the model’s weights using backpropagation, and evaluating
the model’s perormance on the validation set.
Ater training the neural network, I evaluated its perormance on a separate test dataset to assess
its generalization ability. The evaluation metrics included accuracy, precision, recall, and F1-score
depending on the specifc task.
[7]: image = mpimg.
↪imread('Architecture-of-multilayer-artificial-neural-network-with-error-backpropagation.

↪png')

plt.imshow(image)
plt.axis('off')
plt.show()

1
<!DOCTYPE html>
Forward Pass in Neural Network
The orward pass is a undamental step in the operation o a neural network. It is the process by
which input data is ed through the network’s layers to generate an output prediction.
During the orward pass, each layer o the neural network perorms two main operations:
1. Linear Transformation: The input data is multiplied by a weight matrix and added to a
bias vector. This step computes the weighted sum o inputs to each neuron in the layer.
<li>2. **Activation Function**: The result of the linear transformation is passed thr
</ul>
The output o one layer serves as the input to the next layer, and this process continues until
the data passes through all the layers o the network. The fnal output generated by the
output layer represents the prediction made by the neural network or the given input.
[9]: image = mpimg.imread("Screenshot 2024-02-19 145834.png")
plt.imshow(image)
plt.axis('off')
plt.show()

2
<!DOCTYPE html>
Loss Function in Neural Networks
In neural networks, the loss unction measures the disparity between predicted and actual values
during training. The mean squared error (MSE) is a common loss unction, expressed as:
() = [( − ())2]
Here,  represents the actual target value, and () is the predicted output or input . The goal is
to minimize this unction, achieved through techniques like backpropagation and gradient descent,
to enhance predictive accuracy.
<!DOCTYPE html>
Stochastic Gradient Descent
Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is a widely used optimization algorithm in deep learning and
machine learning. It is employed to optimize model parameters and is oten more computationally
ecient when dealing with large datasets.
SGD approximates the gradient using each training example (input, target output) and updates
the model parameters. This means that the gradient is computed rom a random subset o the
data rather than the entire dataset, resulting in a aster training process.
One o the main advantages o SGD is its ability to reduce the risk o getting stuck in local minima.
Random sampling allows or a more diverse and unpredictable update o the gradient, which oten
helps reach the global minimum aster.
[10]: image = mpimg.imread("Başlıksız.png")
plt.imshow(image)
plt.axis('off')
plt.show()

3
<!DOCTYPE html>
Backpropagation
Backpropagation is a key algorithm or training artifcial neural networks. It allows the network to
update its parameters in order to minimize the error between the predicted output and the actual
output.
The process o backpropagation involves computing the gradient o the loss unction with respect
to each parameter in the network. This gradient is then used to update the parameters using an
optimization algorithm such as stochastic gradient descent.
Backpropagation works by propagating the error backwards through the network, hence the name.
It calculates the contribution o each parameter to the overall error, allowing the network to adjust
its weights and biases accordingly.
[13]: image = mpimg.imread("Screenshot 2024-02-19 153123.png")
plt.figure(figsize=(16,16))
plt.imshow(image)
plt.axis('off')
plt.show()

4
[14]: image = mpimg.imread("Screenshot 2024-02-19 153300.png")
plt.figure(figsize=(16,16))
plt.imshow(image)
plt.axis('off')
plt.show()

[15]: image = mpimg.imread("Screenshot 2024-02-19 153428.png")


plt.figure(figsize=(16,16))
plt.imshow(image)
plt.axis('off')
plt.show()

5
1 Multilayer Neural Networks
[3]: from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import load_iris
from sklearn.metrics import confusion_matrix

[4]: class mlp:


import numpy as np
def __init__(self,output_size=1, layer1=[6,4], activation='sigmoid', ␣
↪learning_rate=0.1, random_state=42, epoch=100):

self.layer1 = layer1
self.activation = activation
self.learning_rate = learning_rate
self.random_state=random_state
self.output_size=output_size
self.epoch=epoch

def sigmoid(self, x):


if self.activation=='sigmoid':
return 1 / (1 + np.exp(-x))
elif self.activation=='tanh':
return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))
elif self.activation=='relu':
return np.maximum(x,0)

def sigmoid_derivative(self, x):


if self.activation=='sigmoid':
return x * (1 - x)
elif self.activation=='tanh':

6
return 1 - np.power(self.sigmoid(x), 2)
elif self.activation=='relu':
return np.where(x > 0, 1, 0)

def w_initilaized(self,input_size, output_size, layer):


all_weight=[]
all_bias=[]
layers=layer.copy()
layers.insert(0, input_size)
layers.append(output_size)
for i, number in enumerate(layers):
if i==0:
continue
rgen=np.random.RandomState(self.random_state)
all_weight.append(rgen.normal(loc=0, scale=1, size=[layers[i-1],␣
↪layers[i]]))

all_bias.append(np.zeros([layers[i], 1]))
return all_weight, all_bias

def sm_calculator(self,layer1,X, y, output_size, output):


s_m=list(range(len(output['a_output'])))
for i in range(len(output['a_output'])-1,-1,-1) :
if i==(len(output['a_output'])-1):
s_m[i]=(-2*+self.sigmoid_derivative(output['a_output'][i])*(y.
↪reshape(-1,self.output_size)-output['a_output'][i])).T

else:
b=np.zeros([output['a_output'][i].shape[1],␣
↪output['a_output'][i].shape[1]])

np.fill_diagonal(b,self.
↪sigmoid_derivative(output['a_output'][i-1]))

s_m[i]=np.dot(np.dot(b,self.W[i+1]), s_m[i+1])
self.s_m=s_m
return s_m
def forward(self,layer1,X, output_size, matrix):
neurons_output={'net_input':[], 'a_output':[]}
W=matrix[0]
b=matrix[1]
self.W=W
self.b=b
for i in range(len(layer1)+1):
if i==0:
n=np.dot(X,self.W[0])+self.b[0].T
neurons_output['net_input'].append(n)
a_h=self.sigmoid(n)
neurons_output['a_output'].append(a_h)
else:
n=np.dot(a_h, self.W[i])+self.b[i].T

7
neurons_output['net_input'].append(n)
a_h=self.sigmoid(n)
neurons_output['a_output'].append(a_h)
return neurons_output

def backward(self, layer1, X,y, output_size, output, sm):


neuron_output=output
for i in range(len(layer1)+1):
if i==0:
multip = [np.dot(sm[i][:,p].reshape(-1,1),X[p,:].reshape(-1,1).
↪T) for p in range(output['a_output'][0].shape[0])]

grad=sum(multip)/X.shape[0]
self.W[i]=self.W[i]-self.learning_rate*grad.T
self.b[i]=self.b[i]-self.learning_rate*np.mean(sm[i], axis=1).
↪reshape(-1,1)

else:
multip = [np.dot(sm[i][:,p].
↪reshape(-1,1),output['a_output'][i-1][p,:].reshape(-1,1).T) for p in␣

↪range(output['a_output'][0].shape[0])]

grad=sum(multip)/X.shape[0]
self.W[i]=self.W[i]-self.learning_rate*grad.T
self.b[i]=self.b[i]-self.learning_rate*np.mean(sm[i], axis=1).
↪reshape(-1,1)

self.grad=grad
return self.W, self.b
def fit(self, X,y):
self.matrix=self.w_initilaized(X.shape[1], self.output_size, self.
↪layer1)

losses=[]
for i in range(self.epoch):
output1=self.forward(self.layer1, X, self.output_size, self.matrix)
sm=self.sm_calculator(self.layer1, X, y, self.output_size, output1)
self.matrix=self.backward(self.layer1, X, y, self.output_size,␣
↪output1, sm)

loss=np.mean((y-output1['a_output'][-1])**2)
if i%10==0:
print(f'epoch {i} error{loss}')
losses.append(loss)
self.error=losses

return output1['a_output'][-1]
def predict(self, X):
pred=self.forward(self.layer1, X, self.output_size, self.matrix)
pred=pred['a_output'][-1]
return np.where(pred>0.5,1,0)

8
def predict_prob(self, X):
pred=self.forward(self.layer1, X, self.output_size, self.matrix)
pred=pred['a_output'][-1]
return pred
def online_fit(self, X_, y_):
self.matrix=self.w_initilaized(X_.shape[1], self.output_size, self.
↪layer1)

all_losses=[]
for i in range(self.epoch):
losses=[]
#print('fit iter', i)
for X, y in zip(X_, y_):
X=X.reshape(1,-1)
y=y.reshape(1,-1)
output1=self.forward(self.layer1, X, self.output_size, self.
↪matrix)

sm=self.sm_calculator(self.layer1, X, y, self.output_size,␣
↪output1)

self.matrix=self.backward(self.layer1, X, y, self.output_size,␣
↪output1, sm)

loss=(y-output1['a_output'][-1])**2
losses.append(loss)
if i%10==0:
print(f'epoch {i} error', sum(losses)/len(losses))
all_losses.append(sum(losses)/len(losses))
self.error=all_losses
return output1['a_output'][-1]

2 Check on Breast Cancer Dataset


[144]: breast=load_breast_cancer()
X=breast.data
y=breast.target

[145]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣


↪random_state=62,stratify=y)

scl=MinMaxScaler()
scl.fit(X_train)
X_train_s=scl.transform(X_train)
X_test_s=scl.transform(X_test)

[226]: ann=mlp(layer1=[30,100,4], output_size=1, learning_rate=0.05,␣


↪activation='tanh', epoch=300, random_state=42)

[227]: dd=ann.fit(X_train_s, y_train)

9
epoch 0 error1.8918892501815365
epoch 10 error0.4509385127922095
epoch 20 error0.42724271534455616
epoch 30 error0.41210622118762924
epoch 40 error0.40033610696739735
epoch 50 error0.38787977685346187
epoch 60 error0.4165918141142754
epoch 70 error0.43673985560917705
epoch 80 error0.43978566128469
epoch 90 error0.4357359954420347
epoch 100 error0.4287875568610628
epoch 110 error0.4210103779751217
epoch 120 error0.41871515760189
epoch 130 error0.42444669828764603
epoch 140 error0.43496231241852684
epoch 150 error0.44360661403823914
epoch 160 error0.44786980186990744
epoch 170 error0.44951215394187194
epoch 180 error0.4495585747392262
epoch 190 error0.4485685657006962
epoch 200 error0.44694778134302027
epoch 210 error0.44490208502343725
epoch 220 error0.4426077215755462
epoch 230 error0.4402444345908341
epoch 240 error0.43792251408958227
epoch 250 error0.435756775968689
epoch 260 error0.4337173594348666
epoch 270 error0.43161945335246527
epoch 280 error0.429515024607559
epoch 290 error0.4275416529663787

[228]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)

[230]: print(accuracy_score(y_train, train_pred_last))


print(accuracy_score(y_test, test_pred_last))

0.9252747252747253
0.956140350877193

[231]: plt.plot(np.arange(1, len(ann.error)+1).reshape(-1,1), ann.error)


plt.xlabel('Epoch')
plt.ylabel('Error')
plt.show()

10
[233]: from sklearn.metrics import confusion_matrix
#confusion_matrix=confusion_matrix(y_test, test_pred_last)
confusion_matrix=confusion_matrix(y_train, train_pred_last)

class_names = ["Negative", "Positive"]

# Create the heatmap


fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)

# Add class labels


ax.set_xticks(np.arange(len(class_names)))
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)

# Add text labels with values


for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',

11
fontsize=8)

# Add colorbar
fig.colorbar(im)

# Show the plot


plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()

[234]: from sklearn.metrics import confusion_matrix


confusion_matrix=confusion_matrix(y_test, test_pred_last)
class_names = ["Negative", "Positive"]

fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)

12
ax.set_xticks(np.arange(len(class_names)))
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)

for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',
fontsize=8)

fig.colorbar(im)

plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()

13
3 Online (Incremental) Learning
[235]: ann=mlp(layer1=[30,30], output_size=1, learning_rate=0.05, activation='tanh',␣
↪epoch=50, random_state=42)

[236]: dd=ann.online_fit(X_train_s, y_train)

epoch 0 error [[0.31244563]]


epoch 10 error [[0.09677472]]
epoch 20 error [[0.08516914]]
epoch 30 error [[0.07555927]]
epoch 40 error [[0.06786119]]

[237]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)

[238]: print(accuracy_score(y_train, train_pred_last))


print(accuracy_score(y_test, test_pred_last))

0.9362637362637363
0.9473684210526315

[239]: plt.plot(np.arange(1, len(ann.error)+1).reshape(-1,1), np.concatenate(ann.


↪error))

plt.xlabel('Epoch')
plt.ylabel('Error')
plt.show()

14
[240]: from sklearn.metrics import confusion_matrix
confusion_matrix=confusion_matrix(y_test, test_pred_last)
#confusion_matrix=confusion_matrix(y_train, train_pred_last)

class_names = ["Negative", "Positive"]

fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)

ax.set_xticks(np.arange(len(class_names)))
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)

for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',
fontsize=8)

fig.colorbar(im)

15
plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()

4 Multilayer Neural Network On Iris dataset


[284]: iris=load_iris()
X=iris.data
y=iris.target
X=X[0:100]
y=y[0:100]

[285]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣


↪random_state=62,stratify=y)

16
scl=MinMaxScaler()
scl.fit(X_train)
X_train_s=scl.transform(X_train)
X_test_s=scl.transform(X_test)

[286]: ann=mlp(layer1=[4,10], output_size=1, learning_rate=0.001, activation='tanh',␣


↪epoch=300, random_state=42)

[287]: dd=ann.fit(X_train_s, y_train)

epoch 0 error1.3237539217647336
epoch 10 error1.2957911346204896
epoch 20 error1.2607761296416244
epoch 30 error1.216191972861665
epoch 40 error1.1591489237642119
epoch 50 error1.0888419750277023
epoch 60 error1.0100398241341388
epoch 70 error0.9316724601375149
epoch 80 error0.8609179523186069
epoch 90 error0.8008322793452802
epoch 100 error0.7516183000189318
epoch 110 error0.7121526346121857
epoch 120 error0.6808782460183588
epoch 130 error0.6562400168169608
epoch 140 error0.6368676172448628
epoch 150 error0.6216276632959881
epoch 160 error0.6096147134318352
epoch 170 error0.6001188782759205
epoch 180 error0.5925886839850463
epoch 190 error0.5865971413508803
epoch 200 error0.5818135676021279
epoch 210 error0.5779812798389294
epoch 220 error0.5749003697305514
epoch 230 error0.5724145606140236
epoch 240 error0.5704012229919797
epoch 250 error0.5687637871766422
epoch 260 error0.567425959867396
epoch 270 error0.5663272954662575
epoch 280 error0.5654197867690168
epoch 290 error0.5646652260291023

[288]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)

17
[289]: print(accuracy_score(y_train, train_pred_last))
print(accuracy_score(y_test, test_pred_last))

0.9
1.0

[290]: plt.plot(np.arange(1, len(ann.error)+1).reshape(-1,1), ann.error)


plt.xlabel('Epoch')
plt.ylabel('Error')
plt.show()

[291]: from sklearn.metrics import confusion_matrix


confusion_matrix=confusion_matrix(y_test, test_pred_last)
#confusion_matrix=confusion_matrix(y_train, train_pred_last)

class_names = ["Negative", "Positive"]

fig, ax = plt.subplots()
im = ax.imshow(confusion_matrix, interpolation='nearest', cmap=plt.cm.Blues)

ax.set_xticks(np.arange(len(class_names)))

18
ax.set_yticks(np.arange(len(class_names)))
ax.set_xticklabels(class_names, rotation=45, ha='right')
ax.set_yticklabels(class_names)

for i in range(len(class_names)):
for j in range(len(class_names)):
text = ax.text(j, i, confusion_matrix[i, j],
ha='center', va='center',
fontsize=8)

fig.colorbar(im)

plt.title("Confusion Matrix")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()

19
5 Online Learning On Iris Dataset
[292]: ann=mlp(layer1=[4,10], output_size=1, learning_rate=0.001, activation='tanh',␣
↪epoch=300, random_state=42)

[293]: dd=ann.online_fit(X_train_s, y_train)

epoch 0 error [[0.34699495]]


epoch 10 error [[0.10236811]]
epoch 20 error [[0.09699291]]
epoch 30 error [[0.09276354]]
epoch 40 error [[0.08941041]]
epoch 50 error [[0.08667527]]
epoch 60 error [[0.08434989]]
epoch 70 error [[0.08228259]]
epoch 80 error [[0.08036903]]
epoch 90 error [[0.07853962]]
epoch 100 error [[0.07674882]]
epoch 110 error [[0.07496732]]
epoch 120 error [[0.07317675]]
epoch 130 error [[0.07136626]]
epoch 140 error [[0.06953025]]
epoch 150 error [[0.06766693]]
epoch 160 error [[0.06577728]]
epoch 170 error [[0.06386443]]
epoch 180 error [[0.06193308]]
epoch 190 error [[0.05998912]]
epoch 200 error [[0.05803922]]
epoch 210 error [[0.05609052]]
epoch 220 error [[0.05415033]]
epoch 230 error [[0.05222588]]
epoch 240 error [[0.05032403]]
epoch 250 error [[0.04845113]]
epoch 260 error [[0.04661287]]
epoch 270 error [[0.04481414]]
epoch 280 error [[0.04305904]]
epoch 290 error [[0.0413508]]

[294]: train_pred=ann.predict_prob(X_train_s)
train_pred_last=np.where(train_pred>0.5,1,0)
test_pred=ann.predict_prob(X_test_s)
test_pred_last=np.where(test_pred>0.5,1,0)

[295]: print(accuracy_score(y_train, train_pred_last))


print(accuracy_score(y_test, test_pred_last))

0.9625
1.0

20
[296]: plt.plot(np.arange(1, len(ann.error)+1).reshape(-1,1), np.concatenate(ann.
↪error))

plt.xlabel('Epoch')
plt.ylabel('Error')
plt.show()

5.1 References
Hagan, M. T., Demuth, H. B., & Beale, M. (1997). Neural network design. PWS Publishing Co.

21

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy