FDS Lab

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Experiment-III

3. Study of Python Libraries for ML application such as Pandas and Matplotlib

Machine learning is a science of programming the computer by which they can learn from
different types of data. According to machine learning's definition of Arthur Samuel - "Field
of study that gives computers the ability to learn without being explicitly programmed". The
concept of machine learning is basically used for solving different types of life problems.

In previous days, the users used to perform tasks of machine learning by manually coding all
the algorithms and using mathematical and statistical formulas.

This process was time-consuming, inefficient, and tiresome compared to Python libraries,
frameworks, and modules. But in today's world, users can use the Python language, which is
the most popular and productive language for machine learning. Python has replaced many
languages as it is a vast collection of libraries, and it makes work easier and simpler.

In this tutorial, we will discuss the best libraries of Python used for Machine Learning:

o NumPy
o SciPy
o Scikit-learn
o Theano
o TensorFlow
o Keras
o PyTorch
o Pandas
o Matplotlib

NumPy

NumPy is the most popular library in Python. This library is used for processing large multi-
dimensional array and matrix formation by using a large collection of high-level
mathematical functions and formulas. It is mainly used for the computation of fundamental
science in machine learning. It is widely used for linear algebra, Fourier transformation, and
random number capabilities. There are other High-end libraries such as TensorFlow, which
user NumPy as internal functioning for manipulation of tensors.

Example:

1. import numpy as nup


2.
3. # Then, create two arrays of rank 2
4. K = nup.array([[2, 4], [6, 8]])
5. R = nup.array([[1, 3], [5, 7]])
6.
7. # Then, create two arrays of rank 1
8. P = nup.array([10, 12])
9. S = nup.array([9, 11])
10.
11. # Then, we will print the Inner product of vectors
12. print ("Inner product of vectors: ", nup.dot(P, S), "\n")
13.
14. # Then, we will print the Matrix and Vector product
15. print ("Matrix and Vector product: ", nup.dot(K, P), "\n")
16.
17. # Now, we will print the Matrix and matrix product
18. print ("Matrix and matrix product: ", nup.dot(K, R))

Output:

Inner product of vectors: 222

Matrix and Vector product: [ 68 156]

Matrix and matrix product: [[22 34]


[46 74]]

SciPy

SciPy is a popular library among Machine Learning developers as it contains numerous


modules for performing optimization, linear algebra, integration, and statistics. SciPy library
is different from SciPy stack, as SciPy library is one of the core packages which made up the
SciPy stack. SciPy library is used for image manipulation tasks.

Example 1:

1. from scipy import signal as sg


2. import numpy as nup
3. K = nup.arange(45).reshape(9, 5)
4. domain_1 = nup.identity(3)
5. print (K, end = 'KK')
6. print (sg.order_filter (K, domain_1, 1))

Output:

r (K, domain_1, 1))


Output:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]
[25 26 27 28 29]
[30 31 32 33 34]
[35 36 37 38 39]
[40 41 42 43 44]] KK [[ 0. 1. 2. 3. 0.]
[ 5. 6. 7. 8. 3.]
[10. 11. 12. 13. 8.]
[15. 16. 17. 18. 13.]
[20. 21. 22. 23. 18.]
[25. 26. 27. 28. 23.]
[30. 31. 32. 33. 28.]
[35. 36. 37. 38. 33.]
[ 0. 35. 36. 37. 38.]]

Example 2:

1. from scipy.signal import chirp as cp


2. from scipy.signal import spectrogram as sp
3. import matplotlib.pyplot as plot
4. import numpy as nup
5. t_T = nup.linspace(3, 10, 300)
6. w_W = cp(t_T, f0 = 4, f1 = 2, t1 = 5, method = 'linear')
7. plot.plot(t_T, w_W)
8. plot.title ("Linear Chirp")
9. plot.xlabel ('Time in Seconds)')
10. plot.show()

Output:

Scikit-learn

ADVERTISEMENT

Scikit-learn is a Python library which is used for classical machine learning algorithms. It is
built on the top of two basic libraries of Python, that is NumPy and SciPy. Scikit-learn is
popular in Machine learning developers as it supports supervised and unsupervised learning
algorithms. This library can also be used for data-analysis, and data-mining process.

Example:

1. from sklearn import datasets as ds


2. from sklearn import metrics as mt
3. from sklearn.tree import DecisionTreeClassifier as dtc
4.
5. # load the iris datasets
6. dataset_1 = ds.load_iris()
7.
8. # fit a CART model to the data
9. model_1 = dtc()
10. model_1.fit(dataset_1.data, dataset_1.target)
11. print(model)
12.
13. # make predictions
14. expected_1 = dataset_1.target
15. predicted_1 = model_1.predict(dataset_1.data)
16.
17. # summarize the fit of the model
18. print (mt.classification_report(expected_1, predicted_1))
19. print(mt.confusion_matrix(expected_1, predicted_1))

Output:

DecisionTreeClassifier()
precision recall f1-score support

0 1.00 1.00 1.00 50


1 1.00 1.00 1.00 50
2 1.00 1.00 1.00 50

accuracy 1.00 150


macro avg 1.00 1.00 1.00 150
weighted avg 1.00 1.00 1.00 150

[[50 0 0]
[ 0 50 0]
[ 0 0 50]]

Theano

Theano is a famous library of Python, which is used for defining, evaluating, and optimizing
mathematical expressions, which also efficiently involves multi-dimensional arrays.

It is achieved by optimizing the utilization of CPU and GPU. As machine learning is all about
mathematics and statistics, Theano makes it easy for the users to perform mathematical
operations.

ADVERTISEMENT
ADVERTISEMENT

It is extensively used for unit-testing and self-verification for detecting and diagnosing
different types of errors. Theano is a powerful library that can be used on a large scale
computationally intensive scientific project. It is a simple and approachable library, which an
individual can use for their projects.

Example:

1. import theano as th
2. import theano.tensor as Tt
3. k = Tt.dmatrix('k')
4. r = 1 / (1 + Tt.exp(-k))
5. logistic_1 = th.function([k], r)
6. logistic_1([[0, 1], [-1, -2]])

Output:

array([[0.5, 0.71135838],
[0.26594342, 0.11420192]])

TensorFlow

TensorFlow is an open-source library of Python which is used for high performance of


numerical computation. It is a popular library, which was developed by the Brain team in
Google. TensorFlow is a framework that involves defining and running computations
involving tensors. TensorFlow can be used for training and running deep neural networks,
which can be used for developing several Artificial Intelligence applications.

Example:

1. import tensorflow as tsf


2.
3. # Initialize two constants
4. K_1 = tsf.constant([2, 4, 6, 8])
5. K_2 = tsf.constant([1, 3, 5, 7])
6.
7. # Multiply
8. result = tsf.multiply(K_1, K_2)
9.
10. # Initialize the Session
11. sess_1 = tsf.Session()
12.
13. # Print the result
14. print (sess_1.run(result))
15.
16. # Close the session
17. sess_1.close()

Output:
[ 2 12 30 56]

Keras

Keras is a high-level neural networking API, which is capable of running on top of


TensorFlow, CNTK and Theano libraries. It is a very famous library of Python among Machine
learning developers. It can run without a glitch on both CPU and GPU. IT makes it really easy
and simple for Machine Learning beginners and for designing a Neural Network. It is also
used for fast prototyping.

Example:

1. import numpy as nup


2. from tensorflow import keras as ks
3. from tensorflow.keras import layers as ls
4. number_classes = 10
5. input_shapes = (28, 28, 1)
6.
7. # Here, we will import the data, and split it between train and test sets
8. (x_1_train, y_1_train), (x_2_test, y_2_test) = ks.datasets.mnist.load_data()
9.
10. # now, we will Scale images to the [0, 1] range
11. x_1_train = x_1_train.astype("float32") / 255
12. x_2_test = x_2_test.astype("float32") / 255
13. # we have to make sure that the images have shape (28, 28, 1)
14. x_1_train = nup.expand_dims(x_1_train, -1)
15. x_2_test = nup.expand_dims(x_2_test, -1)
16. print ("x_train shape:", x_1_train.shape)
17. print (x_1_train.shape[0], "Training samples")
18. print (x_2_test.shape[0], "Testing samples")
19.
20.
21. # Then we will convert class vectors to binary class matrices
22. y_1_train = ks.utils.to_categorical(y_1_train, number_classes)
23. y_2_test = ks.utils.to_categorical(y_2_test, number_classes)
24. model_1 = ks.Sequential(
25. [
26. ks.Input(shape = input_shapes),
27. ls.Conv2D(32, kernel_size = (3, 3), activation = "relu"),
28. ls.MaxPooling2D(pool_size = (2, 2)),
29. ls.Conv2D(64, kernel_size = (3, 3), activation = "relu"),
30. ls.MaxPooling2D(pool_size = (2, 2)),
31. ls.Flatten(),
32. ls.Dropout(0.5),
33. ls.Dense(number_classes, activation = "softmax"),
34. ]
35. )
36.
37. model_1.summary()

Output:

x_train shape: (60000, 28, 28, 1)


60000 Training samples
10000 Testing samples
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 1600) 0
_________________________________________________________________
dropout (Dropout) (None, 1600) 0
_________________________________________________________________
dense (Dense) (None, 10) 16010
=================================================================
Total params: 34,826
Trainable params: 34,826
Non-trainable params: 0
_________________________________________________________________

PyTorch

PyTorch is also an open-source Python library for Machine Learning based on Torch,
which is implemented in C language and used for Machine learning. It has numerous tools
and libraries supported on the computer version, Natural Language Processing
(NLP) and many other Machine Learning programs. This library also allows users to perform
computational tasks on Tensor with GPU acceleration.

Example:

1. import torch as tch


2. d_type = tch.float
3. device_1 = tch.device("cpu")
4. # Use device = tch.device("cuda:0") for GPU
5.
6. # Here, N_1 is batch size; D_in_1 is input dimension;
7. # H_1 is hidden dimension; D_out_1 is output dimension.
8. N_1 = 62
9. D_in_1 = 1000
10. H_1 = 110
11. D_out_1 = 11
12.
13. # Now, we will create random input and output data
14. K = tch.randn(N_1, D_in_1, device = device_1, dtype = d_type)
15. R = tch.randn(N_1, D_out_1, device = device_1, dtype = d_type)
16.
17. # Then, we will Randomly initialize weights
18. K_1 = tch.randn(D_in_1, H_1, device = device_1, dtype = d_type)
19. K_2 = tch.randn(H_1, D_out_1, device = device_1, dtype = d_type)
20.
21. learning_rate_1 = 1e-6
22. for Q in range(500):
23. # Now, we will put Forward pass: compute predicted y
24. h_1 = K.mm(K_1)
25. h_relu_1 = h_1.clamp(min = 0)
26. y_pred_1 = h_relu_1.mm(K_2)
27.
28. # Compute and print loss
29. loss = (y_pred_1 - R).pow(2).sum().item()
30. print (Q, loss)
31.
32. # Then we will Backprop to compute gradients of w1 and w2 with respect to loss
33. grad_y_pred = 2.0 * (y_pred_1 - R)
34. grad_K_2 = h_relu_1.t().mm(grad_y_pred)
35. grad_h_relu = grad_y_pred.mm(K_2.t())
36. grad_h = grad_h_relu.clone()
37. grad_h[h_1 < 0] = 0
38. grad_K_1 = K.t().mm(grad_h)
39.
40. # Then we will Update the weights by using gradient descent
41. K_1 -= learning_rate_1 * grad_K_1
42. K_2 -= learning_rate_1 * grad_K_2

Output:

0 35089116.0
1 33087792.0
2 42227192.0
3 56113208.0
4 61125684.0
5 45541204.0
6 21011108.0
7 6972017.0
8 2523046.5
9 1342124.5
10 950067.5625
11 753290.25
12 620475.875
13 519006.71875
14 437975.9375
15 372063.125
16 317840.8125
17 272874.46875
18 235348.421875
.
.
.
497 7.426088268402964e-05
498 7.348413055296987e-05
499 7.258950790856034e-05

Pandas

Pandas is a Python library that is mainly used for data analysis. The users have to prepare
the dataset before using it for training the machine learning. Pandas make it easy for the
developers as it is developed specifically for data extraction. It has a wide variety of tools for
analysing data in detail, providing high-level data structures.

Example:
1. import pandas as pad
2.
3. data_1 = {"Countries": ["Bhutan", "Cape Verde", "Chad", "Estonia", "Guinea", "Kenya", "Liby
a", "Mexico"],
4. "capital": ["Thimphu", "Praia", "N'Djamena", "Tallinn", "Conakry", "Nairobi", "Tripoli", "M
exico City"],
5. "Currency": ["Ngultrum", "Cape Verdean escudo", "CFA Franc", "Estonia Kroon; Euro", "
Guinean franc", "Kenya shilling", "Libyan dinar", "Mexican peso"],
6. "population": [20.4, 143.5, 12.52, 135.7, 52.98, 76.21, 34.28, 54.32] }
7.
8. data_1_table = pad.DataFrame(data_1)
9. print(data_1_table)

Output:

Countries capital Currency population


0 Bhutan Thimphu Ngultrum 20.40
1 Cape Verde Praia Cape Verdean escudo 143.50
2 Chad N'Djamena CFA Franc 12.52
3 Estonia Tallinn Estonia Kroon; Euro 135.70
4 Guinea Conakry Guinean franc 52.98
5 Kenya Nairobi Kenya shilling 76.21
6 Libya Tripoli Libyan dinar 34.28
7 Mexico Mexico City Mexican peso 54.32

Experiment-VI

6. Implementation of Decision tree using sklearn and its parameter tuning

1. # Python program to implement decision tree algorithm and plot the tree
2.
3. # Importing the required libraries
4. import pandas as pd
5. import numpy as np
6. import matplotlib.pyplot as plt
7. from sklearn import metrics
8. import seaborn as sns
9. from sklearn.datasets import load_iris
10. from sklearn.model_selection import train_test_split
11. from sklearn import tree
12.
13. # Loading the dataset
14. iris = load_iris()
15.
16. #converting the data to a pandas dataframe
17. data = pd.DataFrame(data = iris.data, columns = iris.feature_names)
18.
19. #creating a separate column for the target variable of iris dataset
20. data['Species'] = iris.target
21.
22. #replacing the categories of target variable with the actual names of the species
23. target = np.unique(iris.target)
24. target_n = np.unique(iris.target_names)
25. target_dict = dict(zip(target, target_n))
26. data['Species'] = data['Species'].replace(target_dict)
27.
28. # Separating the independent dependent variables of the dataset
29. x = data.drop(columns = "Species")
30. y = data["Species"]
31. names_features = x.columns
32. target_labels = y.unique()
33.
34. # Splitting the dataset into training and testing datasets
35. x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 93)
36.
37. # Importing the Decision Tree classifier class from sklearn
38. from sklearn.tree import DecisionTreeClassifier
39.
40. # Creating an instance of the classifier class
41. dtc = DecisionTreeClassifier(max_depth = 3, random_state = 93)
42.
43. # Fitting the training dataset to the model
44. dtc.fit(x_train, y_train)
45.
46. # Plotting the Decision Tree
47. plt.figure(figsize = (30, 10), facecolor = 'b')
48. Tree = tree.plot_tree(dtc, feature_names = names_features, class_names = target_labels, ro
unded = True, filled = True, fontsize = 14)
49. plt.show()
50. y_pred = dtc.predict(x_test)
51.
52. # Finding the confusion matrix
53. confusion_matrix = metrics.confusion_matrix(y_test, y_pred)
54. matrix = pd.DataFrame(confusion_matrix)
55. axis = plt.axes()
56. sns.set(font_scale = 1.3)
57. plt.figure(figsize = (10,7))
58.
59. # Plotting heatmap
60. sns.heatmap(matrix, annot = True, fmt = "g", ax = axis, cmap = "magma")
61. axis.set_title('Confusion Matrix')
62. axis.set_xlabel("Predicted Values", fontsize = 10)
63. axis.set_xticklabels([''] + target_labels)
64. axis.set_ylabel( "True Labels", fontsize = 10)
65. axis.set_yticklabels(list(target_labels), rotation = 0)
66. plt.show()

Output:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy