ML Lab Manual
ML Lab Manual
INSTITUTE
(ApprovedbyAICTEandAffiliatedtoJNTUH) (AccreditedbyNAACwith‘A’Grade)
Parvathapur,Uppal,Medipally(M),Medchal(D),Telangana,Hyderabad-500098
Course Objective: The objective of this lab is to get an overview of the various machine
learning techniques and can able to demonstrate them using python.
Course Outcomes: After the completion of the course the student can able to:
List of Experiments
1. The probability that it is Friday and that a student is absent is 3 %. Since there are 5
school days in a week, the probability that it is Friday is 20 %. What is
theprobability that a student is absent given that today is Friday? Apply Baye’s rule
in python to get the result. (Ans: 15%)
2. Extract the data from database using python
3. Implement k-nearest neighbours classification using python
4. Given the following data, which specify classifications for nine combinations of
VAR1 and VAR2 predict a classification for a case where VAR1=0.906 and
VAR2=0.606, using the result of k- means clustering with 3 means (i.e., 3 centroids)
Input attributes are (from left to right) income, recreation, job, status, age-group, home-
owner. Find the unconditional probability of `golf' and the conditional probability of
`single' given `medRisk' in the dataset?
6. Implement linear regression using python.
7. Implement Naïve Bayes theorem to classify the English text
8. Implement an algorithm to demonstrate the significance of genetic algorithm
9. Implement the finite words classification system using Back-propagation algorithm
EXPERIMENT : 1
The probability that it is Friday and that a student is absent is 3%. Since there are 5 school
days in a week, the probability that it is Friday is 20%. What is theprobability that a
student is absent given that today is Friday? Apply Baye’s rule in python to get the result.
(Ans: 15%)
Aim : The probability that it is Friday and that a student is absent is 3 %. Since there are 5 school
days in a week, the probability that it is Friday is 20 %. What is the probability that a student is
absent given that today is Friday? Apply Baye’s rule in python to get the result.
Explanation:
F : Friday
A : Absent
and
Then,
P(A ∣ F)
The probability that a student is absent given that today is Friday
PROGRAM :
OUTPUT:
Explanation:
===> First You need to Create a Table (students) in Mysql Database (SampleDB)
---> Open Command prompt and then execute the following command to enter into MySQL
prompt.
--> mysql -u root -p
And then, you need to execute the following commands at MySQL prompt to create table in the
database.
--> create database SampleDB;
--> use SampleDB;
--> CREATE TABLE students (sid VARCHAR(10),sname VARCHAR(10),age int);
--> INSERT INTO students VALUES('s521','Jhon Bob',23);
--> INSERT INTO students VALUES('s522','Dilly',22);
--> INSERT INTO students VALUES('s523','Kenney',25);
--> INSERT INTO students VALUES('s524','Herny',26);
===> Next,Open Command propmt and then execute the following command to install
mysql.connector
package to connect with mysql database through python.
PROGRAM:
import mysql.connector
# Create the connection object
myconn = mysql.connector.connect(host = "localhost", user = "root",passwd =
"",database="SampleDB")
# Creating the cursor object
cur = myconn.cursor()
# Executing the query
cur.execute("select * from students")
# Fetching the rows from the cursor object
result = cur.fetchall()
print("Student Details are :")
# Printing the result
for x in result:
print(x);
# Commit the transaction
myconn.commit()
# Close the connection
myconn.close()
OUTPUT:
C:\xampp\mysql\bin>mysql -u root -p
Enter password:
Welcome to the MariaDB monitor.
Your MariaDB connection id is 9
Commands end with; or \g.
Server version: 10.4.18-MariaDB mariadb.org binary distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> create database SampleDB;
Query OK, 1 row affected (0.046 sec)
MariaDB [(none)]> use SampleDB;
Database changed
MariaDB [SampleDB]> CREATE TABLE students (sid VARCHAR(10), sname VARCHAR(10),age int); Query
OK, & rows affected (0.285 sec)
MariaDB [SampleDB]> INSERT INTO students VALUES ('s521', 'Jhon Bob', 23);
Query OK, 1 row affected (0.066 sec)
MariaDB [SampleDB]> INSERT INTO students VALUES ('s522', 'Dilly',22); Query OK, 1 row affected (0.061
sec)
MariaDB [SampleDB]> INSERT INTO students VALUES ('s523', 'Kenney',25); Query OK, 1 row affected
(0.029 sec)
MariaDB [SampleDB]> INSERT INTO students VALUES ('s524', 'Herny',26); Query OK, 1 row affected (0.040
sec)
MariaDB [SampleDB]>
Explanation:
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install sklearn
Module
---> pip install scikit-learn
In this program, we are going to use iris dataset.And this dataset Split into training(70%) and test
set(30%).
PROGRAM:
# Loading data
data_iris = load_iris()
# To get list of target names
label_target = data_iris.target_names
print()
print("Sample Data from Iris Dataset")
print("*"*30)
# to display the sample data from the iris dataset
for i in range(10):
rn = random.randint(0,120)
print(data_iris.data[rn],"===>",label_target[data_iris.target[rn]])
knn.fit(X_train, y_train)
# to display the score
print("The Score is :",knn.score(X_test, y_test))
# To get test data from the user
test_data = input("Enter Test Data :").split(",")
for i in range(len(test_data)):
test_data[i] = float(test_data[i])
print()
v = knn.predict([test_data])
print("Predicted output is :",label_target[v])
except:
print("Please supply valid input......")
OUTPUT:
D:\Machine Learning\Lab>python Week3.py
Sample Data from Iris Dataset
******************************✶
[6.8 2.8 4.8 1.4] ===> versicolor [5.8 2.7 3.9 1.2] ===> versicolor
[6.6 2.9 4.6 1.3] [5.1 3.8 1.6 0.2] [4.9 2.5 4.5 1.7] [7.3 2.9 6.3 1.8] [4.9 3.1 1.5 0.1] [4.5 2.3 1.3 0.3] [7.6
3. [6.6 3.=> versicolor
> setosa ===> virginica ===> virginica setosa
setosa
6.6 2.1] ===> virginica 4.4 1.4] ===> versicolor
105
The Training dataset length: The Testing dataset length: 45 Enter number of neighbors :10
The Score is : 0.9777777777777777 Enter Test Data :6.2,2.6,3.4,0.6
Predicted output is : ['versicolor']
Given the following data, which specify classifications for nine ombinationsof VAR1 and
VAR2 predict a classification for a case where VAR1=0.906and VAR2=0.606, using the
result of k-means clustering with 3 means (i.e., 3centroids)
'''Aim: Given the following data, which specify classifications for nine ombinationsof VAR1 and
VAR2 predict a classification for a case where VAR1=0.906and VAR2=0.606, using the result
of k-means clustering with 3 means (i.e., 3centroids)
Explanation:
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install sklearn
Module
---> pip install scikit-learn
Finally, you need to predict the class for the VAR1=0.906 and VAR2=0.606
PROGRAM:
OUTPUT:
The following training examples map descriptions of individuals onto high, medium and
low credit-worthiness.Input attributes are (from left to right) income, recreation, job,
status, age-group, home-owner. Find the unconditional probability of 'golf' and the
conditional probability of 'single' given 'medRisk' in the dataset
'''Aim:The following training examples map descriptions of individuals onto high, medium and
low credit-worthiness.
Input attributes are (from left to right) income, recreation, job, status, age-group, home-owner.
Find the unconditional probability of 'golf' and the conditional probability of 'single' given
'medRisk' in the dataset
Explanation:
---> S : single
---> MR : medRisk
P(S ∩ MR) = The number of MedRisk with Single records / total number of Records
= 2 / 10 = 0.2
and
= 0.66666
PROGRAM:
total Records=10
numGolfRecords=4
unConditionalprobGolf=numGolfRecords / total_Records
print("Unconditional probability of golf: ={}".format(unConditionalprobGolf))
#conditional probability of 'single' given 'medRisk'
numMedRiskSingle=2
numMedRisk=3
probMedRiskSingle=numMedRiskSingle/total_Records
probMedRisk=numMedRisk/total_Records
conditionalProb=(probMedRiskSingle/probMedRisk)
print("Conditional probability of single given medRisk: = {}".format(conditionalProb))
OUTPUT:
Explanation:
===> To run this program you need to install the pandas Module
===> To install, Open Command propmt and then execute the following command
===> Open Command propmt and then execute the following command to install sklearn
Module
PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score
count_vect = CountVectorizer()
Xtrain_dims = count_vect.fit_transform(Xtrain)
Xtest_dims = count_vect.transform(Xtest)
df = pd.DataFrame(Xtrain_dims.toarray(),columns=count_vect.get_feature_names_out())
clf = MultinomialNB()
# to fit the train data into model
clf.fit(Xtrain_dims, Ytrain)
# to predict the test data
prediction = clf.predict(Xtest_dims)
print('******** Accuracy Metrics *********')
print('Accuracy : ', accuracy_score(Ytest, prediction))
print('Recall : ', recall_score(Ytest, prediction))
print('Precision : ',precision_score(Ytest, prediction))
print('Confusion Matrix : \n', confusion_matrix(Ytest, prediction))
print(10*"-")
# to predict the input statement
test_stmt = [input("Enter any statement to predict :")]
test_dims = count_vect.transform(test_stmt)
pred = clf.predict(test_dims)
for stmt,lbl in zip(test_stmt,pred):
if lbl == 1:
print("Statement is Positive")
else:
print("Statement is Negative")
Statements_data.csv
Explanation:
===> To run this program you need to install the pandas Module
---> pandas Module is used to read csv files
===> To install, Open Command propmt and then execute the following command
---> pip install pandas
And, then you need to install the sklearn Module
==> Open Command propmt and then execute the following command to install sk learn Module
PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score
count_vect = CountVectorizer()
Xtrain_dims = count_vect.fit_transform(Xtrain)
Xtest_dims = count_vect.transform(Xtest)
df = pd.DataFrame(Xtrain_dims.toarray(),columns=count_vect.get_feature_names_out())
clf = MultinomialNB()
# to fit the train data into model
clf.fit(Xtrain_dims, Ytrain)
# to predict the test data
prediction = clf.predict(Xtest_dims)
print('******** Accuracy Metrics *********')
print('Accuracy : ', accuracy_score(Ytest, prediction))
print('Recall : ', recall_score(Ytest, prediction))
print('Precision : ',precision_score(Ytest, prediction))
print('Confusion Matrix : \n', confusion_matrix(Ytest, prediction))
print(10*"-")
# to predict the input statement
test_stmt = [input("Enter any statement to predict :")]
test_dims = count_vect.transform(test_stmt)
pred = clf.predict(test_dims)
for stmt,lbl in zip(test_stmt,pred):
if lbl == 1:
print("Statement is Positive")
else:
print("Statement is Negative")
Statements_data.csv
D:\Machine Learning\Lab>python Week7.py The Total instances in the Dataset: ******** Accuracy
Metrics *********
Accuracy: 0.6
Recall : 1.0
Precision
: 0.6
Confusion Matrix :
[[0 2]
[03]]
Enter any statement to predict :I hate juice Statement is Negative
DESCRIPTION:
A Fitness Score is given to each individual which shows the ability ofanindividual to “compete”.
The individual having optimal fitness score (or near optimal) are sought.The GAs maintains the population of n
individuals (chromosome/solutions) along with their fitness scores.The individuals having better fitness scores are
given more chance to reproduce than others. The individuals with better fitness scores are selected who mate and
produce better offspring by combining chromosomes of parents. The population size is static so the room has to
be created for new arrivals. So, someindividuals die and get replacedby new arrivals eventually creating new
generation when all the mating opportunity of the old population is exhausted. It is hoped that over successive
generations better solutions willarrive while least fit die.Each new generation has on average more “better genes”
than the individual (solution)of previousgenerations.Thuseachnewgenerationshavebetter“partial solutions” than
previous generations. Once the offsprings produced having no significant difference than offspring produced by
previous populations, the population is converged. The algorithm is said to be converged to a set ofsolutions for
the problem.
OPERATORSOFGENETICALGORITHM:
Once the initialgenerationiscreated,thealgorithmevolvethegenerationusing following operators –
1) SelectionOperator: The idea istogivepreferencetothe individualswithgoodfitness
scores and allow themto pass there genes to the successive generations.
2) Crossover Operator: This represents mating between individuals. Two individuals are
selected using selection operator and crossover sites are chosen randomly. Then the genes at
thesecrossoversitesareexchangedthuscreatingacompletelynew individual(offspring).For
example –
MutationOperator:Thekeyidea istoinsert randomgenesinoffspringtomaintainthe diversity in population to avoid
the premature convergence. For example –
ALGORITHM:
1) Randomlyinitializepopulationsp
2) Determinefitnessofpopulation
3) Untillconvergencerepeat:
a) Selectparentsfrompopulation
b) Crossoverandgeneratenewpopulation
c) Performmutationonnewpopulation
d) Calculatefitnessfornewpopulation
USESOFGENETIC ALGORITHM:
TheyareRobust
Provideoptimisationoverlargespacestate.
UnliketraditionalAI,theydonotbreakonslightchangeininputorpresenceofnoise
PROGRAM:
importnumpy
defcal_pop_fitness(equation_inputs,pop):
#Calculatingthefitnessvalueofeachsolutioninthecurrentpopulation.
fitness=numpy.sum(pop*equation_inputs, axis=1)
return fitness
defselect_mating_pool(pop,fitness,num_parents):
parents=numpy.empty((num_parents,
range(num_parents):
max_fitness_idx=numpy.where(fitness==
numpy.max(fitness)) max_fitness_idx =
max_fitness_idx[0][0]parents[parent_num, :] =
pop[max_fitness_idx, :] fitness[max_fitness_idx] =
-99999999999
returnparents
def crossover(parents, offspring_size):
offspring = numpy.empty(offspring_size
#Thepoint at whichcrossovertakesplacebetweentwo parents.Usually, it isatthecenter. crossover_point =
numpy.uint8(offspring_size[1]/2)
for kinrange(offspring_size[0]):
parent1_idx = k%parents.shape[0]
parent2_idx=(k+1)%parents.shape[0]
parents[parent2_idx, crossover_point:]
returnoffspring
defmutation(offspring_crossover,num_mutations=1):
mutations_counter=numpy.uint8(offspring_crossover.shape[1]/num_mutations)
mutations_counter - 1
formutation_numinrange(num_mutations): # The
randomvaluetobeaddedtothegene.
random_value=numpy.random.uniform(-1.0,1.0, 1)
gene_idx = gene_idx +
mutations_counterreturn
offspring_crossover
importnumpy
"""
ASAP: y =
w1x1+w2x2+w3x3+w4x4+w5x5+6wx6
where(x1,x2,x3,x4,x5,x6)=(4,-2,3.5,5,- 11,-4.7)
Whatarethebestvaluesforthe6weightsw1tow6?
"""
#Inputsoftheequation. equation_inputs =
[4,-
2,3.5,5,-11,-4.7]
optimize.num_weights=len(equation_inputs)
"""
Genetic algorithm
parameters:Mating pool
"""
sol_per_pop=8
num_parents_mating= 4
#Definingthepopulationsize.
#Creatingthe initialpopulation.
new_population=numpy.random.uniform(low=-4.0,high=4.0, size=pop_size)
print(new_population)
"""
new_population[0,:]=[2.4,0.7,8,-2,5,1.1]
new_population[1,:]=[-0.4,2.7,5,-1,7,0.1]
new_population[2,:]=[-1,2,2,-3,2,0.9]
new_population[3,:]=[4, 7,12,6.1,1.4,-4]
new_population[4,:]=[3.1,4,0,2.4, 4.8,0]
new_population[5,:]=[-2,3,-
7,6,3, 3]"""
best_outputs = []
num_generations= 1000
print("Generation:",
generation)
population.fitness=cal_pop_fitness(equation_inputs, new_population)
print("Fitness")
print(fitness) best_outputs.append(numpy.max(numpy.sum(new_population*equation_inputs,
axis=1)))#Thebestresultinthecurrentiteration.
print("Bestresult:",numpy.max(numpy.sum(new_population*equation_inputs,axis=1)))
num_parents_mating)
print("Parents ")
print(parents)
#Generatingnextgenerationusing
crossover.offspring_crossover= crossover(parents,
offspring_size=(pop_size[0]-parents.shape[0],num_weights))
print("Crossover")
print(offspring_cr
ossover)
#Addingsomevariationstotheoffspringusingmutation.
offspring_mutation = mutation(offspring_crossover,
#Creatingthenewpopulationbasedontheparentsandoffspring.
offspring_mutation
cal_pop_fitness(equation_inputs, new_population)
print("Bestsolution:",
new_population[best_match_idx,:])print("Best solution
importmatplotlib.pyplotmat plotlib.pyplot.plot(best_outp
uts) matplotlib.pyplot.xlabel("Ite
ration")
matplotlib.pyplot.ylabel("Fit
ness") matplotlib.pyplot.show()
OUTPUT:
[[0.582041412.32880696-2.951302092.570569533.33055238-0.58167871] [-1.65052225
3.52263842-2.46577305-1.7005396-3.804802020.29677167] [ 2.6239874-2.01548549-
1.722922953.61090243-1.25604726-2.32647264][-3.451673932.857718253.74655682-
2.017906260.25750106-3.12923247][2.86026334-0.4306777-3.262979561.74863348-
1.93705571-3.18855672][-1.700120890.98685104-1.911920723.91873942-0.09354385
1.43038667] [0.31769009-0.872908093.752497852.576579930.588830822.83231871][
3.833149260.33838112-2.49509594-1.507631743.99440509-0.03037715]]
Gen
erati
on :
Fitn
ess
[-33.708344139.6777259451.30214363-4.6238336545.91897711-1.566046069.24418172-
45.41084308]
Best result:
51.302143629097614
Parents
[[2.6239874-2.01548549-1.722922953.61090243-1.25604726-2.32647264][2.86026334-
0.4306777-3.262979561.74863348-1.93705571-3.18855672][-1.650522253.52263842-
2.46577305-1.7005396-3.804802020.29677167][0.31769009-0.872908093.75249785
2.576579930.58883082 2.83231871]]
Crossover
[[2.6239874-2.01548549-1.722922951.74863348-1.93705571-3.18855672][2.86026334-
0.4306777-3.26297956-1.7005396-3.804802020.29677167] [-1.650522253.52263842-
2.46577305 2.576579930.588830822.83231871][0.31769009-0.872908093.75249785
3.61090243-1.25604726-2.32647264]]
Mutation
[[2.6239874-2.01548549-1.678966321.74863348-1.93705571-3.97789372][2.86026334-
0.4306777-3.12878279-1.7005396-3.80480202-0.15430324][-1.650522253.52263842-
3.37669601 2.576579930.588830822.25153466][0.31769009-0.872908092.93428907
3.61090243-1.25604726-2.71597954]]
.
.
Gener
ation:
999
Fitnes
s
[2554.39355622551.72360738 2549.405839542549.299316292552.24225166
2550.45506206
2547.12995122551.22467397]
Best result:
2554.3935561987346
Parents
[[3.17690088e-01-8.72908094e-012.67689952e+021.74863348e+00- 1.93705571e+00
-3.37108802e+02][3.17690088e-01 -8.72908094e-01
2.67638232e+021.74863348e+00-
1.93705571e+00-3.36689592e+02][3.17690088e-01-8.72908094e-012.67254110e+02
1.74863348e+00-1.93705571e+00-3.36865291e+02][3.17690088e-01-8.72908094e-01
2.67370854e+021.74863348e+00-1.93705571e+00-3.36672197e+02]]
Crossover
[[3.17690088e-01-8.72908094e-012.67689952e+021.74863348e+00- 1.93705571e+00
- 3.36689592e+02]
[3.17690088e-01-8.72908094e-012.67638232e+021.74863348e+00
-1.93705571e+00-3.36865291e+02]
[3.17690088e-01-8.72908094e-012.67254110e+021.74863348e+00
-1.93705571e+00-3.36672197e+02]
[3.17690088e-01-8.72908094e-012.67370854e+021.74863348e+00
-1.93705571e+00-3.37108802e+02]]
Mutation
[[3.17690088e-01-8.72908094e-01 2.68382875e+021.74863348e+00
-1.93705571e+00-3.36222272e+02]
[3.17690088e-01-8.72908094e-012.68456819e+021.74863348e+00
-1.93705571e+00-3.37417363e+02]
[3.17690088e-01-8.72908094e-012.67606746e+021.74863348e+00
-1.93705571e+00-3.36866918e+02]
[3.17690088e-01-8.72908094e-012.67051753e+021.74863348e+00
-1.93705571e+00-3.37331663e+02]]
Bestsolution:[[[3.17690088e-01-8.72908094e-012.68456819e+021.74863348e+00
-
1.9
370
557
1e+
00-
3.3
741
736
3e+
02]
]]
Bestsolutionfitness
:[2558.52782726]
EXPERIMENT NO: 9
Explanation:
===> To run this program you need to install the pandas Module
===> To install, Open Command propmt and then execute the following command
==> Open Command propmt and then execute the following command to install sklearn Module
===> Open Command propmt and then execute the following command to install sklearn-
neuralnetwork Module
PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score
Statements_data.csv
D:\Machine Learning\Lab>python Week9.py The Total instances in the Dataset: 18 ******** Accuracy
Metrics ***** ***
Accuracy:
Recall :
0.4
0.6666666666666666
Precision : : 0.5
Confusion Matrix :
[[0 2]
[1 2]]
Enter any statement to predict :I love biryani
Statement is Positive
D:\Machine Learning\Lab>python Week9.py The Total instances in the Dataset: ******** Accuracy Metrics
********* Accuracy: 0.8
Recall: 0.6666666666666666
Precision : 1.0
Confusion Matrix :
[[20] [12]]
Enter any statement to predict :i do not like summer
Statement is Negative