0% found this document useful (0 votes)
9 views21 pages

Rajeek8 12

Uploaded by

ezhilventhanmb30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views21 pages

Rajeek8 12

Uploaded by

ezhilventhanmb30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Ex no: 8

Data And Text Clustering Using K Means Clustering


Date:

Aim:

To write an python program on data and text clustering using k means clustering.

Algorithm:

Step 1: Importing all the necessary libraries

Step 2: Load the dataset

Step 3: Select the number K to decide the number of clusters.

Step 4: Select random K points or centroids.

Step 5: Assign each data point to their closest centroid, which will form the predefined K
clusters.

Step 6: Calculate the variance and place a new centroid of each cluster.

Step 7: Repeat the third steps, which means reassign each datapoint to the new closest
centroid of each cluster.

Step 8: If any reassignment occurs, then go to step-6 else go to FINISH.

Implementation:

import numpy as np

import matplotlib.pyplot as mtp

import pandas as pd

!pip install scikit-learn

from sklearn.cluster import KMeans

dataset = pd.read_csv('/content/50_Startups (1).csv')

dataset.head()

26 717821F132 – Mohamed Rajeek


dataset.tail()

x = dataset.iloc[:, [1, 2]].values

kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)

y_predict= kmeans.fit_predict(x)

mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label = 'Cluster 1') #for first
cluster

mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label = 'Cluster 2') #for
second cluster

mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label = 'Cluster 3') #for third
cluster

mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4') #for
fourth cluster

mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5') #for
fifth cluster

mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label =


'Centroid')

mtp.title('K-Means Cluster')

mtp.xlabel('Market spend')

mtp.ylabel('admin')

mtp.legend()

mtp.show()

27 717821F132 – Mohamed Rajeek


Result:
Thus, the python program on data and text clustering using K mean clustering has been verified and
executed successfully

28 717821F132 – Mohamed Rajeek


Ex no: 9 Data and Text Clustering using Gaussian

Date: Mixture models

Aim:
To write a python programming for Data and Text clustering using Gaussian Mixture
models.

Algorithm:

Step 1: Load the dataset.

Step 2: Split dataset into training set and Test set.

Step 3: Fit simple Gaussian mixture model.

Step4: Initialize the mean, the covariance matrix and the mixing coefficients by some random

values.

Step 5: Compute the ck values for all k.

Step6: Again, estimate all the parameters using the current ck values.

Step7: Compute log-likelihood function and put some convergence criterior.

Implementation:

Bagging:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from pandas import DataFrame

from sklearn.preprocessing import StandardScaler, normalize

from sklearn.decomposition import PCA

from sklearn.mixture import GaussianMixture

from sklearn.metrics import silhouette_score

from sklearn.model_selection import train_test_split

from sklearn import metrics

raw_df = pd.read_csv('/content/supermarket_sales - Sheet1.csv')

raw_df = raw_df.drop('Invoice ID', axis = 1)

raw_df.fillna(method ='ffill', inplace = True)

29 717821F132 – Mohamed Rajeek


raw_df.head(2)

OUTPUT:

scaler = StandardScaler()

import pandas as pd

from sklearn.preprocessing import StandardScaler

numerical_features = raw_df.select_dtypes(include=['number']).columns

numerical_df = raw_df[numerical_features]

scaler = StandardScaler()

scaled_df = scaler.fit_transform(numerical_df)

raw_df[numerical_features] = scaled_df

pca = PCA(n_components = 2)

X_principal = pca.fit_transform(numerical_df)

X_principal = pd.DataFrame(X_principal)

X_principal.columns = ['P1', 'P2']

X_principal.head(2)

print("717821F219 J Jebershon Vetha Singh ")

gmm=GaussianMixture(n_components = 3)

gmm.fit(X_principal)

30 717821F132 – Mohamed Rajeek


plt.scatter(X_principal['P1'], X_principal['P2'],

c = GaussianMixture(n_components = 3).fit_predict(X_principal), cmap =plt.cm.winter,

alpha = 0.6)

plt.show()

def SelBest(arr:list, X:int)->list:

dx=np.argsort(arr)[:X]

return arr[dx]

n_clusters = np.arange(2, 8)

sils = []

sils_err = []

iterations = 20

for n in n_clusters:

tmp_sil = [] # This line was not indented correctly

for _ in range(iterations):

gmm = GaussianMixture(n, n_init=2).fit(X_principal)

labels = gmm.predict(X_principal)

sil = metrics.silhouette_score(X_principal, labels, metric='euclidean')

tmp_sil.append(sil)

val = np.mean(SelBest(np.array(tmp_sil), int(iterations / 5)))

31 717821F132 – Mohamed Rajeek


err = np.std(tmp_sil)

sils.append(val)

sils_err.append(err)

plt.errorbar(n_clusters, sils, yerr=sils_err)

plt.xticks(n_clusters)

plt.xlabel("N. of clusters")

plt.ylabel("Score")

plt.show()

OUTPUT:

Result:
Thus, the python program on data and text clustering using Gaussian mixture model has been verified
and executed successfully.

32 717821F132 – Mohamed Rajeek


Ex. No: 10

Date: Dimensionality Reduction Using Image Processing


Applications

Aim:
To write a python programming for Dimensionality reductional algorithms using
Image processing Applications.

Algorithm:

BEGIN:

Step 1: Load the dataset.

Step 2: Compute the means of the variables.

Step 3: Calculate the covariance variable and matrix by ordered pairs(xi,yi).

Step4: Compute the curse of dimensionality.

Step 5: Derive new dataset by calculating

END

Implementation:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.neural_network import MLPClassifier

from sklearn.metrics import accuracy_score, confusion_matrix

from sklearn.decomposition import PCA

X=np.load('/content/X.npy')

Y=np.load('/content/Y.npy')

X.shape

plt.imshow(X[0])

33 717821F132 – Mohamed Rajeek


9- np.argmax(Y[0])

X_flat = np.array(X).reshape((2062, 64*64))

X_train, X_test, y_train, y_test = train_test_split(X_flat, Y, test_size=0.3, random_state=42)

clf = MLPClassifier(solver='adam', alpha=1e-5, hidden_layer_sizes=(20, 20, 20),

random_state=1)

clf.fit(X_train, y_train)

y_hat = clf.predict(X_test)

print("accuracy: " + str(accuracy_score(y_test, y_hat)))

pca_dims = PCA()

34 717821F132 – Mohamed Rajeek


pca_dims.fit(X_train)

cumsum = np.cumsum(pca_dims.explained_variance_ratio_)

d =np.argmax(cumsum >= 0.95) + 1

print(d)

pca = PCA(n_components=d)

X_reduced = pca.fit_transform(X_train)

X_recovered = pca.inverse_transform(X_reduced)

print("reduced shape: " + str(X_reduced.shape))

print("recovered shape: " + str(X_recovered.shape))

f = plt.figure()

f.add_subplot(1,2, 1)

plt.title("original")

plt.imshow(X_train[0].reshape((64,64)))

f.add_subplot(1,2, 2)

plt.title("PCA compressed")

plt.imshow(X_recovered[0].reshape((64,64)))

plt.show(block=True)

35 717821F132 – Mohamed Rajeek


clf_reduced = MLPClassifier(solver='adam', alpha=1e-5, hidden_layer_sizes=(20,

20, 20))

clf_reduced.fit(X_reduced, y_train)

X_test_reduced = pca.transform(X_test)

y_hat_reduced = clf_reduced.predict(X_test_reduced)

print("accuracy: " + str(accuracy_score(y_test, y_hat_reduced)))

Result:
Thus, the implementation of python programming for Dimensionality reduction algorithms using
Image processing Applications has been completed and verified successfully.

36 717821F132 – Mohamed Rajeek


Ex. No: 11

Date: Implementation of Sampling methods

Aim:
To write a python programming for Implementation of Sampling methods.

Algorithm:

BEGIN:

Step 1: Import all the libraries.

Step 2: Calculate Setup the model by identifying the dependent variable.

Step 3: Specify the probability distribution for independent variables.

Step 4: Run iterative simulations by generating enough possible values for independent

Variables.

Step 5: Visualize the results in the plot.

END

Implementation:

%matplotlib inline

from future import print_function

import numpy as np

import matplotlib.pyplot as plt

import numpy.linalg as LA

def multivariate_normal(X, mu=np.array([[0, 0]]), sig=np.array([[1, 0.8], [0.8, 1]])):

# Indent the lines within the function definition by 4 spaces

sqrt_det_2pi_sig = np.sqrt(2 * np.pi * LA.det(sig))

sig_inv = LA.inv(sig)

X = X[:, None, :] - mu[None, :, :]

return np.exp(-np.matmul(np.matmul(X, np.expand_dims(sig_inv, 0)), (X.transpose(0, 2,


1)))/2)/sqrt_det_2pi_sig

x = np.linspace(-3, 3, 1000)

X = np.array(np.meshgrid(x, x)).transpose(1, 2, 0)

X = np.reshape(X, [X.shape[0] * X.shape[1], -1])

37 717821F132 – Mohamed Rajeek


z = multivariate_normal(X)

plt.imshow(z.squeeze().reshape([x.shape[0], -1]), extent=[-10, 10, -10, 10], cmap='hot', origin='lower')

plt.contour(x, x, z.squeeze().reshape([x.shape[0], -1]), cmap='cool')

plt.title('True Bivariate Distribution')

plt.xlabel('$x_1$')

import numpy as np

import matplotlib.pyplot as plt

x0 = [0, 0]

xt = x0

b = 0.8

samples = []

for i in range(100000):

# Indent the lines within the for loop by 4 spaces

x1_t = np.random.normal(b*xt[1], 1-b*b)

x2_t = np.random.normal(b*x1_t, 1-b*b)

xt = [x1_t, x2_t]

samples.append(xt)

burn_in = 1000

38 717821F132 – Mohamed Rajeek


samples = np.array(samples[burn_in:])

im, x_, y_ = np.histogram2d(samples[:, 0], samples[:, 1], bins=100)

plt.imshow(im, extent=[-10, 10, -10, 10], cmap='hot', origin='lower', interpolation='nearest')

plt.title('Empirical Bivariate Distribution')

plt.xlabel('$x_1$') # Corrected the xlabel to enclose $x_1$ in $ signs

plt.ylabel('$x_2$') # Corrected the ylabel to enclose $x_2$ in $ signs

plt.show()

%matplotlib inline

from future import print_function

import numpy as np

import matplotlib.pyplot as plt

P = lambda x: 3 * np.exp(-x*x/2) + np.exp(-(x - 4)**2/2)

Z = 10.0261955464

x_vals = np.linspace(-10, 10, 1000)

y_vals = P(x_vals)

plt.figure(1)

plt.plot(x_vals, y_vals, 'r', label='P(x)')

39 717821F132 – Mohamed Rajeek


plt.legend(loc='upper right', shadow=True)

plt.show()

f_x = lambda x: x

g_x = lambda x: np.sin(x)

true_expected_fx = 10.02686647165

true_expected_gx = -1.15088010640

a, b = -4, 8

uniform_prob = 1./(b - a)

plt.figure(2)

plt.plot(x_vals, y_vals, 'r', label='P(x)')

plt.plot(x_vals, f_x(x_vals), 'b', label='x')

plt.plot([-10, a, a, b, b, 10], [0, 0, uniform_prob, uniform_prob, 0, 0], 'g', label='Q(x)')

plt.plot(x_vals, np.sin(x_vals), label='sin(x)')

plt.xlim(-4, 10)

plt.ylim(-1, 3.5)

plt.legend(loc='upper right', shadow=True)

plt.show()

40 717821F132 – Mohamed Rajeek


expected_f_x = 0.

expected_g_x = 0.

n_samples = 100000

den = 0.

for i in range(n_samples):

# Indent the following lines within the for loop

sample = np.random.uniform(a, b)

importance = P(sample) / uniform_prob

den += importance

expected_f_x += importance * f_x(sample)

expected_g_x += importance * g_x(sample)

expected_f_x /= den

expected_g_x /= den

expected_f_x *= Z

expected_g_x *= Z

print('E[f(x)] = %.5f, Error = %.5f' % (expected_f_x, abs(expected_f_x - true_expected_fx)))

print('E[g(x)] = %.5f, Error = %.5f' % (expected_g_x, abs(expected_g_x - true_expected_gx)))

41 717821F132 – Mohamed Rajeek


Result:
Thus, the implementation of python programming for Implementation of sampling methods has been
completed and verified successfully.

42 717821F132 – Mohamed Rajeek


Ex. No: 12

Date: Application of Hidden Markov Model for


Weather prediction

Aim:
To write a python programming for Implementing the application of Hidden Markov
model for weather prediction.

Algorithm:

BEGIN:

Step 1: Define the state space and observation space.

Step 2: Define the initial state distribution.

Step 3: Define the state transition probabilities and observation likelihoods.

Step 4: Train the model.

Step 5: Decode the most likely sequence of hidden states.

Step 6: Evaluate the model.

END

Implementation:

import numpy as np

P_transition = np.array([[0.75, 0.15, 0.10],

[0.25, 0.55, 0.20],

[0.30, 0.30, 0.40]])

P_emission = np.array([[0.75, 0.15, 0.65],

[0.25, 0.85, 0.35]])

P_init = [0.65, 0.20, 0.15]

T=3

hidden_states = np.zeros((T,), dtype=np.int32)

probs = np.zeros((T, 3))

for t in range(T):

# Indented block for the 'for' loop

if t == 0:

probs[t, :] = P_init * P_transition[1, :]

else:

43 717821F132 – Mohamed Rajeek


probs[t, :] = np.max(probs[t - 1, :, None] * P_transition, axis=0)

hidden_states[t] = np.argmax(probs[t, :])

state_names = ['Sunny', 'Rainy', 'Foggy']

forecast = [state_names[s] for s in hidden_states]

print("717821F219 J Jebershon Vetha Singh ")

print(forecast)

P_transition = np.array([[0.75, 0.15, 0.10],

[0.25, 0.55, 0.20],

[0.30, 0.30, 0.40]])

P_emission = np.array([[0.75, 0.15, 0.65],

[0.25, 0.85, 0.35],

[0.0, 0.0, 0.0]])

P_init = np.array([0.65, 0.20, 0.15])

P_hidden_init = P_transition[1, :]

alpha = np.zeros((3,))

for i in range(3):

alpha[i] = P_emission[i, 0] * P_init[i] # Indented this line to be part of the 'for' loop

for t in range(1, 3):

alpha = np.dot(alpha, P_transition) * P_emission[:, t]

P_evidence = np.sum(alpha)

posteriors = alpha / P_evidence

state_names = ['Sunny', 'Rainy', 'Foggy']

for i in range(3):

# Indented this print statement to be part of the 'for' loop

print("Probability of being in state %s: %.4f" % (state_names[i], posteriors[i]))

most_likely_weather = state_names[np.argmax(posteriors)]

print("The most likely weather is %s." % most_likely_weather)

44 717821F132 – Mohamed Rajeek


E = ["no umbrella", "umbrella", "umbrella", "no umbrella", "umbrella", "no umbrella"]

alpha = np.zeros((3,))

for i in range(3):

alpha[i] = P_emission[i, 0] * P_init[i] # Indented this line to be part of the 'for' loop

for t in range(1, len(E)):

alpha = np.dot(alpha, P_transition) * P_emission[:, t % 2]

final_state = np.argmax(alpha)

state_names = ['Sunny', 'Rainy', 'Foggy']

most_likely_weather = state_names[final_state]

print("Final state:", state_names[final_state])

print("Most likely weather:", most_likely_weather)

P_emission = np.array([[0.6, 0.4, 0.0, 0.0],

[0.0, 0.0, 0.5, 0.5],

[0.3, 0.3, 0.2, 0.2]])

P_transition = np.array([[0.7, 0.2, 0.1],

[0.2, 0.5, 0.3],

[0.3, 0.3, 0.4]])

E = [0, 1, 2, 2, 1, 0]

T = len(E)

trellis = np.zeros((T, 3))

backpointers = np.zeros((T-1, 3), dtype=np.int64)

for i in range(3):

# Indented this line to be part of the 'for' loop

45 717821F132 – Mohamed Rajeek


trellis[0, i] = P_emission[i, E[0]] * P_transition[i, 0]

for t in range(1, T):

for j in range(3):

prob_transitions = trellis[t - 1, :] * P_transition[:, j]

trellis[t, j] = P_emission[j, E[t]] * np.max(prob_transitions)

backpointers[t - 1, j] = np.argmax(prob_transitions)

hidden_states = [np.argmax(trellis[-1])]

for i in range(T-2, -1, -1):

hidden_states.append(backpointers[i, hidden_states[-1]])

hidden_states.reverse()

print("The most likely sequence of hidden states is:", hidden_states)

Result:
Thus, the implementation of python programming for Implementing the application of Hidden
Markov Model for weather prediction has been completed and verified successfully..

46 717821F132 – Mohamed Rajeek

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy