0% found this document useful (0 votes)
3 views

ML Algorithms Python

The document provides an overview of four machine learning algorithms implemented in Python: Decision Tree for data classification, K-Means for data clustering, Linear Regression for predicting continuous values, and Logistic Regression for binary classification. Each section includes a theoretical explanation, workflow steps, and corresponding Python code examples. The focus is on practical implementation using the scikit-learn library.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ML Algorithms Python

The document provides an overview of four machine learning algorithms implemented in Python: Decision Tree for data classification, K-Means for data clustering, Linear Regression for predicting continuous values, and Logistic Regression for binary classification. Each section includes a theoretical explanation, workflow steps, and corresponding Python code examples. The focus is on practical implementation using the scikit-learn library.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Learning Algorithms in Python

1. Data Classification (Decision Tree)

Overview:
Data classification assigns predefined labels to data points based on their features.
We use Decision Tree Classifier, which splits data based on feature values.

Theory:
Decision Trees use entropy and Gini impurity to create splits and classify data.
Formula: Entropy = - p_i log2 p_i

Workflow:
1. Load and preprocess the dataset.
2. Train the Decision Tree model.
3. Make predictions on test data.
4. Evaluate model performance.

Python Code:
```python
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
```

2. Data Clustering (K-Means)


Overview:
Clustering groups similar data points without predefined labels.
We use K-Means clustering, which partitions data into K clusters.

Theory:
K-Means minimizes Within-Cluster Sum of Squares (WCSS) to find optimal clusters.

Workflow:
1. Generate data.
2. Apply K-Means clustering.
3. Use the Elbow Method to find optimal K.
4. Visualize the clusters.

Python Code:
```python
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

X, _ = make_blobs(n_samples=300, centers=4, random_state=42)


kmeans = KMeans(n_clusters=4, random_state=42, n_init=10)
y_kmeans = kmeans.fit_predict(X)

plt.scatter(X[:, 0], X[:, 1], c=y_kmeans, cmap='viridis', alpha=0.6)


plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300, c='red', marker='X')
plt.show()
```

3. Linear Regression

Overview:
Linear regression predicts continuous values using a linear equation.

Theory:
Y = mX + c, where m is the slope, c is the intercept.
Model minimizes Mean Squared Error (MSE).
Workflow:
1. Generate dataset.
2. Train the Linear Regression model.
3. Make predictions.
4. Evaluate using MSE and R score.

Python Code:
```python
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import numpy as np

X = np.random.rand(100, 1) * 10
y = 2.5 * X + np.random.randn(100, 1) * 2
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("R Score:", model.score(X_test, y_test))
```

4. Logistic Regression

Overview:
Logistic regression is used for binary classification.

Theory:
Uses the sigmoid function to map inputs to probabilities.
Formula: P(Y=1) = 1 / (1 + e^-(mX + c))

Workflow:
1. Generate classification dataset.
2. Train Logistic Regression model.
3. Make predictions.
4. Evaluate using accuracy and confusion matrix.
Python Code:
```python
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=200, n_features=2, random_state=42)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
y_pred = log_reg.predict(X_test)
print("Accuracy:", log_reg.score(X_test, y_test))
```

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy