0% found this document useful (0 votes)

13 views20 pages

Disaster

Uploaded by

akhilmetha756

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views20 pages

Disaster

Uploaded by

akhilmetha756

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Micro Project

S. Program Page no Signature

No.
1 Write a python program to calculate the
Variance, Standard Deviation, Skewness
and Kurtosis.
Macro Project

S. Program Page no Signature

No.
1 Develop a machine learning model to detect
fraudulent credit card transactions. Explore
anomaly detection techniques and evaluate
model performance.
Mini Skill Project
S. Program Page no Signature
No.
1 Fake News Detection- fake news is sometimes
transmitted through the internet by some
unauthorised sources, which creates issues for the
targeted person and it makes them panic and leads
to even violence. Dataset: fake-news kaggle.
MICRO PROJECT

AIM :- Write a python program to calculate the Variance, Standard Deviation,

Skewness and Kurtosis.

DESCRIPTION :-

VARIANCE

 Definition: Variance measures the spread of data points from the mean. It gives an
indication of how much the values in the data set deviate from the average value.
 Formula :

• μ is the population mean, xˉ is the sample mean.

• n is the number of data points.

STANDARD DEVIATION:

• Description:
Standard deviation is the square root of variance, measuring how much the values deviate
from the mean, in the same unit as the data.
• Formula:

• A lower standard deviation means the data points tend to be closer to the mean, whereas a
higher standard deviation indicates more spread.

SKEWNESS

 Definition: Skewness measures the asymmetry of the distribution of data around its mean.
It indicates whether the data is skewed to the left (negatively skewed) or to the right
(positively skewed).
 Formula:
where:

 xˉ is the mean.
 n is the number of data points.

• A Negative Skewness value indicates left-skewed data (more values on the right), while a
Positive Skewness indicates right-skewed data (more values on the left). A skewness value
close to 0 indicates a symmetric distribution.

Kurtosis

 Definition: Kurtosis measures the "tailedness" of the distribution, indicating how much of
the data is in the tails compared to a normal distribution. It helps describe the shape of the
distribution.
 Formula:

• A Positive kurtosis (leptokurtic) means that the distribution has heavier tails than a normal
distribution, while a Negative kurtosis (platykurtic) means it has lighter tails.
PROGRAM:-

import numpy as np
from scipy.stats import skew, kurtosis

# Function to calculate Variance, Standard Deviation, Skewness, and Kurtosis

def calculate_statistics(data):

variance = np.var(data, ddof=1)

std_dev = np.std(data, ddof=1) # ddof=1 for sample standard deviation
skewness = skew(data)
kurt = kurtosis(data)

return variance, std_dev, skewness, kurt

# Example dataset
data = [12, 15, 14, 10, 8, 13, 14, 16, 18, 21]

# Calculating statistics
variance, std_dev, skewness, kurt = calculate_statistics(data)

print(f"Variance: {variance}")
print(f"Standard Deviation: {std_dev}")
print(f"Skewness: {skewness}")
print(f"Kurtosis: {kurt}")

OUTPUT:
MACRO PROJECT

AIM :- Develop a machine learning model to detect fraudulent credit card transactions. Explore
anomaly detection techniques and evaluate model performance..

DESCRIPTION :-
Fraud detection in credit card transactions is a critical application of machine learning, aiming to
minimize financial losses and ensure secure transactions. The primary challenge lies in the
imbalanced nature of the data—fraudulent transactions are rare compared to legitimate ones.
This necessitates techniques that can effectively detect anomalies or patterns indicative of fraud

Data Preprocessing

1. Imbalanced Dataset: Fraudulent transactions typically make up less than 1% of the total
data. Handling this imbalance is crucial.
o Techniques like oversampling (SMOTE), undersampling, or generating synthetic
samples can be employed.
2. Feature Scaling and Encoding: Most datasets require normalization/scaling (e.g.,
MinMaxScaler or StandardScaler) and encoding of categorical features for effective model
training.
3. Feature Selection/Engineering: Identifying important features such as transaction amount,
time, or customer behavior patterns.

Machine Learning Techniques

1. Supervised Learning Models:

o Logistic Regression: Interpretable model for binary classification.
o Random Forest and Gradient Boosting (e.g., XGBoost): Capable of handling
imbalanced data with class weights.
o Neural Networks: For capturing complex patterns but require significant data and
resources.
o Support Vector Machines (SVM): Effective for smaller, high-dimensional
datasets.

2. Anomaly Detection Techniques:

o Unsupervised Models (useful when labels are scarce or unreliable):

• Autoencoders: Neural networks that learn to reconstruct data; higher reconstruction

error indicates potential fraud.
• Isolation Forest: Detects anomalies by isolating observations.
• Clustering Algorithms (e.g., DBSCAN, K-Means): Identifies clusters of
legitimate transactions; outliers are flagged as anomalies.
Evaluation Metrics

Given the class imbalance, conventional accuracy is not a reliable metric. Instead, the following are
used:

 Precision: High precision ensures fewer false positives.

 Recall (Sensitivity): Indicates the model's ability to identify actual frauds.
 F1 Score: Balances precision and recall.
 AUC-ROC Curve: Evaluates the trade-off between true positive and false positive rates.
 Confusion Matrix: Provides insights into false positives, false negatives, and true
classification rates.

Steps to Develop the Model

1. Exploratory Data Analysis (EDA):

o Understand the data distribution, identify missing values, and visualize class
imbalance.
2. Data Preprocessing:
o Handle missing data, normalize/scale features, and balance classes.
3. Model Training:
o Train multiple models using supervised and unsupervised techniques.
4. Hyperparameter Tuning:
o Optimize model parameters using grid search, random search, or Bayesian
optimization.
5. Evaluation:
o Evaluate models using the metrics listed above, focusing on minimizing false
negatives (missed frauds).

Anomaly Detection Approach

In anomaly detection:

 Models are trained on legitimate transaction data (normal data).

 Fraudulent transactions are flagged as anomalies based on the deviation from learned
patterns.
 Suitable algorithms:
o One-Class SVM: Learns a boundary for normal data.
o Autoencoders: Reconstruct transactions; high reconstruction errors signify
anomalies.
o Isolation Forest: Efficiently isolates anomalies by random partitioning.
PROGRAM:-

# Importing all the necessary libraries

import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import IsolationForest
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score,
precision_score, recall_score, f1_score
import seaborn as sns
import matplotlib.pyplot as plt

Step 1: Load the Dataset

data = pd.read_csv('creditcard.csv')
data

Step 2: Preprocess the Data

In many fraud detection tasks, the Time column (which represents the time of day the transaction
occurred) may not provide useful information for detecting fraud. Removing it helps reduce noise
and prevents the model from overfitting.

data = data.drop(['Time'], axis=1)

 Normalization scales all features to the same range, crucial for Isolation Forest.
 StandardScaler standardizes features to have a mean of 0 and standard deviation of 1.
 fit_transform computes and applies scaling, excluding the target variable (Class). . This step
improves model performance and stability.

Normalize the Features

scaler = StandardScaler()
data_scaled = scaler.fit_transform(data.drop('Class', axis=1)) # Assume 'Class' is the target label
Step 3: Anomaly Detection Using Isolation Forest
Isolation Forest is used for detecting anomalies in high-dimensional data, like credit card
transactions. It identifies outliers (fraudulent cases). The contamination=0.01 parameter assumes
1% of transactions are fraud. The model is trained using model.fit(data_scaled).

model = IsolationForest(contamination=0.01) # 1% expected fraudulence

model.fit(data_scaled)

Step 4: Predictions
 The predict method of IsolationForest assigns a label of -1 for outliers (fraudulent
transactions) and 1 for normal points (non-fraudulent transactions).
 Since the target variable in the dataset has 1 for fraud and 0 for non-fraud, the predictions
need to be mapped:
o -1 (outliers) -> 1 (fraud)
o 1 (normal) -> 0 (non-fraud)

predictions = model.predict(data_scaled)
predictions = [1 if pred == -1 else 0 for pred in predictions]

Step 5: Evaluate Model Performance

Accuracy
 Accuracy is the proportion of correct predictions (both fraud and non-fraud) out of all
predictions. It's useful for general performance but may be misleading in imbalanced
datasets.
Precision
 Precision measures how many of the predicted fraud cases are actually fraud. It’s important
when false positives (non-fraud labeled as fraud) are costly.
Recall
 Recall indicates how many of the actual fraud cases were correctly identified. It’s crucial
when missing fraud cases (false negatives) is more costly.
F1-Score
 F1-Score is the balance between precision and recall. It’s useful when you need to balance
both false positives and false negatives, especially in imbalanced datasets.

# Accuracy
accuracy = accuracy_score(labels, predictions)
print("\nAccuracy:", accuracy)

# Precision
precision = precision_score(labels, predictions)
print("Precision:", precision)
# Recall
recall = recall_score(labels, predictions)
print("Recall:", recall)

# F1-Score
f1 = f1_score(labels, predictions)
print("F1-Score:", f1)

Classification Report
The classification report provides additional metrics like precision, recall, and F1-score for each
class (fraud and non-fraud). This gives more insight into how well the model performs on each
class.

print("\nClassification Report:")
print(classification_report(labels, predictions))

Step 6: Plotting the Confusion Matrix

 Confusion matrix visualization is helpful to understand how the model is performing.
 A heatmap is used to visually display the confusion matrix, with the actual labels on the y-
axis and predicted labels on the x-axis. The color intensity indicates the number of instances
in each category.

plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['Non-Fraud', 'Fraud'],
yticklabels=['Non-Fraud', 'Fraud'])
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('Actual Label')
plt.show()
True Positives (TP): 305 transactions were correctly identified as fraudulent.
True Negatives (TN): 281771 transactions were correctly identified as non-fraudulent.
False Positives (FP): 2544 transactions were incorrectly identified as fraudulent (Type I error).
False Negatives (FN): 187 transactions were incorrectly identified as non-fraudulent (Type II
error).
MINI PROJECT

AIM :- Fake News Detection- fake news is sometimes transmitted through the
internet by some unauthorised sources, which creates issues for the targeted person
and it makes them panic and leads to even violence. Dataset: fake-news kaggle.

DESCRIPTION :-

Project Overview:
The project aims to develop a machine learning model for fake news detection using textual data.
The increasing prevalence of misinformation on the internet has made it critical to identify and flag
fake news articles. This model leverages Natural Language Processing (NLP) techniques, machine
learning algorithms, and feature extraction methods to classify news articles as either real or fake.

Description:
With the rise of social media and online news platforms, the spread of fake news has become a
major issue. Fake news can cause harm by spreading misinformation, creating panic, or inciting
violence. The goal of this project is to build an automatic system to classify news articles as real or
fake based on their content. The workflow involves cleaning and preprocessing the data,
visualizing the distribution and common words in fake news, extracting relevant features from the
text, training a machine learning model, evaluating its performance, and saving the model for
future predictions.

Steps Involved:
1. Data Collection and Understanding:
o A dataset containing news articles, typically labeled as either "real" or "fake", is
loaded into the system.
o The dataset is then analyzed to check the distribution of labels, missing values, and
any anomalies. Basic statistics are explored, and the data is cleaned for the next
steps.
2. Data Preprocessing:
o Text preprocessing is a critical part of any NLP task. Raw text data can contain
noise, such as URLs, special characters, and unnecessary whitespaces. Cleaning the
text helps the model focus on important features, like the core vocabulary.
o Steps include:
 Removing URLs using regular expressions.
 Removing non-alphabetic characters.
 Converting text to lowercase for uniformity.
 Stripping unnecessary spaces from the text.
3. Exploratory Data Analysis (EDA):
o Visualization: Before training the model, it is helpful to understand the dataset
better.
o Class distributions (real vs. fake news) are visualized.
o A word cloud of the most frequent words in fake news helps to identify common
patterns and potentially misleading vocabulary that could be indicators of fake
news.
4. Feature Extraction using TF-IDF:
o TF-IDF (Term Frequency-Inverse Document Frequency) is used to convert raw
text data into numerical vectors that can be processed by machine learning
algorithms.
o This method captures the importance of words in a document while penalizing
words that appear frequently across many documents, making it suitable for text
classification tasks.
5. Model Training:
o A Logistic Regression classifier is trained on the preprocessed and vectorized text
data. Logistic Regression is a simple and efficient algorithm for binary classification
tasks.
o The dataset is split into training and test sets to evaluate the model’s performance on
unseen data.
6. Model Evaluation:
o The trained model is evaluated using common metrics like:
 Accuracy: The proportion of correct predictions.
 Precision, Recall, and F1-score: These metrics are especially useful in
imbalanced datasets (such as news articles where fake news may be less
common than real news).
 Confusion Matrix: This matrix helps visualize how well the model
distinguishes between real and fake news by showing the counts of true
positives, true negatives, false positives, and false negatives.
7. Saving the Model:
o Once the model is trained and evaluated, it is saved using the joblib library so it can
be reloaded and used to make predictions on new, unseen data without retraining.
Technology Stack and Libraries Used
Tech Stack:
1. Python: The programming language used for the entire project. Python is well-suited for
machine learning tasks due to its simplicity and the wide availability of libraries and
frameworks.
2. Jupyter Notebook or Google Colab: Used for writing, testing, and visualizing the code
and results interactively.
Libraries Used:
1. pandas:
o Purpose: Data manipulation and analysis.
o Usage: It is used to load and preprocess the dataset, clean missing data, and analyze
the structure of the dataset.
2. numpy:
o Purpose: Numerical computing.
o Usage: It is used for working with arrays and matrices, particularly in conjunction
with machine learning algorithms.
3. re (Regular Expression):
o Purpose: Text processing.
o Usage: Regular expressions are used to clean the text data, such as removing URLs
or special characters.
4. spaCy (Optional but useful):
o Purpose: Natural Language Processing (NLP).
o Usage: For advanced NLP tasks like tokenization, lemmatization, and part-of-
speech tagging. While not explicitly used in the provided code, it can be helpful for
more sophisticated preprocessing.
5. matplotlib and seaborn:
o Purpose: Data visualization.
o Usage: Used for visualizing data distributions and model performance.
6. wordcloud:
o Purpose: Text visualization.
o Usage: It generates a visual representation (word cloud) of the most frequent words
in a text corpus.
7. scikit-learn (sklearn):
o Purpose: Machine learning.
o Usage: It provides tools for feature extraction, model training, evaluation, and
tuning.
PROGRAM:
#Step 1: Load Libraries and Dataset
import pandas as pd
import numpy as np
import re
import matplotlib.pyplot as plt
import seaborn as sns
from wordcloud import WordCloud
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
from sklearn.pipeline import Pipeline

# Load the dataset

data = pd.read_csv("train.csv", encoding="utf-8")

# Display basic information

print(data.head())
print(data.info())
print(data['label'].value_counts()) # Check class distribution
# Step 2: Data Preprocessing
# Text cleaning function
def preprocess_text(text):
text = re.sub(r"http\S+", "", text) # Remove URLs
text = re.sub(r"[^a-zA-Z\s]", "", text) # Remove non-alphabetic characters
text = text.lower() # Convert to lowercase
text = text.strip() # Remove leading/trailing spaces
return text

# Apply preprocessing
data['text'] = data['text'].fillna('').apply(preprocess_text)

# Check the cleaned text

print(data['text'].head())

# Step 3: Exploratory Data Analysis (EDA)

# Visualize class distribution
sns.countplot(data['label'])
plt.title("Class Distribution: Real vs Fake")
plt.show()

# Generate WordCloud for Fake News

fake_news = " ".join(data[data['label'] == 1]['text'])
wc_fake = WordCloud(width=800, height=400, background_color='black').generate(fake_news)

plt.figure(figsize=(10, 5))
plt.imshow(wc_fake, interpolation='bilinear')
plt.axis("off")
plt.title("Most Common Words in Fake News")
plt.show()
# Step 4: Feature Extraction (TF-IDF Vectorization)
# Extract features and labels
X = data['text']
y = data['label']

# Split data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# TF-IDF Vectorization
vectorizer = TfidfVectorizer(max_features=5000, stop_words='english')
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)
print("TF-IDF Matrix Shape:", X_train_tfidf.shape)

# Step 5: Model Training

# Train Logistic Regression Model
model = LogisticRegression()
model.fit(X_train_tfidf, y_train)

# Make predictions
y_pred = model.predict(X_test_tfidf)
# Step 6: Model Evaluation
# Evaluate performance
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

print("Classification Report:\n", classification_report(y_test, y_pred))

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=['Real', 'Fake'],
yticklabels=['Real', 'Fake'])
plt.title("Confusion Matrix")
plt.ylabel("True Label")
plt.xlabel("Predicted Label")
plt.show()
# Step 7: Save the Model
import joblib

# Save the trained model and vectorizer

joblib.dump(model, 'fake_news_model.pkl')
joblib.dump(vectorizer, 'tfidf_vectorizer.pkl')

print("Model and vectorizer saved!")

Credit Card Fraud Detection (Data Analyst)
No ratings yet
Credit Card Fraud Detection (Data Analyst)
22 pages
Lets19 Profed Intensive Janus.
100% (1)
Lets19 Profed Intensive Janus.
4 pages
Credit Card Fraud Detection
100% (1)
Credit Card Fraud Detection
20 pages
English9 - q1 - Mod4 - Infer Thoughts Feelings and Intentions in The Material Viewed - v1 1
67% (3)
English9 - q1 - Mod4 - Infer Thoughts Feelings and Intentions in The Material Viewed - v1 1
38 pages
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
No ratings yet
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
6 pages
ITR Presentation(FINAL)
No ratings yet
ITR Presentation(FINAL)
14 pages
Phase-3 (1)
No ratings yet
Phase-3 (1)
19 pages
FINANCIAL FRAUD DETECTION
No ratings yet
FINANCIAL FRAUD DETECTION
11 pages
B17 Discrete Report
No ratings yet
B17 Discrete Report
16 pages
Module 3.4 Classification Models, Case Study
No ratings yet
Module 3.4 Classification Models, Case Study
12 pages
ANN,KNN & Decision Tree[1]
No ratings yet
ANN,KNN & Decision Tree[1]
13 pages
Phase 5
No ratings yet
Phase 5
10 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
34 pages
Fraud Detection in Financial Transactions.ppt.pptx_20240805_175608_0000 (1)
No ratings yet
Fraud Detection in Financial Transactions.ppt.pptx_20240805_175608_0000 (1)
22 pages
AI and DS Final Document For Phase 5
No ratings yet
AI and DS Final Document For Phase 5
9 pages
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
No ratings yet
Capstone Project - Credit Card Fraud Prediction - Alexandre Daltro
15 pages
Credit Card Fraud Detection
No ratings yet
Credit Card Fraud Detection
25 pages
Credit Card Fraud Detection Report
100% (1)
Credit Card Fraud Detection Report
17 pages
Phase 2 New
No ratings yet
Phase 2 New
14 pages
credit card fraud detection
No ratings yet
credit card fraud detection
8 pages
Banking Fraud Detection Outline
No ratings yet
Banking Fraud Detection Outline
6 pages
IEEE_Conference_Template (2)
No ratings yet
IEEE_Conference_Template (2)
3 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
6 pages
Credit Card Fraud Detection_final
No ratings yet
Credit Card Fraud Detection_final
3 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
28 pages
Fraud Detection in Financial Transactions
No ratings yet
Fraud Detection in Financial Transactions
2 pages
Online Payments Fraud Detection Documentation
No ratings yet
Online Payments Fraud Detection Documentation
40 pages
final project document
No ratings yet
final project document
8 pages
Project Report - Credit Card Fraud Detection
No ratings yet
Project Report - Credit Card Fraud Detection
11 pages
Report
No ratings yet
Report
14 pages
Credit Card Fraud Detection and Analysis
No ratings yet
Credit Card Fraud Detection and Analysis
4 pages
Presentation Credit Card
No ratings yet
Presentation Credit Card
25 pages
Bank Fraud Prediction
No ratings yet
Bank Fraud Prediction
16 pages
Bank Fraud Detection Project
No ratings yet
Bank Fraud Detection Project
30 pages
Credit Card Fraud Analysis Ashutosh
No ratings yet
Credit Card Fraud Analysis Ashutosh
3 pages
Capstone Report: FIRST NAME: Gopalakrishnan LAST NAME: Kalarikovilagam Subramanian M12821535
No ratings yet
Capstone Report: FIRST NAME: Gopalakrishnan LAST NAME: Kalarikovilagam Subramanian M12821535
17 pages
ML Final
No ratings yet
ML Final
34 pages
Midway Report Group 7
No ratings yet
Midway Report Group 7
8 pages
Aifb Lab Manual Exp 6 - Aids
No ratings yet
Aifb Lab Manual Exp 6 - Aids
3 pages
Session 5
No ratings yet
Session 5
21 pages
Industrial Oriented Mini Project - Summer Internship On
No ratings yet
Industrial Oriented Mini Project - Summer Internship On
14 pages
sibi 5
No ratings yet
sibi 5
27 pages
Fraud Detection in Banking Data Using Machine Learning
No ratings yet
Fraud Detection in Banking Data Using Machine Learning
17 pages
Copy of final eddited research paper1
No ratings yet
Copy of final eddited research paper1
6 pages
imac-pretty-1 (1)
No ratings yet
imac-pretty-1 (1)
8 pages
Poster
No ratings yet
Poster
2 pages
Credit Card Fraud Detection-ppt-1
100% (1)
Credit Card Fraud Detection-ppt-1
22 pages
Final Year Project
No ratings yet
Final Year Project
27 pages
Phase-2 for DS.docx
No ratings yet
Phase-2 for DS.docx
13 pages
ads
No ratings yet
ads
8 pages
Credit Card Fraud Detection by Data Analytics Using Python: Malay Joshi, Yudhishthir Bhunwal and Dr. Smita Agarwal
No ratings yet
Credit Card Fraud Detection by Data Analytics Using Python: Malay Joshi, Yudhishthir Bhunwal and Dr. Smita Agarwal
4 pages
PPT Dự án cuối kỳ nhóm 8
No ratings yet
PPT Dự án cuối kỳ nhóm 8
38 pages
Nityananda Vyawhare 2223216 Case Study 5
No ratings yet
Nityananda Vyawhare 2223216 Case Study 5
5 pages
Journal Paper
No ratings yet
Journal Paper
5 pages
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
No ratings yet
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
19 pages
S-11
No ratings yet
S-11
7 pages
Introduction of Phase 4
No ratings yet
Introduction of Phase 4
14 pages
Mini Project
No ratings yet
Mini Project
12 pages
Phase 5 Fraud detection in financial transactions
No ratings yet
Phase 5 Fraud detection in financial transactions
17 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
MSF Degree Plan
No ratings yet
MSF Degree Plan
1 page
Anger Management- 9th Grade
100% (1)
Anger Management- 9th Grade
16 pages
Advangtages & Disadvantages of Linux Mint 9 Over Windows
No ratings yet
Advangtages & Disadvantages of Linux Mint 9 Over Windows
2 pages
G9 Earth and Space Q3
No ratings yet
G9 Earth and Space Q3
10 pages
Lesson Plans For Junior High Schools: Rel. & Moral Education
No ratings yet
Lesson Plans For Junior High Schools: Rel. & Moral Education
42 pages
Answer Key and Recording Scripts: Paper 1 Reading
No ratings yet
Answer Key and Recording Scripts: Paper 1 Reading
3 pages
SOP for Bangor University
No ratings yet
SOP for Bangor University
1 page
A1 Wortliste Start Deutsch en
No ratings yet
A1 Wortliste Start Deutsch en
33 pages
Membership Categories 2012 IOSH
100% (1)
Membership Categories 2012 IOSH
2 pages
Grand Practicals in Clinical Micros
No ratings yet
Grand Practicals in Clinical Micros
2 pages
Dr. B. R. AMBEDKAR AND MAKING OF THE CONSTITUTION - A Case Study of Indian Federalism
No ratings yet
Dr. B. R. AMBEDKAR AND MAKING OF THE CONSTITUTION - A Case Study of Indian Federalism
13 pages
Gned 05 Lesson 12ndsem
No ratings yet
Gned 05 Lesson 12ndsem
14 pages
Liliosa B Diasanta Marungko Approach
No ratings yet
Liliosa B Diasanta Marungko Approach
67 pages
What is literature an attempt to definition_231009_214357
No ratings yet
What is literature an attempt to definition_231009_214357
8 pages
The AdaptiBar MBE Simulator Prep Method - AdaptiGroup LLC
No ratings yet
The AdaptiBar MBE Simulator Prep Method - AdaptiGroup LLC
3 pages
The Pearson Correlation of GWA and PA, DB, UP
No ratings yet
The Pearson Correlation of GWA and PA, DB, UP
1 page
Activity Guide and Evaluation Rubric Speaking Practice
No ratings yet
Activity Guide and Evaluation Rubric Speaking Practice
6 pages
Lateral Thinking
No ratings yet
Lateral Thinking
2 pages
Allison Pinto Testimony
No ratings yet
Allison Pinto Testimony
4 pages
In Serviceeducation 160618131717
No ratings yet
In Serviceeducation 160618131717
34 pages
30 Chapter30 546-561-1
No ratings yet
30 Chapter30 546-561-1
16 pages
Colleges List
No ratings yet
Colleges List
33 pages
GG
No ratings yet
GG
10 pages
The Role of AI and ML in Enhancing Software Testing Automation
No ratings yet
The Role of AI and ML in Enhancing Software Testing Automation
2 pages
Artigo 3
No ratings yet
Artigo 3
20 pages
Exam Schedule
No ratings yet
Exam Schedule
2 pages
English Front Page
No ratings yet
English Front Page
3 pages
Cs8082 Machine Learning Techniques Ripped From Amazon Kindle e Books by Sai Seena
No ratings yet
Cs8082 Machine Learning Techniques Ripped From Amazon Kindle e Books by Sai Seena
148 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Disaster

Uploaded by

Disaster

Uploaded by

Micro Project

S. Program Page no Signature

S. Program Page no Signature

AIM :- Write a python program to calculate the Variance, Standard Deviation,

• μ is the population mean, xˉ is the sample mean.

# Function to calculate Variance, Standard Deviation, Skewness, and Kurtosis

variance = np.var(data, ddof=1)

return variance, std_dev, skewness, kurt

Machine Learning Techniques

1. Supervised Learning Models:

2. Anomaly Detection Techniques:

• Autoencoders: Neural networks that learn to reconstruct data; higher reconstruction

 Precision: High precision ensures fewer false positives.

Steps to Develop the Model

1. Exploratory Data Analysis (EDA):

Anomaly Detection Approach

 Models are trained on legitimate transaction data (normal data).

# Importing all the necessary libraries

Step 1: Load the Dataset

Step 2: Preprocess the Data

data = data.drop(['Time'], axis=1)

Normalize the Features

model = IsolationForest(contamination=0.01) # 1% expected fraudulence

Step 5: Evaluate Model Performance

Step 6: Plotting the Confusion Matrix

# Load the dataset

# Display basic information

# Check the cleaned text

# Step 3: Exploratory Data Analysis (EDA)

# Generate WordCloud for Fake News

# Split data into training and test sets

# Step 5: Model Training

print("Classification Report:\n", classification_report(y_test, y_pred))

# Save the trained model and vectorizer

print("Model and vectorizer saved!")

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.