Skip to content

This repository contains all the key projects, research work, and implementations I developed during my M.Sc. in Applied Statistics and Data Science at Jahangirnagar University, Dhaka, Bangladesh.

License

Notifications You must be signed in to change notification settings

kowshik14/MSc-Projects

Repository files navigation

Master's Project Portfolio

Welcome to my Master's Project Portfolio repository! This repository contains all the key projects, research work, and implementations I developed during my M.Sc. in Applied Statistics and Data Science at Jahangirnagar University, Dhaka, Bangladesh.

Table of Contents

About Me

I am Kowshik Sankar Roy, a passionate Data Scientist with a background in Electronics and Communication Engineering and an M.Sc. in Applied Statistics and Data Science. My work focuses on data-driven solutions using machine learning, deep learning, and statistical modeling to solve real-world problems. Throughout my master's, I have developed various projects, which you can explore in this repository.

Feel free to connect with me for collaborations or discussions!

Email Website

Projects

  • Description: Developed a hybrid model named TweetGuard that combines Transformer and Bi-LSTM architectures for detecting fake news in large-scale tweets.
  • Key Contributions:
    • Designed a robust text preprocessing pipeline using BERTweet tokenization.
    • Conducted ablation studies to evaluate the performance of individual model components.
    • Demonstrated superior accuracy on the TruthSeeker dataset.
  • Description: Developed a deep learning model that utilizes PCA for dimensionality reduction and Bayesian optimization for hyperparameter tuning to detect fraud in supply chain analytics.
  • Key Contributions:
    • Achieved a 94.71% fraud detection rate with 99.42% overall accuracy on the DataCo dataset.
    • Implemented SMOTE to handle class imbalance.
  • Description: Created a hybrid recommendation system that leverages Deep Collaborative Filtering and TF-IDF Content-Based Filtering to provide personalized anime recommendations.
  • Key Contributions:
    • Overcame the cold-start problem through hybrid filtering.
    • Enhanced user experience by providing accurate, diverse recommendations.
  • Description: Conducted an in-depth analysis of time series data using ARIMA models, focusing on achieving stationarity through differencing and various transformations. Utilized statistical tests (ADF and KPSS), model selection criteria, and forecasting techniques to ensure robust predictions, enhancing the overall effectiveness of time series analysis.
  • Key Contributions:
    • Achieved stationarity in time series data through systematic differencing and transformations.
    • Employed statistical tests (ADF, KPSS) for rigorous validation of stationarity.
    • Selected optimal ARIMA models using AIC, BIC, and forecasting accuracy measures.
    • Generated accurate forecasts, contributing to improved decision-making in time series analysis.
  • Description: Performed unsupervised clustering on grocery firm customer data to identify distinct segments for targeted product development and personalized marketing strategies. Used PCA for dimensionality reduction and compared K-Means, Hierarchical Clustering, and DBSCAN for optimal segmentation.
  • Key Contributions:
    • Applied dimensionality reduction techniques like PCA to streamline the data for efficient analysis.
    • Performed comparative analysis using K-Means, Hierarchical (Agglomerative) Clustering, and DBSCAN algorithms to identify optimal customer clusters.
    • Evaluated the performance of each algorithm based on clustering quality metrics such as silhouette score and Davies-Bouldin index.
    • Visualized the clusters and patterns using Matplotlib and Seaborn to provide business insights.
    • Provided actionable recommendations on product development and personalized marketing strategies based on the segmentation results.

Technologies Used

  • Languages: Python, R, SPSS, Hadoop
  • Frameworks: TensorFlow, PyTorch, Scikit-learn
  • Tools: Google Colab, Git, Jupyter Notebooks, R Studio
  • Techniques: Machine Learning, Deep Learning, Data Mining, Statistical Tests, Time Series Analysis, Big Data, Regression, Clustering

Acknowledgments

I would like to express my gratitude to my academic advisors, peers, and collaborators who contributed to the success of these projects. Special thanks to Jahangirnagar University for providing the platform to carry out this research.

License

This repository is licensed under the MIT License. See the LICENSE file for more details.

About

This repository contains all the key projects, research work, and implementations I developed during my M.Sc. in Applied Statistics and Data Science at Jahangirnagar University, Dhaka, Bangladesh.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy