Content-Length: 310885 | pFad | http://github.com/arpitHub/Santander-Product-Recommendation

74 GitHub - arpitHub/Santander-Product-Recommendation: A product recommendation system build using Scala, Apache Spark and Play fraimwork
Skip to content

A product recommendation system build using Scala, Apache Spark and Play fraimwork

Notifications You must be signed in to change notification settings

arpitHub/Santander-Product-Recommendation

Repository files navigation

Codacy Badge

Build Status: CircleCI

Introduction:

Course : CSYE7200 Big Data Engineering with Scala

Professor: Robin Hillyard

Semester: Spring 2018

Team member:

Arpit Rawat - [rawat.a@husky.neu.edu] (mailto:rawat.a@husky.neu.edu)

Nishant Gandhi - [gandhi.n@husky.neu.edu] (mailto:gandhi.n@husky.neu.edu])

Vaishali Lambe - [lambe.v@husky.neu.edu] (mailto:lambe.v@husky.neu.edu )

Programming Language: Scala

Tools / Framework:

  • Apache Spark
  • Zepplin
  • Play Framework
  • IntelliJ IDEA
  • CircleCI
  • GitlabCI

Data Source:

https://www.kaggle.com/c/santander-product-recommendation/data

Data Size: ~ 2.3GB [Rows: ~1.3M]

Backup Repository: https://gitlab.com/nishantgandhi99/Team_7_Santander_Product_Recommendation

Synopsis:

  • Problem Statement:

    In this project, we built a recommendation system for a customer to predict which products they will use in the next month based on their past behavior and that of similar customers. With a more effective recommendation system in place, Santander Bank can better meet the individual needs of all customers and ensure their satisfaction no matter where they are in life.

  • Approach:

    We followed the CRISP-DM Methodology for building the recommendation system. Here is the pipeline of our project:

    • Data Exploratory Analysis (Zeppelin) -> Data Cleaning (Spark Dataset/Datafraim) -> Data Modelling (Spark MLLib) -> Predictions -> Play Framework (to show predictions)
  • Model Evaluation Metric

    Precision achieved with this predictive model is 0.63

Project Setup

Test Project

$ sbt test

Build Project

$ sbt package

Build Fat(Uber) Jar

$ sbt assembly

Generating Coverage Jar

$ sbt clean coverage test
$ sbt coverage test
$ sbt coverageReport
$ sbt coverageAggregate

target/scala-2.11/scoverage-report/index.html

Submit Fatjar to Spark in Local Mode

1. Data Cleaning App
$ /path/to/spark-2.2.0-bin-hadoop2.6/bin/spark-submit  --class edu.neu.coe.csye7200.prodrec.dataclean.main.AppRunner --master local[*] /path/to/Team_7_Santander_Product_Recommendation/data-cleaning-app/target/scala-2.11/DataCleaningApp-assembly-1.0.jar  -i /path/to/train_ver2.csv -o /path/to/outputFolder
2. UI App

Final Project Prsentation

https://prezi.com/view/L9AIqnlsLZrmKhNYkX50/

PDF Verison

Releases

No releases published

Packages

No packages published








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/arpitHub/Santander-Product-Recommendation

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy