Sankhya Data Science Course

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

Sankhya Analytical Research

Decision making with statistical intelligence

Diploma in Data Science

Private and confidential 1


Course Features

• Course design based on 12+ years industry experience

• Exposure to real world analytics

• Hands-on working on R and Python

• Business case examples from domains such as BFSI, FMCG,


Retail, HR, Telecom, Pharmaceutical, etc.

• Course guided by highly skilled and experienced trainers

2
Training Program Framework

3
Focus Areas

A data scientist represents an evolution from the data analyst role.

A strong foundation in Statistics remains a formal requirement.

Analytics skills Domain knowledge

A data scientist’s
skill set

Programming skills Communication skills

4
Course Structure

Assignments
60 hours
Classroom Training

40 hours

Total
Project Work 200
80 hours hours

Evaluation
20 hours

5
Module 0: Introduction

 Introduction to Data Science

 Business Analytics success stories

 Exposure to tools like R, RStudio, MS Excel,


MySQL

 Scope and opportunities

 Fundamentals of Statistics

6
Module 1: Exploratory Data Analysis

 Data Management
– Import, sort, merge,  Critical for successful analytics
aggregate, subset, derive implementation
– Introduction to MySql

 Descriptive Statistics  Good data management helps to


– Central tendency – assess quality of data
– Variation – improve the quality of data
– Shape – make data analysis ready

 Visualisation  Provides data insights


– Bar charts/histograms
– Box-Whiskers plot
– Contour plot  Guides towards business research
– Motion chart problem solution using advanced
analytics

4 hrs 7
Module 2: Statistical Interference

 Distribution Theory and


Hypothesis Testing  Powerful tool for testing
– Discrete distributions researcher’s claim in the planned
– Continuous distributions experiment
– Parametric tests
– Non-parametric tests  Wide application in clinical, market
– Analysis of Variance and social research
– Analysis of Covariance
 Marketing campaigns can be
designed and tested before full
fledged implementation

4 hrs 8
Module 3: Predictive Modelling - Fundamentals

 Basics of modelling
– Modeling framework  Growing area in Risk Management
– Best practices & Marketing

 Multiple Linear Regression  Cross selling/up-selling can be


– Mathematical model done scientifically
– Validating assumptions
– Residual analysis  Financial institutions can predict
– Multicollinearity problem ‘bad’ customers
– Out of sample validation

 Huge scope in e-commerce


business

8 hrs 9
Module 4: Predictive Modelling - Advanced

 Categorical Response
Variable  Categorical response variables are
– Binary logistic regression frequently incorporated in real
– Multinomial logistic world scenarios
regression
– Ordinal logistic regression
 Most widely used class of
– Poisson regression
(modelling count
predictive modelling
response variable)
– Cox regression  Response to offer – Yes/No

 Brand preference –
iPhone/Samsung/Sony

6 hrs 10
Module 5: Time Series Analysis
 Time Series Modelling
– AR models  Set of models of forecasting sales,
– MA models financial indices, economy indices
– ARIMA
– ARCH  Inflation rate ,GDP are predicted
– GARCH using time series modelling
– Time Series regression
– Exponential smoothing
 Nifty/Sensex future values can be
estimated

 Complex financial models are


developed using ARCH & GARCH

4 hrs 11
Module 6: Unsupervised Multivariate Methods

 Segmentation and data


reduction  Provide exploratory segments of
– k-means clustering customer, stores & agents
(algorithm and selection
of best cluster solution)
 PCA/Factor Analysis are powerful
– Principal component
analysis
techniques for dimension reduction
and scoring models
– Principal component
regression
– Factor analysis  PCA is used to resolve
– Multidimensional scaling multicollinearity problem in
regression models

4 hrs 12
Module 7: Machine Learning
 Machine Learning
Algorithms  New generation algorithms
– Naïve Bayes
– Support vector machines  Multiple methods can be used to
– Decision tree decide best predictive model
– Random Forest algorithm
– Neural networks
– Association rules
 Discover hidden pattern which
may not be revealed by classical
– Introduction to KNIME
and RATTLE methods

6 hrs 13
Module 8: Text Mining

 Data Management using


 Volume & velocity of the data is
Python
humungous
 Exploring Pandas library
 Connecting R with Python  Platform for analytics
to perform Statistical implementation in cloud
Analysis environment
 Sentiment analysis of
Facebook and Twitter data  Combines unstructured data with
(Text Mining) structured data

4 hrs 14
Assessment
Assignments

Individual presentation 25
and Viva marks
25 marks

Project work Total


300 Module tests
50 marks
marks 200 marks

15
Eligibility Criteria

• Must have studied Mathematics at 10+2 level (or during


Graduation years)

• Final Year Graduation/Graduate/Post Garduate/Ph.D students


from any background

• Currently Pursuing Post Graduation/Ph.D from any background

• Working Professionals from any background

Course Duration: 6 months ( can be completed in 3 months)

16
About Sankhya

• We are a knowledge process outsourcing firm based in Mumbai and


London, providing training and consulting in the field of Advanced Analytics

• Our vision is to empower our Data Science students to leverage the power
of data and make data-driven decisions in business and research

• Our promoter team has a cumulative experience of 40+ years

• We offer a high level of expertise in the subjects of Statistics coupled with


efficiency in R, SAS and SPSS

• Our clientele include reputed organisations across industry verticals: BFSI,


FMCG and Pharmaceutical

17
Sankhya’s Legacy

50+ clients

PGDDS at
St. Xavier’s 100+ projects
College Mumbai
Sankhya

50+ corporate
12 research
training
papers
programs

18
Our Esteemed Clientele

19
Vinayak Deshpande
• M.Sc. Statistics, Mumbai University (1997),First Rank
• Professor-Ramnarain Ruia College (1997-2004)

• Founder, Managing Director: Sankhya Analytical Research Pvt. Ltd. (2004)


• Director: Medicounts Life Sciences Pvt Ltd(2013)
• Principal Advisor: Direxions Marketing Solutions Pvt. Ltd. (2013)

• Provided services to reputed clients in India, USA and Korea in BFSI, Retail, FMCG
and Pharma domains

• Published papers on biostatistics, marketing analytics ,econometrics Member- Board


of Studies, St.Xaviers College
Worked as a member of Independent Ethics Committee for 2 clinical trials

• Recent research papers:


1) Evaluating Principal Component Regression in case of severe
Multicollinearity.
2) Comparison of predictive modeling methods: Naïve Baye’s, Logistic 20
Regression and Support Vector Machines
Sheetal Joshi
• M.A. Economics, Mumbai University (2004)

• Director: Sankhya Analytical Research Pvt. Ltd. (2005-2010, 2013 onwards)

Instrumental in providing data driven solutions in Marketing , Market Research, Risk


Management and Clinical Research for clients across BFSI sector in India and USA

Highly experienced in designing and delivering corporate training programs in


SAS/R/Business Analytics

Customized clinical data management tool Open Clinica for global clinical trial

• Started Market Management department at Bajaj Allianz General Insurance

Responsible for Initiatives in Customer Lifetime Value Management , Brand


Performance Measurement and Net Promoter Score
Developed platform for Lead Management System, Created single customer view 21
and campaign management (2011-12)
Sankhya Analytical Research

www.SankhyaAnalytics.com

Private and Confidential 22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy