Welcome To IST 380 !: Data Science Programming
Welcome To IST 380 !: Data Science Programming
Welcome To IST 380 !: Data Science Programming
Data Science
Programming
an advocate of When the course was over, I knew it was a good thing.
concrete computing –
and HMC's mascot - New York Times Review of Courses
Data Science
Programming
an advocate of
concrete computing –
and HMC's mascot
About myself
dodds@cs.hmc.edu
Contact
Information 909-607-0867
Friday mornings, 9-11 am
Office Hours:
or set up a time...
HMC Beckman B111
TMI?
What is it?
Data Science
Venn Diagram
Hmmm… where am I
on this diagram?
• Neighbor's name
Data?!
• A place they consider home
Is "Data Science"
important or just trendy?
Data Science concerns
Hmmm…
the companies are expanding as fast as the data!
There's certainly a lot of it!
Data, data everywhere…
logarithmic scale
800 EB
5 EB
1 Exabyte
120 PB
wisdom
knowledge
information
data
Big Data?
What? Why?
Data Science
Programming Data Rules
Make3d
Andrew Ng ~
Computers and
Thought award,
2009
Learning to
Powerslide
Stanford's
Autonomous
Vehicles project
(Thrun et al.)
Learning ground
from obstacles
classification segmentation
Motivation
Recommender Systems
predicting
movie ratings
Netflix Prize
(I don't know this guy) Bob Bell, winner of the "Netflix prize"
(I don't know this guy) Bob Bell, winner of the "Netflix prize"
Broad background:
Final project ~ open-ended with datasets of your choice
Programming: R
www.r-project.org/
Homepage
Go to the course page http://www.cs.hmc.edu/~dodds/IST380/
1 week + 1 day…
Homework
Assignments
~ 2-5 problems/week ~ 100 points extra credit, often
Due Tuesday of the following week by 11:59 pm.
Assignment 1 due Tuesday, February 5.
statistical modeling
support vector machines (SVMs)
Weeks 6-10
nearest neighbors (NN)
"Machine Learning" random forests No breaks?!
k-means algorithm
Final project
• the last ~4 weeks will work towards a larger, final project
• there will be a short design phase and a short final presentation
• choose your own problem to study (I'll have some suggestions, too.)
• I'd encourage you to connect R and our Data Science techniques
to other datasets or projects that you use/need/like, etc.
Academic Honesty
This course operates under CGU's (and all of Claremont Schools')
Academic Honesty policies…
•Your work must be your own. This must be true for the whole
team, if you're working in a pair.
6 * 7
rnorm(10)
x <- 380
Getting started!
5) You can submit again – all copies are saved… troubles? email me!
This webserver can be
spacey -- I should know!
Creating a vector?
Printing?
Comments?
Comments?
R types
c ~ concatenate
c ~ concatenate
virginica setosa
Today's lab:
smaller details
machine specifics
www.acm.org/education/curric_vols/CC2005_Final_Report2.pdf
CS vs. IS and IT ?
Week 10 Week 13
classes vs. objects final projects
Week 11 Week 14
methods and data final projects
Week 12 Week 15
inheritance final exam
• Neighbor's name
Data?!
• A place they consider home
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.
Alternative Proxies: