Introduction To Data and Statistics With R
Introduction To Data and Statistics With R
HELLO!
I am Elijah Appiah from
Ghana.
I am an Economist by
profession.
I love everything about R!
Lecture Series
Introduction to Data and Statistics
Foundations of Probability
Inferential Statistics
Lesson Goal
Introduce statistics as a science of
understanding and analyzing data
and making data-based decisions.
5
Statistics
Statistics
Statistics - Data
Statistics - Data
Data
Numeric Categorical
(Quantitative) (Qualitative)
Data
Numeric Categorical
Discrete – counts Nominal – names, labels,
e.g. number of categories with no natural
cylinders of a vehicle order
e.g. gender, countries
Continuous – measured Ordinal – categories with
even within an interval an order
e.g. height, weight e.g. Likert Scales
10
Normal
Right Skewed
22
Population vs Sample
Population – entire group you to want
to draw conclusions about
e.g. income of all countries in the world
Sample – specific group from the
population used for inference
e.g. income of countries in Africa
24
Measures of Spread
Data Variability Mean = 0
SD = 1
Mean = 0
SD = 2
29
Measures of Spread
Range: (maximum – minimum)
Measures of Spread
Range: max(x) – min(x); range()***
Variance: var(x)
Robust Statistics
A measure least affected by extreme
values
32
Robust Statistics
Robust measures of Center & Spread
Example:
Data Mean Median
1,2,3,4,5,6 3.5 3.5
1,2,3,4,5,1000 169.12 3.5
Robust Statistics
Median is a more robust statistic of
center than the mean.
Robust Statistics
Robust statistics like the median and
IQR are most useful for describing
skewed distributions.
Data Transformation
Rescaling data
Logarithmic Transformation
Square Root Transformation
36
Exploring Categories
Data
titanic {ggmosaic}
Passengers and crew on board the Titanic
Description
A dataset containing some demographics and survival of people
on board the Titanic
Variables: Class (1st, 2nd, 3rd, crew); Sex (Male, Female);
Age (Child, Adult); Survived (Yes, No)
39
Exploring Categories
One Categorical Variable
Frequency Table
Bar Plots
40
Exploring Categories
Two Categorical Variables
Contingency Table
Stacked Bar Plots
Clustered Bar Plots
Mosaic Plots
42
Exploring Categories
One Numerical and One Categorical
Box Plot
44
Any questions?
Reach me anytime!
Email
eappiah.uew@gmail.com
LinkedIn
https://www.linkedin.com/in/appiah-elijah-383231123/