ST1009 Week2
ST1009 Week2
ST1009
Week 2
1
1
4/17/2020
IN DESCRIPTIVE STATISTICS…
Collect data
e.g. Survey
Organise and Present data
e.g. Tables and graphs
Analyse data
e.g. Sample mean = X i
n 3
COLLECT DATA
Primary Secondary
Data Collection Data Compilation
Print or Electronic
Observation Survey
Experimentation
4
2
4/17/2020
3
4/17/2020
TABULAR FORM
ONE WAY TABLE – FREQUENCY TABLE
District No. of
No. of students admitted to students
the University in an Admitted
Colombo
academic year (2019/2020),
Gampaha
according to district.
…
…
…
Total
4
4/17/2020
TABULAR FORM
CROSS TABULATIONS – TWO-WAY TABLE
Distribution of ethnicity in different age groups
Distribution of gender of the employers for different employment category
Count
Employment Category
Clerical Custodial Manager Total
Gender Female 206 0 10 216
Male 157 27 74 258
Total 363 27 84 474
9
PICTURE FORMS
5
4/17/2020
BAR CHARTS
The bar chart makes comparisons by means of parallel bars
whose lengths are proportional to the values represented
25.0
20.0
15.0
10.0
5.0
0.0
Children Young adults Middle Aged adults Senior citizens 12
Age groups
6
4/17/2020
70.0
60.0
40.0
Percentage
50.0
40.0 15.0
35.0
30.0
20.0
33.3 33.3
10.0 5.0 5.0
16.7
8.3 8.3
0.0
13
Private Rented Tourist Hired Jeep Other
vehicle vehicle coach
Mode of travel
14
7
4/17/2020
45.0
40.0
40.0
35.0
33.3 33.3
35.0
30.0
Percentage
25.0
20.0 16.7
15.0
15.0
8.3 8.3
10.0
5.0 5.0
5.0
0.0
Private Rented Tourist Hired Jeep Other 15
vehicle vehicle coach
LINE CHART
This is useful in particular to emphasize the changes in some variable
occurring during an interval of time.
No. selected students to Universities by Year of AL
Exam
16,000
14,000
Number of students
12,000
10,000
8,000
6,000
4,000
2,000
0
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 16
Year
8
4/17/2020
PIE CHART
Total export figures in 1993
This is useful in showing a
Type Rs.('000) %
total into its component parts.
Agriculture 14554 57.99
Component parts are
Industrial 8821 35.15
expressed as percentages of
the total and are represented Mineral 1132 4.51
by segments of a circle whose Other 589 2.35
sizes are proportional to the
percentages. Total 25096 100.00
17
Industrial
35%
Mineral 58%
58% Other
18
9
4/17/2020
SCATTER DIAGRAM
Are used to examine the relationship between two continuous variables.
Example: Student marks for two subjects
Relationship between subject-1 &
subject-2
120
100
subject-1
80
60
40
20
0
0 20 40 60 80 100 19
subject-2
10
4/17/2020
IMPORTANT!
EXAMPLES…
11
4/17/2020
EXAMPLE…
Following are the ethnicities of 200 persons in the sample
Sinhalese 155
Tamil 25
Muslims 15
Other 5
23
FREQUENCY DISTRIBUTION
12
4/17/2020
STEPS…
STEPS…
When all intervals have the same width, the following rule
may be used to find the required class interval width:
W = (L - S) / K
where:
W = Class width, L= Largest value,
S = Smallest value, K= No. of classes 26
13
4/17/2020
EXAMPLE
Suppose the age of a sample of 10 students are:
20.9, 18.1, 18.5, 21.3, 19.4, 25.3, 22.0, 23.1, 23.9, and 22.5
EXAMPLE
Class Interval Frequency Rel. Freq.
18 ≤ x < 20 3 30%
20 ≤ x < 22 2 20%
22 ≤ x < 24 4 40%
24 ≤ x < 26 1 10%
Total 10 100%
Note that the sum of all relative frequencies must add up to 1.00 or
100%. Here, we see that 40% of all students are younger than 24 years 28
old, but older than 22 years old.
14
4/17/2020
EXAMPLE
CLASS LIMITS
True classes are those classes such that the upper true limit of
a class is the same as the lower true limit of the next class.
True class limit is obtained by adding the upper class limit of
one class interval to the lower class limit of the next higher
class interval and dividing by two.
30
15
4/17/2020
Example:
Stated Limits True Limits
Rs.600 ≤ x ≤ 799...........Rs.599.50 ≤ x ≤ 799.50
Rs.800 ≤ x ≤ 999...........Rs.799.50 ≤ x ≤ 999.50
CUMULATIVE FREQUENCY
The total frequency of all the values less than the upper class boundary
of a given class interval is called the cumulative frequency up to and
including that class interval.
Cumulative frequency of a class interval = frequency of the class
interval + frequencies of preceding class intervals.
The cumulative frequencies for the previous problem are: 3, 5, 9, and 10.
(Slide 28) 32
16
4/17/2020
HISTOGRAM
HISTOGRAM
34
17
4/17/2020
EXAMPLE
Class Intervals Mid Point Width U.C.B. Freq. Freq. Density Cumul. Freq.
126-136 131 10 136 3 0.3 3
136-146 141 10 146 5 0.5 8
146-156 151 10 156 5 0.5 13
156-166 161 10 166 5 0.5 18
166-176 171 10 176 2 0.2 20
6
Frequency
4
2
0
35
36
18
4/17/2020
FREQUENCY POLYGON
EXAMPLE
Frequency Polygon
Frequency Density
0.6
0.4
0.2
0
121 131 141 151 161 171 181
19
4/17/2020
OGIVE
Ogive
Cumulative frequency
25
20
15
10
5
0
126 136 146 156 166 176
Upper class boundaries 40
20
4/17/2020
FREQUENCY CURVE
When the no. of intervals gets large, the freq. polygon will
consist of a large no. of line segments, and the freq.
polygon approaches a smooth curve known as a freq.
curve. i.e. The freq. curve is obtained by smoothing the
freq. polygon.
Freq. curve is useful to have some idea about the shape of
the freq. distribution.
41
SMOOTHING OF DISTRIBUTION
42
21
4/17/2020
43
22
4/17/2020
2. Kurtosis:
Kurtosis is a measure of whether the data are peaked or flat relative to a
normal distribution.
Data sets with high kurtosis tend to have a distinct peak near the mean,
decline rather rapidly, and have heavy tails.
Data sets with low kurtosis tend to have a flat top near the mean rather
than a sharp peak.
45
EXAMPLE
A department store has its own credit card accounts. The department
randomly selects 40 accounts and records the number of days within
which the bill is paid:
16 9 5 8 6 10 16 4 11 4
3 19 21 16 15 24 45 11 8 19
37 59 14 72 3 22 10 6 14 11
20 9 16 6 75 21 7 15 12 10
46
23
4/17/2020
EXAMPLE
24