Types of Data & Descriptive Statistics
Types of Data & Descriptive Statistics
Types of Data & Descriptive Statistics
DESCRIPTIVE STATISTICS
OVERVIEW
Operational definitions
Descriptive statistics
Categorical variables
Involves placing observations into categories
Do our participants cry during EastEnders – YES or NO?
Definitio
Variable Data
n
Ordinal
Measured
Interval/Ratio
Operationalise
Categorical Nominal
TYPES OF DATA
Nominal data
Names and categorises observations
1 = did cry, 2 = did not cry
Ordinal data
Orders observations using some kind of scale
Rank these programmes from most to least depressing
Interval data
Measures observations in terms of units of numerical difference
Score on a depression scale
Ratio data
Same as interval level, but with a true, non-arbitrary zero-point
Number of comfort chocolates eaten
TYPES OF DATA
Ordinal However,
Less informative information is lost as
Mary cried > Bruce we reduce.
Nominal
Least informative
Mary and Bruce both cried
AN EXAMPLE
Construct
Alertness while driving
Operational definition
RT in a driving simulator
Type of variable
Measured
How else might we operationalise this variable?
Type of data
Interval/ ratio
DESCRIPTIVE STATISTICS
Central tendency
Indicates the average value in the data set
Mean, median, mode
Dispersion
Indicates the spread of scores – how are they distributed?
Range, interquartile range, standard deviation
Sum
Score
x
Mean of x
sample N Number of values
in the data set
CENTRAL TENDENCY: MEAN
Advantages
Sensitive: Takes value of each data point into account
Disadvantages
Can be distorted by rogue data points (extreme values)
Type of variable
Measured
Type of data
Interval/ ratio
AN EXAMPLE: MEAN
x
x =
80 + 65 + 53 + 44 + 39 + 51 + 77 + 35 + 56 + 61
10
N
561
=
10
= 56.1
CENTRAL TENDENCY: MEDIAN & MODE
Median
Central value when the data set is arranged sequentially
Mode
The most commonly occurring value in the data set
CENTRAL TENDENCY: MEDIAN
Advantages
Not distorted by rogue data points (extreme values)
Disadvantages
Less sensitive: Does not take value of each data point into account
Type of variable
Measured
Type of data
Ordinal (interval/ ratio when mean is likely to be biased)
AN EXAMPLE: MEDIAN
Median = 3
Median value = (N+1) / 2
= (9 + 1) / 2 = the 5 t h value, which is 3
AN EXAMPLE: MEDIAN
Median = 3
Median value = (N+1) / 2
= (10 + 1) / 2 = the 5.5 t h value, which is 3
AN EXAMPLE: MEDIAN
Advantages
Not distorted by rogue data points (extreme values)
Disadvantages
Less sensitive: Does not take value of each data point into account
Type of variable
Categorical
Type of data
Nominal
AN EXAMPLE: MODE
Mode
2 (moderately drunk)
Modal frequency
4 (we were moderately drunk 4 times out of 10)
Mean age = 35
years
Mean age = 35
years
DISPERSION
Low variability
(scores clustered
around mean)
High variability
spread (scores
widely spread)
DISPERSION: THE RANGE
(36-34)+1 = 3
(69-1)+1 = 69
DISPERSION: THE RANGE
Interquartile range
Middle 50% of scores
i.e. difference between top quarter and bottom quarter
Semi-interquartile range
Interquartile range / 2
Semi-interquartile range = 4 / 2 = 2
DISPERSION: DEVIATION
Deviation
Difference between an individual score and the mean
Mean deviation
The average difference between each individual score and the mean
In your practical reports we will ask you to present the mean value
and the standard deviation for each condition of your experiment
DISPERSION: DEVIATION
d xx
Oscar’s deviation = 1 – 35 = -34
Boris’ deviation = 35 – 35 = 0
Percy’s deviation = 69 – 35 = 34
DISPERSION: MEAN DEVIATION
d xx
Mean deviation = 34 + 0 + 34 / 3 = 22.67
s
x x 2
N 1
Standard deviation = √(-34 2 + 0 2 + 34 2 ) / 2 = 34
DISPERSION: STANDARD DEVIATION
DON’T PANIC!
Just remember which stat to use when and what they mean
Operational definition
Categorical variable
Measured variable
Nominal data
Ordinal data
Interval/ ratio data
Central tendency
Mean
Median
Mode
Dispersion
Range
Standard deviation
READING