Lecture 1
Lecture 1
1
Statistics
Statistics is a field of study concerned with:
• the collection, organization, summarization, and analysis of data; and
• the drawing of inferences about a body of data when only a part of the data
is observed.
2
Data
• The raw material of statistics is data.
• It is one thing to have a dataset either small or large, but the beauty of it is how well
you can generate relevant information for descriptive/inferential purposes.
The two kinds of data (numbers)
• - Measurement (a nurse weights a patient or takes a patient’s temperature)
• - counting. (Number of patients attending a health facility/number of births/
number of deaths/ discharged)
3
Sources of Data
• Routinely kept records (Case files). E.g. Profile of cervical cancer ptx in UBTH
from 2005-2019.
• Surveys (questionnaires). KAP studies, Student’s adoption of virtual classroom in a
Igwe ayanvwen community, The Use of Latin Maxims among Nigerian practicing lawyers
outside the court.
• Experiments (Laboratory, clinical trials, etc)
• External sources (database-CBN, NDHS, UNICEF, USAID, etc)
4
Variable
• If, as we observe a characteristic, we find that it takes on different values in different
persons, places, or things, we label the characteristic a variable. We do this for the
simple reason that the characteristic is not the same when observed in different
possessors of it. E.g. Blood pressure (SBP/DBP), HR, Height, weight, BMI.
Types of variables
• Quantitative Variables - can be measured in the usual sense e.g. age, weight, height.
They convey information regarding amount
• Qualitative Variables - cannot be measured in the usual sense e.g. Outcome of a
diagnosis, ethnic group, gender. They convey information regarding attributes.
5
Other classification
• Dependent variables/Outcome variable/Response variable
6
Random variable
• Whenever a set of values obtained arise as a result of chance factors, so that
they cannot be exactly predicted in advance, the variable is called a random
variable. An example of a random variable is adult height. When a child is
born, we cannot predict exactly his or her height at maturity. Attained adult
height is the result of numerous genetic and environmental factors.
7
Types of Random Variables
Discrete Random
• The number of daily admissions to a general hospital is a discrete random variable
since the number of admissions each day must be represented by a whole number,
such as 0, 1, 2, or 3. The number of admissions on a given day cannot be a number
such as 1.5, 2.997, or 3.333. The number of decayed, missing, or filled teeth per
child.
Continuous Random Variable
• A continuous random variable does not possess the gaps or interruptions
characteristic of a discrete random variable. Examples various measurements that
can be made on individuals such as height, weight, and skull circumference.
8
Measurement Scale in SPSS
10
Choice of Test Statistics
11
Parametric/Non parametric Statistics
• Parametric – follows normal distribution
• Non parametric – doesn’t follow normal distribution
12
13
One outcome variable and one explanatory
variable
Type of Type of
outcome explanatory
variable variable Number of levels of categorical variables Statistic
Categorical Categorical Both variables are binary/has two or more levels Chi-square
Odds ratio/Relative Risk
Logistic regression (Binary/Multinomial)
Categorical Continuous Categorical variable is binary ROC curve
Survival analyses
Categorical variable is multi-level and ordered Spearman's correlation coefficient
14
Choosing a statistic for one or more outcome variables
and more than one explanatory variable
Type of explanatory Number of levels of categorical
Type of outcome variable(s) variable Statistic
Continuous -only one Both continuous and Multiple
outcome categorical Categorical variables are binary regression
At least one of the explanatory variables Two-way
Categorical has three or more categories ANOVA
Both continuous and One categorical variable has two or
categorical more levels ANCOVA
Continuous - outcome Repeated
measured more than Both continuous and measures
once categorical Categorical variables are binary ANOVA
Both continuous and Categorical variables can have two or
No outcome variable categorical more levels Factor analysis
15
Thank you for listening
Questions
16