Biostatistics: Descriptive Analysis
Biostatistics: Descriptive Analysis
Descriptive analysis
NEILY ZAKIYAH, PHD., APT
Descriptive statistics
To describe and summarize data
Larger datasets: try to summarize the data with numbers and figures
Descriptive statistics
§ Type of variable
§ Distribution of variable
§ Summary statistics
§ Figures
Variable Interval:
Temperature, calendar
Classification of variables Continuous:
years, intelligence scale
Ordinal:
better/same/worse,
dyspnea degree 2,3,4
Qualitative
(categorical)
Nominal: Note: Some ambiguities in classifying
a type of variable may arise in some
gender, yes/no, blood
cases.
group
Distribution of variable
Normal and skewed distribution
Measurement of
central tendency:
Normal distribution:
mean
Skewed distribution:
median
Differences between groups
Purpose: to make statements about a population in a study
aIf data are censored. b The Kruskal-Wallis test is used for comparing
ordinal or non-Normal variables for more than two groups, and is a
generalisation of the Mann-Whitney U test. c Analysis of variance is a
general technique, and one version (one way analysis of variance) is
used to compare Normally distributed variables for more than two
groups, and is the parametric equivalent of the Kruskal-Wallistest. d If
the outcome variable is the dependent variable, then provided the
residuals (the differences between the observed values and the
predicted responses from regression) are plausibly Normally distributed,
then the distribution of the independent variable is not
important. e There are a number of more advanced techniques, such as
Poisson regression, for dealing with these situations. However, they
require certain assumptions and it is often easier to either dichotomise
the outcome variable or treat it as continuous. © MJ Campbell 2016, S
Shantikumar 2016
Choice of statistical test for paired or matched observations
Effectiveness of influenza vaccination is assessed in a study, where 150 patients aged 65 years old
or older are getting the influenza vaccination, and another 150 patients in the same age group
are getting placebo. The study shows that the rate of influenza in the patients group with
vaccination is lower than in the patients group without vaccination. How to determine whether
the difference is statistically significant?
u Chi square is used to estimate whether the difference in influenza rate among two patients
group is statistically significant.
c2 = å ( O - E)2 / E df = (r-1) (c-1)
O: observed
E: expected
df: degree of freedom
Example Chi-square (X2)
Influenza vaccination trial in SPSS
Chi-square – 2x2 table – via Crosstabs (Analyze – Descriptive Statistics– Crosstabs)
P-value
Observed chi-square
Example:
In a study, the anxiety scores are observed from 50 patients aged 10 - 15 years old with divorced
parents and 50 patients with non-divorced parents. The results shows that the average anxiety
score of patients with divorced parents is higher than patients with non-divorced parents. How
to determine whether the difference in average anxiety score is statistically significant?
Normality assumption is used in the anxiety score
u Independent t-test is used to estimate whether the difference in average anxiety score among
two patients group is statistically significant.
Example independent t-test
Anxiety score in SPSS
(Analyze – compare means – independent samples t-test)
Example:
In a population based cohort study, Apgar scores at five and 10 minutes in infants born less than
37 completed weeks are observed. The Apgar score has been used worldwide as an index of
early neonatal condition. In its descriptive analysis, the Apgar score in five minutes is observed to
be different from the Apgar score in 10 minutes. How to determine whether the difference is
statistically significant?
Non-normality assumption is used in Apgar score
In a retrospective cohort study, 3431 inhabitants of city A and city B who had histamine
challenge test data were assessed from year 1964 – 1972, for approximate 30 years. The
objective of the study was to assess association between histamine airway hyper-responsiveness
and mortality from chronic obstructive pulmonary disease.
Baseline characteristics of the study population on gender, age, height, smoking habits, and
several respiratory symptoms were collected. Descriptive statistics were conducted to assess
whether there were statistically significant differences in baseline characteristics between men
and women participants.
Please determine which test is appropriate to assess each characteristic based on the following
information (in the Table)!
Exercise
Characteristics of the participants of the study, for men and women separately
Height (cm), mean (sd) 177.1 (6.7) 163.9 (5.9) 0.000 Independent t-test
Smoking habits
• Never smokers, N(%) 252 ( 24.2%) 791 (75.8%) Chi square
• Ex-smokers, N(%) 444 (68.2%) 207 (31.8%) 0.000
• Current smokers, N(%) 1017 (64.0%) 572 (36.0%)
FEV1 (cl), mean (sd) 350.1 (79.7) 268.1.0 (53.0) 0.000 Independent t-test
Weight (kg), mean (sd) 79.5 (10.1) 68.9 (11.0) 0.004 Independent t-test