FORMULAS
FORMULAS
FORMULAS
2. Frequency tables
Frequenc Percentag
Valu Cumulative Cumulative
y e
e frequency percentage
fi (%)
X1 f1 d1 f1 d1
X2 f2 d2 f1+f2 d1+d2
... ... ... ... ...
Xk fk dk f1+f2+...+fk d1+d2+...+dk
Total n 100
- Weighted mean:
3.2. Mode: is the value that occurs with the greatest frequency.
For the unbinned data: Mode is the value with the highest frequency.
For the equal-width binned data: The bin containing the mode value is
the bin with the highest frequency. The mode value is calculated by this
formula:
1
For the fixed-width binned data: The determination of the bin containing
the mode is not based on the frequency but on the distribution density
(Distribution density=Frequency/Width):
3.3. Median: is the value in the middle when the data are arranged in
ascending order (smallest value to largest value). The median divides the
data into 2 parts, with each part having one-twice of the observations
(50%).
3.4. Quartiles: divide the sorted data set in ascending order into four
parts, with each part having one-fourth of the observations (25%).
If n+1 is divisible by 4:
2
If n+1 is NOT divisible by 4:
EX: We have the following numbers: 1800; 1900; 2000; 2100; 2200;
2500; 2700; 2800.
3.5. Percentiles: In a dataset, the pth percentile divides the data into two
parts: approximately p% of the observations are less than the pth
percentile, and approximately (100 – p)% of the observations are greater
than the pth percentile.
Qp% = Xp%(n+1)
4. Numerical measures of dispersion (variability)
3
4.1. Range: is the difference between the maximum and the minimum
value of the data set.
R=XMax-XMin
4.2. Interquartile Range: is the difference between the third quartile, Q3,
and the first quartile, Q1. It is the range for the middle 50% of the data. It
overcomes the dependency on extreme values (or outliers).
RQ=Q3-Q1
4.3. Variance: refers to the average of the squared differences from the
mean. In other words, it is the square value of the standard deviation.
σ =√ σ 2
2
S=√ S 2
2
4
1
).100% of the data values must be within m standard deviations of the mean,
m2
where m is any value greater than 1.
5
7. Z-scores: are the measures of relative location that help us determine how far
a particular value is from the mean. It is measured in terms of standard
deviations from the mean.
X−μ X− X
Z (population)= σ Z (sample)= S
6
3. Estimating the mean difference between 2 paired samples
Rejection rules:
7
(similar when comparing t values)
8
Rejection rules:
9
Rejection rules: similar when comparing Z or t values for
1 sample
10
Post-hoc one-way ANOVA (Tukey test):
Rejecting H0 if:
11
2. Forecasting using the Arithmetic Progression Method
12
5. Forecasting using Linear Trend Regression
13
7. Forecasting using Exponential Trend Equation
14
11. Simple index number
13. Weighted aggregate price index: when the quantity of usage is the measure
of importance.
15
14. Weighted aggregate quantity index: when the price is the measure of
importance.
16