Faculty of Engineering Universiti Pertahanan Nasional Malaysia Mini Project
Faculty of Engineering Universiti Pertahanan Nasional Malaysia Mini Project
MINI PROJECT
Semester II 2019/2020
Programme : 2ZK50A
INTRODUCTION……………………………………………………………………………3
a) Histogram……………………………………………………………………………..4
b) Descriptive Statistic of New Cases of Portugal and Thailand……………………..5 – 6
c) Regression of New Cases of Portugal and Thailand………………………………6 – 8
d) Hypothesis Testing For New Cases of Portugal and Thailand………………………..8
T-test Two Sample Assuming Equal Variances (2 tail)…………………………..9
T-test Two Sample Assuming Equal Variances (1 tail)…………………….10 – 11
F-test Two Sample For Variances (Variance Test)………………………………11
e) Anova (Single Factor)…………………………………………………………..12 – 13
a) Histogram…………………………………………………………………………….14
b) Descriptive Statistic of Death for Portugal and Thailand……………………….15 – 16
c) Regression for Death of Portugal and Thailand………………………………...16 – 18
d) Hypothesis Testing for Death of Portugal and Thailand……………………………..18
T-test Two Sample Assuming Equal Variances (2 tail)……………………..19 - 20
T-test Two Sample Assuming Equal Variances (1 tail)………………………….20
F-test Two Sample For Variances (Variance Test)………………………………21
e) Anova (Single Factor)…………………………………………………………..21 – 23
a) Histogram…………………………………………………………………………….23
b) Descriptive Statistic of Recoveries For Portugal and Thailand………………….24 -25
c) Regression For Recoveries of Portugal and Thailand…………………………..25 – 27
d) Hypothesis Testing for Recoveries of Portugal and Thailand………………………..27
T-test Two Sample Assuming Equal Variances (2 tail)………………………….28
T-test Two Sample Assuming Equal Variances (1 tail)………………………….29
F-test Two Sample For Variances (Variance Test)………………………………30
e) Anova (Single Factor)…………………………………………………………...31 - 32
1.0 Introduction
Coronaviruses (CoV) are a large family of viruses that cause illness ranging from the
common cold to more severe diseases such as Middle East Respiratory Syndrome
(MERSCoV) and Severe Acute Respiratory Syndrome (SARS-CoV). A novel
coronavirus (nCoV) is a new strain that has not been previously identified in humans.
Coronaviruses are zoonotic, meaning they are transmitted between animals and
people. Detailed investigations found that SARS-CoV was transmitted from civet
cats to humans and MERS-CoV from dromedary camels to humans. Several known
coronaviruses are circulating in animals that have not yet infected humans. Common
signs of infection include respiratory symptoms, fever, cough, shortness of breath
and breathing difficulties. In more severe cases, infection can cause pneumonia,
severe acute respiratory syndrome, kidney failure and even death. Standard
recommendations to prevent infection spread include regular hand washing, covering
mouth and nose when coughing and sneezing, thoroughly cooking meat and eggs.
Avoid close contact with anyone showing symptoms of respiratory illness such as
coughing and sneezing.
A online research and survey was carried out upon Covid-19 virus pandemic by our
team which consists of 4 members. Regarding this project, we have began our
research on 22 January 2020 until 31 April 2020 and the main two countries that we
suggest to pick were Portugal and Thailand. We have collected data regarding the
following information of covid-19 pandemic in Portugal and Thailand :
Regarding this project, we have choose this two countries because we believe that
this both countries can full-fill all the information that we needed. Moreover we were
also instructed to conduct this project with collected data for each country using
analysing tool which is Microsoft Excel. We have included important criteria in
analysing the data we have collected such as descriptive statistics, hypothesis testing,
charts, anova and regression.
4
2.0 RESULT AND DISCUSSION
5
Figure 2: Show the histogram chart of daily new cases in Thailand
B) DESCRIPTIVE STATISTIC OF NEW CASES OF PORTUGAL AND
THAILAND
a) The data collected for Portugal shows that it is normally distributed. This is
because the reading of skewness and the kurtosis is nearly to -1 and 1.
6
b) The data collected for Thailand shows that it is normally distributed. This is
because the reading of skewness and the kurtosis is in between to -3 and 3.
c) The mean of new cases in Portugal is higher than the mean of new cases in
Thailand. This indicated that the new cases in Portugal is greater than in Thailand.
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a) 16.78 indicates the gradient of the trendline.
b) 1582.4 indicates the y-intercept of the regression chart.
c) It has weak correlation
The R square value has been proved at the summary output as following :
7
SUMMARY OUTPUT FOR PORTUGAL (Constant to zero)
Regression Statistics
Multiple R 0.767292607
R Square 0.588737944
Adjusted R Square 0.578737944
Standard Error 6561.791185
Observations 101
We have plot the points by inserting the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a) 9.9421 indicates the gradient of the trendline.
b) 611.46 indicates the y-intercept of the regression chart.
c) It has weak correlation
8
The R square value has been proved at the summary output as following :
SUMMARY OUTPUT FOR THAILAND
Regression Statistics
Multiple R 0.369937428
R Square 0.136853701
Adjusted R Square 0.128135051
Standard Error 1077.567271
Observations 101
Table 7 : Show the descriptive statistic used for t-Test and F-Test
9
A-Test : Two Sample Assuming Equal Variances(Two tail)
Portugal Thailand
Mean 247.970297 29.24752475
Variance 106150.8891 1843.928119
Observations 101 101
Pooled Variance 53997.40861
Hypothesized Mean
Difference 0
df 200
t Stat 6.688877338
P(T<=t) one-tail 1.10162E-10
t Critical one-tail 1.652508101
P(T<=t) two-tail 2.20323E-10 (<0.05)
t Critical two-tail 1.971896224
Table 8 : Show the t-Test two sample assuming equal variances for two tail of
Portugal and Thailand
a/2 a/2
t
t = -1.972 t=0 t = 1.972
t = 6.689
Graph 1 : Show the two tail graph obtained for the t-Test
Conclusion : It seems the t- Stat value is in rejection region thus the Ho is rejected.
The value of P(T<=t) is also lower that 0.05 thus there is difference in the mean of
new cases between Portugal and Thailand.
10
t-Test: Two-Sample Assuming Equal Variances (One tail)
Thailand Portugal
Mean 29.24752475 247.970297
Variance 1843.928119 106150.8891
Observations 101 101
Pooled Variance 53997.40861
Hypothesized Mean
Difference 0
df 200
t Stat -6.688877338
P(T<=t) one-tail 1.10162E-10 (<0.05)
t Critical one-tail 1.652508101
P(T<=t) two-tail 2.20323E-10
t Critical two-tail 1.971896224
Table 9 : Show the t-Test two sample assuming equal variances for one tail of
Portugal and Thailand
t
t=-1.653 t=0
t=-6.689
Graph 2 : Show the one tail graph obtained for the t-Test
11
Conclusion : As the value of t-Stat is lees than the t-Critical value where it is at the
rejection region and the value of P(T<=t) is lower than 0.05, thus the Ho is rejected.
This shows that the mean of new cases of victim covid-19 in Thailand is less than
the mean of new cases of victim covid-19 in Portugal.
Portugal Thailand
Mean 247.970297 29.24752475
Variance 106150.8891 1843.928119
Observations 101 101
df 100 100
F 57.56780214
P(F<=f) one-tail 9.13477E-61 (<0.05)
F Critical one-tail 1.391719552
Table 10 : Show the F-Test two sample for variances Portugal and Thailand
F = 1.392
F = 57.568
Graph 3 : Show the one tail graph obtained for the F-Test
Conclusion : As the value of F is less than the value of F-Critical where it falls at
rejection region and the value of P(F<=f) is less than 0.05, thus the Ho is rejected.
12
We can conclude that there is significant difference in the variance of death of
covid-19 victims in Portugal and Thailand.
We have divided the data of number of cases for Portugal and Thailand by month.
This is because, it will be easier for us to compare the mean of number of cases for
both Portugal and Thailand by monthly. Before we do the anova (single factor)
analysis, we have assumed two types of analysis. Following are the hypothesis:
Hypothesis Null: There is no difference in the mean of number of new cases for
Portugal in February, March and April (µF=µM=µA).
Hypothesis Alternative: There is at least one pair is different (µF≠µM≠µA).
SUMMARY
Varianc
Groups Count Sum Average e
February 29 0 0 0
240.096 92279.6
March 31 7443 8 2
586.733 66979.1
April 30 17602 3 7
ANOVA
Source of Variation SS df MS F P-value F crit
5145253.4 47.5119 1.13E- 3.10129
Between Groups 8 2 2572627 4 14 6
4710784.5 54146.9
Within Groups 8 87 5
9856038.0
Total 6 89
Table 11 : Show the anova (single factor) for the number of cases for Portugal
Conclusion : As the value of P value is less than the value of 0.05 and the value of F
is larger than the value of F-Critical where it is at the rejection point, thus it reject
Ho. This means that there is at least one pair is different.
13
Hypothesis Null: There is no difference in the mean of number of new cases of
Thailand in February, March and April. (µF=µM=µA)
Hypothesis Alternative: There is at least 1 pair different. (µF≠µM≠µA)
SUMMARY
Varianc
Groups Count Sum Average e
1.44827 39.6847
February 29 42 6 3
51.9032 3182.62
March 31 1609 3 4
43.4333 1173.56
April 30 1303 3 4
ANOVA
Source of Variation SS df MS F P-value F crit
43209.9 21604.9 14.3897 3.99E- 3.10129
Between Groups 1 2 5 1 06 6
130623. 1501.41
Within Groups 2 87 7
173833.
Total 2 89
Table 12 : Show the anova (single factor) for the number of cases for Thailand
Conclusion : As the value of P value is less than the value of 0.05 and the value of F
is at the rejection point, thus it reject Ho. This means that there is at least one pair
is different.
14
THIPAN A/L NATHAN
Daily Death
40
35
30
Number of Death
25
20
15
10
5
0
22/01/2020 05/02/2020 19/02/2020 04/03/2020 18/03/2020 01/04/2020 15/04/2020 29/04/2020
Date
Figure 5 : Show the histogram chart for the number of death in Portugal.
Daily Death
4.5
4
3.5
Number of Death
3
2.5
2
1.5
1
0.5
0
21/01/2020 04/02/2020 18/02/2020 03/03/2020 17/03/2020 31/03/2020 14/04/2020 28/04/2020
Date
15
Figure 6 : Show the histogram chart for the number of death in Thailand
Number Of Death
Descriptive Statistic Thailand
Mean 0.524752475
Standard Error 0.096054914
Median 0
Mode 0
Standard Deviation 0.965339934
Sample Variance 0.931881188
Kurtosis 2.385299635
Skewness 1.83385477
Range 4
Minimum 0
Maximum 4
Sum 53
Count 101
16
Table 14 : Show the descriptive statistic of number of death for Thailand
d) The data collected for Portugal shows that it is normally distributed. This is
because the reading of skewness and the kurtosis is nearly to -1 and 1.
e) The data collected for Thailand shows that it is normally distributed. This is
because the reading of skewness is between -1 and 1 while the value of kurtosis is in
between -3 and 3.
Portugal
1200
1000
800
Cummulative
400
200
0
0 5 10 15 20 25 30 35 40
Number of death
Figure 7 : Show the regression chart for the death cases of Portugal
We have plot the points by inserting the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have added the equation
(Y=mX+c) that can be derived from the chart and the R square value of the chart.
Based on scatter plot chart, we can conclude that :
a) 18.887 indicates the gradient of the trendline.
b) 1.3504 indicates the y-intercept of the regression chart.
c) It has weak correlation
The R square value has been proved at the summary output as following :
17
R Square 0.649142327
Adjusted R Square 0.64559831
Standard Error 181.1042216
Observations 101
Table 16 : Show the summary output (constant to zero) for number of death of
Portugal
Thailand
60
50
30 Thailand
Linear (Thailand)
20
10
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Number of death
Figure 8 : Show the regression chart for the number of death of Thailand
We have plot the points by inserting the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have added the equation
(Y=mX+c) that can be derived from the chart and the R square value of the chart.
Based on scatter plot chart, we can conclude that :
a) 9.2339 indicates the gradient of the trendline.
b) 7.5605 indicates the y-intercept of the regression chart.
18
c) Its has weak correlation
The R square value has been proved at the summary output as following :
Table 18 : Show the summary output (constant to zero) for the number of death of
Thailand
D) HYPOTHESIS TESTING FOR DEATH OF PORTUGAL AND
THAILAND
Number Of Death
Descriptive Statistic Number Of Death Portugal Thailand
Mean 9.792079208 0.524752475
Standard Error 1.29127436 0.096054914
Median 0 0
19
Mode 0 0
Standard Deviation 12.97714671 0.965339934
Sample Variance 168.4063366 0.931881188
Kurtosis -1.05444266 2.385299635
Skewness 0.80065278 1.83385477
Range 37 4
Minimum 0 0
Maximum 37 4
Sum 989 53
Count 101 101
Table 19 : Show the descriptive statistic used to obtain t-Test and F-Test
t-Test: Two-Sample Assuming Equal Variances (Two tail)
Portugal Thailand
Mean 9.792079208 0.524752475
Variance 168.4063366 0.931881188
Observations 101 101
Pooled Variance 84.66910891
Hypothesized Mean
Difference 0
df 200
t Stat 7.157109627
P(T<=t) one-tail 7.6554E-12
t Critical one-tail 1.652508101
P(T<=t) two-tail 1.53108E-11 (<0.05)
t Critical two-tail 1.971896224
Table 20 : Show the t-Test two sample assuming equal variances for number od
death of Portugal and Thailand
a/2 a/2
20
t
t = -1.972 t=0 t = 1.972
t = 7.157
Conclusion : The value of t-Stat is less than the value of t-Critical and it falls at the
rejection region as shown in Graph 4 while the value of P(T<=t) is less than 0.05.
Thus we can conclude that Ho is rejected. It means that there is difference between
the mean for number of death of Portugal and Thailand.
t-Test: Two-Sample Assuming Equal Variances (One tail)
Portugal Thailand
Mean 9.792079208 0.524752475
Variance 168.4063366 0.931881188
Observations 101 101
Pooled Variance 84.66910891
Hypothesized Mean
Difference 9
df 200
t Stat 0.206455085
P(T<=t) one-tail 0.418322734 (>0.05)
t Critical one-tail 1.652508101
P(T<=t) two-tail 0.836645468
t Critical two-tail 1.971896224
Table 21 : Show the data of two sample assuming equal variances of number of death
for Portugal and Thailand
21
t
t=0 t = 1.653
t = 0.206
Graph 5 : Show the one tail graph obtained from the t-Test
Conclusion : The value of t-Stat is not at the rejection region as showed in Graph 5
and the value of P(T<=t) is larger than 0.05 as showed in Table 21 thus this clearly
shows that the t-Test failed to reject Ho. So it means that there is no difference in
the mean of death of covid-19 victims in Portugal and Thailand.
F-Test Two-Sample for Variances (Variance Test)
Portugal Thailand
Mean 9.792079208 0.524752475
Variance 168.4063366 0.931881188
Observations 101 101
df 100 100
F 180.7165321
P(F<=f) one-tail 4.1494E-85 (<0.05)
F Critical one-tail 1.391719552
F = 1.392
F = 180.717
22
This is because, it will be easier for us to compare the mean of number of death for
both Portugal and Thailand by monthly. Before we did the anova (single factor)
analysis, we have assumed two types of analysis. Following are the hypothesis:
Hypothesis Null: There is no difference in the mean of number of death of Portugal
in February, March and April. (µF=µM=µA)
Hypothesis Alternative: There is at least 1 pair is different. (µF≠µM≠µA)
SUMMARY
Varianc
Groups Count Sum Average e
February 29 0 0 0
60.9397
March 31 160 5.16129 8
27.6333 33.2057
April 30 829 3 5
ANOVA
Source of Variation SS df MS F P-value F crit
12865.8 6432.91 200.512 2.64E- 3.10129
Between Groups 3 2 4 9 33 6
Within Groups 2791.16 87 32.0823
15656.9
Total 9 89
Table 23 : Show the anova (single factor) for the number of death of Portugal
Conclusion : The value of the F is less than the value of F-critical and the value of F
falls at the rejection area. The value of P is also less than the value of 0.05 thus the
Ho is rejected. This shows that there is at least one pair is different.
Hypothesis Null: There is no difference in the mean of number of death of Thailand
in February, March and April. (µF=µM=µA)
Hypothesis Alternative: There is at least 1 pair different. (µF≠µM≠µA)
SUMMARY
Groups Count Sum Average Variance
February 29 0 0 0
23
0.29032
March 31 9 3 0.47957
1.46666
April 30 44 7 1.36092
ANOVA
Source of Variation SS df MS F P-value F crit
35.9351 17.9675 29.0263 3.10129
Between Groups 3 2 6 5 2.2E-10 6
53.8537 0.61900
Within Groups 6 87 9
89.7888
Total 9 89
Table 24 : Show the anova (single factor) for the number of death of Thailand
Conclusion : As the data in Table 24 shows that the value of F is greater than the
value of F-critical and the value of P is also lesser than the significant level which is
0.05, Ho is rejected. This means that there is at least one pair is different.
ATIF
A) HISTOGRAM CHART OF NUMBER OF RECOVERIES FOR
PORTUGAL AND THAILAND.
Number of Recoveries
350
300
Number of Recovery
250
200
150
100
50
0
22/01/2020 05/02/2020 19/02/2020 04/03/2020 18/03/2020 01/04/2020 15/04/2020 29/04/2020
Date
24
Recoveries
300
200
150
100
50
0
21/01/2020 04/02/2020 18/02/2020 03/03/2020 17/03/2020 31/03/2020 14/04/2020 28/04/2020
Date
25
Standard Deviation 47.35791673
Sample Variance 2242.772277
Kurtosis 4.140355595
Skewness 1.995964156
Range 244
Minimum 0
Maximum 244
Sum 2678
Count 101
Portugal
1600
f(x) = 5.41475632289625 x + 105.306783619016
1400 R² = 0.338058907541487
1200
1000
Cummulative
800 Portugal
Linear (Portugal)
600
400
200
0
0 50 100 150 200 250 300 350
Number of recovered
Figure 11 : Show the regression chart obtained for the number of recoveries in
Portugal
26
We have plot the points by inserting the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have added the equation
(Y=mX+c) that can be derived from the chart and the R square value of the chart.
Based on scatter plot chart, we can conclude that :
a) 5.4148 indicates the gradient of the trendline.
b) 105.31 indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
Table 27 : Show the summary output for the number of recoveries in Portugal
Table 28 : Show the summary output (constant to zero) for the number of recoveries
in Portugal
27
Thailand
3000
f(x) = 10.7544994260992 x + 209.301490464418
2500 R² = 0.37442549975286
Cummulative 2000
1500 Thailand
Linear (Thailand)
1000
500
0
0 50 100 150 200 250 300
Number of recovered
Figure 12 : Show the regression chart for the number of recoveries in Thailand
We have plot the points by inserting the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have added the equation
(Y=mX+c) that can be derived from the chart and the R square value of the chart.
Based on scatter plot chart, we can conclude that :
a) 10.754 indicates the gradient of the trendline.
b) 209.3 indicates the y-intercept of the regression chart
This equation and the R square value has been proved at the summary output as
following :
28
Table 29 : Show the summary output for the number of recoveries in Thailand
Table 30 : Show the summary output (constant to zero) for the number of recoveries
in Thailand
Number Of Recovered
Descriptive Statistic Number Of Recovered in Portugal Thailand
Portugal Thailand
Mean 15.03960396 26.51485149
29
Variance 1793.458416 2242.772277
Observations 101 101
Pooled Variance 2018.115347
Hypothesized Mean Difference 0
df 200
t Stat -1.81524295
P(T<=t) one-tail 0.035492228
t Critical one-tail 1.652508101
P(T<=t) two-tail 0.070984457 (>0.05)
t Critical two-tail 1.971896224
Table 32 : Show the t-Test obtained for the number of recoveries for Portugal
a/2 a/2
t
t = -1.972 t=0 t = 1.972
t = 3.569
Conclusion : Referring to the Table 32, the value of P(T<=t) is greater than the value
of 0.05 while the value of t-Stat is also not in the rejection region as shown in Graph
7. Thus we can conclude that the Ho is failed to reject. So there is no difference in
the mean of recoveries between Portugal and Thailand.
Thailand Portugal
30
Mean 26.51485149 15.03960396
Variance 2242.772277 1793.458416
Observations 101 101
Pooled Variance 2018.115347
Hypothesized Mean Difference 11
df 200
t Stat 0.07517831
P(T<=t) one-tail 0.470073992 (>0.05)
t Critical one-tail 1.652508101
P(T<=t) two-tail 0.940147984
t Critical two-tail 1.971896224
t
t=0 t = 1.653
t = 0.075
Graph 8 : Show the one tail graph obtained from the t-Test
Conclusion : Referring to Table 33, the value of P(T<=t) is less than the value of
0.05 and the value of t-Stat is also not at the rejection region as shown in the Graph
8. This clearly show that the Ho is failed to reject thus there is no difference in the
mean of number of recoveries between Portugal and Thailand.
Thailand Portugal
Mean 26.51485149 15.03960396
Variance 2242.772277 1793.458416
31
Observations 101 101
df 100 100
F 1.250529289
P(F<=f) one-tail 0.132663515 (>0.05)
F Critical one-tail 1.391719552
F = 1.392
F = 1.251
Conclusion : The Table 34 show the value of P(F<=f) is greater than the value of
0.05 and the Graph 9 show that the value of F-Stat in not at the rejection area. Thus
Ho is rejected which means there is no difference in the variance of recoveries of
Portugal and Thailand.
We have divided the data of number of recoveries for Portugal and Thailand by
month. This is because, it will be easier for us to compare the mean of number of
recoveries for both Portugal and Thailand by monthly. Before we did the anova
(single factor) analysis, we have assumed two types of analysis. Following are the
hypothesis:
32
Hypothesis Null: There is no difference in the mean of number of recovered for
Portugal in February, March and April. (µF=µM=µA)
Hypothesis Alternative: There is at least one pair is different. (µF≠µM≠µA)
SUMMARY
Varianc
Groups Count Sum Average e
February 29 0 0 0
1.38709 17.7118
March 31 43 7 3
4447.61
April 30 1476 49.2 4
ANOVA
Source of Variation SS df MS F P-value F crit
23520.7 3.10129
Between Groups 47041.5 2 5 15.8001 1.4E-06 6
129512. 1488.64
Within Groups 2 87 5
176553.
Total 7 89
SUMMARY
Varianc
Groups Count Sum Average e
0.65517 2.94827
February 29 19 2 6
10.1290 928.582
March 31 314 3 8
33
78.1666 2787.93
April 30 2345 7 7
ANOVA
Source of Variation SS df MS F P-value F crit
106808. 53404.2 42.7075 3.10129
Between Groups 4 2 1 8 1.2E-13 6
108790. 1250.46
Within Groups 2 87 2
215598.
Total 6 89
Conclusion : Table 36 show that the value of F is less than the value of F-critical and
it fall at the rejection area. Other than that, the value of P is also less than the value
0.05 so it reject Ho. Thus, there is at least one pair is different.
AIMAN
34
IN PORTUGAL AND THAILAND
A) HISTOGRAM CHART OF NEW CASES OF MALE IN PORTUGAL AND
THAILAND
Figure 12: Show the histogram chart of confirmed cases for male in Portugal
100
80
Number of cases
60
40
20
0
00- 09
10 -19
20- 29
30- 39
40- 49
50- 59
60- 69
70- 79
80
Age
Figure 13: Show the histogram chart of confirmed cases for male in Thailand
35
Histogram of Confirmed Cases for Female
3000
2000
1500
1000
500
20- 29
00- 09
10 -19
30- 39
40- 49
50- 59
60- 69
70- 79
80+
Age
Figure 14: Show the histogram chart of confirmed cases for female in Portugal
100
80
Number of cases
60
40
20
0
00- 09
10 -19
20- 29
30- 39
40- 49
50- 59
60- 69
70- 79
Age 80+
Figure 15: Show the histogram chart of confirmed cases for female in Thailand
i) MALE
PORTUGAL THAILAND
Mean 1142.111111 39.555556
Standard Error 178.0611977 12.461174
Median 1283 34
Mode #N/A #N/A
Standard Deviation 534.1835931 37.383523
36
Sample Variance 285352.1111 1397.5278
Kurtosis -0.046109336 -0.533096
Skewness -1.079292505 0.6726338
Range 1471 108
Minimum 199 0
Maximum 1670 108
Sum 10279 356
Table 37: Show the descriptive statistic of daily new cases of male
ii) FEMALE
PORTUGAL THAILAND
Mean 1656.777778 39.55556
Standard Error 307.0456796 12.46117
Median 1675 34
Mode #N/A #N/A
Standard Deviation 921.1370389 37.38352
Sample Variance 848493.4444 1397.528
Kurtosis -1.089717246 -0.5331
Skewness -0.479641169 0.672634
Range 2465 108
Minimum 212 0
Maximum 2677 108
Sum 14911 356
Table 38: Show the descriptive statistic of daily new cases of female
a)The data collected for male shows that it is normally distributed. This is because
the reading of skewness and the kurtosis is nearly to -1 and 1.
b)The data collected for female shows that it is normally distributed. This is because
the reading of skewness and the kurtosis is in between to -3 and 3.
37
C) REGRESSION OF NEW CASES OF AGE AND SEX PORTUGAL AND
THAILAND
1600
f(x) = 114.133333333333 x + 571.444444444444
1400 R² = 0.342377468149487
1200
Confirmed Cases
1000
Confirmed Cases for Male
800 Linear (Confirmed Cases for Male)
600
400
200
0
0 1 2 3 4 5 6 7 8 9 10
Age
Figure 16: Show the regression chart for the confirmed cases of male in Portugal
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a)114.13 indicates the gradient of the trendline.
b) 571.44 indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
38
Regression Statistics
Multiple R 0.993841204
R Square 0.98772034
Adjusted R Square 0.844863197
Standard Error 30.40320628
Observations 8
2500
f(x) = 206.9 x + 622.277777777778
R² = 0.378384862136694
2000
Confirmed Cases
1000
500
0
0 1 2 3 4 5 6 7 8 9 10
Age
Figure 17 : Show the regression chart for the confirmed cases female in Portugal
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a)206.9 indicates the gradient of the trendline.
b) 622.28 indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
39
Table 41 : Show the summary output for male in Portugal
100
80
Number of cases
60 MALE
Linear (MALE)
f(x) = 5.86315789473684 x
40 R² = 0.387829025497002
20
0
0 1 2 3 4 5 6 7 8 9 10
Age
Figure 18: Show the regression chart for the confirmed cases male in Thailand
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a)5.863 indicates the gradient of the trendline.
b) -4.186 indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
40
Multiple R 0.971751477
R Square 0.944300934
Adjusted R Square 0.935017756
Standard Error 17.10378962
Observations 8
100
80
Number of cases
60 Female
Linear (Female)
f(x) = 5.86315789473684 x
40 R² = 0.387829025497002
20
0
0 1 2 3 4 5 6 7 8 9 10
Age
41
The R square value has been proved at the summary output as following :
Portugal Thailand
Mean 55.88888889 39.55555556
Variance 9906.611111 1397.527778
Observations 9 9
Pooled Variance 5652.069444
42
Hypothesized Mean
Difference 0
df 16
t Stat 0.460868831
P(T<=t) one-tail 0.325547603
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.651095206
t Critical two-tail 2.119905299
Table 47 : Show the t-Test two sample assuming equal variances for number of
confirmed cases of male in Portugal and Thailand
a/2 a/2
t
t= t=
-2.120 t=0 2.120
t=
0.461
Conclusion : The value of t-Stat is more than the value of t-Critical and it falls not at
the rejection region as shown in Graph 10 while the value of P(T<=t) is more than
0.05. Thus we can conclude that Ho is failed to rejected. It means that there is
difference between the mean for number of confirmed cases of male in Portugal and
Thailand.
t-Test: Two-Sample Assuming Equal Variances (One tail)
43
confirmed cases for males in Thailand > 0)
Portugal Thailand
Mean 55.88888889 39.55555556
Variance 9906.611111 1397.527778
Observations 9 9
Pooled Variance 5652.069444
Hypothesized Mean
Difference 0
df 16
t Stat 0.460868831
P(T<=t) one-tail 0.325547603
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.651095206
t Critical two-tail 2.119905299
Table 48 : Show the data of two sample assuming equal variances of number of
confirmed cases for male in Portugal and Thailand
a/2 a/2
t
t = -2.120 t=0 t = 2.120
t = 0.461
Graph 11 : Show the one tail graph obtained from the t-Test
Conclusion : The value of t-Stat is not at the rejection region as showed in Graph 11
and the value of P(T<=t) is larger than 0.05 as showed in Table 21 thus this clearly
shows that the t-Test failed to reject Ho. So it means that there is no difference in the
mean of confirmed cases of covid-19 victims in Portugal and Thailand.
F-Test Two-Sample for Variances (Variance Test)
Portugal Thailand
Mean 55.88888889 39.55555556
Variance 9906.611111 1397.527778
Observations 9 9
df 8 8
F 7.088668482
P(F<=f) one-tail 0.005991451
F Critical one-tail 3.438101233
44
Table 49 : Show the result obtained for F-Test
F=
3.438
F=
7.08867
Portugal Thailand
Mean 57.77777778 25.66666667
Variance 16434.19444 951.5
Observations 9 9
Pooled Variance 8692.847222
45
Hypothesized Mean
Difference 0
df 16
t Stat 0.730601512
P(T<=t) one-tail 0.237792867
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.475585735
t Critical two-tail 2.119905299
Table 50 : Show the t-Test two sample assuming equal variances for number of
confirmed cases of female in Portugal and Thailand
a/2 a/2
t
t = -2.120 t=0 t = 2.120
t = 0.731
Conclusion : The value of t-Stat is more than the value of t-Critical and it falls not at
the rejection region as shown in Graph 13 while the value of P(T<=t) is more than
0.05. Thus we can conclude that Ho is failed to rejected. It means that there is
difference between the mean for number of confirmed cases of male in Portugal and
Thailand.
t-Test: Two-Sample Assuming Equal Variances (One tail)
Portugal Thailand
Mean 57.77777778 25.66666667
46
Variance 16434.19444 951.5
Observations 9 9
Pooled Variance 8692.847222
Hypothesized Mean
Difference 0
df 16
t Stat 0.730601512
P(T<=t) one-tail 0.237792867
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.475585735
t Critical two-tail 2.119905299
Table 51 : Show the data of two sample assuming equal variances of number of
confirmed cases for female in Portugal and Thailand
t
t=0 t = 1.746
t = 0.731
Graph 14: Show the one tail graph obtained from the t-Test
Conclusion : The value of t-Stat is not at the rejection region as showed in Graph 14
and the value of P(T<=t) is larger than 0.05 as showed in Table 42 thus this clearly
shows that the t-Test failed to reject Ho. So it means that there is no difference in the
mean of confirmed cases of covid-19 victims in Portugal and Thailand.
Portugal Thailand
Mean 57.77777778 25.66666667
Variance 16434.19444 951.5
Observations 9 9
df 8 8
F 17.27188066
P(F<=f) one-tail 0.000274611
F Critical one-tail 3.438101233
47
F
F = 3.438
F = 17.271
We have divided the data of number of confirmed cases for Portugal and Thailand by
sex. This is because, it will be easier for us to compare the mean of number of
confirmed cases for both Portugal and Thailand by sex. Before we did the anova
(single factor) analysis, we have assumed two types of analysis. Following are the
hypothesis:
Hypothesis Null: There is no difference in the mean of number of confirmed cases
for male in Portugal and Thailand. (µP=µT)
Hypothesis Alternative: There is at least one pair is different. (µF≠µM≠µA)
SUMMARY
Groups Count Sum Average Variance
55.8888 9906.61111
PORTUGAL 9 503 9 1
39.5555 1397.52777
THAILAND 9 356 6 8
ANOVA
Source of
Variation SS df MS F P-value F crit
Between 4.4939984
Groups 1200.5 1 1200.5 0.21240008 0.651095 8
90433.1111 5652.06
Within Groups 1 16 9
Total 91633.6111 17
48
1
Table 44 : Show the anova for the number of confirmed cases of male
Conclusion : As we see the Table 44, the value of F is less than the value of F-critical
where it fall at the rejection area. The value of P is more than the value of 0.05 so the
Ho is failed to rejected. This show that there is at most one pair is different.
Hypothesis Null: There is no difference in the mean of number of confirmed cases of
female in Portugal and Thailand. (µP=µT)
Hypothesis Alternative: There is at least 1 pair different. (µF≠µM≠µA)
SUMMARY
Averag Varianc
Groups Count Sum e e
57.7777 16434.1
PORTUGAL 9 520 8 9
25.6666
THAILAND 9 231 7 951.5
ANOVA
Source of Variation SS df MS F P-value F crit
4640.0555 4640.05 0.53377 0.4755857 4.49399
Between Groups 56 1 6 9 35 8
139085.55 8692.84
Within Groups 56 16 7
143725.61
Total 11 17
Table 53 : Show the anova for the number of confirmed cases of female
Conclusion : Table 45 show that the value of F is less than the value of F-critical and
it fall at the rejection area. Other than that, the value of P is also more than the value
0.05 so it fail to reject Ho. Thus, there is at most one pair is different.
49
Death Cases for Male
350
300
250
200
Death
150
100
50
More
80+
00- 09
10 -19
20- 29
30- 39
50- 59
60- 69
70- 79
40- 49
Age
Figure 20: Show the histogram chart of death for male in Portugal
10
6
DEATH
0
80+
50- 59
60- 69
70- 79
00- 09
10 -19
20- 29
30- 39
40- 49
AGE
Figure 21: Show the histogram chart of death for male in Thailand
50
Death Cases for Female
450
400
350
300
250
Death 200
150
100
50
0
00- 09
10 -19
20- 29
30- 39
50- 59
60- 69
70- 79
40- 49
More
80+
Age
Figure 22: Show the histogram chart of death cases for female in Portugal
4
DEATH
0
10 -19
20- 29
40- 49
50- 59
60- 69
70- 79
00- 09
30- 39
80+
AGE
Figure 23: Show the histogram chart of death cases for female in Thailand
iii) MALE
PORTUGAL THAILAND
Mean 55.88888889 4.555555556
Standard Error 33.17732008 1.302893267
Median 5 4
Mode 0 0
Standard Deviation 99.53196025 3.9086798
Sample Variance 9906.611111 15.27777778
Kurtosis 5.13381968 -1.16985124
Skewness 2.241846921 0.268466772
Range 299 11
Minimum 0 0
Maximum 299 11
51
Sum 503 41
iv) FEMALE
PORTUGAL THAILAND
Mean 57.77777778 1.555555556
Standard Error 42.73197404 0.689426314
Median 5 1
Mode 0 1
Standard Deviation 128.1959221 2.068278941
Sample Variance 16434.19444 4.277777778
Kurtosis 7.861939408 1.925210226
Skewness 2.768841661 1.642443615
Range 392 6
Minimum 0 0
Maximum 392 6
Sum 520 14
Table 55: Show the descriptive statistic of death female
a)The data collected for male shows that it is normally distributed. This is because
the reading of skewness and the kurtosis is nearly to -1 and 1.
b)The data collected for female shows that it is normally distributed. This is because
the reading of skewness and the kurtosis is in between to -3 and 3.
52
Death Cases for Male
350
300
250
200
Death
50
0
0 1 2 3 4 5 6 7 8 9 10
Age
Figure 24: Show the regression chart for the death of male in Portugal
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a)28.167 indicates the gradient of the trendline.
b)-84.944 indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
53
Adjusted R Square 0.844863
Standard Error 30.40321
Observations 8
Table 57: Show the summary output(constant to zero) for male in Portugal
400
350
300
250
Death Case for Female
Death
50
0
0 1 2 3 4 5 6 7 8 9 10
Age
Figure 25 : Show the regression chart for the death female in Portugal
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a)31.45indicates the gradient of the trendline.
b)-99.472indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
54
SUMMARY OUTPUT FOR FEMALE IN PORTUGAL (Constant to
zero)
Regression Statistics
Multiple R 0.99591
R Square 0.991837
Adjusted R Square 0.84898
Standard Error 24.78793
Observations 8
10
8 f(x) = 0.929824561403509 x
R² = 0.79742235848521
Male
DEATH
6
Linear (Male)
0
0 1 2 3 4 5 6 7 8 9 10
AGE
Figure 26 : Show the regression chart for the death male in Thailand
We have plot the points by using the scatter plot graph. We have also added the
trendline to show the linearity of the regression chart. We have also added the
equation (Y=mX+c) that can be derived from the chart and the R square value of the
chart. Based on scatter plot chart, we can conclude that :
a)0.9298 indicates the gradient of the trendline.
b)-0.54362 indicates the y-intercept of the regression chart.
The R square value has been proved at the summary output as following :
55
SUMMARY OUTPUT FOR MALE IN THAILAND
Regression Statistics
Multiple R 0.972913
R Square 0.94656
Adjusted R Square 0.937654
Standard Error 1.39622
Observations 8
4
Female
DEATH
3 Linear (Female)
f(x) = 0.31578947368421 x
2 R² = 0.507518796992481
0
0 1 2 3 4 5 6 7 8 9 10
AGE
56
The R square value has been proved at the summary output as following :
Hypothesis Null : There is no difference in the mean of death of covid-19 victims for
males in Portugal and Thailand. (µP = µT ) (Mean of death for males in Portugal =
Mean of death for males in Thailand)
Hypothesis Alternative : There is difference in the mean of death of covid-19 victims
for males in Portugal and Thailand. (µP ≠ µT ) (Mean of death for males in Portugal
≠ Mean of death for males in Thailand)
57
Portugal Thailand
Mean 4.555555556 55.8888889
Variance 15.27777778 9906.61111
Observations 9 9
Pooled Variance 4960.944444
Hypothesized Mean
Difference 0
df 16
t Stat -1.54605002
P(T<=t) one-tail 0.070820593
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.141641186
t Critical two-tail 2.119905299
Table 63 : Show the t-Test two sample assuming equal variances for number of death
of male in Portugal and Thailand
a/2 a/2
t
t = -2.120 t=0 t = 2.120
t = -1.546
Conclusion : The value of t-Stat is more than the value of t-Critical and it falls not at
the rejection region as shown in Graph 16 while the value of P(T<=t) is more than
0.05. Thus we can conclude that Ho is failed to rejected. It means that there is no
difference between the mean for number of death of male in Portugal and Thailand.
t-Test: Two-Sample Assuming Equal Variances (One tail)
Hypothesis Null : There is no difference in the mean of death of covid-19 victims for
males in Portugal and Thailand. (µP = µT ) (Mean of death for males in Portugal =
Mean of death for males in Thailand)
Hypothesis Alternative : The mean of death for male of covid-19 victims in Thailand
58
is greater than the mean of death for male of covid-19 victims in Portugal (µT - µP >
0) (Mean of death for males in Thailand - Mean of death for males in Portugal > 0)
Thailand Portugal
Mean 55.88888889 4.55555556
Variance 9906.611111 15.2777778
Observations 9 9
Pooled Variance 4960.944444
Hypothesized Mean
Difference 0
df 16
t Stat 1.546050022
P(T<=t) one-tail 0.070820593
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.141641186
t Critical two-tail 2.119905299
Table 64 : Show the data of two sample assuming equal variances of number of death
for male in Portugal and Thailand
t
t=0 t = 1.746
t = 1.546
Graph 17 : Show the one tail graph obtained from the t-Test
Conclusion : The value of t-Stat is not at the rejection region as showed in Graph 17
and the value of P(T<=t) is larger than 0.05 as showed in Table 48 thus this clearly
shows that the t-Test failed to reject Ho. So the mean of death for male of covid-19
victims in Thailand is greater than the mean of death for male of covid-19 victims in
Portugal
F-Test Two-Sample for Variances (Variance Test)
Portugal Thailand
Mean 4.555555556 55.8888889
Variance 15.27777778 9906.61111
Observations 9 9
df 8 8
F 0.00154218
P(F<=f) one-tail 1.96031E-10
F Critical one-tail 0.290858219
59
Table 65 : Show the result obtained for F-Test
F = 0.2909
F = 0.001542
Hypothesis Null : There is no difference in the mean of death of covid-19 victims for
females in Portugal and Thailand. (µP = µT ). (Mean of death for females in Portugal
= Mean of death for females in Thailand)
Hypothesis Alternative : There is difference in the mean of death of covid-19 victims
for females in Portugal and Thailand. (µP ≠ µT ) (Mean of death for females in
Portugal ≠ Mean of death for females in Thailand)
Portugal Thailand
Mean 55.88889 1.555556
Variance 9906.611 4.277778
Observations 9 9
Pooled Variance 4955.444
Hypothesized Mean
Difference 0
60
df 16
t Stat 1.637311
P(T<=t) one-tail 0.06054
t Critical one-tail 1.745884
P(T<=t) two-tail 0.12108
t Critical two-tail 2.119905
Table 66 : Show the t-Test two sample assuming equal variances for number of death
of female in Portugal and Thailand
a/2 a/2
t
t = -2.120 t=0 t = 2.120
t = 1.637
Conclusion : The value of t-Stat is more than the value of t-Critical and it falls not at
the rejection region as shown in Graph 19 while the value of P(T<=t) is more than
0.05. Thus we can conclude that Ho is failed to rejected. It means that there is
difference between the mean for number of death of male in Portugal and Thailand.
t-Test: Two-Sample Assuming Equal Variances (One tail)
Thailand Portugal
Mean 1.555556 55.88889
Variance 4.277778 9906.611
Observations 9 9
Pooled Variance 4955.444
61
Hypothesized Mean
Difference 0
df 16
t Stat -1.63731
P(T<=t) one-tail 0.06054
t Critical one-tail 1.745884
P(T<=t) two-tail 0.12108
t Critical two-tail 2.119905
Table 67 : Show the data of two sample assuming equal variances of number of death
for female in Portugal and Thailand
t
t= -1.746 t=0
t= -1.637
Graph 20 : Show the one tail graph obtained from the t-Test
Conclusion : The value of t-Stat is not at the rejection region as showed in Graph 20
and the value of P(T<=t) is larger than 0.05 as showed in Table 51 thus this clearly
shows that the t-Test failed to reject Ho. So it means that there is no difference in the
mean of death for female of covid-19 victims in Thailand is greater than the mean of
death for female of covid-19 victims in Portugal
Portugal Thailand
Mean 55.88889 1.555556
Variance 9906.611 4.277778
Observations 9 9
df 8 8
F 2315.831
P(F<=f) one-tail 1.21E-12
F Critical one-tail 3.438101
62
F
F = 3.438
F = 2315.83
We have divided the data of death for Portugal and Thailand by sex. This is because,
it will be easier for us to compare the mean of number of confirmed cases for both
Portugal and Thailand by sex. Before we did the anova (single factor) analysis, we
have assumed two types of analysis. Following are the hypothesis:
Hypothesis Null: There is no difference in the mean of number of death for male in
Portugal and Thailand. (µP=µT)
Hypothesis Alternative: There is at least one pair is different. (µF≠µM≠µA)
SUMMARY
Groups Count Sum Average Variance
4.55555
PORTUGAL 9 41 6 15.27778
55.8888
THAILAND 9 503 9 9906.611
ANOVA
Source of
Variation SS df MS F P-value F crit
Between
Groups 11858 1 11858 2.390271 0.141641 4.493998
4960.94
Within Groups 79375.11 16 4
Total 91233.11 17
Table 69: Show the anova for the number of death of male
63
Conclusion : As we see the Table 53, the value of F is less than the value of F-critical
where it fall at the rejection area. The value of P is more than the value of 0.05 so the
Ho is failed to rejected. This show that there is at most one pair is different.
Hypothesis Null: There is no difference in the mean of number of death of female in
Portugal and Thailand. (µP=µT)
Hypothesis Alternative: There is at least 1 pair different. (µF≠µM≠µA)
SUMMARY
Groups Count Sum Average Variance
55.8888 9906.61
PORTUGAL 9 503 9 1
1.55555 4.27777
THAILAND 9 14 6 8
ANOVA
Source of
Variation SS df MS F P-value F crit
2.68078 4.49399
Between Groups 13284.5 1 13284.5 9 0.12108 8
4955.44
Within Groups 79287.11 16 4
Total 92571.61 17
Conclusion : Table 54 show that the value of F is less than the value of F-critical and
it fall at the rejection area. Other than that, the value of P is also more than the value
0.05 so it fail to reject Ho. Thus, there is at most one pair is different.
CONCLUSION
By carry out research, we could learn more about the pandemic of Covid-19 that is
happening in Portugal and Thailand. Moreover, we could increase or gain or
knowledge in predicting and analyse the data that we have collected starting from
February until April. For example, we could learn and do the T-test that carried out
to test the mean of every data. Other than that, we could also learn how to tabulate
the data collected systematically using the Microsoft Excel which will be very useful
for every engineering students to carry out their final year project reports.
64
REFERENCE
1. https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Portugal
2. https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Portugal
3. https://www.worldometers.info/coronavirus/
4. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-
reports/
5. https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Thailand
6. https://en.wikipedia.org/wiki/Template:COVID-19_pandemic_data/
Thailand_medical_case
65
66