A2 ModelosCOVID19Mariel22 Lerma12314
A2 ModelosCOVID19Mariel22 Lerma12314
A2 ModelosCOVID19Mariel22 Lerma12314
Sitio web:
https://repository.uaeh.edu.mx/revistas/index.php/MJMR/ab
out
https://repository.uaeh.edu.mx/revistas/index.php./MJMR/issue/archive
Mexican Journal of Medical Research ICSa
Biannual Publication, Vol. 10, No. 20 (2022) 44-50 ISSN: 2007-5235
Abstract:
Since the COVID-19 pandemic, the world has experienced a large incidence of infections in short periods of time, giving rise to waves
of contagion caused by the different variations of SARS-COV-2. Health services, as well as personnel, have been overwhelmed,
especially in the poorest countries. Currently and after two years, the pandemic continues and according to experts it is here to stay,
which highlights the importance of vaccines and methods of detecting the disease, to curb the number of infections and avoid that the
pandemic continues to spread and thus the virus continues to mutate. Detection tests have been scarce and expensive for most of the
population, so alternative methods to laboratory ones could be a decisive factor so that people can self-isolate before continuing to
infect more people. One of the most effective methods have been statistical predictions of the diagnosis of COVID-19 in a patient,
based on certain variables. In this article, it was identified that the most common prediction models were developed from logistic
regression and machine-learning, which have shown high percentages of predicting test results for COVID-19. The most important
predictor variables in the different models developed in various regions of the world were identified and the opportunities, limitations
and perspectives of this prediction method are discussed.
Keywords:
COVID-19, individual prediction, prediction models, symptoms, detection
Resumen:
Desde que la pandemia del COVID-19, el mundo ha vivido el gran número de infecciones en periodos cortos de tiempo, dando lugar
a las olas de contagio provocadas por las distintas variaciones del SARS-COV-2. Los servicios de salud, así como el personal, se han
visto superados, esto especialmente en los países más pobres. Actualmente y después de dos años, la pandemia continua y según
expertos llegó para quedarse de forma estacionaria, por lo que hoy más que nunca la importancia de las vacunas y de los métodos de
detección de la enfermedad, para frenar el número de contagios y evitar que la pandemia siga extendiéndose y así el virus siga
mutando. Las pruebas de detección han resultado escasas y caras para la mayoría de población, por lo que los métodos alternativos a
los de laboratorio podría ser un factor decisivo para que las personas puedan autoaislarse antes de seguir contagiando a más personas.
Uno de los métodos más eficaces ha sido lo que involucran predicciones estadísticas de diagnóstico de COVID-19 en un paciente, a
partir de ciertas variables. En este artículo se identificaron que los modelos de predicción más comunes se desarrollaron a partir de la
regresión logística e inteligencia artificial, el objetivo de este trabajo es demostrar los altos porcentajes de predicción de resultado de
prueba por COVID-19 de estos métodos alternativos a las pruebas de laboratorio, para mostrar que son confiables como alternativas
a ellas, y aplicarlas a la población como método de control de la pandemia de COVID-19. Se identificaron las variables predictoras
más importantes en los distintos modelos desarrollados en varias regiones del mundo y se discuten las oportunidades, limitaciones y
perspectivas de este método de predicción.
Palabras Clave:
COVID-19, predicción individual, modelos de predicción, síntomas, detección.
________________________________________________________________________________________________________________________________________________________________________
variants of interest have been detected because they can cause serious
INTRODUCTION
health problems for humans, and in the most serious cases, even death.11
The COVID-19 disease is caused by the SARS-COv-2 virus, which The variants of interest are: the original strain, delta and omicron, each of
began to spread as a pandemic since 2020, affecting all countries in the which has caused millions of infections and thousands of deaths around
world with almost 6 million deaths worldwide until this year.1,2 This the world.10
disease has brought a number of subsequent problems, from economic to Although cases have decreased drastically thanks to vaccination in many
health problems, revealing how little is known today about viruses, since, countries of the world, there is still a significant lag in terms of the least
although there are already vaccines that are effective in preventing the developed countries, so the number of infections worldwide continues to
development of severe COVID-19 by up to 98%, there are still no drugs increase.12
that cure the disease.3 However, the fact that people are vaccinated with
one or more doses, although they do not prevent infection, does
WAVES OF CONTAGION IN THE PANDEMIC
significantly lower the chances of being hospitalized, as well as dying.4
Here is the importance of continuing to vaccinate as many people as Since the beginning of the pandemic, in China, in 2019, four waves of
possible. Moreover, although the number of infections has fallen by a contagion have been detected worldwide, which stand out dramatically
large percentage, there are still a large number of infected people for the numbers of infections registered until that moment.13 According
worldwide.2 to information provided by the WHO and the Centers for Disease Control
Due to the large number of infections, it is of vital importance that (CDC) of USA, the first wave was recorded in July-August 2020, the
detection tests continue to be done for COVID-19 infection, for people to second in January 2021, the third in August of the same year, and the
isolate themselves, and by not having contact with other people, fourth and most recent one, in January 2022.2,14 Consequently, it is of the
contagion can be avoided, especially with the last variant of interest of utmost importance to continue with sanitary measures and apply a greater
COVID-19, which caused the most recent wave of infections, and whose number of reliable screening tests, to avoid or mitigate the effects of a
symptoms are more like the common cold, so it is more likely that people possible fifth wave, since health systems are on the verge of collapse, in
will become infected even without knowing it.5 addition to other ailments have been left aside by the resources that have
All this without mentioning that the long-term sequelae are still being been allocated to combat the pandemic.
studied and that the vast majority of recovered patients have reported, New models for predicting future waves or peaks of contagion are
which approach from headaches, psychological problems, to multiple currently being developed, all thanks to the data that has been collected
organ failure (varying in each patient, and depending on many factors).6 since this new virus emerged.15
Due to all this, it is of the utmost importance to prevent more people from
getting infected, which could be avoided exponentially by increasing the
COVID-19 DIAGNOSTIC STRATEGIES
number of tests performed, but because they have been scarce, as well as
the medical personnel who apply them, and having such high costs, new Currently, millions of screening tests have already been applied for
detection methods have been developed, among which statistical COVID-19, the most reliable has turned out to be the PCR test, with an
methods stand out, such as logistic regression models, which through approximate 90-98% effectiveness in detecting the presence of the virus,
certain variables, are able to satisfactorily predict the result of an followed by the antigen test with an approximate of 82-90% detection
individualized test for COVID-19.7,8 efficiency.16,17 This type of test is followed by laboratory studies that are
Because the models for individualized prediction of COVID-19 that have generally also used to detect the level of severity of the disease in patients,
been reported so far are heterogeneous in terms of their composition and in addition to initially being used to diagnose whether a laboratory test
performance, as well as the applicability in the time and region in which cannot be accessed.18 However, in view of the high demand for tests, the
they were developed, the objective of this work was to analyze the scarcity and cost of such tests, the time they take, and the lack of medical
characteristics of the individualized prediction models of COVID-19 that personnel applying them due to the high risk of contagion when applying
have been reported, identify the main predictor variables, the them, the development of other types of detection methods has been
performance of the models, their limitations and strengths.9 resorted to, among which statistical methods are applied to try to predict
as accurately as possible, the outcome that patients would have for the
COVID-19 disease. One of the most effective methods within this
VARIANTS OF INTEREST OF THE SARS-COV-2
category has turned out to be mathematical modeling through logistic
VIRUS
regression and artificial intelligence (AI), since they use different
The mutation of the SARS-COV-2 virus has occurred constantly since independent variables. Although there are variations between the models
its emergence, through the genetic changes it has undergone, also called developed, most model-independent variables include the most frequent
genetic mutations in which the genome is replicated as there are still a symptomatology in infected patients, exposure to the virus, place of
large number of people infected and who continue to infect, the virus residence, and previous medical conditions of the patient.19 These
acquires new mutations, which are classified according to the World variables were analyzed separately and then in combinations, to know
Health Organization (WHO), as variants of interest or not.2,10 So far 3
45
Biannual Publication, Mexican Journal of Medical Research ICSa, Vol. 10, No. 20 (2022) 44-50
what the predictor variables for the result of the test for COVID-19 would combination of them to make the predictions with a greater degree of
be.20-26 accuracy.28 Some studies used one or more predictive or statistical
In other words, several previously known independent variables are used methods, the most common being logistic regression (Table 1), and AI
to predict the dependent variable, which has shown us a large percentage (Table 2).20-26
of success when predicting the result of the test, being able to even
compare with the results obtained with the most used laboratory tests, but
MODEL COMPARISONS
with less response time, exposure of both patients and staff, and lower
cost. The main variations found in the models, in addition to the type of
analysis that was performed, were the independent variables selected for
the predictions (Table 1 and Table 2). When a large number of
STATICAL PREDICTIONS MODELS FOR COVID-19
independent variables were reported in various studies, and as they varied
There are innovative models for the individualized prediction of the result at the time they were taken into account for the analyses, it can be
of the COVID-19 test, developed with statistical methods and that do not considered that there was more or less information about the disease in
require in their variables the result of laboratory analyzes (except for the general, so the researchers decided each one by their own variables. In
COVID-19 diagnostic test used as a gold standard to compare the most models the independent variables were similar, but in other cases
outcome) or physically review patients.4 they were not. This could explain the difference in the performance of the
The models that stand out from the others have been those of logistic models, which overall was between 70 and 87% of success effectiveness
regression and machine-learning as this is the most used method and that in positive COVID-19 test.20-26
had reported the best results in the various works that were reviewed, and Most of the models were made with samples taken in short periods of
as mentioned above, one of its characteristics. The key is that they don’t time, about one month, and the longest was 4 months.20-26 In most cases,
need laboratory studies to make their predictions.20-26 the investigations were conducted in the United States of America, and
The methods have variations in their population type, sampling method, all in the year 2020 (Table 1 and Table 2).
sample, data collection method, statistical analysis, variables used, and
results obtained.20-26A comparison analysis of the models was carried out,
ADVANTAGES AND DISADVANTAGES OF MODELS
considering their main characteristics, such as the method of solution of
the model, the dependent and independent variables, the population The models found and analyzed showed effective results (Table 1 and
studied, the predictor variables, and the final results. Table 2). However, the diagnostic value of the models is lower than the
laboratory results, since there are variants that cannot be controlled, such
as the large number of symptoms that the COVID-19 disease has
MODEL OVERVIEW AND ANALYSIS presented, in addition to the fact that these vary from person to person
The models analyzed showed similar structures in terms of research depending on various characteristics that patients present.29 External
development (Table 1 and Table 2). The information was collected factors that contribute to the fact that the results are not more uniform,
through questionnaires, interviews or databases, with the independent such as their demographic characteristics and the fact that a large
variables previously determined through research that showed which percentage of the population travels due to their daily activities or work,
were the risk factors for positive COVID-19 test. Many similar may also affect the predictive value of the models.20-26 Another factor to
prescribing variables were found, such as the main symptomatology consider is that, while research has determined that people have been
presented by previous confirmed cases (sore throat, fever, cough, exposed to the virus by having contact with an infected patient as a highly
headache, changes in smell or taste, difficulty breathing), exposure to the valuable predictor, many of these patients who may have been exposed
virus by contact with people who were known to be infected, smokers, or do not know it because the other people have not been tested, or did not
recent trips (last two weeks) or co-morbid diseases.20,21,23,24,26 However, have the general symptoms.30
others were not as common, such as gender (women), race (African- Some research mentions the fact that one of its limitations is not to give
Americans) and demographic data (whether or not they live in registered follow-up to patients, since having subsequent information
metropolitan areas or with large percentages of the population) and the from them to be able to reinforce the prediction systems with this data.20-
24,26
level of physical activity they performed.22,25 Another key point that was
considered in only one study is the psychological problems derived from However, and despite all the above, the models analyzed have shown
the COVID-19 disease, which could also be determinants as predictive optimal performance when determining the predictor variables, and
factors (anxiety, depression, insomnia).27 This was determined for the derived from it, final results above 70% of positive COVID-19 test
population and subsequent sample, which in general only required that prediction, some even reaching more than 87% efficacy.20-26 With this
they have their COVID-19 test, either the PCR or the quick test.20-26 performance, it would be considered that patients analyzed with this type
Having access to patients data before and after the test was useful so they of method could have high certainty of the results, because these
were able to perform a validation phase.23,25 percentages can be compared with those of the rapid test for COVID-19,
After identifying the characteristics of their sample, the most frequent this being one of the most used. Other great advantages to highlight about
variables in these patients were analyzed, in order to determine which these methods are their cost, since, in times of pandemic, this type of tests
were the predictor variables for the models, as well as to make a increased their monetary value, increasing the number of people not
taking the test.31 In addition, a shortage of tests has been reported due to
46
Biannual Publication, Mexican Journal of Medical Research ICSa, Vol. 10, No. 20 (2022) 44-50
Table 1. Comparison of current individualized test prediction models for Covid-19, part 1
Research name/ Smell and taste Individualizing risk Beyond predicting the Development of an
characteristics symptom-based prediction for positive number of infections: individualized risk
predictive model for coronavirus disease 2019 predicting who is likely prediction model for
Covid-19 diagnosis.20 testing.21 to be covid negative or Covid-19 using
positive.22 electronic health
record data.23
47
Biannual Publication, Mexican Journal of Medical Research ICSa, Vol. 10, No. 20 (2022) 44-50
Table2. Comparison of current individualized test prediction models for Covid-19, part 2
48
Biannual Publication, Mexican Journal of Medical Research ICSa, Vol. 10, No. 20 (2022) 44-50
Name of the research/ Machine learning-based Prediction of individual Covid- Screening for Covid-19: patient
characteristics prediction of Covid-19 19 diagnosis using baseline factors predicting positive
diagnosis based on symptoms.24 demographics and lab data.25 PCRa test.26
Country-region Israel USAc USAc
Objective Covid-9 Positive Test Prediction Covid-9 Positive Test Prediction Covid-19 Positive Test Prediction
Statistical method Machine-Learning Machine-Learning, baseline data Logistic regression
Study design Prospective Cohort Retrospective
Population Patients in Israel tested for Covid- Patients from hospitals in the New Patients admitted to Rochester,
19 York metropolitan area Minnesota clinic
Sampling method Not mentioned Stratified sub-populations Not mentioned
Sample size 51,831 individuals tested (with 31739 adults without a health 48 positive and 98 negative
4769 confirmed by Covid-19) system patients from the Ranchester,
Minnesota clinic
Method of obtaining the information Public information reported by the Clinic databases Questionnaire applied to patients
Minister of Health and Israel by a nurse
Dependent variable PCRa test and nasopharyngeal test PCRa test PCRa test result
Independent variable Cough, fever, contact with Demographics, common co- Fever, sneezing, respiratory
infected people, sex, age over 60, morbidities and laboratory tests, problems, co-morbidities, travel
headache and breathing problems calcium levels, temperature, age, and exposure to the virus
blood tests, smokers, oxygen
saturation
Confusor variant Headache, shortness of breath and Temperature and blood tests Fever, chills
cough
Significance predictors Sex, age over 60 years, exposure Common co-morbidities and Exposure to the virus and travel to
to the virus and appearance of at laboratory tests, calcium levels, metropolitan areas
least 5 clinical symptoms temperature, age, blood tests,
smokers, oxygen saturation
Model Solution Method Decision Tree Decision tree, random forest, Multivariable
XGBoost multi-tree and logistic
regression
Scan time March-April 2020 April-June 2020 March 2020
Validation stage Performed with 1000 repetitions Realized (ROCb) Not realized
(ROCb)
Results 87.30% sensitivity, 71.98% Random forest: 79.10% accuracy, Contact with confirmed cases
specificity, or 85.76% sensitivity multi-tree XGBoost: 77.66% increases the odds of positive test
and 79.18% specificity accuracy, logistic regression: by 17 times (95% CId 4.6–88.4),
79.05% accuracy and single-tree and recent trips increases the odds
XGBoost: 79.37% accuracy of positive test by 4.7 times (95%
CI 1.9-12.7).
a
PCR: polymerase chain reaction, b ROC: operational characteristic of the receptor, C USA: United States of America, d CI:
Confidence indicator
49
Biannual Publication, Mexican Journal of Medical Research ICSa, Vol. 10, No. 20 (2022) 44-50
[4] He X, Hong W, Pan X, Lu G, Wei X. SARS-CoV-2 Omicron variant: [22] Zhang SX, Sun S, Afshar-Jahanshahi A, Wang Y, Nazarian-Madavani
Characteristics and prevention. MedCom. 2021;2(4): 838–845. A, Li J, et al. Beyond Predicting the Number of Infections: Predicting
Who is Likely to Be COVID Negative or Positive. Risk. Manag.
[5] Yong SJ. Long COVID or post-COVID-19 syndrome: putative Healthc. Policy. 2020;13:2811–2818.
pathophysiology, risk factors, and treatments. J. Infect. Dis.
2021;53(10):737–754. [23] Mamidi T, Tran-Nguyen TK, Melvin RL, Worthey EA. Development
of An Individualized Risk Prediction Model for COVID-19 Using
[6] Kabir MA, Ahmed R, Iqbal S, Chowdhury R, Paulmurugan R, Demirci Electronic Health Record Data. Fron. Big. Data. 2021;4:675882.
U, et al. Diagnosis for COVID-19: current status and future prospects.
Expert. Rev. Mol. Diagn. 2021;21(3):269–288. [24] Zoabi Y, Deri-Rozov S, Shomron N. Machine learning-based
prediction of COVID-19 diagnosis based on symptoms. N. P. J. Digit.
[7] Wong R. COVID-19 testing and diagnosis: A comparison of current Med. 2021;4(1):3.
approaches. Malays. J. Pathol. 2021;43(1):3–8.
[25] Zhang J, Jun T, Frank J, Nirenberg S, Kovatch P, Huang KL. Prediction
[8] Al-Najjar D, Al-Najjar H, Al-Rousan N. Evaluation of the prediction of of individual COVID-19 diagnosis using baseline demographics and lab
CoVID-19 recovered and unrecovered cases using symptoms and data. Sci. Rep. 2021;11(1):13913.
patient's meta data based on support vector machine, neural network,
CHAID and QUEST Models. Eur. Rev. Med. Pharmacol. Sci. [26] Challener DW, Challener GJ, Glow-Lee VJ, Fida M, Shah AS, O'Horo
2021;25(17):5556–5560. JC. Screening for COVID-19: Patient factors predicting positive PCR
test. Infect. Control. Hosp. Epidemiol. 2020;41(8): 968–969.
[10] Hemmer CJ, Löbermann M, Reisinger EC. COVID-19: Epidemiologie
und Mutationen: Ein Update [COVID-19: epidemiology and mutations: [27] Galindo-Vázquez O, Ramírez-Orozco M, Costas-Muñiz R, Mendoza-
An update]. Radiologe. 2021;61(10):880–887. Contreras LA, Calderillo-Ruíz G, Meneses-García A. Symptoms of
anxiety, depression and self-care behaviors during the COVID-19
[11] Wang R, Hozumi Y, Yin C, Wey GW. Mutations on COVID-19 pandemic in the general population. Gac. Med. Mex. 2021;156(4):298–
diagnostic targets. Genom. 2020;112(6):5204–5213. 305.
[12] Solís-Arce JS, Warren SS, Meriggi NF, Scacco A, McMurry N, Voors, [28] Bender R, Grouven U. Ordinal logistic regression in medical research.
et al. COVID-19 vaccine acceptance and hesitancy in low- and middle- Clin. Med. (Lond.). 1997;31(5):546–551.
income countries. Nat. Med. 2021;27(8):1385–1394.
[29] Iser B, Sliva I, Raymundo VT, Poleto MB, Schuelter-Trevisol F,
[13] Bose-O'Reilly S, Daanen H, Deering K, Gerrett N, Huynen M, Lee J, Bobinski F. Suspected COVID-19 case definition: a narrative review of
et al. COVID-19 and heat waves: New challenges for healthcare the most frequent signs and symptoms among confirmed cases.
systems. Environ. Res. 2021;198:111153. Epidemiol. Serv. Saude. 2020;29(3):1-10
[20] Roland LT, Gurrola JG, Loftus PA, Cheung SW, Chang JL. Smell and
taste symptom-based predictive model for COVID-19 diagnosis. Int.
Forum. Allergi. Rhinol. 2020;10(7):832–838.
50