Prediction of post covid19 evolution in patients, using big data tools

Authors

DOI:

https://doi.org/10.33936/rehuso.v8i2.5911

Keywords:

COVID-19, prediction, data mining, orange data mining, prevention

Abstract

This research aims to predict the post COVID-19 evolution in patients at the Portoviejo General Hospital (IESS), identifying similar patterns in the spread of future cases of this disease. As a methodology, a descriptive, retrospective study was carried out, with a quantitative approach and use of the documentary analysis method, where information was extracted from the database of the aforementioned hospital, in the period 2020-2022. For the data analysis, the Orange Data Mining software was used, which is an open-source tool with a wide range of data analysis and machine learning methods. Of the total of 18,316 patients, an intentional sample of 3,678 was used, since they had the data required for analysis. Among the main results, it stands out that the people most likely to have Covid are in the age range between 63 and 70 years; the most exposed sex is the male; The most common symptoms for those affected are respiratory failure and chronic kidney disease, issues that help predict which patients may be more likely to contract the disease. In conclusion, it is highlighted that the application of data mining tools facilitates the prediction and future evolution of diseases such as the one analyzed, facilitating decision-making on the prevention and control of the pandemic for health authorities.

Downloads

Download data is not yet available.

References

Banluesapy S. & Jirapanthong, W. (2022, del 10 al 11 de noviembre). A Prediction Model for Screening COVID-19 Patients [conference]. 2022 6th International Conference on Information Technology (InCIT), Nonthaburi, Thailand. https://10.1109/InCIT56086.2022.10067446.

Byeon, H. (2021). Predicting high-risk groups for COVID-19 anxiety using adaboost and nomogram: Findings from nationwide survey in South Korea. Applied Sciences, 11(21), 1-15. https://doi.org/10.3390/app11219865

Brownlee, J. (2016). Machine Learning Mastery with Python: Understand Your Data, Create Accurate Models, and Work Projects End-to-End. Machine Learning Mastery. https://n9.cl/lnk4v

Herrera, C. E., Lage, D., Betancourt, J., Barreto, E., Sánchez, L., y Crombet, T. (2021). Nomograma de predicción para la estratificación del riesgo en pacientes con COVID-19. European Journal of Health Research:(EJHR), 7(2), 1-19. https://doi.org/10.32457/ejhr.v7i2.1592

Crespo, M. (2019). Análisis de la Encuesta de Salud Nacional y Examen de Nutrición de Estados Unidos (NHANES) usando machine learning. [Tesis de maestría, Universidad Oberta de Catalunya].UOC. https://hdl.handle.net/10609/99127

García, S., Ramírez-Gallego, S., Luengo, J., & Herrera, F. (2016). Big Data: Preprocesamiento y calidad de datos. Big Data monografía, (237), 17-23. https://n9.cl/spesd

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. https://n9.cl/yumt2

Leung, C. K., Chen, Y., Hoi, C. S.H., Shang, S., Wen, Y. & Cuzzocrea, A. (2020, del 07 al 11 de septiembre). Big Data Visualization and Visual Analytics of COVID-19 Data [conference]. 24th International Conference Information Visualisation (IV). Melbourne, Australia. https://doi.org/10.1109/IV51561.2020.00073

Medel-Ramírez, C. & Medel-López, H. (2020). Data Mining for the Study of the Epidemic (SARS- CoV-2) COVID-19: Algorithm for the Identification of Patients (SARS-CoV-2) COVID 19 in Mexico. SSRN. https://dx.doi.org/10.2139/ssrn.3619549

Naeem, M., Jamal, T., Diaz-Martínez, J., Butt, S. A., Montesano, N., Tariq, M. I., De-La-Hoz-Franco, E. & De-La-Hoz-Valdiris, E. (2022). Trends and future perspective challenges in big data. In J-S. Pan., V.E. Balas. y C. M. Chen. (Eds.), Advances in Intelligent Data Analysis and Applications. Smart Innovation, Systems and Technologies (pp. 309-325). Springer, Singapore. https://doi.org/10.1007/978-981-16-5036-9_30

Ong, A. K. S., Prasetyo, Y. T., Yuduang, N., Nadlifatin, R., Persada, S. F., Robas, K. P. E., Chuenyindee Thanatorn & Buaphiban, T. (2022). Utilization of random forest classifier and artificial neural network for predicting factors influencing the perceived usability of COVID-19 contact tracing “Morchana” in Thailand. International Journal of Environmental Research and Public Health, 19(13),2-28. https://doi.org/10.3390/ijerph19137979

Pérez-Milena, A., Leyva-Alarcón, A., Barquero-Padilla, R. M., Peña-Arredondo, M., Navarrete-Espinosa, C. & Rosa-Garrido, C. (2022). Valoración y seguimiento de los pacientes con sospecha de COVID-19 en la primera ola pandémica en una zona urbana de Andalucía. Atención Primaria, 54(1), 1-8. https://doi.org/10.1016/j.aprim.2021.102156

Raftarai, A., Mahounaki, R. R., Harouni, M., Karimi, M. & Olghoran, S. K. (2021). Predictive models of hospital readmission rate using the improved AdaBoost in COVID-19. In T. Saba. y A. Rehman. (Ed.), Intelligent Computing Applications for COVID-19 (pp. 67-86). CRC Press. https://doi.org/10.1201/9781003141105

Inca Ruiz, G. P. y Inca León, A.C. (2020). Evolución de la enfermedad por coronavirus (COVID-19) en Ecuador. La ciencia al servicio de la salud, 11(1), 5-15. https://dx.doi.org/10.47244/cssn.Vol11.Iss1.441

Sujatha, R., Venkata Siva, B., Chatterjee, J. M., Rahul Naidu, P., Jhanjhi, N. Z., Charita, C., Mariya, E. & Baz, M. (2022). Prediction of Suitable Candidates for COVID-19 Vaccination. Intelligent Automation & Soft Computing, 32(1) https://n9.cl/ktzk3

Thange, U., Shukla, V. K., Punhani, R. & Grobbelaar, W. (2021, del 19-21 de enero). Analyzing COVID-19 Dataset through Data Mining Tool “Orange” [conference]. In 2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM). Dubai, United Arab Emirates 10.1109/ICCAKM50778.2021.9357754

Villena-Ortiz, Y., Giralt, M., Castellote-Bellés, L., Lopez-Martínez, R. M., Martinez-Sanchez, L., García-Fernández, A. E., Ferrer-Costa, R., Rodriguez-Frias, F. & Casis, E. (2021). Estudio descriptivo y validación de un modelo predictivo de severidad en pacientes con infección por SARS-CoV-2. Advances in Laboratory Medicine/Avances en Medicina de Laboratorio, 2(3), 399-408. https://doi.org/10.1515/almed-2021-0006

Wickham, H., & Grolemund, G. (2017). R for Data Science. O'Reilly Media. https://n9.cl/etad3

Published

2023-07-01