Viral pandemics are a serious threat. COVID-19 is not the first,
and it won’t be the last. But, like never before, we are collecting
and sharing what we learn about the virus. Hundreds of research teams
around the world are combining their efforts to collect data and
develop solutions. We want to shine a light on their work and show
how machine learning is helping us to fight COVID-19.
1.Identifying who is most at risk from COVID-19
Machine learning has proven to be invaluable in predicting risks in
many spheres. With medical risk specifically, machine learning is
potentially interesting in three key ways.
Predicting the risk of infection: What is the risk of a specific
individual or group getting COVID-19? Early statistics show that
important risk factors that determine how likely an individual is to
contract COVID-19 include: Age, Pre-existing conditions, General hygiene
habits, Social habits, Number of human interactions, Frequency of
interactions, Location and climate, Socio-economic status. DeCapprio
et al. have used machine learning to build an initial Vulnerability
Index for COVID-19. Prevention measures such as wearing masks, washing
hands, and social distancing are all likely to influence overall risk as well.
Predicting who is at risk of developing a severe case: What is the risk
of a specific individual or group developing severe COVID-19 symptoms
or complications that would require hospitalization or intensive care?
Once a person or group has become infected, we need to predict the risk
of that person or group developing complications or requiring advanced
medical care. In the Computers, Materials and Continua journal, researchers
published an article showing that machine learning could potentially
predict the likelihood of a patient developing ARDS as well as the risk
of mortality, just by looking at the initial symptoms.
Predicting treatment outcome: What is the risk that a specific treatment
will be ineffective for a certain individual or group, and how likely are
they to die? An extension of severity prediction is predicting the treatment’s
outcome, which is often literally a matter of predicting life and death.
If we can predict the outcomes of specific treatment methods, then doctors
can treat patients more effectively. Using machine learning to personalize
treatment plans is not specific to COVID-19, and machine learning has
previously been used to predict treatment outcomes for patients with epilepsy,
as just one example.
2.Screening patients and diagnosing COVID-19
When a new pandemic hits, diagnosing individuals is challenging. Testing
on a large scale is difficult and tests are likely to be expensive,
especially in the beginning. Anyone who has any symptoms of COVID-19 is
likely to be very concerned that they have contracted the disease, even
if the same symptoms are indicative of many other, potentially milder
Instead of taking medical samples from each patient and waiting for slow,
expensive lab reports to come back, a simpler, faster, and cheaper test
(even if it’s less accurate) would be useful in gathering data on a larger
scale. This data could be used for further research, as well as for screening
and triaging patients.
When it comes to using machine learning to help diagnose COVID-19, promising
research areas include:
Using face scans to identify symptoms, such as whether or not the patient has
Using wearable technology such as smart watches to look for tell-tale patterns
in a patient’s resting heart rate,
Using machine learning-powered chatbots to screen patients based on self-reported
3.Predicting the spread of infectious disease using social networks
In the middle of a pandemic, when we’re trying to develop strategies to
actively work against it, we first need to know where we are. We need to
answer questions like “How many people are infected?” and “Where are
these people?” Unfortunately pandemics — especially those caused by viruses
— are difficult and expensive to keep track of.
Usually, the government answers these questions, together with the health
system. For example, every day (or week) the responsible agency counts and
publicizes the number of new patients diagnosed with the disease. But one
of the problems here is that there might be a big gap (in time and space)
between contracting the disease, developing the first symptoms, and testing
Luckily, we live in a digital world. A farmer who is starting to develop
symptoms might live in a small town with no nearby hospitals capable of performing
the test. But this same farmer might still be able to access social networks
and immediately leave hints about his health and the spread of the disease —
hints that only a machine learning model can learn to process at scale.
4.Predicting the risk of new pandemics
Accurately predicting whether a strain of influenza is going to make
a zoonotic leap (jumping from one species to another) can help doctors
and medical professionals anticipate potential pandemics and prepare
As one example, Influenza A exists primarily in the avian population,
but it has the potential to jump to human hosts. Researchers working
on Influenza A isolated 67,940 protein sequences from a database. They
filtered these sequences so that the dataset included only those influenza
strains with complete sequences of 11 influenza proteins.
With machine learning the researchers were then able to identify potentially
zoonotic strains of influenza with high levels of accuracy. More work needs
to be done to establish prediction models for direct transmission, but
knowing which strains of influenza are likely to make a leap is an important
first step in preparing for the next pandemic.
5.Identifying hosts in the natural world
A zoonotic pandemic , like the one we are experiencing with the novel
coronavirus , is a pandemic caused by an infectious disease that originates
in a different species (such as bats) and spreads to humans. Viruses such
as Ebola, HIV, or COVID-19 can survive unnoticed in the natural world for
a long time, waiting for the next mutation and the next opportunity to infect
us. They hide in animals, called reservoir hosts, that are unaffected by the
Knowing who these reservoir hosts are is vital in fighting a pandemic, once
we’ve found them, we can develop strategies to control the spread of the
disease and prevent more outbreaks from happening. The classical approach
to finding reservoir hosts can take years of research, and there are still
many orphan viruses that haven’t been matched to an animal host.
Thanks to huge advances in technology, Whole-Genome Sequencing (WGS, the process
of determining an organism’s complete DNA sequence) has become cheap and fast.
Research has shown that machine learning models can use genome sequencing data
together with expert knowledge to pinpoint the species that most likely acted
as hosts for the disease. By looking at a small subset of species, we can
dramatically speed up the process of finding these pathogens in the wild.
Machine learning is an important tool in fighting the current pandemic. If we
take this opportunity to collect data, pool our knowledge, and combine our skills,
we can save many lives — both now and in the future.
Markus Schmitt, How to fight COVID-19 with machine learning, Towards Data Science,
DeCaprio D, Gartner J, Burgess T, et al. Building a COVID-19 Vulnerability Index[J].
arXiv preprint arXiv:2003.07347, 2020.
Provided by the IKCEST Disaster Risk Reduction Knowledge Service System