data science

How Data Science Can Make Pandemic Predictions Better


Our client, an international pharmaceutical company, was looking to review commonly used simulation models for predicting outbreaks of infectious diseases.


The number of viruses and the infections they cause are increasing. The way humans live today combined to global warming and pollution contribute to the rise of viruses. Now more than ever, viral infections are everywhere.

Whether research needs to be done on the next drugs and vaccines, it also needs to focus on distribution systems, in particular which country should supply vaccines as a priority to prevent an outbreak. The challenge is pressing, more than ever.


When  we first met our client, they already had a system for  simulating viruses spread based on a compartmentalisation model. In other  words, a population was divided into different groups, each with distinctive  characteristics, e.g. age and transmission rate. For each population segment,  they were able to estimate the speed of transmission of a virus.

Though their model was quite advanced, we felt it could be further enhanced. As such, we broadened characteristics applied  to population thanks to a higher number of parameters and improve the  performance of the simulation model to be able to simulate outbreaks over  several hundred years.

Doing so, we moved from a system of compartments  to a multi-agent based modeling system. The proposed framework models single  persons with personal behaviour, different health states and ability to spread  the disease. In total, we took into account more than 80 parameters,  including family situation, number of children and schools to model each  individual.

While this approach has great benefits, it requires extensive knowledge of the epidemic’s  background. So we worked hand in hand with our client to define the modelling of a population and the modelling of the mode of transmission of the virus.  We also used demographics data to validate the population model.  

Adopting such a system has great benefits  because it allows modeling complex dynamics based on simple rules. The  structures, which finally lead to an epidemic, should correspond to reality.  This leads to direct calculation of different scenarios and high quality  results and allows examination of widely disputed and unknown effects.

In the end our client has a model that can help them to predict outbreaks of the virus and to convince governments to invest in vaccination.