A virus simulation, visualized

Chi-Loong, Updated 1st May 2020


Since the Covid19 pandemic upended billions of lives around the world, we have had some really gorgeous simulations / visualizations to explain how an infection spreads.

Here are some links to posts and stories that you might have seen:

Being stuck at home in Singapore on a lockdown (which was just extended to another month), I wanted a project to occupy my time.

And I had a different take on simulating and visualizing a virus outbreak in a town.

Instead of people moving about randomly, I felt that giving people a routine home to work to home cycle seemed more realistic to me. Was there a way I could show this, and what insights could we draw?

So I coded up my own model for fun that people can play and explore.

Warning Caveats

Please note that my toy model is just that: A fun thought experiment.

It is highly simplified and the simulation rules do not represent real life at all with all its complexities.

You simplify things to communicate ideas, but this is in no way a real epidemological model.

I do not have a background in epidemology and neither am I a doctor. So please defer to these experts, like we all should. If I have gotten anything wrong, it is entirely my own fault. Do let me know if you spot any errors!

What I do is visualization work to communicate stories and ideas. And I see code as a tool and way to share my ideas.

Baseline Model

Here’s our very simplified base toy model.

Based on this model, let’s try running the simulation and see how it works!

Please run these simulations / visualization on a modern browser. Also, although this will work for mobiles, this will look a lot better for desktops and bigger tablets, because a bigger canvas allows for the town simulation to be bigger.

Simulation Parameters

Number of people: Number of homes: Number of workplaces: Number of days to run simulation: Infection chance: Simulation Speed:

Simulation Results


Our model here is rather simple, and it is just about infection. It does not take into account any resistance or anything like that. But even in this barebones state the visualization can help with intuitions about viral spread.

You might notice a few things if you run the simulation a few times:

The last point may be a a common sense intuition, but becomes clear when visualized.

For example, in Singapore our current lockdown situation is due to Covid19 spread in migrant workers who were squeezed into a few dormitories. For them it is extremely difficult to isolate, and thus when the virus started to circulate, it exploded.

Another example: Having one main employer in town like a meat-packing factory. If you get one case in town, it is likely to spread.

You can see these intuitions better by putting more people in less homes and workplaces when you run the simulation. One red dot quickly becomes a bunch of angry red dots at chokepoints.

Simulation vs real-life

Each simulation run gets outcomes based on chance.

If you get lucky in your run, you might not get that much infection. Conversely, you could be unlucky and see the whole town get sick before the simulation time even ran out.

Actually this is somewhat similar to real-life: Luck does play a part in epidemics on who and where it strikes, and how it grows and dies in any particular place.

In real life, we only live in this one timeline, and we only see the case results after infections have occured. If we got lucky, we may think that the virus is not as serious or contagious.

The trouble is that whilst your town (or you individually) may not be as affected, the overall trend is that many other towns (and other people) would have gotten an unlucky dice roll.

Our trouble as humans is that we tend to only see localized events, and mistake it for the entire trend.

Our cognitive biases makes it hard to fight long-term, complex challenges like Covid19 (looking like this is going to be more and more likely) or climate change because we evolved to pay attention to short-term immediate problems.

In any case, in simulation land, because it is a thought experiment, we can do something that cannot be done in real-life: We can rerun the simulation thousands of times, and plot the results to get the most probable outcomes.

If we keep everything the same except for changing one parameter, we can get useful conclusions.

30 workplaces

20 workplaces

Parameters: 150 people town, 50 homes, 3 days simulation time, 2% infection rate.

Everything is the same except for number of workplaces.

Both scenario simulations run 10,000 times.

I considered making this interactive for readers, but dropped the idea as it is compute heavy and takes time to run.

What you are looking at is the number of infections at the end of 3 days plotted in 2 histograms (bar chart) after 10,000 simulations for each scenario. The only change is the number of workplaces.

You can see that the average number of infections is less in the 30 workplaces scenario compared to the 20 workplaces one, even without calculating the exact mean / median / variance.

Statistics and reproducibility

Feel free to skip this section if you're not into statistics or code as it delves into some fun technical ideas.

A quick detour. If you plot many independent random variables (due to the Law of Large Numbers), you will get a nice bell curve due to the Central Limit Theorem. Just like here!

This, in essence, is why data polling is such a powerful tool and why it works to tell us about the sampled population.

But hold on, you might ask. If I run your simulation with randomized data, I will get a different result, right?

Correct. This is why for all these charts, for reproducibility, it is run based on a specific psuedo number generator (PRNG) algorithm running on a specific hash.

The default javascript Math.random() does not provide for this. The random numbers generated here are based on the SFC2 algorithm (here is a great list).

In any case, if you have an interest in reproducing these results, you can checkout and play with the code in the github repository.

Model Exploration

Now that we have a baseline model to work with, let’s explore some other scenarios.

This article is a work in progress and I will add more models over time.

For example, how do quarantine breakers affect infection rates? Under a lockdown, how does the percentage of essential businesses affect infection?

What if people have different types of schedules? Or if social activities are allowed, and people are allowed to mix around at social places after social distancing is in effect? Can we model this?