A super model for the team of 5 million

The nation's Covid-19 response has relied heavily on contagion modelling. Dion O'Neale is leading work to build a unique model that mirrors the myriad connections of the entire population of New Zealand.

Dion O'Neale lead researcher
Applied mathematician Dion O'Neale leads a team building a mathematical model that mirrors the interactions of Aotearoa's "team of 5 million".
The University of Auckland is ranked No.1 globally in the Times Higher Education University Impact Rankings for 2020 (also no.1 in 2019). The rankings assess how universities are working towards the UN Sustainable Development Goals.
Research that responds to the challenges of the UN Sustainable Development Goals.

Through the power of political rhetoric Aotearoa New Zealand’s population is now the ‘team of five million’. But you’d have to question any actual selection process that brings together Cuba mall hipsters, West Coast miners, and Queen St property speculators with High Country graziers. New Zealanders are a diverse bunch.

Through a balance of good management and luck, the ‘team’ has performed well against the Covid pandemic to date. However policy makers need to know what levers to pull if New Zealand does experience a major outbreak.

Courtesy of Te Pūnaha Matatini, the Centre of Research Excellence for Complex Systems, at the University of Auckland, Aotearoa has a working contagion model. Now the centre is working on a new model to help New Zealand work out how best to live in a world where Covid-19 is likely to persist for years to come.

Covid 19 is no exception to most things in life. One size does not fit all and as the vaccine rollout begins, the way we do that in practice and what risks New Zealanders face will differ depending on where they live and their communities.

This new model has been on Dr Dion O’Neale and his colleagues’ minds at Te Pūnaha Matatini, since the first lockdown last year. O’Neale and colleagues are building ‘an interaction network model of the entire population of Aotearoa merged with a stochastic, dynamic contagion model’.

We were able to model the risk Covid might spread outside Auckland and were able to predict the probable outcome one fortnight later if we moved to Level 3.

Dion O'Neale University of Auckland

In plain speak, the team are bringing together the multiple realms of public data collected for the census and from other public sources, and overlaying that with a Covid contagion model developed to represent both disease progression and interventions such as contact tracing and testing and the announcement and timing of Alert Levels.

The contagion model proved its worth in the August, 2020 Auckland lockdown. O’Neale says, “We were asked about the likely impact for a number of scenarios. For instance, if three people are detected with Covid and the initial case was two weeks’ prior, how many people would you expect to have who remain undetected in the community?

“We were able to model the risk Covid might spread outside Auckland and were able to predict the probable outcome one fortnight later if we moved to Level 3.” The model proved spot on, offering politicians a degree of confidence on which to base their decisions.

O’Neale and team are in the process of testing and perfecting a giant edifice built from data that mimics and mirrors the potentially trillions of possible connections and interactions happening in the team of five million. Combined with the contagion model, this super model of Aotearoa aims to deliver more granular and fine-tuned predictions as Aotearoa adapts to life with Covid.

The team is building the super model from data held in the Integrated Data Infrastructure (IDI) hosted by Statistics New Zealand. The IDI is a giant research database that holds information from the census, government agencies and from some non-governmental organisations. The IDI enables the data to be linked or integrated to answer research questions.

Researchers gain permission to access the IDI through a series of strict protocols, the main one being that all personal information that could lead to identification is ‘de-identified’ and all results are about groups, not individuals. So any use of the IDI should not raise concerns about the arrival of Big Brother.

diagram of the supermodel
The super model combines data from the Integrated Data Infrastructure with existing contagion models.

O’Neale and the team do not run the contagion model with the actual data sets. Instead they create a series of distributions of certain characteristics within the IDI. An example might be the geographic distribution of people over 60 who live in households with more than one person or the number of people on low incomes who also have more than one job.

They check the accuracy of each class of information before transferring the distributions to their model outside the IDI, where it runs on a super computer hosted at the University of Auckland as part of the National e-Science Infrastructure. It’s like creating blurred mirror images of the team of 5 million, each one slightly different from the next but all with the same overall patterns.

By discipline O’Neale is an applied mathematician who for the past decade has worked on complex networks. “Think of it as the collection of dots and lines that show how things interact. If you map this network you can ask questions about how information might spread and find out how many hops it might take to go from one person to any other arbitrary person.”

Our world can be seen as a series of complex networks from international air travel to the plumbing and electricity networks under city streets, from the weather and ocean currents to the flowering plants and insects that pollinate them. The most obvious complex networks are our online interactions on Facebook or Twitter. As a category complex networks lie at the heart of biology, physics and the human brain. The maths to describe and model these networks comes from other disciplines, from materials science to advanced physics.

“The distributions we take from the IDI are really what in physics we would call our ‘observables’. We create the networks from what we know are the right number of people, the right number of different schools and workplaces and so on. By making these ensembles of networks we can then observe how these correlations play out and then run these through the contagion model.”

Essentially the team of 5 million is a complex network. There are too many individual parts and connections to ever track each one in complete detail, but the mathematics allows the model to run as a simulation of real life, a working model that describes the team of 5 million’s daily routines based on where we live, work or go to school.

The supermodel team: (from left) James Gilmore, Steven turnbull, Dion O'Neale, Oliver Maclaren, Emily Harvey, Frankie Patten-Elliott and David Wu.
The supermodel team: (from left) James Gilmore, Steven Turnbull, Dion O'Neale, Oliver Maclaren, Emily Harvey, Frankie Patten-Elliott and David Wu.

The maths originated in materials science and physics but has been applied to the epidemiology of disease for at least a decade, most notably when researchers successfully modelled the spread of widespread Ebola outbreaks in Africa.

Like a lot of research, there is an element of serendipity behind the project. At Te Pūnaha Matatini, O’Neale was supervising doctorate students looking at education and employment networks, curious about how knowledge flows in early 2020.

“We realised that we had an awful lot of information about two of the big contexts for how society interacts, at school and at work.” A visiting post-doc was trapped in New Zealand in the first lockdown and jumped in to help integrate the education and employment networks with dwelling data.

This work is easy to describe in a few paragraphs, but belies the inherent complexity. Even with a small population approaching only 5 million, the number of possible connections runs to around 25 trillion or 25,000,000,000,000. This year the Health Research Council has acknowledged the work’s importance with a grant of $1m.

A big driver for O’Neale is how the model can be used to address equity issues related to both the potential burden of disease and the intervention measures, such as lock downs, that are needed for disease control. A barrier to addressing some of these questions is the need to plug important gaps in public information arising in particular from the 2018 census, the first to go digital. The 2018 census received completed forms from only 83.4 per cent of the population (92.2 per cent in 2013) with much worse response rates from Māori and Pacific peoples and for all 15-29 year-olds.

Worrying Census gaps

For quantitative researchers who need to work with population data, the 2013 and 2018 Censuses were disasters. O’Neale singles out specific and worrying gaps that are so fundamental that  Aotearoa lacks information to respond effectively to the pandemic.

One example is household data, with almost 360,000 people and 14 percent of the Maori and Pacific population not able to be linked to a household and 21 percent of young Māori and Pacific males not appearing to live anywhere at all.

“Researchers warned that the new approach was going to leave big gaps and this would be terrible for underserved communities in particular, regional Māori, Pacific communities, and the youngest and oldest members of those communities,” he says.

O’Neale warns that if the next census does not deliver, New Zealanders will experience the loss of a generation’s crucial data, information that needs to be accurate to deliver health, social and economic services effectively.

Much of the project will seek to address gaps in the data. “We can’t do our own census, it’s too expensive and too slow and it isn’t our job. But we can work with people and ask them about their experience with the census and work out how to try and better represent those who were under-served by the census process.”

A key member of the team is Dr Andrew Sporle (Te Rarawa), a founding member of Te Mana Raraunga, the Maori Data Sovereignty Network, who will work with them on practical ways to mind those gaps. “We will have to make some assumptions. A high percentage of young males don’t appear to have a link to a dwelling, but we don’t think they are homeless and can assign them to a dwelling in a community,” says O’Neale.

One of the benefits of the super model is being able to take a closer look at the impact of Covid on specific communities. The team has estimated that Māori and Pacific communities could experience fatality rates twice that of Pākehā in a poorly controlled community outbreak. People in precarious work are more likely to go to work when sick and have more than one job, people in overcrowded housing are more likely to have pre-existing health issues that will make Covid far worse, all things over-represented in Māori and Pacific people.


We want our work to go to the people who want to use it. We want other people, not just the ministries and the Government, to be asking us questions.

Dion O'Neale University of Auckland

“We want our work to go to the people who want to use it. We want other people, not just the ministries and the Government, to be asking us questions. We will take the time to go and explain what the model can and can’t do and talk with communities and ask them what they would like to know, what would be useful for them.”

The pandemic is a global black swan event. But in many respects, says O’Neale, New Zealand has advantages beyond the widest moat in the world. For data scientists, New Zealand is special. Very few countries have an IDI as comprehensive as New Zealand. Australia has been building an individual level national contagion model for many years, but with the IDI and some very long days, New Zealand was able to get a working contagion model up within months.

Once operational, the super model for the team of 5 million, will be a one of a kind tool to guide social and economic policy in what is likely to be a long war against the pandemic.

“We want to use this this to address how we can come through the pandemic without leaving some of us to bear a disproportionate level of burden and we just don’t want to be talking with ministries. We really want to answer questions from communities that would make a difference for them,” says O’Neale.

Friends on phones
Popularity is not symmetrical.

The friendship paradox

The strategy for the vaccine roll-out is to focus first on those most vulnerable. O’Neale has no issue with this, but as the roll-out extends to the wider population, what then becomes the most effective strategy?

Theoretically, a network model can determine how many of us might be super spreaders, i.e. those of us who are popular and likely to have a lot of connections at work and in personal life. A possible strategy would be for vaccinators to try to identify super-connected individuals and vaccinate them first. But the IDI is completely anonymised. How could you identify the people who are most connected and therefore likely to be super spreaders?

One answer lies in the friendship paradox, a phenomenon first observed by sociologists. Put simply, as you might have always fretted, your friends have more friends than you do. The paradox is that we tend to think that there should be a symmetrical balance that your friend will have the same number of friends. This is not the case.

On Facebook, your friends will have more friends than you have. In real life your sexual partners will on average have had more partners than you have had. The friendship paradox is due to the structure of social networks, where those with more connections will be connected to people who have fewer connections. If you are popular, then many of those you are friends with will be less popular. Friendship turns out to be asymmetrical.

So O’Neale says one viable vaccine strategy for the general population would be for the vaccine teams to go to people and ask them to name a friend. By ‘friend’ he means a regular, sufficiently close epidemiological contact that they could infect or be infected by. Friends in other countries don’t count. They would then vaccinate that ‘friend’ first.

“That process is preferentially going to find the people who are more connected and therefore are the people who if they contract Covid pose a greater risk to the community.” This is unlikely to happen, but hypothetically it would be a very effective way to prioritise vaccination.

Story by Gilbert Wong

Main researcher portrait by Billy Wong

Mātātaki|The Challenge is a series from the University of Auckland about how researchers are tackling the world's biggest challenges.To republish this article please contact: gilbert.wong@auckland.ac.nz