What do second Covid waves look like? The evidence from Norway and Sweden

Imagine a country in which every region has the same herd immunity threshold for Covid. Now imagine this country witnesses a Covid wave, but not every region suffers the same level of infections. This means some regions finish the first wave with more immunity than others.

If the hardest-hit regions have made substantial progress towards the herd immunity threshold, then in the second wave you’d expect them to suffer less than the country on average. In a scatterplot you’d see a negative correlation: a high value along the horizontal axis (first-wave infections) would result in a low-value on the vertical (second-wave infections).

A study of the evolution of the Covid epidemic in Lombardy finds support for the idea that areas with the most first-wave infections fared better in the second wave. Figure 3 in the paper shows the kind of scatterplot I mentioned in the paragraph above.

On the other hand, analyses of many countries and regions with major Covid outbreaks have found either no correlation (a shapeless cloud) or a positive correlation. This Medium post discusses the case of Spain as of September 2020, and uses such a weak positive correlation to argue against the idea that herd immunity was playing a significant role in the epidemic.

(Notice that, if the virus is more easily transmitted in cold months, then one should expect a rise in infections in those months even if community immunity is very high. The mere fact that infections increase between July and December does not imply anything about herd immunity)

There are at least two complications with the negative-correlation idea:

-The herd immunity threshold may differ between regions. Hence, a region may end up with more (or fewer) infections than average in both waves.

-We don’t know the number of infections.

Building on the second point, the most commonly-used proxies for the real quantity of infections are tests of active infection, antibody tests, and fatalities (both “Covid deaths” and all-cause mortality). But none of these is perfect. Tests of active infection are a bad metric for the first months of the epidemic because testing was not widespread. Antibody surveys have not been carried out regularly in most countries and might not capture all infections. There is uncertainty as to the cause of many deaths, hence the difficulty in defining “Covid fatalities”. And all-cause mortality figures can take a long time to be published. That’s why the authors of the Lombardy study had to compare first-wave deaths with second-wave cases.

The peculiarities of data collection by governments can also pose an obstacle. In Spain’s case I’m not aware of a breakdown of Covid fatalities over time by province. Instead data is disaggregated by region, or Comunidad Autónoma as we call it. Some of these have a greater population and area than entire countries like Ireland. Ideally, an analysis shouldn’t reduce such big regions to a single dot in a scatterplot.

So what data can we use?

In this post I decided to focus on two countries that have published statistics consistently since the Covid outbreak started: Norway and Sweden. Both provide data on how many Covid fatalities took place each week in every county. For Sweden I’ll also use all-cause mortality. As for Denmark, the country only has five regions and I couldn’t find a breakdown of deaths over time for smaller jurisdictions.

Norway is an outlier among Western nations because of the low impact Covid has had. Sweden by contrast has a death toll in line with the European average. Both countries had a first wave in spring 2020 and a second wave in autumn and winter. For this analysis I pick the last day of week 35, which is to say August 30th, as the end of the first wave. Can we learn anything by comparing the evolution of the epidemic?

In the following plots every dot represents a county. In Norway’s case, one could approximately know how many fatalities a county would have in the second wave just by knowing how many it had in the first. R² is just north of 0.7.

But for Sweden there is almost no correlation. First-wave fatalities only explain 9 to 10% of the variance in second-wave deaths.

How many people are dying due to Covid in Sweden?

There is some evidence that Sweden may be over-counting Covid fatalities in the second wave. Since October, all-cause mortality has been well below the level one would expect if all Covid fatalities were “additional” to the rest of deaths. That’s why the purple line (total mortality) is now under the orange one (normal mortality + Covid deaths). In other words: non-Covid deaths since October have been lower than normal. The plot also suggests there was under-counting in the first wave, but to a lesser degree.

The phenomenon is not uniform across Sweden. In Stockholm and nearby counties all-cause mortality had almost returned to normal levels by week 1–2021 (ending on January 10th). This part of the country suffered a large spring wave.

Whereas in southern Sweden all-cause mortality is still well above normal, and the gap between expected and actual mortality is proportionately smaller. Comparing South with East, the former had a smaller spring wave but a larger autumn and winter wave. And the pattern is more noticeable if one looks at all-cause mortality, rather than the official Covid toll.

So, does the picture change if instead of the official number of Covid deaths one looks at excess mortality? For convenience I simply took the 2015–2019 average as the “normal” level of mortality, but I accept the result may change with a different baseline. Some people have argued for a lower baseline to account for gradually improving health outcomes, and indeed Sweden’s public health agency appears to calculate a “normal” mortality level that’s slightly below the 2015–2019 average.

All-cause mortality data for recent weeks is incomplete. To be more sure I stopped at the last week of 2020, which actually finishes on January 3rd of 2021. This is the result.

Across Sweden’s counties, the correlation between first- and second-wave excess mortality is just below zero.

One has to mention that this correlation has been declining as the second wave progressed. Going back to the official Covid toll, as of January 24th R² was 0.096. But if one instead chose January 3rd as the endpoint, to make it comparable with the excess mortality chart, R² was 0.14.

Thus, the definition of Covid death matters for this kind of plot. A mild but positive correlation can disappear just by looking at excess mortality rather than the official Covid death toll.

Tentative conclusions

So long as a country’s population remains mostly susceptible to Covid, one should expect that the regions with the the highest number of infections and deaths in the early months will also have more than average in later stages. Thus a scatterplot comparing first- and second-wave infections or deaths will not show a shapeless cloud. Rather, there will be a positive correlation. I believe this happens because many of the factors that led a region to have more deaths in the beginning of an epidemic will still be present in a second or third wave. I’m thinking of issues like the proportion of people living in nursing homes, share of the population with vitamin D deficiency, etc. As group immunity builds up, the correlation becomes weaker — perhaps going negative at some point.

I’m just a guy writing in his free time and obviously this blog post hasn’t been peer-reviewed, so take it with a grain of salt. If anyone wants to further check these conclusions, it would be useful to see if the Norway pattern also shows up in other countries with few infections, like Finland and South Korea.

Data

All data and code is in this Google Drive folder.

For Norway, I haven’t found a dashboard with the actual data in Excel or similar form. So I use the weekly report. In particular, here are the PDFs for week 35–2020 and week 3–2021.

In Sweden’s case this is the main portal for Covid data. The daily update can be found through this link. I used data updated on January 29th.

Sweden’s all cause-mortality is published on this website, and the file itself is here. I used the January 25th update.