Applying mixture model methods to SARS-CoV-2 serosurvey data from Geneva

Bouman JA, Kadelka S, Stringhini S, Pennacchio F, Meyer B, Yerly S, Kaiser L, Guessous I, Azman AS, Bonhoeffer S, Regoes RR

Serosurveys are an important tool to estimate the true extent of the current SARS-CoV-2 pandemic. So far, most serosurvey data have been analyzed with cutoff-based methods, which dichotomize individual measurements into sero-positives or negatives based on a predefined cutoff. However, mixture model methods can gain additional information from the same serosurvey data. Such methods refrain from dichotomizing individual values and instead use the full distribution of the serological measurements from pre-pandemic and COVID-19 controls to estimate the cumulative incidence. This study presents an application of mixture model methods to SARS-CoV-2 serosurvey data from the SEROCoV-POP study from April and May 2020 in Geneva (2766 individuals). Besides estimating the total cumulative incidence in these data (8.1% (95% CI: 6.8%-9.9%)), we applied extended mixture model methods to estimate an indirect indicator of disease severity, which is the fraction of cases with a distribution of antibody levels similar to hospitalized COVID-19 patients. This fraction is 51.2% (95% CI: 15.2%-79.5%) across the full serosurvey, but differs between three age classes: 21.4% (95% CI: 0%-59.6%) for individuals between 5 and 40 years old, 60.2% (95% CI: 21.5%-100%) for individuals between 41 and 65 years old and 100% (95% CI: 20.1%-100%) for individuals between 66 and 90 years old. Additionally, we find a mismatch between the inferred negative distribution of the serosurvey and the validation data of pre-pandemic controls. Overall, this study illustrates that mixture model methods can provide additional insights from serosurvey data.