Analyzing COVID-19 Data with Mixed Molecular and Serological Test Results

5 min readJun 15, 2020

Updated July 12, 2020.

Knowing the daily growth of pathogen infections and its trend is of crucial importance to measure the effectiveness of moderation and control measurements. In the United States of America, COVID-19 test results include two test type results: molecular and serological. Mixing molecular and serological test results introduces a graphing and analysis error when used together without distinction. To estimate the current daily growth rate and the spread of infection within our population requires representing both test-type results in a timeline reflecting how the two types represent different symptom onset dates.

This image illustrates the problem with graphical representations of mixed test-type when correctly graphed against test date instead of the report date.

Common COVID-19 data representation against the estimated chronologically corrected representation. — Figure 1: Where are we?

As can easily be seen in the above illustration (Fig. 1), the same data set yields two very different graphs. This publication explains why the second graph is closer to reality and how it can be recreated.

Let’s start with the following Proposed Timeline of Testing for SARS-CoV-2.

Figure 2: Sethuraman N et al. Interpreting Diagnostic Tests for SARS-CoV-2. JAMA 2020.

The proposed timeline (Fig. 2) identifies detection peaks for Polymerase chain reaction (PCR) tests and antibody detection tests. The third to the fourth week after symptom onset is the peak window for IgM and IgG seroconversion. These peaks are two to three weeks from the peak window for PCR detection. Therefore, an infection detected from antibodies is at least two weeks further from symptom onset than an infection detected with a PCR test. Therefore, representing both test types in the same timeline introduces a graphing error — a false representation of the actual new daily case history because of the misalignment to symptom onset. A proper historical representation of positive tests from the two test types requires an alignment factor.

Knowing the two to three weeks timeline mismatch is not enough to determine how the two test types represent the actual chronology. For that, we need a system, the unique case system. The Unique Case System prioritizes PCR tests for infection confirmation and allows only one positive result from each person regardless of the amount of test type administered.

If a person gets a positive antibody test result in the Unique Case System, the result will be uniquely assigned to the probable case group. If the same person gets a positive PCR test result afterward, that case is removed from the probable cases and moved to the confirmed cases. If the PCR test comes back negative, the person stays in the probable cases list because the test missed the PCR - Likely positive detection window. The end result is that the Unique Case System methodology divides the two test type results between two distinct regions: the PCR - Likely positive and the PCR - likely negative regions.

Figure 3: Salim Rezale et al.; COVID-19: Screening, Testing, PUI, and Returning to Work.

This defined division (Fig. 3) that the Unique Case System caused effectively eliminates the overlap between the two test types. With this Unique Case System, the patient is confidently in either the PCR-Likely Positive region or the PCR Likely-Negativeregion (3.1), the latter being approximately three weeks away (3.3) from the PCR detection peak (3.2).

The goal is to align PCR tests and antibody detection tests to the symptom onset date. A simple form of alignment is to use the detection peak of each test type to estimate the test alignment factor. The preferred method is to use curve data for each test type, but a simpler approximation can be to use “Week 1” (3.2) for the PCR detection peak, “Week 4” (3.3) for the IgM and IgM+IgG antibody detection results, and “Week 6” (3.4) for the IgG antibody detection results. These are the peaks used in the Fig. 1 representation of Estimated Daily New cases.

Using those detection peaks as alignment peaks, the test date for IgM and IgM+IgG factor is three weeks (-21 days) and five weeks (-35 days) for IgG positive test results. Applying this alignment factors to the antibody test results aligns their test date with the PCR test date to the symptom onset date.

Figure 4: Inflection testing of alignment factors.

The inspection of the PCR and antibody test curves should show alignment at inflection points (Fig. 4) to validate the alignment factors. This alignment can be mathematically evaluated and quantified with various mathematical methods. Aligned curves will be perfect. In the case of the curves presented in Figure 4, antibody tests represent about 75% of all unique cases, and COVID-19 moderation and control measure triggered antibody testing surges as each economic sector opened, hence the disproportioned spikes in results and expected fluctuations.

The corrected and aligned representation can provide relevant surveying data, especially determining an approximated symptomatic to asymptotic ratio. As all positive antibody test results in the Unique Case System are PCR negative, these results can be assumed asymptomatic, albeit with a calculable margin of error. The point with the largest PCR to antibody test gap can represent an estimated symptomatic-to-asymptomatic ratio. In the region graphed in Figure 4, the average asymptomatic infected person percentage is 81%.

Figure 5: Estimated total new cases with 81% asymptomatic average.

Figure 5 presents the estimated total of infected people using the calculated estimated margin of 81% from the Puerto Rico Department of Health data. With an estimated asymptomatic curve with 6,250 unique cases and 1,477 total confirmed cases, an estimated 7,727 total infections have occurred since the day of the first confirmed infection.

Update: July 12, 2020

Up-to-date data has been incorporated into the model and is presented in the following figure on a stacked-bar plot. Most recent dates without estimated probable cases represent the region not currently covered by the current survey data. The model continues to prove alignment from serology tests with molecular tests as can be witnessed in the visual patterns. To review the model you can access the original data report at this link.

Figure 6: Mixed molecular and serological test results with time-corrected probable cases.

This model has proven alignment since May 24, the orignal release date, and is available for anyone interested in either applying it or validating it. Anyone interested in downloading the data used in this publication can contact the author at https://bit.ly/tecnocato.

Disclaimer: This is a working document, and hence it represents research in progress and has not yet undergone peer review. The analysis presented here is limited to the data provided by Departamento de Salud de Puerto Rico and has not been yet evaluated with another compatible database. Therefore, the conclusions presented here are limited to that dataset, subject to dataset errors.

Analyzing COVID-19 Data with Mixed Molecular and Serological Test Results

Update: July 12, 2020

Written by Israel Meléndez, II

No responses yet