ABSTRACT
Wastewater-based epidemiology (WBE) has emerged as a valuable tool for COVID-19 monitoring, especially as the frequency of clinical testing diminishes. Beyond COronaVIrus Disease 19 (COVID-19), the tool's versatility extends to addressing various public health concerns, including antibiotic resistance and drug consumption. However, the complexity of sewage systems introduces noise when measuring chemical tracer concentrations, potentially compromising their applicability for modeling. In our study, we detail the approach adopted to determine the concentration of severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) ribonucleiec acid (RNA) in wastewater from the Ponte a Niccheri wastewater treatment plant in Tuscany (Italy), with a sample size of N = 13,935 inhabitants. The unique characteristics of this wastewater system, including mandatory pretreatment in septic tanks with extended retention times, the presence of a hospital for COVID-19 patients, and mixed sewage networks, posed additional challenges. Nevertheless, our results highlight a robust and significant correlation between our measurements and the number of infections within the wastewater treatment plant's catchment area at the time of sampling. A simple linear model also shows promising results in estimating the number of infected people within the area.
HIGHLIGHTS
Strong and significant correlation found between concentration of nucleic acids in wastewater and clinical test data..
Presence of a small-sized catchment (13935 people served).
The presence of mandatory septic tanks in civil buildings does not impact results.
The presence of hospitals with intensive care units within the catchment area does not impact results.
Adopting a polishing pipeline using linear interpolation and locally weighted scatterplot smoothing gives promising results.
INTRODUCTION
In 1951, Moore et al. successfully detected Salmonella in sewage systems (Moore 1951). This work unveils the latent potential of wastewater, not just as a transmission medium, but as a sentinel to evaluate community health. Paired with epidemiological modeling, this method provided an unobtrusive, real-time perspective on a plethora of societal health concerns. For instance, the quest for eradicating polio found a valuable ally in wastewater surveillance. By detecting poliovirus circulation in wastewater, even in the absence of clinical cases, researchers could instigate a rapid response and targeted vaccination drives, profoundly impacting polio eradication efforts (Yang et al. 2013). More recently, during the spread of the COVID-19 pandemic, wastewater-based epidemiology (WBE) has been employed extensively as a monitoring tool for SARS-CoV-2 and its associated disease trends among populations (Medema et al. 2020a; Lundy et al. 2021). It has also shown a promise as an early warning system (Medema et al. 2020b; Randazzo et al. 2020; Ahmed et al. 2021; Bibby et al. 2021). The robust performance of these methodologies for surveillance purposes has even led the World Health Organization (WHO) to insert environmental surveillance as part of its revised MOSAIC framework.1 Part of this interest derives from the fact that massive screening in a pandemic setting has proven extremely difficult and a wastewater-based approach could implement efficient long-term surveillance in large areas, allowing resources for clinical tests to be more efficiently spent only in needed areas (Gagliano et al. 2023). In this aspect, WBE seems to be the most promising approach for both long-term surveillance and areas where clinical testing may prove too expensive or logistically infeasible (Hart & Halden 2020). Moreover, the European Parliament is now discussing the possibility of making the concentration measurements that this approach relies upon mandatory in every town with a population of 1,000 or more. Therefore, it is important to verify the feasibility of the WBE approach for smaller plants such as the one we analyze here, where noise sources may affect more strongly the signal.
Peak . | Time distance (days) . | Height difference . | Slope difference . |
---|---|---|---|
1 | −7 | 0 | 1 |
2 | 21 | −0.08 | 1 |
3 | −14 | 0.02 | 1 |
Peak . | Time distance (days) . | Height difference . | Slope difference . |
---|---|---|---|
1 | −7 | 0 | 1 |
2 | 21 | −0.08 | 1 |
3 | −14 | 0.02 | 1 |
In the case of Italy, the pandemic hit hard starting in January 2020, and on February 24th, a SARS-CoV-2-positive case was reported in Florence. While the WBE approach seemed promising, the specificity of the Tuscany Region sewage system and therefore wastewater (Sguanci et al. 2019) could have significantly impacted the signal quality. The most important characteristic that interferes with the viral signal is the mandatory presence of pretreatments before entering the sewer system. Depending on local regulations and the implementation period, there could be two or three septic tanks or even older one-chamber systems that are still present. The total capacity has to be at least 225 l for each equivalent inhabitant, with an absolute minimum of 3,000 l.2 Therefore, the hydraulic retention could range from a few hours to several days in a typical anaerobic environment where the viral signal is heavily disintegrated. At the time of writing, there are still important gaps in the literature concerning the pathways followed by SARS-CoV-2 from the infected to the wastewater treatment plant (Foladori et al. 2020). For example, most papers that link wastewater viral signals and clinical cases do not analyze sewage systems where this kind of pretreatment is widespread. The few who consider the impact of septic tanks, like Li et al. (2023), do so by extracting the data directly from the tank and not from a wastewater treatment plant, as in this paper.
Other factors that could potentially influence the viral signal in wastewater are as follows:
the use of combined sewage networks;
a typical highly diluted wastewater due to the drainage of the pre-existing hydrographic networks in urban areas; and
the presence of hydro-demanding small and medium enterprises connected to the sewers.
These challenges were further compounded by Tuscany's consistent presence of tourists, attracted by its world-renowned art cities and coastal regions. These sources of noise may hamper the signal's capability to accurately describe pandemic trends. Recognizing the complexities presented by these factors, the Tuscany Region embarked on a collaboration with the Universities of Florence and Pisa. This joint initiative aimed to craft a robust COVID-19 surveillance system tailored to the region's unique context, placing special emphasis on the potential of WBE as an early warning mechanism.
In this work, we detail the methodologies and strategies adopted to optimize data from the inlet of Ponte a Niccheri wastewater treatment plant (N = 13,935 people served), whose catchment is completely situated within the confines of the province of Florence, the major city within Tuscany. This process involved aligning wastewater data with the number of COVID-19-infected people living in the catchment area by devising a workflow to reduce the impact of noise sources on genomic copy concentration measurements within wastewater. Our findings illuminate both the potential and challenges embedded within the sewage network as an epidemiological resource, with an emphasis on its value for pandemic monitoring and epidemiological model enhancement. A strong and significant correlation was found between wastewater data and clinical tests within the tested catchment area, with remarkably similar pandemic trends. We think our results clearly show the feasibility of a WBE approach to COVID-19 trend monitoring even in relatively small and very complex sewer systems such as those typically found in Tuscany.
MATERIALS AND METHODS
Monitored wastewater
The expected residential connected population (N = 13,935) was computed by taking the Italian National Institute of Statistics (ISTAT) population estimates for Firenze and Bagno a Ripoli and assuming a homogeneous density throughout the city area. Specifically, the served area was reconstructed by applying a 200 m buffer to the sewage network pipes.
Sewage water samples were extracted approximately weekly. Deviations from this pattern mostly occur within the first few months, during which the extraction system was first implemented. Each singular extraction is composed of 24 consecutive hourly wastewater samplings, starting at 9 a.m. Right after sampling each aliquot was stored at 4°C. A flow meter was also present in the WWTP inlet, making it possible to define a daily average wastewater flow value used in the subsequent data analysis. Data were collected from 22/06/2020 to 16/10/2021.
Viral concentration measurement
The 24-h composite wastewater samples were removed from the automatic sampler, kept refrigerated at 4 °C, and transferred to the laboratory within 48 h.
The concentration of viral nucleic acid was performed, for 23 samples, with polyethylene glycol (PEG) centrifugation. In detail, after pretreatment of 30 min at 56 °C for the inactivation of infectious viral particles, 45 mL of samples were centrifuged at 4,500 × g for 30 min at 4 °C. About 40 mL of the supernatant was recovered, and 4 g of PEG 8000 (Fisher Scientific, Geel, Belgium) and 0.9 g of NaCl (Merck KGaA, Darmstadt, Germany) were added. After the components were completely dissolved, tubes were centrifuged at 12,000 × g for 2 h at 4 °C. The pellet was resuspended in 200 μL of ASL stool lysis buffer (Qiagen GmbH, Hilden, Germany).
Nucleic acid extraction was performed using the STARMag 96 × 4 Universal Cartridge Kit (Seegene Inc., Seul, Republic of Korea) in Microlab NIMBUS (Hamilton Company, Reno, NV, USA), and the eluate (100 μL) was stored at −80 °C. Twelve additional samples were concentrated using the Zymo Environ™ Water RNA Kit (Zymo Research, Irvine, CA, USA) following manufacturer's instructions. In detail, 5 mL of the sample was directly subjected to RNA enrichment and purification to get a final amount of 15 μL of eluate, which is ready for downstream molecular analysis.
The quantitative detection of SARS-CoV-2 RNA was performed with the reverse transcription real-time polymerase chain reaction (qRT-PCR) developed by the Italian National Institute of Health (ISS) (La Rosa et al. 2021) in a CFX96 thermal cycler (Bio-Rad Laboratories Inc., Hercules, CA, USA). The calibration curve was obtained by serially diluting 1:10 DNA from a cultured wild-type SARS-CoV-2 RNA (provided by the ISS itself) in TE buffer pH 8.0 (Thermo Fisher Scientific Inc., Waltham, MA, USA) from 1 × 105 up to 1 × 101 copies/μL. The AgPath-ID™ One-Step RT-PCR Reagents kit (Thermo Fisher Scientific) was used for the PCR mix. The thermal protocol consisted of 30 min at 50 °C, 10 min at 95 °C, then 15 s at 95 °C, and 45 s at 60 °C for 45 cycles. Samples were considered positive when a signal was detected at the cycle threshold (Cq) < 40.
The procedure is described further in detail in Morecchiato et al. (2024).
COVID clinical test data
Anonymized COVID-19 clinical test data compatible with the time of wastewater extractions were granted to us by the Health Department of the Tuscany Region. During the monitored period, testing (e.g. nasopharyngeal swab) was mandatory (Ministry of Health Italian Ministry of Health 2022) for:
people with symptoms associated with the illness and close contacts of those registered as positive for SARS-CoV-2;
people with symptoms associated with the illness who travelled to an area where local transmission was present within the previous 14 days;
people with illness-associated symptoms need hospitalization.
Tests were carried out by medical professionals and their results were communicated to the local health department. COVID-19 patients, both symptomatic and asymptomatic, were required to quarantine until they tested negative.
We received data for all the COVID-19 tests taken both by people residing during their quarantine period within the sewer-served area and hospital patients. This was possible because each test was geolocated to either the person's residence address or the hospital, depending on whether they were hospitalized at the moment or not. We also received additional information on test results, specifically whether they resulted positive or negative. We considered a positive test as the beginning of the infection, and the following negative test as its end. To account for delays both from infection to the positive clinical test and from recovery to the negative test, we carried out a sensitivity analysis by applying different values for a static delay (from a day to a week) and noticed that it did not significantly impact results (data not shown).
Wastewater data polishing
During the monitored period, we found measurements with no resulting SARS-CoV-2 genome detected even if according to clinical test data, the number of infected people residing within the catchment area was always more than zero. There may be multiple plausible causes for this behavior, contributing in different ways. The extraction tool is incapable of detecting SARS-CoV-2 nucleic acids for concentrations lower than 500 copies per liter, which may be the result of severe signal degradation within the sewer network and/or wastewater dilution due to parasitic waters. Therefore, samples called negative for SARS-CoV-2 were reported as missing data. In these instances, we used linear interpolation to substitute the original null measure. To assess the impact of introducing linearly interpolated measurements, we repeated the process using the lower limit of detection in place of those values, finding that changes were not significant (results not shown).
Subsequently, we tried to reduce the impact of dilution by normalizing with flow rate data. To do so, we multiplied the genomic copy concentration by the dilution factor (the ratio between the measured average daily flow rate and the minimum average daily flow rate measured during the extractions). This procedure should account for the dilution caused by atmospheric agents and, in general, parasitic waters.
Following Huisman et al. (2022), we also show that the results obtained can be further improved by applying a LOWESS (LOcally WEighted Scatterplot Smoothing) algorithm with first-order polynomials and tricubic weights to smooth the time series. The smoothing parameter was set to the value that produces the best correlation, excluding values for which the p-value would be above the 0.05 significance threshold or wastewater trends appeared to be over-smoothed.
It is important to stress that, while smoothing improves correlation and is strongly advised, pandemic trends detected with wastewater were remarkably time-wise similar to clinical data reports even before applying the LOWESS algorithm.
RESULTS AND DISCUSSION
Wastewater correlation and trend analysis
Further considerations can be made by observing the specifics of the two time series, as summarized in Table 1 prior to smoothing and Table 2 after smoothing. We notice that the peaks are noticeably close in time, with a 2-week delay at most. Moreover, their height is similar, with an average 0.03 difference. These similarities certainly give a positive outlook toward wastewater's potential for pandemic surveillance. Finally, we can also see that both the ascending and descending trends are steeper for sewage water. These abrupt changes suggest that trying to predict pandemic trends using wastewater data only may prove difficult. Nevertheless, pandemic surveillance could also benefit from these sharp increases/decreases, allowing us to easily define a threshold beyond which we could consider the number of infected dwellers to be concerning.
Peak . | Time distance (days) . | Height difference . | Slope difference . |
---|---|---|---|
1 | 0 | 0.2 | −1 |
2 | 13 | −0.3 | 1 |
3 | −14 | −0.3 | 1 |
Peak . | Time distance (days) . | Height difference . | Slope difference . |
---|---|---|---|
1 | 0 | 0.2 | −1 |
2 | 13 | −0.3 | 1 |
3 | −14 | −0.3 | 1 |
Hospital's presence effect
CONCLUSIONS
In this work, we created a pipeline to polish SARS-CoV-2 genomic concentration data collected within the Ponte a Niccheri WWTP in Tuscany, spanning more than 1 year. Our results show that our framework is robust even in this particularly tough environment, characterized by the reduced dimension of the catchment, mandatory septic tanks for civil buildings, a mixed sewerage network that collects water from rain and both civil and industrial sources, as well as the presence of a hospital with an intensive care unit (ICU) for COVID-19 patients. While these characteristics are specific to the catchment area object of this study, we argue that they do not limit the generalization possibilities for our methodology, which could achieve even better results when those adverse conditions were to be lifted.
We show the presence of a strong and significant correlation between polished wastewater time series and COVID-19 clinical tests, considering the full catchment area, the hospital-only, and the external to the hospital-only alike.
This proves the feasibility of the WBE approach for surveillance purposes even in those settings where the noise level is much higher than what is usually found in literature. Moreover, we can conclude that the presence of an ICU in the catchment area does not require the hospital to be sampled separately, and the samples obtained at the WWTP level are still representative of the area.
These remarkably positive results lead us to regard WBE as a feasible and promising approach to pandemic surveillance in Tuscany and other areas where signal noise is high, for example, due to the presence of septic tanks before the discharge of wastewater into the sewers and the generalized use of combined sewer systems. The importance of such a tool becomes clear when observing the lowering number of clinical tests carried out in the area during 2023: the sewage network could be, in this instance, a more reliable data source than traditional tests for a significant period. In such conditions, WBE would be the most cost-effective approach for pandemic surveillance. Moreover, while the method was produced and tested on SARS-CoV-2 data only, it could be adapted to study different health issues such as drug consumption, antibiotic resistance, or other illnesses. The flexibility of this tool makes it extremely suitable for a One Health approach.
Our framework could benefit from further validation, not only in the Tuscan environment but also in evaluating its performance in less noisy settings. Additionally, it could also be interesting to study the pipeline's sensibility to the catchment area, analyzing data from both larger WWTPs and more localized sampling campaigns to monitor small communities (e.g. schools, hospitals, and prisons). Finally, the tested procedure could also be adjusted, when data are available, to monitor other health issues. In this case, the end goal could be constructing a singular framework capable of carrying wastewater surveillance on multiple different aspects of public health. Moreover, given the positive results of this pilot test despite the harsh conditions exacerbating noise issues, we think there could be value in creating epidemiological models (e.g. compartmental models) that rely on wastewater data and compare their predictions with those trained with clinical data only.
The authors thank Piergiuseppe Cal′ (Regione Toscana), Alessandro Cappelli (Autorit′a Idrica della Toscana), Fabrizio Mancuso (Ingegnerie Toscane srl), Simone Caffaz (Publiacqua SpA), Tiziana Pielggi (University of Florence), and Emanuele Massaro (European Commission Joint Research Center) for the general support of the research. Co-funded by the European Union – NextGenerationEU – National Recovery and Resilience Plan, Mission m, 4 Component 2 – Investment 1.5 – THE – Tuscany Health Ecosystem – ECS00000017 – CUP B83C22003920001.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.