ABSTRACT
The anaerobic membrane bioreactor (AnMBR) is a promising technology for not only water reclamation but also virus removal; however, the virus removal efficiency of AnMBR has not been fully investigated. Additionally, the removal efficiency estimation requires datasets of virus concentration in influent and effluent, but its monitoring is not easy to perform for practical operation because the virus quantification process is generally time-consuming and requires specialized equipment and trained personnel. Therefore, in this study, we aimed to identify the key, monitorable variables in AnMBR and establish the data-driven models using the selected variables to predict virus removal efficiency. We monitored operational and environmental conditions of AnMBR in Sendai, Japan and measured virus concentration once a week for six months. Spearman's rank correlation analysis revealed that the pH values of influent and mixed liquor suspended solids (MLSS) were strongly correlated with the log reduction value of pepper mild mottle virus, indicating that electrostatic interactions played a dominant role in AnMBR virus removal. Among the candidate models, the random forest model using selected variables including influent and MLSS pH outperformed the others. This study has demonstrated the potential of AnMBR as a viable option for municipal wastewater reclamation with high microbial safety.
HIGHLIGHTS
Virus removal efficiency in the AnMBR was monitored.
The data-driven model to predict virus removal efficiency was built.
The random forest model had the highest prediction performance.
The pH values of influent and MLSS strongly correlated with virus removal efficiency.
INTRODUCTION
In recent years, the field of wastewater reclamation has undergone a fundamental paradigm shift. With rapidly developing technology and increasing concerns over environmental sustainability, researchers have moved beyond viewing wastewater as a societal burden to recognizing the potential of recovering embedded resources as a critical goal in the new era of sustainability (van Loosdrecht & Brdjanovic 2014). Municipal wastewater, in particular, is a valuable resource that contains a diverse array of materials, including water, energy, and nutrient-rich fertilizers (Kehrein et al. 2020), making it an attractive target for resource recovery. However, conventional wastewater treatment technologies were not designed with resource recovery in mind, and as a result, they are inherently inefficient for extracting the residual value from municipal wastewater while simultaneously producing high-quality effluent. Therefore, there is a need for novel solutions that can optimize resource recovery while ensuring the production of high-quality effluent.
The anaerobic membrane bioreactor (AnMBR) is a promising technology for water purification, which operates in an anaerobic environment in the stirred tank and utilizes a membrane module to separate the slow-growing anaerobic microbes from the filtrate. The AnMBR offers several advantages over conventional wastewater treatment processes, such as the production of high-quality effluent, low land requirements, and the ability to recover energy from the organic compounds present in wastewater. Optimized operational strategies have been reported to achieve net energy and carbon neutrality (Cookney et al. 2016; Robles et al. 2021; Zhang et al. 2022b). In comparison to conventional plants, which consume between 0.26 and 0.89 kWh of electricity per cubic meter of municipal wastewater treated (Singh et al. 2016; Maktabifard et al. 2018), the AnMBR exhibits significant advantages in terms of long-term sustainability. With regard to effluent utilization, coupling the AnMBR with downstream processes such as anammox to remove nitrogen is an attractive option (Rong et al. 2022). Additionally, the nutrient-rich effluent can be applied in agricultural irrigation as a liquid fertilizer (Harb & Hong 2017; Wu & Kim 2020), which reduces the total environmental footprint by offsetting the energy consumption and carbon emissions involved in fertilizer production (Zhang et al. 2022b). Consequently, the AnMBR has emerged as a new-generation wastewater treatment solution, and recent research has been increasingly devoted to exploring its potential.
The utilization of AnMBR holds significant potential benefits, whereas it is also accompanied by microbial safety concerns. This is particularly relevant to agricultural irrigation, where the reclaimed water may come into direct human contact. It is therefore necessary to undertake additional efforts to ensure the safe usage of AnMBR effluent. The membrane utilized in the AnMBR exhibits the ability to completely prevent the passage of larger microorganisms such as bacteria and protozoa (Cecconet et al. 2019; Zhu et al. 2021); however, the presence of viruses is a matter of greater concern for two distinct reasons. Firstly, municipal wastewater is known to harbor a diverse range of waterborne viruses that can pose us infection risk if not adequately removed (Ogorzaly et al. 2010; Gibson 2014; Corpuz et al. 2020). Secondly, due to their small size, virus particles are more likely to evade membrane rejection and ultimately end up in the effluent than other pathogens. Despite the existing research gap, the virus removal efficiency of the AnMBR has been reported in only a limited number of studies, in which only lab- or bench-scale reactors were employed (Wong et al. 2009; Fox & Stuckey 2015; Zhang et al. 2022a). Moreover, while the efficiency and mechanisms involved were discussed, the consistency and dependability of virus removal in long-term operation, a fundamental aspect of the multi-barrier system concept (Sano et al. 2016), remain to be fully investigated.
In the context of microbial safety, the multi-barrier system framework is designed to achieve the desired pathogen reduction by implementing multiple virus reduction measures that work in concert (Mohr et al. 2020). This approach ensures that even if one measure fails, the overall safety of the system is maintained by the compensation of other measures. To implement this framework, the virus removal efficiency of each component needs to be assessed and certified. Moreover, in contemporary sanitation frameworks such as the WHO's Guidelines for the Safe Use of Wastewater, Excreta, and Greywater, microbial safety is recommended to be continuously monitored rather than through discrete testing (World Health Organization 2006). However, this creates additional complexity for virus quantification, as commonly used detection methods, such as RT-qPCR and plaque assays, can be time-consuming and costly, in addition to requiring specialized equipment and trained personnel (Zhu et al. 2021). This makes it financially impractical to implement frequent virus testing in daily wastewater treatment plant (WWTP) operations. Furthermore, the turnaround time required for experiments means that by the time the target virus reduction rate is determined, the effluent may already have been released. To address this challenge, a soft-sensor approach can be used to infer hard-to-measure variables (objective variables) from easy-to-measure variables (explanatory variables) via models. This approach is becoming increasingly popular in the water field, and some applications have been reported (Haimi et al. 2013; Han et al. 2018; Thürlimann et al. 2018; Reynaert et al. 2023). When the desired variable is the result of both physicochemical and biological processes, such as membrane fouling, it may be difficult to establish a mechanistic model, and data-driven modeling approaches that can learn the input–output relationship directly from the data may be a more practical alternative (Haimi et al. 2013). Virus removal in the AnMBR is a typical example of such an objective variable (Zhu et al. 2021), yet little has been done to investigate the feasibility of applying a soft-sensor approach. In a previous study, Madaeni & Kurdian (2011) explored the potential of this approach by utilizing a combination of fuzzy modeling and the genetic algorithm to model virus removal in a lab-scale membrane filtration module (Madaeni & Kurdian 2011). However, to date, no study has been conducted on a functioning AnMBR, let alone a pilot-scale system that can better simulate real-world implementation.
The present study aims to investigate the feasibility of using a soft-sensor approach for real-time monitoring of virus removal efficiency in the AnMBR. A pilot-scale AnMBR treating municipal wastewater was used as the testing platform. The concentrations of two indigenous viruses, PMMoV and Norovirus GII (NoV), were determined using RT-qPCR, and operational variables obtained from reactor operators were analyzed to identify potential effects on virus removal efficiency. Finally, data-driven modeling methods based on machine learning algorithms were fitted to the data to verify whether virus removal efficiency can be predicted from the operational conditions of the AnMBR via the soft-sensor approach.
METHODS
AnMBR overview
A pilot-scale submerged AnMBR plant, which is situated in Sendai, Japan, was used as the experimental platform in this study. The reactor, which has a total volume of 5,000 L, is equipped with a hollow fiber Polyvinylidene fluoride (PVDF) membrane with a pore size of 0.4 μm and a total area of 72 m2. The influent is pre-screened municipal wastewater that serves a population of around 150,000 from a local municipal WWTP. Prior to sampling, the pilot-scale plant had undergone continuous operation for about 500 days and had achieved a stable phase. Additional information on the configuration, operational strategies, and performance of this AnMBR plant can be found in other studies (Rong et al. 2021, 2022).
Sample collection and concentration
Influent and effluent samples were collected from the AnMBR plant over a period of 149 days, from September 6, 2020 to February 1, 2021. The sampling frequency was set to once a week, with exceptions due to holidays and reactor maintenance. To account for potential differences in virus concentration, the volumes of influent and effluent samples collected were 40 and 1 L, respectively. Upon collection, all samples were kept on ice and transported to the laboratory for processing on the same day.
Different methods were used to concentrate influent and effluent samples due to their respective volumes. The influent samples were concentrated using a previously described polyethylene glycol (PEG) precipitation method, as follows. Specifically, 40 mL of the influent sample was mixed with 3.2 g of PEG 6000, 0.92 g of NaCl, and 100 μL of pre-prepared murine norovirus (MNV) strain S7-PP3 stock suspension for recovery calculation. The mixture was then stirred overnight at 4 °C using a magnetic stirrer, after which it was centrifuged at 8,000 × g for 30 min at 4 °C. The supernatant was removed, and the pellet was resuspended in 1 mL of MilliQ water, before being further centrifuged at 9,000 × g for 10 min at 4 °C. The supernatant was collected, and its volume was measured for subsequent analysis.
On the other hand, effluent samples were concentrated by a double membrane filtration assay (Haramoto et al. 2004). Specifically, 1 L of effluent was mixed with 10 mL of 2.5 M MgCl2 and 1 mL of MNV stock suspension, after which it was concentrated by suction filtration using a 0.45 μm membrane (HAWP-090-00, Millipore). Following filtration, the membrane was washed with 200 mL of 0.5 mM H2SO4. Then, 10 mL of 1 M NaOH was added to elute the absorbed virus particles into 10 mL of TE buffer with 100 mM H2SO4, and the final volume was about 20 mL. The filtrate was then subjected to a secondary filtration using CentriPrep YM-50 at 2,500 rpm for 10 min. However, two effluent samples were concentrated using Amicon Ultra-15, MWCO 30 kDa with similar performance for 15 min at 2,500 × g instead, as CentriPrep YM-50 was discontinued during the sampling period. The volume of the final filtrate was recorded to calculate recovery efficiency.
RNA extraction, cDNA synthesis, and RT-qPCR
The RNA extraction and cDNA synthesis were carried out using the QIAamp Viral RNA Mini Kit (Qiagen) and the iScript™ cDNA Synthesis Kit (Bio-rad), respectively. The reverse transcription-quantitative polymerase chain reaction (RT-qPCR) experiments were conducted using the SsoAdvanced Universal Probes Supermix (Bio-rad) and a CFX Connect system (Bio-rad). Further information on the RT-qPCR procedure, such as the primer/probe sequences for each target virus, mix formulation, and thermal cycling parameters, can be found in Tables S1 and S2. All cDNA samples were analyzed three times in the RT-qPCR, and the average Ct value was used for subsequent analyses.
Prediction model for virus removal efficiency
The operation record of the AnMBR during the study period was acquired from reactor operators. The record includes a total of 33 variables (Supplementary Table S3), ranging from operation strategy parameters set by the operators (e.g., HRT) to passively collected variables that reflect the environmental and operational status (e.g., mixed liquor suspended solid (MLSS) and pH). The methodologies employed to obtain these variables had been specified in previous studies (Rong et al. 2021, 2022).
A variable screening and selection step was first conducted by calculating its Spearman's rank correlation coefficient with the LRV. We used four options for selecting inputs for modeling: (option 1) only variables found to be strongly correlated with the LRV (∣ρ∣ ≥ 0.7 and p-value ≤0.05) were used as the inputs; (option 2) all variables that show at least weak correlation with the LRV (∣ρ∣ ≥ 0.3) were used (Reac_temp, GasClean_freq, EPS_PS, GasClean_flux, MLVSS, TMP, H2S, Inf_ph and MLSS_ph: see Supplementary Table S3); (option 3) a selection of nine variables empirically determined using prior knowledge about the virus removal process were used (HRT, trans-membrane pressure (TMP), GasClean_freq, Inf_ph, MLSS_ph, MLSS, MLVSS, MLSS_TPROT, and MLSS_TPS: see Supplementary Table S3); and (option 4) all variables were used. The goal was to test the optimal variable selection strategy that achieves high prediction accuracy while using as few inputs as possible. Two data-driven regression models (artificial neural network (ANN) and random forest (RF)) were tested due to their increasing popularity and proven performance in soft-sensor development tasks regarding biological wastewater treatment (Haimi et al. 2013). Briefly, ANN features a network structure and a set of artificial neurons arranged in three connected layers: input layer, output layer, and one or more hidden layers in between. By tuning the weights between the neurons, the output layer can be nonlinearly connected to the input layer. This gives ANN the ability to establish a complex relationship between the input and output without prior knowledge. On the other hand, RF features a large collection of decision trees, and each tree is individually constructed using sub-samples. Using bootstrap aggregation, the outcome from RF reflects the collective outcome from all decision trees, hence the ‘forest’ in the name.
To initiate the modeling process, an optimization process was performed by conducting a grid search to determine the optimal configuration for each model under each input set. To ensure model training's randomness and improve prediction performance due to overfitting to the training dataset, repeated random sub-sampling cross-validation was employed for both models. During each iteration of model training, only 80% of the dataset was randomly selected for training the model. The trained model was then used to perform predictions on the remaining 20% of the data, and the predictions were compared with actual values. The training process was repeated 500 times to ensure that an adequate amount of data was used to train the model. In parallel, a linear regression model (LM) was also run for comparison. Two metrics were used to evaluate prediction performance: root-mean-square error (RMSE) and mean absolute percentage error (MAPE) between the predicted and actual LRV. All statistical analysis and modeling were conducted in R version 4.2.2 (R Core Team 2018). ANN and RF models were constructed using the ‘neuralnet’ and ‘randomForest’ packages, respectively.
RESULTS
AnMBR operation condition and virus removal performance
The recovery efficiency of MNV ranged from 18.37 to 47.96% (mean: 32.06%) in influent samples. In effluent samples, the range was 0.14–4.39% (mean: 1.61%), which was lower than that reported in the original literature, although the difference in the virus targets may have caused the disparity (Haramoto et al. 2004). Regarding virus quantification results, counting in the volume conversion involved in quantification, PMMoV had a consistent and significant presence in influent samples with an average concentration of 5.70 log copies/mL (SD: 0.27). This concentration range is consistent with previous studies (Kitajima et al. 2014). In effluent samples, the concentration of PMMoV dropped to an average of 2.61 log copies/mL (SD: 0.48). This resulted in the LRV ranging from 2.25 to 4.61, which is also in line with other AnMBR virus removal studies (Haimi et al. 2013; Han et al. 2018; Thürlimann et al. 2018). On the other hand, the presence of NoV GII in influent samples showed an upward trend during the study period from not detected in September 2,020 to around 4 log copies/mL during the wintertime. Despite the presence of influent samples, all effluent samples were negative for NoV GII, suggesting a high removal efficiency by the AnMBR. Because the real LRV could not be calculated for NoV GII, the LRV modeling was established for PMMoV only.
LRV prediction modeling
DISCUSSION
To our knowledge, this is the first study to report the virus removal performance of a pilot-scale AnMBR treating real municipal wastewater. Furthermore, to test the feasibility of putting it under real-time monitoring, which is advocated by modern sanitation frameworks, we applied a soft-sensor approach that connects the LRVs with some operational variables monitored in reactor operation. Overall, the pilot-scale AnMBR had a stable virus removal efficiency on par with previous studies. We also found the virus removal efficiency to be strongly correlated with the pH values of both MLSS and influent. The modeling result showed that it was feasible to circumvent the mechanistic understanding of the virus removal process and predict the LRV using only the variables monitored during operation. This validation supports the soft-sensor approach.
The AnMBR is now making the transition from the proof-of-concept phase and is currently being tested under various conditions in terms of reactor configuration and influent characteristics. Many technical challenges, such as start-up-phase optimization, fouling control, and operating strategy, are being tackled by a growing number of researchers. However, for this technology to make its way into real-world implementation, there are other practical factors to be taken into consideration with one being the microbial safety evaluation. As mentioned earlier, considering that AnMBR effluent had been proposed for agricultural irrigation use (Prieto et al. 2013; Scarascia et al. 2021), its virus removal efficiency largely determines whether tertiary treatment is needed, and if so, to what extent. However, there is a significant lack of knowledge about it, both in terms of the number of available studies and the conditions (reactor configuration, virus type, operational condition, etc.) covered in them. We intended to address this issue by conducting the virus removal experiment in a pilot-scale AnMBR and introducing a modeling framework that could be adopted by future studies.
The occurrence of norovirus showed a wintertime seasonality in influent samples, which is consistent with past literature on both norovirus wastewater monitoring and clinical testing (Katayama et al. 2008; Nordgren et al. 2009; Ahmed et al. 2013; Prevost et al. 2015). The fact that no effluent sample tested positive for NoV GII suggests that the AnMBR can effectively and reliably remove viruses from municipal wastewater to a low level. This statement is also supported by a recent virus removal study featuring a bench-scale AnMBR (Wu et al. 2022).
Conceptually, the reactor may be further optimized to yield a higher virus reduction rate (e.g., using the membrane that has a smaller pore size or maintaining a thicker biofilm), but the improved virus removal efficiency may come at a cost (higher membrane purchase, maintenance cost, and higher electricity consumption due to increased TMP). Therefore, a balance needs to be found in actual operation. Also worth mentioning is that spiking bacteriophages for fouling control and reactor performance enhancement were discussed in recent studies (Scarascia et al. 2021; Aydin et al. 2022). If this approach comes to fruition, the reactor will need to remove not only indigenous viruses but also bacteriophages, demanding an even higher and more reliable virus removal capability.
As for modeling, while several studies attempted to model virus removal performance in other membrane-related systems via either mechanistic, statistical, or data-driven modeling (Wu et al. 2010; Madaeni & Kurdian 2011; Rathore et al. 2014), limited progress has been made regarding prediction. The complexity of the AnMBR system, especially as it features both physical and biological components, greatly hinders the development of mechanistic models. The data-driven modeling approach was hence chosen in this study because it does not require prior knowledge about the underlying mechanisms and learns directly from the data.
Due to the relatively stable operation at the AnMBR plant employed in this study, the sampling interval in this study was chosen to be 1 week to better catch long-term operation variability. From September 2020 to February 2021, the recorded air temperature ranged from around 2.0–31.0 °C. This led to a series of performance changes, with the most noticeable being the change in the operating temperature. Although the operating temperature was closely associated with the overall performance of AnMBR (Rong et al. 2022), a significant effect on the PMMoV LRV was not observed, indicating that the microbial safety of the AnMBR may be consistent throughout the operating temperature range.
The experimental results have several implications. Firstly, the fact that the virus removal performance was strongly correlated with the pH of influent and MLSS indicates that electrostatic interactions play an indispensable role in the virus removal process, which was also identified in our review article and other previous studies but not investigated in detail (Matsushita et al. 2013; Armanious et al. 2016; Purnell et al. 2016; Miura et al. 2018). Since TMP largely indicates the permeability of a membrane module and was found to be significantly correlated with virus removal efficiency in a previous study (Zhang et al. 2022a), it was originally considered to be a crucial factor, but a strong correlation between TMP and LRV was not found in this study. The addition of TMP to the input set did not improve prediction accuracy either. Considering the aforementioned correlation between influent/MLSS pH and PMMoV LRV, a plausible explanation is that while membrane permeability largely determines virus removal efficiency in the start-up stage, once the reactor enters stable operation, the electrostatic interaction-driven virus absorption onto either the suspended or attached biomass becomes the dominant factor in additional removal. From the perspective of operating strategy, this also means a temporary drop in TMP, which can result from membrane cleaning or sludge discharge, may not greatly affect the virus removal performance. In this study, a positive correlation was found between LRV and influent/MLSS pH, meaning that slightly increasing pH of influent or MLSS may result in better virus removal efficiency. However, whether this affects the activity of the microorganisms in the reactor requires further investigation. Further studies are needed to investigate whether a balance between virus removal efficiency and reactor performance can be maintained.
Two limitations of this study are worth mentioning. Since NoV GII was reduced to below the detection limit in all effluent samples, its LRV was left-censored; hence, the modeling was performed on the PMMoV LRV instead. However, the removal of PMMoV may not well represent that of human pathogens due to different characteristics, for instance, the electrostatic potential and the affinity for MLSS. Nevertheless, we should point out that the absence of NoV GII in all effluent samples suggests this pilot-scale AnMBR achieved high removal of NoV GII throughout the study period, adding credibility to the claim about its microbial safety. Therefore, PMMoV removal efficiency may serve as a conservative indicator, and in further study, other potential indicator viruses such as coliphages and crAssphages should be applied for modeling. Secondly, the measured virus concentration, and by extension LRV, can be greatly influenced by the virus recovery efficiency. Although we employed tested and proven virus concentration methods, some extent of variation in recovery efficiency still exists, which may affect the robustness and reliability of modeling.
CONCLUSIONS
In the present study, we established the data-driven model to predict virus removal efficiency in the AnMBR. We demonstrated that the RF model based on nine variables, which showed a weak correlation with the virus log reduction value, had the best prediction performance. Furthermore, among the candidate variables, the pH values of influent and MLSS, which might be responsible for virus adsorption on biomass via electrostatic potential, played a critical role in virus removal in the AnMBR. As the interest in AnMBR technology continues to grow, follow-up research needs to catch up and that rightly includes the assessment and verification of its microbial safety. The competent virus removal capability shown by the pilot-scale AnMBR in stable operation and the prediction performance of soft sensors have highlighted the potential of this technology. We hope that the findings made here will help expand the role that the AnMBR plays in the future grand scheme of wastewater reclamation and reuse.
ACKNOWLEDGEMENTS
This study was supported by the Japan Science and Technology Agency (JST) through the Strategic International Collaborative Research Program (SICORP) (grant no. JPMJSC18H6), JSPS Bilateral Program (grant no. 120237402), and the National Key Research and Development Program of China (grant no. 2017YFE0127300).
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.