How much data is required for a robust and reliable wastewater characterization?

Yang, Cheng; Barrott, Wendy; Busch, Andrea; Mehrotra, Anna; Madden, Jane; Daigger, Glen T.

doi:10.2166/wst.2019.233

Abstract

Water resource recovery facility (WRRF) modeling requires robust and reliable characterization of the wastewater to be treated. Poor characterization can lead to unreliable model predictions, which can have significant economic consequences when models are used to make important facility upgrade/expansion and operational decisions. Current wastewater characterization practice often involves a limited number of relatively short-duration intensive campaigns. On-going work at the Great Lakes Water Authority (GLWA) WRRF, serving 3.1 million residents in Southeast Michigan, provided an opportunity to conduct more detailed wastewater characterization over an annual cycle. The collection system includes a significant combined sewer component, and the WRRF provides primary and secondary treatment (high purity oxygen activated sludge) and phosphorus removal via ferric chloride addition. Detailed wastewater fractionation was conducted weekly over a one-year period. Daily conventional secondary influent and process operational data from that same period were used to evaluate the efficiency of various wastewater characterization strategies on the bioreactor mixed liquor volatile suspended solids (MLVSS) concentration calculated using an International Water Association (IWA) Activated Sludge Model Number 1 (ASM1) with minor modifications. An adaptive strategy consisting of a series of short-duration characterization campaigns, used to assess model fit for its intended purpose and continued until a robust and reliable model result, is recommended. Periods of unusual plant influent and/or operational conditions should be identified, and data from these periods potentially excluded from the analysis. Sufficient data should also be collected to identify periods when poor model structure, rather than wastewater characterization, leads to poor fit of the model to actual data.

activated sludge, campaigns, characterization, models, reliable, robust, wastewater

INTRODUCTION

Listen

Process modeling based on the International Water Association (IWA) Activated Sludge Models (ASMs) has become the standard technique for the design of Water Resource Recovery Facilities (WRRFs) (Henze et al. 2000; Phillips et al. 2009; Rieger et al. 2012; Hauduc et al. 2013). These models depend on a detailed characterization of the influent wastewater that goes beyond the general simple lumped parameters, such as total five-day biochemical oxygen demand (BOD₅) and chemical oxygen demand (COD) typically collected for plant operation. Robust and valid characterization is essential for process modeling, as inaccurate wastewater composition inputs can lead to significant modeling errors (Rieger et al. 2012). The profound effect of wastewater characterization on modeling outputs has been demonstrated many times (Petersen et al. 2002; Phillips et al. 2009; Choubert et al. 2013), and include the following:

Sludge production is influenced by the estimated inert particulate COD.
Oxygen demand is influenced by the estimated total bio-degradable COD.
Anoxic denitrification rate and anaerobic phosphorus release are influenced by the estimated readily biodegradable COD.
Effluent COD is influenced by the estimated inert soluble COD.

In practice, wastewater characterization is conducted mainly via two methods: (1) physical-chemical and (2) respirometric. STOWA (Roeleveld & van Loosdrecht 2002) proposed simple and easy to implement guidelines based on physical-chemical methods. WERF (Melcer 2004) provided a state-of-the-art and frequently used method for measuring key influent wastewater characteristics and kinetic/stoichiometric parameters covering both methods. BIOMATH (Vanrolleghem et al. 2003) developed a protocol for activated sludge model calibration, with influent wastewater characterized by the respirometric method. Recent attempts at integrated characterization suggested a combination of both methods (Lu et al. 2010). These various methods were compared by Gillot & Choubert (2010) and Fall et al. (2011), where significant gaps were found in results.

Despite lack of agreement on the best characterization method, the choice should fit the purpose for which the model is being developed. Due to its time-consuming and labor-intensive nature, wastewater characterization is often conducted intensively within one or a limited number of short duration campaigns. While these data allow a simulation model to be set up, concerns exist when the model is to be used to simulate future performance. For example, ‘Are sufficient data collected to robustly characterize the wastewater on a long-term basis?’ and ‘Do wastewater characteristics vary on a seasonal or more random basis?’ Non-representative wastewater characterizations can lead to significant cost implications when model results are used to make decisions on facility upgrades/expansions and operation.

On-going work at the Great Lakes Water Authority (GLWA) Water Resource Recovery Facility (WRRF) in Southeast Michigan provided an opportunity to conduct detailed wastewater characterization over an annual cycle. Building on this long-term data set, an assessment of variations in wastewater characteristics and impacts of different strategies for wastewater characterization campaigns was conducted.

This paper evaluates alternative wastewater characterization campaign designs, mainly focusing on campaign size and timing. Following physical-chemical guidelines provided by WERF (Melcer 2004), detailed wastewater fractionation and characterization was conducted every week for a one-year period. Characterization results were fed into a standard ASM1 model, modified as described below, and different practical campaign strategies were evaluated. Based on these investigations, suggestions about obtaining robust and reliable wastewater characterization estimates by campaign design are proposed. Bioreactor mixed liquor volatile suspended solids (MLVSS) concentration, which responds in a straightforward fashion to process operating conditions and the relative fractions of biodegradable and non-biodegradable particulate matter in the influent wastewater, was used as the modeled response variable, compared to actual daily values. GLWA uses the high purity oxygen (HPO) activated sludge process operated with an average 2.3-day solids resident time (SRT), making MLVSS concentration responsive to variations in wastewater characteristics.

MATERIAL AND METHOD

Listen

Description of the plant

Listen

The GLWA WRRF is a 3,560,000 m³/day (940 MGD) peak flow (secondary treatment) facility serving 3.1 million residents in Southeast Michigan. The liquid process treatment train consists of influent pumping and preliminary treatment (screening and grit removal), conventional primary treatment with ferric chloride addition for phosphorus removal, HPO activated sludge, and effluent disinfection. Flows above 3,560,000 m³/day and up to 4,500,000 m³/day receive primary treatment with ferric chloride addition. Secondary treatment requirements apply, along with seasonally varied monthly effluent total phosphorus (TP) limits of 0.7 mg-P/L (October to March) and 0.6 mg-P/L (April to September). The plant routinely meets all discharge standards. Solids are thickened, dewatered, and either subject to drying or incineration and landfill.

Wastewater fractionation

Listen

Flow-proportioned 24-hour composite samples are collected daily for influent wastewater, secondary influent (primary effluent) and secondary effluent by GLWA WRRF staff. There are actually three separate influent streams to the GLWA facility, and each is sampled separately. While a combined primary effluent stream is conveyed to secondary treatment, it passes through two different pumping stations to secondary treatment, and each secondary influent stream is sampled separately. Return activated sludge (RAS) is combined and conveyed to the HPO bioreactors, resulting in a ‘single’ biological population, but two separate sets of secondary clarifiers exist and each set is sampled separately. Detailed wastewater fractionation was conducted weekly on all seven streams on samples collected on random weekdays over the period from October 19, 2017 to October 17, 2018. Wastewater fractionation generally followed the physical-chemical guidelines provided by WERF (Melcer 2004), and consisted of stepwise filtration through the standard glass fiber filter (1.2 μm nominal pore size) and an 0.45 μm membrane filter. Filtrate through the glass fiber filter (1.2 μm) was defined as the sum of soluble and colloidal COD (SCCOD). Filtrate through the 0.45 μm membrane filter was defined as soluble COD (SCOD). The difference between these two filtrates was defined as colloidal COD (CCOD). Particulate COD (PCOD) was defined as the difference between the total COD and SCCOD. COD and BOD₅ analyses were conducted by GLWA staff according to Standard Methods (APHA 2017).

Flocculation and filtration (Mamais et al. 1993; Roeleveld & van Loosdrecht 2002) is more generally applied to determine the soluble fraction of wastewater. Previous work (Yan et al. 2018) had indicated that, for this wastewater, there was no significant difference for COD and BOD₅ between 0.45 μm membrane filtrate and the results with flocculation and filtration per the WERF protocol. Note that ferric chloride is added prior to the primary clarifiers for phosphate removal, and this may function, to a certain extent, to achieve the flocculation of colloidal organic matter present in the influent wastewater. An independent wastewater characterization effort was conducted during this period in connection with an on-going master planning effort (Mehrotra 2018) that reached similar conclusions. In this study, they performed six days of COD characterization at the GLWA WRRF following standard physical-chemical guidelines (APHA 2017), and these results generally support that use of simple membrane filtration, rather than the more complicated flocculation and filtration procedure, is reasonable to characterize soluble organic constituents for this wastewater. Secondary influent (primary effluent) data were used in this study for modeling purposes. Not including flocculation and filtration of the samples collected from the several locations each week also facilitated the significant duration of the sampling program and became a practical consideration in proceeding with the characterization campaign.

Mapping measured wastewater fractions into model inputs

Listen

Required IWA ASM inputs include readily biodegradable COD (S_s), slowly biodegradable COD (X_s), soluble inert COD (S_I) and particulate inert COD (X_I) (Henze et al. 2000), which were calculated as fractions of total COD. As discussed below, colloidal COD was found to be insignificant for this wastewater and, therefore, was incorporated into the particulate COD fraction. The soluble inert COD (S_I) was determined directly as the measured second effluent membrane filtrated COD (SCOD_nb). The readily biodegradable COD (S_s or SCOD_bio) was calculated as the difference between the total soluble COD (SCOD) and SCOD_nb. The total biodegradable COD (SCOD_bio + PCOD_bio) was determined using the measured BOD₅ following STOWA guidelines (Roeleveld & van Loosdrecht 2002) and using a biodegradable COD/BOD₅ ratio of 1.73 mg COD/mg BOD₅. The slowly biodegradable COD (X_s or PCOD_bio) was determined as the difference between the total biodegradable COD and SCOD_bio. The final remaining COD (PCOD_nb) was then the particulate inert COD (X_I). A manual reconciliation process, including mass balance check, specific ratio check, non-negativeness check etc. (Rieger et al. 2010) was applied to the four wastewater component data, and records with apparent abnormalities were omitted. The reconciled COD concentrations were converted into fractions and then fed into the model.

Biological process modeling

Listen

HPO process bioreactor MLVSS concentrations were calculated using a standard IWA ASM1 (Henze et al. 2000), modified as described below and implemented in MATLAB^®, with measured secondary influent total COD and fractions determined as above as input. Secondary influent was used in the model for two reasons. One is that it represents the direct input to the secondary treatment process and, consequently, the impacts of upstream treatment on wastewater constituents need not be included in the model. Secondly, GLWA measures secondary influent total COD daily, so a several-year database was available for extensive evaluation of model performance based on various approaches for analyzing the fractionation results, as described below. Two long-term data sets were used for modeling and model evaluation. Daily data for the period of 19 October 2017 to 17 October 2018, corresponding to the year over which detailed wastewater fractionation occurred, were used as the model training set. Daily data from 18 October 2013 to 17 October 2017 were used for model evaluation and verification.

A simplified model based on a single completely-mixed bioreactor was used to compute the MLVSS, the response model variable which was compared to the measured MLVSS concentration. This simplified model facilitated process modeling and data analysis (around 50 times reduction on run-time). A more complete model of the entire liquid treatment process had previously been developed in SUMO (Dynamita). Comparison of the results from the two models demonstrated that use of the simplified bioreactor configuration did not materially affect MLVSS predictions. Further details of the model used include the following:

Biochemical processes included growth, decay and hydrolysis. Because biomass prediction was the main objective of this study, only these highly biomass-related reactions were considered.
Heterotrophic biomass was used to estimate the overall biomass. As is typical for HPO processes used for secondary treatment due to the relatively low SRT (average = 2.3 days) and the reduced bioreactor pH due to the retention of CO₂ in solution, nitrification does not occur in the full-scale system.
Since it is an HPO process, where oxygen is not limiting, oxygen limiting terms in reaction rate expressions were not included.
Standard stoichiometric and kinetic parameters and temperature correction factors from the literature (Grady et al. 2011; Hauduc et al. 2011; Alikhani et al. 2017) were used, as summarized in Table 1.

Table 1

Stoichiometric and kinetic parameter values and temperature correction factors used in model

Type	Symbol	Parameter	Unit	Value (20°C)	Factor θ^a
Kinetics	μ_H	Maximum specific growth rate of heterotrophs	d⁻¹	6	1.072
	K_s	Substrate half saturation for heterotrophs	mg COD L⁻¹	20	1.03
	f_D’	Fraction of biomass contributing to endogenous residue	g COD g COD⁻¹	0.08	1
	b_L	Aerobic decay coefficient for heterotrophs	d⁻¹	0.63	1.03
	k_h	Hydrolysis rate coefficient	d⁻¹	2.2	1.03
	K_x	Hydrolysis half saturation coefficient	g COD g VSS⁻¹	0.15	1
Stoichiometries	Y_H	Yield of heterotrophs on substrate	g COD g VSS⁻¹	0.67	1
Partitioning coefficients	i_VSS,B	COD/VSS ratio of biomass	g COD g VSS⁻¹	1.42	1
	i_VSS,XI	COD/VSS ratio of particulate inert	g COD g VSS⁻¹	1.5	1
	i_VSS,Xs	COD/VSS ratio of particulate substrate	g COD g VSS⁻¹	1.8	1
	i_VSS,XD	COD/VSS ratio of biomass debris	g COD g VSS⁻¹	1.3	1

Type	Symbol	Parameter	Unit	Value (20°C)	Factor θ^a
Kinetics	μ_H	Maximum specific growth rate of heterotrophs	d⁻¹	6	1.072
	K_s	Substrate half saturation for heterotrophs	mg COD L⁻¹	20	1.03
	f_D’	Fraction of biomass contributing to endogenous residue	g COD g COD⁻¹	0.08	1
	b_L	Aerobic decay coefficient for heterotrophs	d⁻¹	0.63	1.03
	k_h	Hydrolysis rate coefficient	d⁻¹	2.2	1.03
	K_x	Hydrolysis half saturation coefficient	g COD g VSS⁻¹	0.15	1
Stoichiometries	Y_H	Yield of heterotrophs on substrate	g COD g VSS⁻¹	0.67	1
Partitioning coefficients	i_VSS,B	COD/VSS ratio of biomass	g COD g VSS⁻¹	1.42	1
	i_VSS,XI	COD/VSS ratio of particulate inert	g COD g VSS⁻¹	1.5	1
	i_VSS,Xs	COD/VSS ratio of particulate substrate	g COD g VSS⁻¹	1.8	1
	i_VSS,XD	COD/VSS ratio of biomass debris	g COD g VSS⁻¹	1.3	1

^aTemperature dependent parameter: P(T) = P₂₀θ^(T−20).

Standard checks on the data, such as the mass balance over the secondary clarifier, were performed for the entire data set and confirmed the integrity of the data for its intended use (data not shown).

Model performance evaluation

Listen

Mean and standard deviation values were calculated for model predictions and actual MLVSS data, and the root mean square error (RMSE) between model predictions and actual MLVSS concentrations was calculated to evaluate model performance. Our evaluation focused particularly on instances where model predictions appeared to differ noticeably from measured values, as they suggested periods of lack of fit for the model. We defined two types of deviations, namely outliers and spikes. Outliers were defined by comparison of individual model predictions to individual actual values where the deviation exceeded ±3 standard deviation from the actual MLVSS (corresponding to a probability of occurrence of 0.3% based on the assumption of a normal distribution). Spikes were defined by deviations exceeding ±2 standard deviation of actual MLVSS (corresponding to a probability of 4.6%) (Taylor 1997).

Practical campaign strategies evaluation

Listen

Three averaging strategies, yearly, quarterly and monthly, were applied for conversion of the measured fractionation data to determine model inputs, and then fed into the model to predict the bioreactor MLVSS concentration. This approach was used not only for the period over which detailed wastewater fractionation was conducted (19 October 2017 to 17 October 2018). To further evaluate the general applicability of the fractionation data and averaging strategies, the results from the three different averaging strategies were applied over the preceding four years of data and the resulting bioreactor MLVSS concentrations were calculated. In addition, each single-monthly average fraction value was used to represent whole-year values to evaluate the performance of shorter period characterization campaigns.

Potential indicators of days Bad for campaign

Listen

Using the yearly-average model for the training data set, individual days were divided into two categories – spikes (≥2 STD) and non-spikes. Differences in important plant conventional influent wastewater and operational features for these two data sets were investigated. Unpaired two sample t-tests were conducted over those features to detect statistically significant differences in mean values. Significantly different features can potentially serve as a flag for a bad campaign day.

Campaign size evaluation

Listen

Random sampling without replacement was conducted for different sample sizes from the year-long campaign data to determine the effect of sample size on wastewater characteristic estimates. Estimates of COD fractions gained from different sample sizes were averaged and fed into the model for simulation. Fifty iterations were conducted for each sample size. Maximum and mean values for averages of year-long predicted MLVSS, RMSE, number of outliers and number of spikes were calculated for each sample size.

RESULT AND DISCUSSION

Listen

Determination of model input values based on measured fractionation data

Listen

Raw secondary influent total COD and concentration fraction data for the year over which these data were collected are presented in Figure 1. The total COD concentration varied significantly (158 ± 40 mg/L, ranging from 87.5 to 259 mg/L) throughout the year, primarily as a result of dilution during wet weather periods considering the GLWA WRRF is a combined sewage system. Particulate components appeared to be the most varied, covering a range of 27–217 mg/L, while the soluble component fluctuated with a range of 21–104 mg/L. The colloidal component was generally smaller than the particulate and soluble components, and some negative values were recorded. This can arise because the colloidal component is calculated by difference between the measured glass fiber and 0.45 μm filter filtrates. Since any measurement is subject to random errors, a measured value for the 0.45 μm filtrate that is randomly higher than the true value and the measured value for glass fiber filtrate that is randomly lower than the true value can result in the calculation of a negative value. The uncertainties (standard deviations) for total COD and glass fiber filtered COD were 40 and 27 mg COD /L respectively, and the maximum absolute value of the colloidal component was 56 mg/L, mathematically supporting that the colloidal concentration was subject to measurement error. Analysis of the secondary influent wastewater characterization data collected by Mehrotra (2018) during this same period suggested that the colloidal fraction is not statistically significant. Thus, it appears likely that the concentration of colloidal COD in the secondary influent (primary effluent) may be small enough that it cannot be accurately measured for this wastewater. Inspection of the data presented in Figure 2 also suggests that colloidal COD is a small fraction of the total COD and that it can, perhaps, be neglected as long as it is incorporated into another COD fraction.

Figure 1

$Variation of secondary influent (primary effluent) COD concentration fractions based on filtration procedure applied throughout the campaign year.$

View large Download slide

Variation of secondary influent (primary effluent) COD concentration fractions based on filtration procedure applied throughout the campaign year.

Figure 2

$COD model input values as a fraction of total COD for the campaign year. Components: biodegradable COD (Ss), slowly biodegradable COD (Xs), soluble inert COD (SI) and particulate inert COD (XI).$

View large Download slide

COD model input values as a fraction of total COD for the campaign year. Components: biodegradable COD (S_s), slowly biodegradable COD (X_s), soluble inert COD (S_I) and particulate inert COD (X_I).

Based on the observations above, a one-sample-t-test was conducted with a null hypothesis that the mean value of the colloidal COD is not equal to zero. With 95% confidence, the analysis failed to reject the null hypothesis (p-value = 0.13). In addition, ordinary least square linear regression analysis was conducted for the relationship between colloidal COD and total COD. Results showed that: (1) both slope and intercept were not significant; (2) the goodness of fit, R square, was 0.022; (3) the p-value of the ANOVA test comparing this linear fitting with no fitting was 0.32. These results all indicate that the colloidal component is sufficiently small that it cannot be measured for this wastewater with this technique. Consequently, this fraction was incorporated into particulate components, as is the typical approach when ASM1 is applied.

Figure 2 summarizes the reconciled input fractions for each day of the campaign year. There was no obvious pattern throughout the campaign year, and particulate COD (both biodegradable and non-biodegradable) varied more than soluble COD components. Table 2 provides both raw influent wastewater and primary effluent characteristics, as determined in this study, compared to recent literature values. The results for this wastewater are within the range of those obtained with other wastewaters, suggesting that it may be generally representative of domestic wastewater from a large metropolitan area.

Table 2

Comparison of the results from this study for wastewater COD fractions compared to literature values

Source	Total COD, mg/L	COD fraction in percentage
Source	Total COD, mg/L	S_s	S_I	X_s	X_I	Biomass, X_H
Primary effluent characteristics
This study^a	159 ± 41	22 ± 9	15 ± 7	28 ± 13	35 ± 15	–
Fall et al. (2011) ^a	492	36	5	35	24	–
Siegrist et al. (1995) ^b	250	10	8	58	24	–
Henze (1992)		29	3	43	11	14
Raw influent characteristics
This study^a	280 ± 85	20 ± 10	9 ± 5	35 ± 15	36 ± 20	–
Mehrotra (2018) ^a	290	15	9	24	52	–
Lu et al. (2010)	540	8–10	1–4	27–40	14–36	23–46
Zhou et al. (2008)	176–220	19.5–27.8	8.4–12.8	16.1–37.3	13.9–33.4	14.7–18.9
Roeleveld & van Loosdrecht (2002) ^a	241–827	9–42	3–10	10–48	23–50	–
Kappeler & Gujer (1992)	250–430	7–11	12–20	53–60	8–10	7–15
Henze (1992)	400	27	15	40	17	–

Source	Total COD, mg/L	COD fraction in percentage
Source	Total COD, mg/L	S_s	S_I	X_s	X_I	Biomass, X_H
Primary effluent characteristics
This study^a	159 ± 41	22 ± 9	15 ± 7	28 ± 13	35 ± 15	–
Fall et al. (2011) ^a	492	36	5	35	24	–
Siegrist et al. (1995) ^b	250	10	8	58	24	–
Henze (1992)		29	3	43	11	14
Raw influent characteristics
This study^a	280 ± 85	20 ± 10	9 ± 5	35 ± 15	36 ± 20	–
Mehrotra (2018) ^a	290	15	9	24	52	–
Lu et al. (2010)	540	8–10	1–4	27–40	14–36	23–46
Zhou et al. (2008)	176–220	19.5–27.8	8.4–12.8	16.1–37.3	13.9–33.4	14.7–18.9
Roeleveld & van Loosdrecht (2002) ^a	241–827	9–42	3–10	10–48	23–50	–
Kappeler & Gujer (1992)	250–430	7–11	12–20	53–60	8–10	7–15
Henze (1992)	400	27	15	40	17	–

^aThe fractions were measured and calculated purely with a physical-chemical method.

^bThe Siegrist's fraction was calibrated with estimations used in modeling based on literature (not directly measured).

Comparison of model results with actual data

Listen

The three different methods for averaging the fractionation data were evaluated using the campaign year as the training set, and the preceding four years as the validation set, as described above. Figure 3 compares predicted and measured MLVSS concentrations for the three methods for the training set, while Table 3 summarizes performance statistics for the training and validation data sets. While variations occur between model-predicted and actual MLVSS values, the model-predicted and measured MLVSS concentrations are generally of the same order of magnitude for all three averaging methods. This is significant as the modeling approach does not include a mechanism to directly calibrate the model results to measured values. Model stoichiometric and kinetic parameters are standard values taken from the literature, as discussed above and summarized in Table 1, and wastewater influent values are based on measured influent values, as described previously. As noted in Table 3, actual average MLVSS concentrations compare quite well with model values, irrespective of the averaging method used. Importantly, this suggests that the wastewater characterization method used, along with the use of relatively standard stoichiometric and kinetic coefficients, can lead to a reasonable model to begin with.

Table 3

Simulation results for the three different fractionation averaging methods for the training and testing data sets

Set	Average method	Mean	STD	RMSE
Set	Average method	mg VSS/L	mg VSS/L	mg VSS/L	>1 STD	>2 STD	>3 STD
Training (2017/10/18 − 2018/10/17) Size: 365	Actual	1,311.6	170.9
	Yearly	1,419.3	216.5	229.4	38.9%	14.0%	4.4%
	Quarterly	1,411.2	213.7	243.2	45.4%	15.6%	4.9%
	Monthly	1,413.5	361.7	343.5	55.6%	27.4%	10.7%
Testing (2013/10/18 − 2017/10/17) Size: 1461	Actual	1,165.9	185.9
	Yearly	1,119.7	281.0	256.3	44.0%	16.2%	3.0%
	Quarterly	1,112.8	288.1	185.9	47.8%	17.2%	3.8%
	Monthly	1,106.9	332.5	312.2	56.0%	24.4%	6.0%

Set	Average method	Mean	STD	RMSE
Set	Average method	mg VSS/L	mg VSS/L	mg VSS/L	>1 STD	>2 STD	>3 STD
Training (2017/10/18 − 2018/10/17) Size: 365	Actual	1,311.6	170.9
	Yearly	1,419.3	216.5	229.4	38.9%	14.0%	4.4%
	Quarterly	1,411.2	213.7	243.2	45.4%	15.6%	4.9%
	Monthly	1,413.5	361.7	343.5	55.6%	27.4%	10.7%
Testing (2013/10/18 − 2017/10/17) Size: 1461	Actual	1,165.9	185.9
	Yearly	1,119.7	281.0	256.3	44.0%	16.2%	3.0%
	Quarterly	1,112.8	288.1	185.9	47.8%	17.2%	3.8%
	Monthly	1,106.9	332.5	312.2	56.0%	24.4%	6.0%

Figure 3

$Simulation results for the three different fractionation averaging methods for the training data set. (a) Yearly average; (b) quarterly average; (c) monthly average.$

View large Download slide

Simulation results for the three different fractionation averaging methods for the training data set. (a) Yearly average; (b) quarterly average; (c) monthly average.

Visual inspection of the data presented in Figure 3 indicates a noticeable lack of fit from early February to late March. Model predictions consistently exceed actual values, and the deviations exceed the 10% criteria often applied to indicate model lack of fit (Rieger et al. 2012). Inspection of the individual data during this period indicated that this arose because of the nature of the model used. As indicated in Table 1, values for the COD/VSS for influent particulate inert material (i_VSS,XI) and influent particulate substrate (i_VSS,Xs) of 1.5 and 1.8 g COD g VSS⁻¹ are used, while the actual measured value for the influent particulate matter throughout the year was 1.9 ± 0.7, ranging from 0.7–4.1 g COD g VSS⁻¹. The ratio for February to April was 2.5 ± 1.0. In fact, use of higher values in the model for this period resulted in near elimination of this lack of fit. By adjusting i_VSS,Xs from 1.8 to 2.3 and i_VSS,XI from 1.5 to 2.0, the February spike was eliminated, but the resulting model underestimated the actual MLVSS for March. Overall, the mean predicted MLVSS was improved to 1,361.4 mg/L, along with small improvements of RMSE and standard deviation (221.3 and 204.5 mg/L, less than 8%). From a modeling perspective, the lack of fit during the February to March period did not occur due to variations in wastewater characteristics, but rather because of poor model structure, as the COD to VSS ratio for these individual model components was not formulated as a wastewater characteristic but as a model parameter. Interestingly, the months of February and March represent a distinct operating period when influent flows tend to be somewhat higher and periods of precipitation occur (this is a combined system, as described above). This unusual operating period may explain why the COD to VSS ratio is higher during this period. The impact of unusual operating conditions is addressed in additional detail below. From a modeling perspective, a priori knowledge concerning this failure of model structure would be required if the model is to be used to predict future performance. From a practical perspective, however, extreme COD/VSS values (around 0.7 or 4.1 for example), can be used to eliminate those days from the data set as they are likely measurement errors.

The results summarized in Table 4 address a different question; that is, whether there were better and worse times to conduct fractionation studies. It differs from the monthly analysis summarized in Table 3 and illustrated in Figure 3, in that the fractionation results for a single month are used to model the entire year. The results indicate that some time periods are better than others.

The poorest results occur when characterization data from February is used, as might be expected from the results presented immediately above. The difference between the mean predicted and actual MLVSS increases to 46% of the actual value, the RMSE is more than triple the value for yearly average results presented in Table 3, the percentage of predictions exceeding one STD increased to 96.2%, and 58.6% exceeded three STD. On the other hand, the fractionation data from certain months, such as March and October to December, generally performed better in terms of mean values, RMSE, and the percentage exceeding two and three STD (spikes and outliers) as summarized in Table 3. Note that the number of fractionation measurements was not the main contributor to improved performance, as larger sample size did not guarantee good performance (August and February) and smaller sample size did not diminish performance (April). It is noted that the period of October to December generally represents a period of lower plant influent flow.

Table 4

Simulation results using fractionation data from an individual month to represent the whole year

Month	Mean	STD	RMSE	> 1 STD	> 2 STD	> 3 STD	Sample size
Month	mg VSS/L	mg VSS/L	mg VSS/L	> 1 STD	> 2 STD	> 3 STD	Sample size
January	947.9	145.7	400.4	86.6%	60.3%	16.7%	5
February	1,915.9	292.2	656.2	96.2%	8.8%	58.6%	4
March	1,424.2	217.2	230.0	40.4%	12.9%	4.4%	3
April	1,068.5	163.2	297.8	72.3%	28.5%	4.7%	2
May	1,525.5	232.7	302.2	57.5%	24.1%	8.2%	2
June	1,441.8	219.9	242.7	40.8%	15.9%	5.8%	4
July	1,512.2	230.7	292.4	54.2%	22.5%	7.4%	4
August	1,651.8	251.9	409.7	81.1%	43.0%	20.0%	5
September	1,494.5	228.0	279.1	51.0%	20.5%	6.3%	3
October	1,337.4	204.0	195.8	34.2%	8.2%	2.5%	4
November	1,342.8	205.1	198.8	32.6%	8.2%	2.2%	4
December	1,304.2	199.5	194.1	35.6%	8.8%	1.6%	3
Actual	1,311.6	170.9					43
Yearly average	1,361.3	326.0	294.4	53%	23%	8%	43

Month	Mean	STD	RMSE	> 1 STD	> 2 STD	> 3 STD	Sample size
Month	mg VSS/L	mg VSS/L	mg VSS/L	> 1 STD	> 2 STD	> 3 STD	Sample size
January	947.9	145.7	400.4	86.6%	60.3%	16.7%	5
February	1,915.9	292.2	656.2	96.2%	8.8%	58.6%	4
March	1,424.2	217.2	230.0	40.4%	12.9%	4.4%	3
April	1,068.5	163.2	297.8	72.3%	28.5%	4.7%	2
May	1,525.5	232.7	302.2	57.5%	24.1%	8.2%	2
June	1,441.8	219.9	242.7	40.8%	15.9%	5.8%	4
July	1,512.2	230.7	292.4	54.2%	22.5%	7.4%	4
August	1,651.8	251.9	409.7	81.1%	43.0%	20.0%	5
September	1,494.5	228.0	279.1	51.0%	20.5%	6.3%	3
October	1,337.4	204.0	195.8	34.2%	8.2%	2.5%	4
November	1,342.8	205.1	198.8	32.6%	8.2%	2.2%	4
December	1,304.2	199.5	194.1	35.6%	8.8%	1.6%	3
Actual	1,311.6	170.9					43
Yearly average	1,361.3	326.0	294.4	53%	23%	8%	43

A further analysis of the potential reasons for deviations was conducted by evaluation of the difference in operating conductions on days where spikes (difference between modeled and actual MLVSS ≥2 STD) occurred, compared to the operating conditions for days when spikes did not occur. The hypothesis test results indicate that, within a 95% interval, days with spikes tended to occur on days with lower SRT, MLSS, higher secondary influent BOD₅, COD, TSS and VSS concentration, and higher secondary effluent TSS concentration. In short, efforts should be made to conduct fractionation campaigns during periods of relatively normal influent flow, loading, and operation, and results from periods where these factors are somewhat abnormal should be carefully screened and reviewed.

Impacts of sample size on wastewater characteristic estimation and model performance

Listen

Increased sample size can improve estimated fractionation, but with diminished results, as presented in Figure 4. Fifty iterations were implemented for each sample size, with the designated characterization records randomly pooled without replacement. Averaged fractions were fed into the model for simulation, and performance was evaluated. To minimize the error introduced by chance in sampling, both the maximum and average values of the model performance statistics among the 50 iterations were calculated. Average values reflect the overall performance of each sample size, while maximum values indicate the robustness, meaning that the result is not significantly influenced by individual characterizations. As indicated in Figure 4, there is a point of diminishing return. As expected, the desired ‘elbow point’ is controlled by the maximum criteria to achieve robust estimates of wastewater characteristics, making 20 the desired sample size in this instance. A preliminary analysis regressing the inert particulate fraction on the total COD with bootstrap sampling reached a similar conclusion (data not shown).

Figure 4

View large Download slide

Elbow plots to determine sample size. Each sample size was iterated 50 times, and then maximum and mean values for each model assessment parameter were extracted to represent each sample size. The model evaluation parameters used include maximum and average values for (a) mean of predicted MLVSS; (b) RMSE of predicted MLVSS; (c) days with different deviations.

Implications for wastewater characterization campaigns

Listen

These results provide guidance on the number of individual measurements that can result in a robust assessment of wastewater characteristics. The analysis summarized in Figure 4 suggests that around 20 measurements represent a reasonable balance between achieving a robust assessment without an excessive number of measurements. The results presented in Table 3 also support the conclusion that ‘more is better’ (yearly average compared to quarterly and monthly) when assessing wastewater characteristics and their impact on model performance. This result conflicts, however, with those presented in Table 4, which indicated that even a small number of measurements conducted at ‘the right time’ (March and October to December in this case) can result in better characterization of the wastewater relative to model performance. This presents a conundrum for planning wastewater characterization campaigns, as it is not possible to know, a priori, what the ‘right time’ is. Certainly, periods that are recognized to generally represent unusual conditions can be avoided, but it may not be possible to predict the ideal time. This suggests that an adaptive approach to wastewater characterization may be needed. It may consist of multiple sampling events, each of relatively short duration, with the results carefully evaluated after each event for consistency in model predictions as well as the occurrence of unusual influent or operating conditions. Sampling periods continue until a consistent set of results is achieved. Using this approach, sampling can be terminated when a sufficient number of measurements are obtained during periods of normal operation so that a robust assessment of wastewater characteristics is achieved. Issues related to model structure, as occurred in this instance during February and March of 2018, can also be identified with this approach and addressed appropriately given the objective of the modeling exercise. Use of this approach makes it unnecessary to specify initially the number of measurements required to achieve a robust assessment of wastewater characteristics, as the methodology itself will determine this. A robust budget is needed to account for unforeseen conditions. Given the significant economic impact of poor wastewater characterization in many instances, unnecessarily limiting the wastewater characterization budget may not be a wise use of funds as the economic impact of poor decisions may be orders of magnitude greater than the cost of additional testing.

The system considered and model application used in this work represents perhaps one of the simplest, but one with potentially significant economic impacts. Accurate prediction of the MLVSS concentration translates directly into the required bioreactor and secondary clarifier sizes, which represents a major capital expense for any suspended growth biological treatment system. The colloidal organic matter fraction of the biological process influent wastewater was found to be negligible in this instance, and the dissolved fraction could be characterized based on membrane filtration rather than flocculation and filtration. Note that GLWA serves a large and diverse metropolitan area, and that a significant portion of the collection system consists of combined sewers, leading to significant variations in influent flows, both seasonal and daily, and significant temperature variations given its location in the Northern USA. In spite of these factors, it was found that one set of wastewater characteristics applied over the entire year. Thus, while the precise numerical results determined for this application may not generally apply, the adaptive approach to wastewater characterization and model calibration described here may be more generally applicable.

CONCLUSIONS

Listen

An extended wastewater fractionation study conducted at the GLWA WRRF provided the basis to evaluate alternative wastewater characterization campaign designs. An ideal campaign results in a robust characterization of the wastewater while managing the time and resources required to achieve this result. Wastewater characterization must, of course, be viewed in the context of the objectives of the modeling exercise and the potential impacts of improper model development. The following conclusions can be offered based on this study:

1.
The characteristics of this wastewater originating from a large and diverse metropolitan area, as assessed based on predicted versus actual bioreactor MLVSS concentration, did not vary on a seasonal basis. This occurred in spite of significant daily and seasonal influent wastewater flows and seasonal temperature variations due to the fact that the collection system included a substantial combined sewer component.
2.
Sampling during periods of normal and stable plant operation results in the most reliable estimates of wastewater characteristics. Increasing the number of samples can help to partially overcome the adverse impacts on sampling results resulting from occasional periods of unusual plant operation, but the best results will be obtained by avoiding, when possible, sampling during unusual operating periods.
3.
For this application, around 20 samples randomly distributed over an annual cycle was found to represents a good trade-off between further increasing the number of samples and the gain in precision in the estimation of wastewater characteristics.
4.
An adaptive approach to wastewater characteristics measurement consisting of multiple measurement campaigns, each of limited duration, may provide the best results. Sufficient resources need to be devoted to the campaign to allow for sufficient sampling events to ensure that a reliable and robust assessment of wastewater characteristics is achieved.
5.
Attention should be paid to the potential for periods of poor model structure, including numerical values of key parameters, when assessing results. Some redundancy in measured parameters (COD, BOD₅, TSS, VSS) can facilitate identification of such periods.

ACKNOWLEDGEMENTS

Listen

The assistance of the personnel of the GLWA WRRF and the GLWA Analytical laboratory is acknowledged and is greatly appreciated by the authors of this paper. This work would not have been possible without their diligence in processing the collected samples and conducting the analyses.

REFERENCES

Alikhani

J.

,

Takacs

I.

,

Al-Omari

A.

,

Murthy

S.

&

Massoudieh

A.

2017

Evaluation of the information content of long-term wastewater characteristics data in relation to activated sludge model parameters

.

Water Science and Technology

75

(

6

),

1370

–

1389

.

Google Scholar

Crossref

PubMed

APHA (American Water Works Association, American Public Health Association & Water Environment Federation)

2017

Standard Methods for the Examination of Water and Wastewater

, 23rd edn.

APHA

,

Washington, DC, USA

. Accessed at www.standardmethods.org.

Choubert

J.-M.

,

Rieger

L.

,

Shaw

A.

,

Copp

J.

,

Spérandio

M.

,

Sørensen

K.

,

Rönner-Holm

S.

,

Morgenroth

E.

,

Melcer

H.

&

Gillot

S.

2013

Rethinking wastewater characterisation methods for activated sludge systems-a position paper

.

Water Science and Technology

67

(

11

),

2363

.

Google Scholar

Crossref

PubMed

Fall

C.

,

Flores

N. A.

,

Espinoza

M. A.

,

Vazquez

G.

,

Loaiza-Navia

J.

,

van Loosdrecht

M. C. M.

&

Hooijmans

C. M.

2011

Divergence between respirometry and physicochemical methods in the fractionation of the chemical oxygen demand in municipal wastewater

.

Water Environment Research

83

(

2

),

162

–

172

.

Google Scholar

Crossref

PubMed

Gillot

S.

&

Choubert

J. M.

2010

Biodegradable organic matter in domestic wastewaters: comparison of selected fractionation techniques

.

Water Science and Technology

62

(

3

),

630

–

639

.

Google Scholar

Crossref

PubMed

Grady

C. L.

Jr,

Daigger

G. T.

,

Love

N. G.

&

Filipe

C. D.

2011

Biological Wastewater Treatment

.

CRC Press

,

New York, NY, USA

.

Google Scholar

Hauduc

H.

,

Rieger

L.

,

Ohtsuki

T.

,

Shaw

A.

,

Takács

I.

,

Winkler

S.

,

Héduit

A.

,

Vanrolleghem

P. A.

&

Gillot

S.

2011

Activated sludge modelling: development and potential use of a practical applications database

.

Water Science and Technology

63

(

10

),

2164

–

2182

.

Google Scholar

Crossref

PubMed

Hauduc

H.

,

Rieger

L.

,

Oehmen

A.

,

van Loosdrecht

M. C. M.

,

Comeau

Y.

,

Héduit

A.

,

Vanrolleghem

P. A.

&

Gillot

S.

2013

Critical review of activated sludge modeling: state of process knowledge, modeling concepts, and limitations

.

Biotechnology and Bioengineering

110

(

1

),

24

–

46

.

Google Scholar

Crossref

PubMed

Henze

M.

1992

Characterization of wastewater for modelling of activated sludge processes

.

Water Science and Technology

25

(

6

),

1

–

15

.

Google Scholar

Crossref

Henze

M.

,

Gujer

W.

,

Mino

T.

&

van Loosdrecht

M. C.

2000

Activated Sludge Models ASM1, ASM2, ASM2d and ASM3

.

IWA Publishing

,

London, UK

.

Google Scholar

Kappeler

J.

&

Gujer

W.

1992

Estimation of kinetic parameters of heterotrophic biomass under aerobic conditions and characterization of wastewater for activated sludge modelling

.

Water Science and Technology

25

,

125

–

139

.

Google Scholar

Crossref

Lu

P.

,

Zhang

X.

&

Zhang

D.

2010

An integrated system for wastewater COD characterization and a case study

.

Water Science and Technology

62

(

4

),

866

–

874

.

Google Scholar

Crossref

PubMed

Mamais

D.

,

Jenkins

D.

&

Prrr

P.

1993

A rapid physical-chemical method for the determination of readily biodegradable soluble COD in municipal wastewater

.

Water Research

27

(

1

),

195

–

197

.

Google Scholar

Crossref

Mehrotra

A.

2018

BioWin Modeling Report for GLWA Master Wastewater Plan

.

Report

,

CDM Smith

,

Boston, MA

,

USA

.

Melcer

H.

2004

Methods for wastewater characterization in activated sludge modelling

.

IWA Publishing

,

London, UK

.

Google Scholar

Petersen

B.

,

Vanrolleghem

P.

,

Gernaey

K.

&

Henze

M.

2002

Evaluation of an ASM1 model calibration procedure on a municipal–industrial wastewater treatment plant

.

Journal of Hydroinformatics

4

,

15

–

38

.

Google Scholar

Crossref

Phillips

H. M.

,

Sahlstedt

K. E.

,

Frank

K.

,

Bratby

J.

,

Brennan

W.

,

Rogowski

S.

,

Pier

D.

,

Anderson

W.

,

Mulas

M.

,

Copp

J. B.

&

Shirodkar

N.

2009

Wastewater treatment modelling in practice: a collaborative discussion of the state of the art

.

Water Science and Technology

59

(

4

),

695

–

704

.

Google Scholar

Crossref

PubMed

Rieger

L.

,

Takács

I.

,

Villez

K.

,

Siegrist

H.

,

Lessard

P.

,

Vanrolleghem

P. A.

&

Comeau

Y.

2010

Data reconciliation for wastewater treatment plant simulation studies – planning for high-quality data and typical sources of errors

.

Water Environment Research

82

(

5

),

426

–

433

.

Google Scholar

Crossref

PubMed

Rieger

L.

,

Gillot

S.

,

Langergraber

G.

,

Ohtsuki

T.

,

Shaw

A.

,

Takacs

I.

&

Winkler

S.

2012

Guidelines for Using Activated Sludge Models

.

IWA Publishing

,

London, UK

.

Google Scholar

Roeleveld

P. J.

&

van Loosdrecht

M. C. M.

2002

Experience with guidelines for wastewater characterisation in the Netherlands

.

Water Science and Technology

45

(

6

),

77

–

87

.

Google Scholar

Crossref

PubMed

Siegrist

H.

,

Krebs

P.

,

Bühler

R.

,

Purtschert

I.

,

Rock

C.

&

Rufer

R.

1995

Denitrification in secondary clarifiers

.

Water Science and Technology

31

(

2

),

205

–

214

.

Google Scholar

Crossref

Taylor

J.

1997

Introduction to Error Analysis, the Study of Uncertainties in Physical Measurements

, 2nd edn.

University Science Books

,

Sausalito, California

,

USA

.

Google Scholar

Vanrolleghem

P. A.

,

Insel

G.

,

Petersen

B.

,

Pauw

D. D.

,

Nopens

I.

,

Dovermann

H.

,

Weijers

S.

&

Gernaey

K.

2003

A Comprehensive Model Calibration Procedure for Activated Sludge Models

. In:

Proceedings of the Water Environment Federation, WEFTEC 2003

.

Google Scholar

Yan

J.

,

Yang

C.

,

Tian

Z.

&

Daigger

G. T.

2018

Characterizing the Performance and Operational Characterisitics of the Bioreactors at the Detroit, MI, Water Resource Recovery Facility: May, 2017- March 2018 Results

.

Report

,

University of Michigan

,

Ann Arbor, MI

,

USA

.

Zhou

Z.

,

Wu

Z.

,

Wang

Z.

,

Tang

S.

&

Gu

G.

2008

COD fractionation and parameter estimation for combined sewers by respirometric tests

.

Journal of Chemical Technology & Biotechnology: International Research in Process. Environmental & Clean Technology

83

(

12

),

1596

–

1601

.

Google Scholar

How much data is required for a robust and reliable wastewater characterization?

Abstract

INTRODUCTION

MATERIAL AND METHOD

Description of the plant

Wastewater fractionation

Mapping measured wastewater fractions into model inputs

Biological process modeling

Model performance evaluation

Practical campaign strategies evaluation

Potential indicators of days Bad for campaign

Campaign size evaluation

RESULT AND DISCUSSION

Determination of model input values based on measured fractionation data

Comparison of model results with actual data

Impacts of sample size on wastewater characteristic estimation and model performance

Implications for wastewater characterization campaigns

CONCLUSIONS

ACKNOWLEDGEMENTS

REFERENCES

Cited by

This Feature Is Available To Subscribers Only