Abstract
River water quality degradation is a risk to human health. Hence, many water quality models have been developed to predict the future states of water bodies and understand how the current water treatment systems will respond to future pollution loads and climatic drivers. A Japanese river was evaluated with the River Water Quality Model No.1 (RWQM1), and parameter sensitivity and identifiability analyses were executed on the model output using parameter sensitivity ranking, collinearity index, and Fisher Information Matrix-derived criterion. Among RWQM1 kinetic parameters, those related to hydrolysis, growth of aerobicheterotrophs, and first-stage nitrifiers were the most influential. Reactive soluble organic substances included in untreated gray waters, in addition to a prevalence ratio of the most advanced on-site treatment facility, strongly contributed to the model output variability. A remediation analysis revealed that a renewal to the most advanced on-site treatment facility by 20% increment was almost equivalent to the 70% decrease in the effluent concentration from an on-site treatment facility producing the highest pollutant load in terms of a BOD concentration decrease in the stream. This study provided baseline data assisting in policy implementation regarding the management of effluents from on-site treatment facilities.
HIGHLIGHTS
The impact of wastewater discharge from on-site treatment facilities was studied.
Sensitivity-identifiability analyses provided effluent management perspectives.
A prevalence ratio of the most advanced on-site treatment facility was the most influential parameter.
Easily soluble organic matter in untreated gray water should be prioritized as a wastewater constituent to be treated in the studied area.
INTRODUCTION
Today, the world's most crucial river water quality problem is posed by areas not connected to centralized sewer systems (Yates et al. 2019). In many countries, most houses in rural or isolated areas are not connected to centralized sewer systems. These houses rely on on-site domestic wastewater treatment facilities such as septic tank systems to treat household sewage (Withers et al. 2011). One of the challenges of on-site domestic wastewater treatment facilities is that it is not inspected as routinely as centralized wastewater treatment plants. This suggests that underperforming systems, or ones that exceed a functional lifespan, could be sources of pollutant loading. Withers et al. (2011) pointed out that the semi-continuous release of nutrients can considerably impact local stream ecosystem health, particularly rivers with low dilution effects. Jarvie et al. (2006) showed that septic tank effluents posed more significant risks to river eutrophication than diffuse agricultural pollution. Gill & Mockler (2016) demonstrated that annual nutrient emissions from septic tank systems could reach 22% for P and 13% for N in small catchments, indicating that they could be significant sources of nutrient loading. Based on high temporal resolution monitoring data, it is apparent that even small clusters of poorly performing septic tank systems could become a significant source of nutrients in the spring and summer seasons when rural watersheds are ecologically active (Macintosh et al. 2011). Sato et al. (2013) have demonstrated that approximately 90% of the untreated wastewater in low-income countries is discharged into the surrounding water bodies, such as rivers, wetlands, lakes, and streams. The findings agree with World Health Organization (WHO) Fact-sheet, which indicates that at least two billion people worldwide drink from fecal-contaminated water sources. The WHO fact-sheet further shows that microbiologically contaminated drinking water causes 485,000 deaths each year, which are linked to transmitting diseases such as diarrhea (WHO 2022).
United Nations Sustainable Development Goal 6 addresses the global water quality and sanitation challenges. It emphasizes eliminating dumping and minimizing the release of hazardous chemicals and materials, reducing pollution, halving the proportion of untreated wastewater, and substantially increasing recycling and safe reuse. Several integrated water quality monitoring programs and approaches have been developed over the past years to achieve this goal. The water quality modeling has been proven effective in predicting the future states of water bodies and understanding how the current water treatment systems will respond to future pollution loads and climatic drivers. They have been used for risk assessment, identifying and quantifying the sources of water quality constituents, and exploring potential outcomes of climate, hydrology, and management scenarios.
However, complex nonlinear environmental simulation models have many parameters. It is often not clear which parameter subset should be estimated from observed data. Hence, an identifiability analysis is an effective tool to obtain insights into reasonable parameter subsets that describe observed data adequately (Soares et al. 2020). A river often receives wastewater from treatment facilities at different treatment levels. The sensitivity and identifiability analyses can reveal which parameters or variables are most influential on the river water quality. This could provide practical guides to determine the priority of managing effluents from treatment facilities and characterize biological kinetic parameters that dominate the conversion processes of pollutants in a river. Hence, the main objective of this study was to investigate the most influential water quality variables in wastewater discharged from on-site treatment facilities and kinetic parameters included in a river water quality model. Our main goal was to generate baseline data that may provide a clue in identifying critical elements in the water quality models for improving river basin management.
METHODS
Test case






Maps of Japan and the Honjo City, and schematic diagram of a river reach of the Motokoyama River (maps taken from Google Maps). The river flows towards Kujyo.
Maps of Japan and the Honjo City, and schematic diagram of a river reach of the Motokoyama River (maps taken from Google Maps). The river flows towards Kujyo.
Model description



















































Conceptual diagram of the tank-in-series model. W indicates pollutant load, Q flow rate, and C concentration.
Conceptual diagram of the tank-in-series model. W indicates pollutant load, Q flow rate, and C concentration.
Sensitivity measures





























Collinearity index











This index represents the degree of approximate linear dependence of the sensitivity functions of the parameters, which corresponds to the columns of the matrix. is equal to unity if the sensitivity functions are orthogonal, while it goes to infinity if these functions are exactly linearly dependent. A high collinearity index indicates that changes in model results induced by small changes in a parameter
can be compensated by changes in other parameters in K. The value of
is preferable in a study where available data are limited and identifiability problems start to arise when
is between 10 and 15 (Omlin et al. 2001)
FIM-based criterion











Procedures for the parameter identifiability analysis


Statistical analysis
Two-way analysis of variance (ANOVA) was used to assess variability in mean concentrations of water quality variables observed at Kujyo. The time period (1–3) and month (January to December) were used as the independent categorical variables. All statistical analyses were performed using Python with the statsmodels library version 0.14.0 (Seabold & Perktold 2010). All tests were considered statistically significant at p 0.05.
RESULTS
Comparisons of observed and simulation results
















Simulation results (presented with solid lines) of monthly concentration variations for (a–c) BOD, (d–f) , (g–i)
, (j–l) SS, and (m–o) DO in time periods 1, 2 and 3, compared with monitored data obtained at Kujyo (presented with box and whiskers plots). Large error bars indicate large variations in the concentrations of each water quality variable. Circles that appear in the figures indicate outliers, which were either larger or smaller than a certain threshold. The thresholds are the same with data less than but most close to
, or greater than but most close to
. Q3, Q1, and IQR stand for the upper and lower quartile, inter quartile range, respectively. Note that the vertical scale is different depending on water quality variables but the same scale is maintained for the same water quality variable.
Simulation results (presented with solid lines) of monthly concentration variations for (a–c) BOD, (d–f) , (g–i)
, (j–l) SS, and (m–o) DO in time periods 1, 2 and 3, compared with monitored data obtained at Kujyo (presented with box and whiskers plots). Large error bars indicate large variations in the concentrations of each water quality variable. Circles that appear in the figures indicate outliers, which were either larger or smaller than a certain threshold. The thresholds are the same with data less than but most close to
, or greater than but most close to
. Q3, Q1, and IQR stand for the upper and lower quartile, inter quartile range, respectively. Note that the vertical scale is different depending on water quality variables but the same scale is maintained for the same water quality variable.
Screening for important parameters












Parameter importance ranking using for (a) RWQM1 kinetic parameters and (b) water quality components and prevalence ratios.
Parameter importance ranking using for (a) RWQM1 kinetic parameters and (b) water quality components and prevalence ratios.
Identifiability of parameter subsets

















Collinearity index () in each set size along with added parameters for (a) RWQM1 kinetic parameters, and (b) water quality components and prevalence ratios
. | Time period 1 (2005–2010) . | Time period 2 (2011–2014) . | Time period 3 (2015–2019) . | |||
---|---|---|---|---|---|---|
Set size . | A . | B . | A . | B . | A . | B . |
1a | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
2 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
(![]() | (![]() | (![]() | (![]() | (![]() | (![]() | |
3 | 2.919 | 2.353 | 3.093 | 2.159 | 2.127 | 2.068 |
(![]() | (![]() | (![]() | (![]() | (![]() | (![]() | |
4 | 2.924 | 592.0 | 3.093 | 587.0 | 2.137 | 220.9 |
(![]() | (![]() | (![]() | (![]() | (![]() | (![]() | |
5 | 13.394 | – | 12.585 | – | 15.374 | – |
(![]() | (![]() | (![]() | ||||
6 | 100.209 | – | 95.121 | – | 99.143 | – |
(![]() | (![]() | (![]() |
. | Time period 1 (2005–2010) . | Time period 2 (2011–2014) . | Time period 3 (2015–2019) . | |||
---|---|---|---|---|---|---|
Set size . | A . | B . | A . | B . | A . | B . |
1a | ![]() | ![]() | ![]() | ![]() | ![]() | ![]() |
2 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
(![]() | (![]() | (![]() | (![]() | (![]() | (![]() | |
3 | 2.919 | 2.353 | 3.093 | 2.159 | 2.127 | 2.068 |
(![]() | (![]() | (![]() | (![]() | (![]() | (![]() | |
4 | 2.924 | 592.0 | 3.093 | 587.0 | 2.137 | 220.9 |
(![]() | (![]() | (![]() | (![]() | (![]() | (![]() | |
5 | 13.394 | – | 12.585 | – | 15.374 | – |
(![]() | (![]() | (![]() | ||||
6 | 100.209 | – | 95.121 | – | 99.143 | – |
(![]() | (![]() | (![]() |
aThe initial subset consists of the most influential parameter evaluated with . A parameter was added as the set size increases and is presented inside parentheses.
Change of BOD concentrations for the 15 cases including the base case. Specifics for each condition are: Base = default , default
,
43%; 1 =
decreased by 10%; 2 =
decreased by 40%; 3 =
decreased by 70%; 4 =
increased by 10%; 5 =
increased by 50%; 6 =
increased by 100%; 7 =
decreased by 10%; 8 =
decreased by 40%; 9 =
decreased by 70%; 10 =
increased by 10%; 11 =
increased by 50%; 12 =
increased by 100%; 13 =
53%; 14 =
63%.
Change of BOD concentrations for the 15 cases including the base case. Specifics for each condition are: Base = default , default
,
43%; 1 =
decreased by 10%; 2 =
decreased by 40%; 3 =
decreased by 70%; 4 =
increased by 10%; 5 =
increased by 50%; 6 =
increased by 100%; 7 =
decreased by 10%; 8 =
decreased by 40%; 9 =
decreased by 70%; 10 =
increased by 10%; 11 =
increased by 50%; 12 =
increased by 100%; 13 =
53%; 14 =
63%.
DISCUSSION
Sensitivity and identifiability analyses in combination with RWQM1 were used as tools to investigate the effect of the discharge of wastewater on the river water quality. RWQM1 was originally developed, partially aiming at evaluating the impact of wastewater treatment plant operation and control (Reichert et al. 2001), which means that smooth integration of effluents from wastewater treatment facilities with river water quality modeling is possible in terms of interchangeability of model components. Sensitivity and identifiablity analyses were proposed in the context of river water quality modeling two decades ago (Reichert & Vanrolleghem 2001); however, they are still usable in various fields such as agriculture (Coudron et al. 2021) and biochemistry (Gábor et al. 2017). The usefulness of these analytical techniques is due to the facilitated understanding of the model's intrinsic behavior, principally evaluated based on the magnitude of output variations in response to parameter changes (Soares et al. 2020). In this study, sensitivity measures () capture all of the sensitives as shown in Equation (19), regardless of the sensitivity difference among the target water quality variables (e.g., BOD and DO), thereby being affected by a water quality variable most relevant to each parameter or input. It was possible to identify governing model kinetic parameters, water quality components and newly introduced parameters in the currently applied model. Important insights gained through the present analyses is that the model behavior could be explained largely based on hydrolysis, growth of aerobic heterotrophs, and first-stage nitrifiers, since kinetic parameters relevant to those processes were high-sensitive. Hydrolysis leads to a strong increase in degradable organic matter (DOM), while a higher heterotrophic growth facilitates DOM consumption. Those microbial activities could have affected dynamic variations of BOD. Nitrification processes did not seem to work very much to decrease these concentration levels. The high concentrations of
observed in this river might be partly due to the influx of nitrate-contaminated groundwater (Inagaki et al. 2020).
plays a role in the consumption of
and hence influences the decrease in
which was observed in a period from May to August, particularly for time period 1.
concentrations increased in December and kept high in winter seasons. This could be due to a lower activity of stage 1 nitrifiers as the water temperature decreased. Parameters associated with on-site wastewater treatment facilities affect the accumulation of pollutants in the river. Easily soluble organic substances included in untreated gray water (i.e.,
and
) were found to be the most important among water quality components. This result was consistent with estimated BOD load quantities from point and nonpoint sources covered in this study. For example, in the direct discharge area, the BOD loads of TJ and KU (2.02
104 and 9.84
103 g day−1, respectively) were one to two orders of magnitude higher than that of GJ (9.52
102 g day−1). Lowering effluent concentrations from those on-site treatment facilities was found to be important from the standpoint of pollution control. However, in fact, TJ and KU do not treat gray water, which is released into the stream without any treatment. Achieving a decrease in effluent concentrations is practically impossible, and a viable option is to increase the prevalence ratio of GJ. As can be seen in Figure 6, the 20% increment of the prevalence ratio is almost equivalent to the 70% decrease in the effluent concentration from TJ. This finding is aligned with a coherent policy encouraging the renewal of TJ or KU to GJ, and showcases the practical implementation for water quality governance and environmental policy, based on the findings obtained via those analyses. It is evident that
and
strongly contributed to the variability in the model outputs. A variation of sensitivities of model parameters was related to the chronological change of the three datasets; however, the top parameters included in the sensitivity rankings were somewhat similar for each time period. The differences in terms of the model structure were only attributed to changes in model inputs (e.g., inflow at Chubu, meteorological forcing data) that varied depending on time periods. However, actual increases in
values were observed, which shows that the river condition had become more sensitive to the discharge, as the water quality had been improved. The statistical analysis also suggested that the mean concentrations of water quality variables (except for SS) differed significantly between the three time periods. The efforts to mitigate pollution loading should be continued by encouraging renewal of TJ or KU to GJ as well as maintaining facilities properly.
A subset of parameters is identifiable if the model output is sensitive enough to small variations of all parameters in the subset. The collinearity index and FIM-derived criterion were used to define an identifiable parameter subset, or the maximum number of subset dimensions. A combination of different indices leads to developing better judgments, instead of relying on a single index (Freni et al. 2009). The identifiable subset included parameters that appear in Figure 5 (except for ), which indicates that high sensitivities of these parameters and inputs stemmed from different causes (i.e., different impacts of parameters and inputs on different water quality variables), as they were linearly independent. With regard to water quality components, soluble organic substances included in untreated gray water had a great impact on the simulation. Reduction of untreated gray water would be a priority measure to effectively reduce the impact of discharge from those domestic wastewater facilities on this river. Identifiable parameter subsets did not change over the three time periods. The reason why this occurred is because sampling points and frequencies in addition to monitored variables were not changed (Freni & Mannina 2012).
Although our analyses were tailored to a specific condition of the test case, the procedures taken in this study produced relevant information on a situation where a river receives wastewater from treatment facilities at different treatment levels, and also are likely to be replicated to identify dominant water quality variables in other similar systems: The same technique is applicable to other sites where domestic wastewater is discharged, which could be used to assess the effectiveness of implemented regulatory measures. The expansion of the model should be also considered, covering other important water quality variables such as heavy metals and pathogens. The approach utilized in this study still has some limitations. The best available values were used for water quality of effluents from on-site treatment facilities as well as prevalence ratios of GJ. In reality, they were not necessarily constant. However, the variations of those model input values were considered very small, which backed the assumption that they were constant. Also, the current method for a sensitivity analysis is categorized as a local method, which relies on a linear approximation of the model around the expectation of a parameter value. A global sensitivity analysis method explores a wider input space of parameter values (Saltelli et al. 2010) and could be worth implementing to reveal more in-depth interrelationships among parameters. The current study paid attention to newly introduced model parameters. As mentioned earlier, their values were determined based on collected data for this specific region and expected to reflect a real situation, so that the local method was still considered as a good option.
CONCLUSION
The sensitivity and identifiability analyses can effectively reveal which parameters or variables are most influential on model outputs when a large and complex model is involved. This study covered an urban river, which has been subject to long-term monitoring campaigns since the environmental program was implemented. This river receives partially treated wastewater from on-site wastewater treatment facilities such that it could be a good test case to evaluate the impact of wastewater discharge on river water quality. The three different time periods were considered as either an in-program or a post-program period. The model behavior could be explained largely based on hydrolysis, growth of aerobic heterotrophs, and first-stage nitrifiers, since kinetic parameters relevant to those processes were high-sensitive. Easily soluble organic substances included in untreated gray water were found to be the most important among water quality components. Also, prevalence ratios of the most advanced on-site treatment facility type strongly contributed to the variability in the model outputs. These analytical results were in turn used for a remediation analysis: the 20% increment of the prevalence ratio was almost equivalent in terms of a BOD concentration decrease in the stream to the 70% decrease in the effluent concentration from an on-site treatment facility with the highest pollutant load. This finding indicated that a renewal to the most advanced on-site treatment facility is effective. This study provided quantitative evidence that can assist in the implementation of policy for the management of effluents from on-site treatment facilities.
ACKNOWLEDGEMENTS
The authors would like to thank students of Dr Sakakibara's research group for their assistance in the early stage of model development, especially Mr Jun Yasuoka, Mr Takahede Hasegawa, and Mr Atsushi Miyata. The observed data in the field were kindly provided by the Honjo City Office. The views expressed in this article are those of the authors and do not necessarily reflect the views or policies of the Japanese government and local municipalities. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.