Two main issues regarding stormwater quality models have been investigated: i) the effect of calibration dataset size and characteristics on calibration and validation results; ii) the optimal split of available data into calibration and validation subsets. Data from 13 catchments have been used for three pollutants: BOD, COD and SS. Three multiple regression models were calibrated and validated. The use of different data sets and different models allows viewing general trends. It was found mainly that multiple regression models are case sensitive to calibration data. Few data used for calibration infers bad predictions despite good calibration results. It was also found that the random split of available data into halves for calibration and validation is not optimal. More data should be allocated to calibration. The proportion of data to be used for validation increases with the number of available data (N) and reaches about 35% for N around 55 measured events.
Calibration and validation of multiple regression models for stormwater quality prediction: data partitioning, effect of dataset size and characteristics
M. Mourad, J.-L. Bertrand-Krajewski, G. Chebbo; Calibration and validation of multiple regression models for stormwater quality prediction: data partitioning, effect of dataset size and characteristics. Water Sci Technol 1 August 2005; 52 (3): 45–52. doi: https://doi.org/10.2166/wst.2005.0060
Download citation file:
Impact Factor 1.915
CiteScore 3.3 • Q2
13 days from submission to first decision on average