Success of any forecasting model depends heavily on reliable historical data, among others. Data are needed to calibrate, fine tune and verify any simulation model. However, data are very often contaminated with noise of different levels originating from different sources. This study proposes a scheme that extracts the most representative data from a raw data set. Subtractive Clustering Method (SCM) and Micro Genetic Algorithm (mGA) were used for this purpose. SCM does (a) remove outliers and (b) discard unnecessary or superfluous points while mGA, a search engine, determines the optimal values of the SCM's parameter set. The scheme was demonstrated in: (1) Bangladesh water level forecasting with Neural Network and Fuzzy Logic and (2) forecasting of two chaotic river flow series (Wabash River at Mt. Carmel and Mississippi River at Vicksburg) with the phase space prediction method. The scheme was able to significantly reduce the data set with which the forecasting models yield either equally high or higher prediction accuracy than models trained with the whole original data set. The resulting fuzzy logic model, for example, yields a smaller number of rules which are easier for human interpretation. In phase space prediction of chaotic time series, which is known to require a long data record, a data reduction of up to 40% almost does not affect the prediction accuracy.
Skip Nav Destination
Article navigation
Research Article|
October 01 2005
Derivation of effective and efficient data set with subtractive clustering method and genetic algorithm
C. D. Doan;
C. D. Doan
1Civil Engineering Department, National University of Singapore, Singapore, 117576
Search for other works by this author on:
S. Y. Liong;
1Civil Engineering Department, National University of Singapore, Singapore, 117576
Tel: +65 6874 3081, Fax: 65 6779 1635; E-mail: [email protected]
Search for other works by this author on:
Dulakshi S. K. Karunasinghe
Dulakshi S. K. Karunasinghe
1Civil Engineering Department, National University of Singapore, Singapore, 117576
Search for other works by this author on:
Journal of Hydroinformatics (2005) 7 (4): 219–233.
Citation
C. D. Doan, S. Y. Liong, Dulakshi S. K. Karunasinghe; Derivation of effective and efficient data set with subtractive clustering method and genetic algorithm. Journal of Hydroinformatics 1 October 2005; 7 (4): 219–233. doi: https://doi.org/10.2166/hydro.2005.0020
Download citation file: