This paper describes the results of experiments with artificial neural networks (ANNs) and genetic programming (GP) applied to some problems of data mining. It is shown how these subsymbolic methods can discover usable relations in measured and experimental data with little or no a priori knowledge of the governing physical process characteristics. On the one hand, the ANN does not explicitly identify a form of model but this form is implicit in the ANN, being encoded within the distribution of weights. However, in cases where the exact form of the empirical relation is not considered as important as the ability of the formula to map the experimental data accurately, the ANN provides a very efficient approach. Furthermore, it is demonstrated how numerical schemes, and thus partial differential equations, may be derived directly from data by interpreting the weight distribution within a trained ANN. On the other hand, GP evolutionary force is directed towards the creation of models that take a symbolic form. The resulting symbolic expressions are generally less accurate than the ANN in mapping the experimental data, however, these expressions may sometimes be more easily examined to provide insight into the processes that created the data. An example is used to demonstrate how GP can generate a wide variety of formulae, of which some may provide genuine insight while others may be quite useless.
Research Article|January 01 2000
Subsymbolic methods for data mining in hydraulic engineering
Journal of Hydroinformatics (2000) 2 (1): 3-13.
Anthony W. Minns; Subsymbolic methods for data mining in hydraulic engineering. Journal of Hydroinformatics 1 January 2000; 2 (1): 3–13. doi: https://doi.org/10.2166/hydro.2000.0002
Download citation file: