Present day instrumentation networks already provide immense quantities of data, very little of which provides any insights into the basic physical processes that are occurring in the measured medium. This is to say that the data by itself contributes little to the knowledge of such processes. Data mining and knowledge discovery aim to change this situation by providing technologies that will greatly facilitate the mining of data for knowledge. In this new setting the role of a human expert is to provide domain knowledge, interpret models suggested by the computer and devise further experiments that will provide even better data coverage. Clearly, there is an enormous amount of knowledge and understanding of physical processes that should not be just thrown away. Consequently, we strongly believe that the most appropriate way forward is to combine the best of the two approaches: theory-driven, understanding-rich with data-driven discovery process. This paper describes a particular knowledge discovery algorithm—Genetic Programming (GP). Additionally, an augmented version of GP—dimensionally aware GP—which is arguably more useful in the process of scientific discovery is described in great detail. Finally, the paper concludes with an application of dimensionally aware GP to a problem of induction of an empirical relationship describing the additional resistance to flow induced by flexible vegetation.

You do not currently have access to this content.