With the increasing complexity of hydrologic problems, data collection and data analysis are often carried out in distributed heterogeneous systems. Therefore it is critical for users to determine the origin of data and its trustworthiness. Provenance describes the information life cycle of data products. It has been recognised as one of the most promising methods to improve data transparency. However, due to the complexity of the information life cycle involved, it is a challenge to query the provenance information which may be generated by distributed systems, with different vocabularies and conventions, and may involve knowledge of multiple domains. In this paper, we present a semantic knowledge management framework that tracks and integrates provenance information across distributed heterogeneous systems. It is underpinned by the Integrated Knowledge model that describes the domain knowledge and the provenance information involved in the information life cycle of a particular data product. We evaluate the proposed framework in the context of two real-world water information systems.