Multi-objective Markov decision processes (MOMDPs) provide an effective modeling framework for decision-making problems involving water systems. The traditional approach is to define many single-objective problems (resulting from different combinations of the objectives), each solvable by standard optimization. This paper presents an approach based on reinforcement learning (RL) that can learn the operating policies for all combinations of objectives in a single training process. The key idea is to enlarge the approximation of the action-value function, which is performed by single-objective RL over the state-action space, to the space of the objectives' weights. The batch-mode nature of the algorithm allows for enriching the training dataset without further interaction with the controlled system. The approach is demonstrated on a numerical test case study and evaluated on a real-world application, the Hoa Binh reservoir, Vietnam. Experimental results on the test case show that the proposed approach (multi-objective fitted Q-iteration; MOFQI) becomes computationally preferable over the repeated application of its single-objective version (fitted Q-iteration; FQI) when evaluating more than five weight combinations. In the Hoa Binh case study, the operating policies computed with MOFQI and FQI have comparable efficiency, while MOFQI provides a continuous approximation of the Pareto frontier with no additional computing costs.
Skip Nav Destination
Article navigation
Research Article|
January 02 2013
Tree-based fitted Q-iteration for multi-objective Markov decision processes in water resource management
F. Pianosi;
1Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza L. da Vinci, 32, I-20133 Milano, Italy
E-mail: [email protected]
Search for other works by this author on:
A. Castelletti;
A. Castelletti
1Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza L. da Vinci, 32, I-20133 Milano, Italy
Search for other works by this author on:
M. Restelli
M. Restelli
1Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza L. da Vinci, 32, I-20133 Milano, Italy
Search for other works by this author on:
Journal of Hydroinformatics (2013) 15 (2): 258–270.
Article history
Received:
October 31 2011
Accepted:
July 05 2012
Citation
F. Pianosi, A. Castelletti, M. Restelli; Tree-based fitted Q-iteration for multi-objective Markov decision processes in water resource management. Journal of Hydroinformatics 1 April 2013; 15 (2): 258–270. doi: https://doi.org/10.2166/hydro.2013.169
Download citation file: