In this article we present a novel nested dynamic programming (nDP) algorithm for multipurpose reservoir optimization with additional decision variables related to different water users. The nDP algorithm is built from two algorithms: (1) dynamic programming (DP) and (2) nested optimization algorithm implemented with Simplex and quadratic Knapsack methods. The novel idea is to include a nested optimization algorithm into the DP transition that reduces the initial problem dimension and alleviates the DP's curse of dimensionality. The nDP can solve multi-objective optimization problems, without significantly increasing the algorithm complexity and the computational expenses. Computationally, the nDP can handle dense and irregular variable discretization; it is coded in Java as a prototype application and has been successfully tested with eight objectives at the Knezevo reservoir in the Republic of Macedonia. The article presents a discussion on comparison of nDP with other DP methods and highlights the advantages of nDP.

## INTRODUCTION

Historically, the two most widely used methods for optimal reservoir operation have been dynamic programming (DP) and stochastic dynamic programming (SDP). These two methods suffer from the so-called ‘dual curse’ which prevents them to be used in reasonably complex water systems. The first one is the ‘curse of dimensionality’ that denotes an exponential growth of the computational complexity with the state–decision space dimension (Bellman 1957). The second one is the ‘curse of modelling’ that requires an explicit model of each component of the water system (Bertsekas & Tsitsiklis 1995) to anticipate the effect of each system's transition. The literature offers various strategies to overcome the curse of dimensionality such as successive approximations, incremental DP, and differential DP (Jacobson & Mayne 1970; Bellman 1986; Anvari *et al.* 2013; Li *et al.* 2013). The application of various DP and SDP methods in optimal reservoir operation are reviewed in Yeh (1985) and for multi-reservoir systems in Labadie (2004). One of the additional issues arises is when assessment of the constraints and objective functions requires running complex models (for example, river, a model to calculate water levels or flows downstream). One of the possible solutions is the use of fast approximate surrogate models (e.g., artificial neural networks replicating the behaviour of a hydrodynamic river model (Solomatine & Torres 1996)) – this reduces the total running time but still for real-life problems the use of DP can be prohibitively long.

This paper addresses the problem of optimal reservoir operation concerning multiple objectives that are related to: (1) reservoir releases to satisfy several downstream users competing for water with dynamically varying demands; (2) deviations from the target minimum and maximum reservoir water levels; and (3) hydropower production that is a combination of the reservoir water level and the reservoir releases.

Addressing such a problem with classical DP requires a reasonably high level of discretization of the reservoir storage volume, which in combination with the required releases discretization for meeting the demands of downstream users leads to computationally expensive formulations and causes the curse of dimensionality.

We present an alternative approach, in which at each transition of the classical DP an additional optimization algorithm is run to identify the optimal releases (allocations) to individual users. Because this second optimization algorithm is ‘nested’ inside the classical DP algorithm we name this method ‘nested dynamic programming (nDP)’. Depending on the way we formulate the objective function related to deficit in the allocation problem in the nested optimization, two methods are implemented: (1) Simplex for linear allocation problems and (2) quadratic Knapsack method in the case of non-linear problems.

The presented nDP algorithm was coded as a prototype application in Java using Eclipse. The DP part of the nDP was coded by using the example from the book by Loucks & Van Beek (2005). The nested optimization algorithms, Simplex and quadratic Knapsack methods are developed as separate modules. The nDP algorithm was tested at the multipurpose reservoir Knezevo of the Zletovica hydro-system located in the Republic of Macedonia, with the purpose of urban water supply, agriculture, ensuring ecological flow and generation of hydropower.

The article is organized in six sections. After the Introduction, the second section presents the problem, formulation of variables, objective functions and the nDP algorithm. The third section introduces the case study, which is followed by a section presenting the implementation of the nDP for this case. Results and discussion follow, then a section including the comparisons of the nDP with others DP methods, which is followed by conclusions.

## THE PROBLEM AND THE NDP ALGORITHM

### nDP framework

*n*competing users, and this multiplies the total number of decision variables. This problem, if posed in the DP set-up using the Bellman equation (Bellman 1957) (for stages

*t*=

*T*−1,

*T*−2, …1)

where *s _{t}* is the state vector representing discrete reservoir storage volume at the beginning of the period

*t*;

*T*is the number of stages in the sequential decision process; is the state value function; is the actions or decision variables vector during the period

*t*; is the reward from period

*t*when the current state is

*s*, the action

_{t}*a*is executed and the resulting state is

_{t}*s*

_{t}_{+1}.

*q*is the reservoir inflow,

_{t}*e*are evaporation losses and

_{t}*r*is the total reservoir release.

_{t}The nDP contains a nested optimization algorithm inside the DP algorithm that optimally allocates the total reservoir release *r _{t}* to different users corresponding to their demands

*d*, as shown in Figure 1.

_{it}Assuming a backward-moving DP, at each nDP transition, the reservoir beginning and end states are known. From the mass balance Equation (2), the release *r _{t}* can be calculated and then nested optimization algorithm is run to identify allocation of

*r*between

_{t}*n*users. The inputs of the nested optimal allocation algorithm are the reservoir release, the users' demands and their relative importance, while the decision variables are the water volumes given to users (called subsequently the ‘users' releases') for satisfying their demands. The overall objective function may include terms related to users' releases, deviations from target reservoir levels, or both releases and levels (e.g., hydropower). At the end of the transition, all decision variables are estimated and the reward is calculated.

The nDP pseudo-code is presented below:

Discretize storage

*s*and_{t}*s*_{t}_{+1}in*m*intervals, i.e.,*s*(_{i,t}*i*= 1, 2, …,*m*),*s*_{j}_{,t+1}(*j*= 1, 2, …,*m*).Set time at

*t*=*T*− 1.Set reservoir level

*i*= 1(for time step*t*).Set reservoir level

*j*= 1(for time step*t*+ 1).Calculate the first group of the objective functions (related to deviations from target reservoir levels).

Calculate the total release

*r*using Equation (2)._{t}If total release

*r*is bigger or equal to total demand then return 0, and go to step 9._{t}Execute the nested optimization algorithm to allocate the total release to all users {

*r*_{1t},*r*_{2t}, …*r*} in order to meet their individual demands._{nt}Calculate the second group of the objective functions (related to users' releases).

Using the reservoir levels and the user releases, calculate the third group of the objective functions (related to hydropower production).

Combine the objective functions from steps 5, 8 and 9 into the aggregate objective function

*V*(*s*)._{t}*j*=*j*+ 1.If

*j*≤*m*, go to step 5.Select the optimal actions (decision variables) {

*a*_{1t},*a*_{2t}, …*a*}_{nt}_{opt}, which consist of the optimal transition {*s*_{t}_{+1}}_{opt}and the users' releases {*r*_{1t},*r*_{2t}…*r*}_{nt}_{opt}that give minimal value of*V*(*s*)._{t}*i*=*i*+ 1.If

*i*≤*m*, go to step 4.If

*t*= 0, stop.*t*=*t*− 1.Go to step 3.

The optimal allocation algorithm is incorporated (nested) in the DP method (step 7) and directly updates the state value function *V*(*s _{t}*) at each time step consequently changing the optimal reservoir policy and solving the multi-objective optimization problem.

The action vector {*a*_{1t}, *a*_{2t}, … *a _{nt}*} consists of the transition state

*s*

_{t}_{+1}, and the users' releases {

*r*

_{1t},

*r*

_{2t}, …

*r*}. The corresponding total release

_{nt}*r*, at each transition is calculated from Equation (3). The

_{t}*e*evaporation losses can be calculated at each transition using the reservoir area that is a function of the reservoir storage volume

_{t}*s*, and other factors.

_{t}### Approaches to nested optimization of step-wise resource allocation

Depending on the problem formulation various methods can be used to optimally allocate the total reservoir release *r _{t}* between

*n*water users. Optimality could be understood in different ways and this is reflected in the objective function. We have chosen to use two methods: Simplex method in the case of linear problem and quadratic Knapsack algorithm for quadratic problem. Each water user is described with its demand

*d*and corresponding weight

_{it}*W*at time step

_{it}*t*. For the nested optimal allocation the following variables are relevant:

*d*

_{1t},

*d*

_{2t}, …

*d*are users' demands;

_{nt}*W*

_{1t},

*W*

_{2t}, …

*W*are the corresponding demands' weights;

_{nt}*r*is the reservoir release;

_{t}*r*

_{1t},

*r*

_{2t}, …

*r*are the users' releases,

_{nt}*v*is the release discretization value.

*r*can satisfy the aggregated demand of all users in Equation (3), step 7 in the nDP pseudo-code. If the release

_{t}*r*can satisfy the aggregated demand of all users, then the optimal allocation is not performed since all the releases can be set to their demands.

_{t}#### Simplex method

#### Quadratic Knapsack method

*r*is assumed to be discretized in

_{t}*v*levels; this value is set at the beginning and stays the same over nDP execution. The quadratic Knapsack objective function is to minimize with the same constraints previously described in Equations (5)–(7).

## CASE STUDY

The hydro system Zletovica is located in the eastern part of the Republic of Macedonia and it uses the water resources of the river Zletovica and its tributaries. It is composed of the reservoir Knezevo, several water distribution canals used for delivering water to downstream users and associated infrastructure structures, as shown in the schematic representation in Figure 2. The reservoir Knezevo is a multipurpose reservoir, and its main objective is to provide drinking water to several towns and populated areas in the region, as well as to provide environmental flows in Zletovica, water for agriculture and hydropower (in this order of decreasing priority).

There are five towns in this region (Kratovo, Probishtip, Zletovo, Shtip and Sveti Nikole) and two large agricultural irrigation regions named upper and lower zone. The town Kratovo is not modelled because the intake for this town is upstream from the Knezevo reservoir and only a small water quantity is required. The region is characterized with mountainous topography and the system is designed to include several small hydropower plants, most of which are located downstream of the reservoir Knezevo as derivational (run of river) power plants that utilize the natural head differences created by the topography. The hydropower system is still in development and according to the feasibility study report (GIM 2010) the plan is to build eight hydropower plants.

The GIM (2008) report contains a detailed hydrological model of the Zletovo river basin with four river flow measurement points on the river Zletovica. The monthly time series data of the flow measurement points from the year 1951–1990 are used in this research. There is a significant tributary inflow between the Knezevo reservoir and the last branching point. First, the tributary inflow is used to satisfy the water demand objectives, and if additional water quantities are needed then they are released from the reservoir.

## APPLICATION OF NDP IN THE CASE STUDY

### Formulation of the optimization problem

*O*and six decision variables. Each objective is described by its target value and its corresponding weight at each time step. In this study, we aggregate all objectives into one objective function being the weighted sum of squared deviations over the entire time horizon; referring to the Bellman Equation (1) the reward function has the following form: where

_{i}*s*is the reservoir storage volume at time step

_{t}*t*,

*w*is the objective weight for a given objective

_{it}*i*and time step

*t*, and

*D*is the difference between the target value and decision variable for a given objective

_{it}*i*and time step

*t*,

*c*is the balance coefficient explained below in this paper.

_{i}*l*and max

_{t}*l*are the recreation-related (minimum) and flood-related (maximum) water level targets.

_{t}*d*

_{3t},

*d*

_{4t},

*d*

_{5t},

*d*

_{6t},

*d*

_{7t}. The water deficits are calculated using Equation (12) where

*r*is the release (decision variable) for a given objective (

_{it}*i*) and time step (

*t*).

*w*

_{8t}as the hydropower energy production weight and

*D*

_{8t}is calculated from where

*d*

_{8t}is the hydropower target demand and

*p*is the hydropower production.

_{t}The action vector *a _{t}* consist of six actions or decision variables: s

_{t+}_{1},

*r*

_{3t},

*r*

_{4t},

*r*

_{5t},

*r*

_{6t},

*r*

_{7t}which are the next optimal reservoir state and water user releases at each time step. Using these decision variables, it is possible to calculate all other variables and objective functions.

*r*goes through the turbines of HEC0. The reservoir release

_{t}*r*is compared to the generator maximum water capacity HEC0

_{t}_{max}, as in Equation (14).

*r*

_{0t}is the reservoir release quantity that goes over the turbines of HEC0. The energy generated by HEC0 (MWh) is The gen0 is the coefficient that includes total efficiency and it is taken from the GIM (2008) report;

*l*and

_{t}*l*

_{t}_{+1}are reservoir water levels at time

*t*and

*t*+ 1. The others HEC1–HEC3 and HEC4 are calculated similarly, with one difference – for these derivational HECs the generated energy does not depend on the reservoir water levels directly, but only on the releases from the reservoir, which (together with corresponding tributary flows) constitute the total flow for each HEC. The actual head for these HECs is fixed (by topographical differences in elevation). The calculation of the hydropower production of HEC1–3 and HEC4 is somewhat more complex, so these details are not presented here as they are beyond the focus of this article. All hydropower plants together produce the total energy

*p*.

_{t}Since in Equation (9) various objective function types are aggregated into one, their values need to be adjusted so they are comparable in order of magnitude. The approach used was to assign the appropriate coefficients (*c _{i}* in Equation (9)). A similar approach to balancing was taken in other previous research studies, e.g., Pianosi & Soncini-Sessa (2009) and Rieker (2010).

The nested algorithm is selected in the beginning of optimization. The nDP is constructed in a way that first the tributary flow (downstream of the Knezevo reservoir) is optimally divided between users' demands using the previously selected nested optimization algorithm. Afterwards, the remaining, still unsatisfied user demands are met from the reservoir releases by applying the optimization as described above.

### Optimization algorithm settings

The nDP algorithm was tested on the Zletovica hydro system using 40-year monthly data from 1951 to 1990, which resulted in 480 time steps. The reservoir operation volume is 23 × 10^{6} m^{3} which was discretized in 115 equal levels (200 × 10^{3} m^{3} each). The target for the recreation reservoir level was set at 1,035 [amsl], for the flood control reservoir level at 1,058 [amsl] and for the hydropower production at 1.5 GW-h per month. The targets for the consumptive users were set to their predetermined monthly demands.

To test and evaluate the proposed nDP algorithm and to compare results using the two nested optimization methods, Simplex and quadratic Knapsack, four different scenarios were considered, as shown in Table 1. They are based on two sets of weights for objective function and two formulations for the deficits in the nested optimization problem: (a) linear (hence the Simplex method, Equation (4)) named L_{1} and L_{2}; and (b) quadratic (the quadratic Knapsack method, Equation (8)) named Q_{1} and Q_{2}. In the quadratic Knapsack the discretization of allocation was set to 50. The higher weights are set for urban water supply objectives (*w*_{3} and *w*_{5}) followed by ecological flow, irrigation, reservoir water level objectives and hydropower. The second set of weights gives higher priority to one of the urban water supply objectives (*O*_{5}) at the expense of the two irrigation objectives (*O*_{4} and *O*_{6}).

Scenarios | w_{1} | w_{2} | w_{3} | w_{4} | w_{5} | w_{6} | w_{7} | w_{8} |
---|---|---|---|---|---|---|---|---|

L_{1} and Q_{1} | 0.05 | 0.05 | 0.25 | 0.1 | 0.25 | 0.1 | 0.16 | 0.04 |

L_{2} and Q_{2} | 0.05 | 0.05 | 0.25 | 0.075 | 0.3 | 0.075 | 0.16 | 0.04 |

Scenarios | w_{1} | w_{2} | w_{3} | w_{4} | w_{5} | w_{6} | w_{7} | w_{8} |
---|---|---|---|---|---|---|---|---|

L_{1} and Q_{1} | 0.05 | 0.05 | 0.25 | 0.1 | 0.25 | 0.1 | 0.16 | 0.04 |

L_{2} and Q_{2} | 0.05 | 0.05 | 0.25 | 0.075 | 0.3 | 0.075 | 0.16 | 0.04 |

## RESULTS AND DISCUSSION

^{4}–5, respectively. For all deviations/deficits, the sum over the entire horizon is calculated using Equation (16)

Depending on the formulation that is used in the nested optimization, the obtained differences in results can be observed in Figures 3^{4}–5. The quadratic formulation in Q_{1} and Q_{2} scenarios influence the *D*_{1} and *D*_{2} deviations, producing better results compared with the L_{1} and L_{2} scenarios, respectively, as can be seen in Figure 3. The *D*_{3} and *D*_{5} deficits in the L_{1} and L_{2} scenarios are at the expense of *D*_{4} and *D*_{6} (agriculture) that have higher deficits, as shown in Figure 4. When comparing L_{1} and L_{2} with Q_{1} and Q_{2} scenarios, it can be seen that *D*_{3} and *D*_{5} have less deficits when a linear formulation is used. The quadratic formulation (quadratic Knapsack) makes a more balanced distribution between the water demand users, which is due to its objective function, shown in Equation (8). The quadratic formulation objective function alleviates the *D*_{6} peak deficits that occur in L_{1} and L_{2} scenarios, at the expense of *D*_{3}, *D*_{4} and *D*_{5}. The quadratic formulation has better results concerning *D*_{8} hydropower deficits, as shown in Figure 5.

These differences in results when using linear or quadratic formulations in the nested algorithm come from the properties of the objective functions as described in Equations (4) and (8). Linear objective functions have the known characteristic that optimal solutions are always corner-point feasible solutions, which means that when the objective function is formulated as a weighted sum of separate linear objectives, the optimal solution will favour objectives (‘users' in our case) with higher weights. Therefore, *D*_{5} and *D*_{3} (which are with the highest weights) show smaller deficits for L_{1} and L_{2} than with Q_{1} and Q_{2}. The quadratic formulation allows for a more balanced solution overall (with better trade-offs) and in fact alleviates huge deficits obtained with linear formulation for users with lower weights. However, linear formulation performs faster, and depending on the problem, targets and the ‘desired’ optimal reservoir operation, the modeller can choose between the linear and quadratic formulation. It is also important to note that the proposed solution is general in the sense that it allows for inclusion of other nested algorithms, and it is not restricted to the two proposed and tested here.

The *w*_{5} (corresponding to the urban water supply) is the highest weight in the L_{2} and the Q_{2} scenarios shown in Table 1, and consequently, it is the most satisfied water demand user, as shown in Figure 4. The *D*_{3} deficit in the L_{1} scenario is zero, but it has a significant increase due to the *w*_{5} increase in the L_{2} scenario, although *w*_{3} weight is unchanged, as shown in Figure 4. The *D*_{4} and *D*_{6} deficits are slightly increased in L_{2}, compared to the L_{1} scenario. The *D*_{3} and *D*_{5} deficits are lower in Q_{2} compared to the Q_{1} scenario. These deficit reductions are achieved at the expense of *D*_{4} and *D*_{6} deficits. When comparing L_{1}, L_{2}, Q_{1} and Q_{2} scenarios considering *D*_{1}, *D*_{2}, *D*_{3}, *D*_{7} and *D*_{8} deficits and their weights, it is obvious that the results are different even without changes of the corresponding weights, as shown in Figures 4^{5}–6. This happens because the change in one or more weights, in this case *w*_{4}, *w*_{5} and *w*_{6}, modifies the main objective function and, consequently, produces different results. The presented scenarios demonstrate that by changing the weights in accordance with the user preferences it is possible to create different optimal reservoir operations.

As it is difficult to present the 40 years nDP optimization results' graphs, a 3-year (1985–1987) sample was selected from the Q_{2} scenario to show some characteristic results. Figure 6 shows the variations of reservoir inflow, tributaries inflow, reservoir volume and reservoir outflow. Figure 7 shows the recreation and flood target’ deviations. Figure 8 shows water users' demands. Figure 9 shows water users' deficits and Figure 10 shows hydropower deficits. This 3-year period is a combination of two dry and one wet year in between them.

This 3-year period is characterized by the significant reservoir and tributary inflows in the spring months (from February until May) due to high precipitation and snowmelt, which is relatively small in the other periods, as shown in Figure 6. The reservoir releases are rising between June and October because of the increased demand of the agricultural users, as shown in Figures 6 and 8. The town users (*d*_{3t} and *d*_{5t}) and the ecology (*d*_{7t}) demands are relatively constant, while the agriculture users' (*d*_{4t} and *d*_{6t}) demands are variable reaching a maximum in the summer months, as shown in Figure 8. The reservoir inflow and the tributary inflow in 1986 are larger and more widely distributed between January and June compared to 1985 and 1987 when there are high reservoir inflows in April and May, as shown in Figure 6. The difference between 1986 and the other 2 years, 1985 and 1987, is shown in the reservoir release shown in Figure 6, the water users' deficits in Figure 9 and hydropower deficits shown in Figure 10. The majority of users' deficits occur from April until October in 1985 and 1987, as shown in Figure 9. The deficits *D*_{4} and *D*_{6} (agriculture) are larger than *D*_{3} and *D*_{5} (towns) deficits in 1985 and 1987, because of the smaller weights *w*_{4} and *w*_{6} and their larger demand, as shown in Figures 8 and 9. If the linear formulation results were presented using the *L*_{1} or the *L*_{2} scenarios, the difference between the agriculture and the town's deficits would be even larger.

The reservoir satisfies (almost) all objectives in 1986, as shown in Figures 9 and 10. There are reservoir flood deviations (*D*_{2}) in May 1985 and 1987 because of peak reservoir inflows, as shown in Figures 6 and 7. On the other side, there are recreational level deviations (*D*_{1}) in October and December 1985. This coincides with the reservoir volume shown in Figure 6, which is lowest in periods between November and January. The last months of 1987 suggest that probably the next year is very dry. The reservoir stores additional volume in winter months (October–December 1987) at the expense of water demand users and the hydropower, as shown in Figures 6, 9 and 10.

The nDP algorithm was tested with a variable reservoir volume discretization using the Q_{2} scenario settings. The reservoir storage volume was discretized: (1) from 0 to 5 × 10^{6} m^{3} and from 22 × 10^{6} to 23 × 10^{6} m^{3} in intervals of 200 × 10^{3} m^{3}; and (2) the rest was discretized in intervals of 150 × 10^{3} m^{3} (resulting in the 144 discretization levels). The higher discretization has directly increased the computational resources and the running time, but it has improved the results (1–2%). This test shows that the nDP algorithm allows for higher levels of reservoir level discretization without prohibitive computational time, and for each problem, a balance between the desired level of accuracy and the computational time and resources can be determined. In addition, the nDP algorithm was tested with variable objective weights at different time steps.

In designing the optimal reservoir operation, one needs to be aware of the objectives' interdependence, in our case, users' releases, reservoir level and hydropower. The hydropower production is a function of the total users' releases (reservoir release) and the reservoir level. The hydropower weight influences both the total users' releases and the reservoir level. An additional scenario was executed based on the Q_{2} scenario settings, where the highest priority (weight) was set to the hydropower. The scenario result showed that the reservoir was filling to the highest possible level and afterwards made significant releases, increasing the hydropower production. The increased hydropower production directly influenced the users' releases and the reservoir levels. This shows that although the main objective function is composed of a sum of weighted objectives, there is additional complexity because of the objectives' interdependencies, which means that changing the objective weight influences other dependent objectives and the overall results. The desired optimal reservoir operation can be designed by tuning the weights.

The nDP makes an exhaustive search over all possible states and actions, executing the nested optimization at each transition, and making a significant number of calculations. For the presented set-up, the scenario execution time was under 5 minutes on a standard desktop computer. As expected, the execution time of the Q_{1} and the Q_{2} scenario was longer because of the more complex calculation in the quadratic Knapsack than in the Simplex optimization algorithm. The minor obstacle for the nDP implementation was the memory requirements that in our case were solved by increasing the maximum memory allocation pool for the Java Virtual Machine.

## DISCUSSION ON COMPARISON OF NDP WITH OTHER DP ALGORITHMS

### nDP compared with a classical DP algorithm

The presented case study has five water demand objectives. The classical DP approach would model these objectives as five different releases. Since the DP algorithm is an exhaustive search, all possible states and actions need to be evaluated. Let us assume that the discretization is the same and that the five releases are discretized on 10 × 10^{3} m^{3}.

The DP algorithm makes a transition between all states. Let us consider only one transition from full to empty reservoir. In this simple calculation, the possible combinations of five releases are below 2,300^{5}. There are 480 time steps and over 3,000 transition at each time step. A simple assessment shows that the total number of DP transitions is around 10^{23}, which makes this problem computationally unsolvable. For the applications considered, the usage of classical DP becomes impossible when there are several decision variables.

### nDP compared with an aggregated water demand DP algorithm

To decrease the optimization problem dimension and to allow for DP implementation a simplified sub-optimal approach can be taken: the water demand objectives can be combined into a single aggregated objective so that at the first stage the standard DP optimization can be employed. The resulting releases are quasi-optimal since only the aggregated demand is optimized rather than the individual ones. At the second stage, these quasi-optimal reservoir releases are distributed between the different water demand users for each time step according to their importance. Such an approach can be called the aggregated water demand (AWD) DP algorithm. It would be good to compare AWD DP with nDP but using the case considered in this paper it appeared to be difficult.

The new optimization problem now has four objectives, (1) recreation level target, (2) flood level target, (3) (aggregated) water demand user and (4) hydropower, and it can be solved by AWD DP. In the first stage, the DP algorithm is executed with the four objectives and the total optimal reservoir releases identified. In the second stage, these releases are be distributed at each time step between the five users, forming allocations , , , and .

Let us assume that hydropower is a priority, and only the agriculture-related allocations and are used for producing hydropower. In this case, the AWD approach cannot be applied because it is difficult to create an objective function in DP that will represent the hydropower weight when all water demand users are aggregated (since hydropower is generated only by two out of four allocations). The solution for this problem would be to divide the water demand users and their allocations into two groups, (a) , and (b) , , , and in this way to represent the hydropower in the objective function. This will introduce the second decision variable in DP that considerably increases the dimensionality of the problem. If we make this case more complicated and assume that the water demand user is very important compared to the other users, and needs to be represented as a separate decision variable in DP, then the application of the classical AWD would become even more computationally demanding.

An alternative is to use the nested approach, as introduced in this paper, which makes it easy to introduce a large number of separate objective functions (e.g., hydropower or *d*_{3t} prioritization as discussed above).

## CONCLUSIONS

This article presented a novel nested approach to solving a DP problem that allows for several objective functions at each time step. The use of the nDP algorithm was implemented in the Zletovica river basin case study with eight objectives and six decision variables. The nDP algorithm has the following advantages:

It effectively alleviates the curse of dimensionality of a standard DP algorithm.

It allows for considering all objective functions at each time step and hence allows for deeper optimization if compared to the sub-optimal approach where all water demands are aggregated (AWD DP).

Computationally, nDP runs fast on standard personal computers. In the presented case study, optimization for one scenario required no more than 5 minutes on a standard desktop PC.

The algorithm allows for employing dense and variable discretization of the reservoir volume and release.

It supports using a variable (time-dependent) weight at each time step for every objective function.

It provides a framework for including many objectives in the nested optimization algorithm without a significant change in the source code or an increase in the computational expenses.

Different optimization algorithms can be used in the nesting for water allocation; however, since nested optimization has to be repeated multiple times (for each transition of DP) the algorithm used for this purpose needs to be fast. Thus, in the presented nDP framework we used Simplex and quadratic Knapsack algorithms which showed excellent performance.

The nDP framework can support complex non-linear objective functions.

The presented nesting approach can be applied not only in the deterministic DP context, but also in SDP, reinforcement learning and other optimization algorithms.

One of the limitations of the presented study is the fully deterministic formulation, without explicit consideration of uncertainties, so characteristic of the reservoir operation. Therefore, further research will be focussed on implementing the nDP approach in a SDP setting. In addition, it is planned to test the nested optimization approach in the problems solved by reinforcement learning.