Does payment by results work? Lessons from a multi-country WASH programme

Payment by results (PbR) for financing public services has attracted increasing interest over recent years in the water, sanitation, and hygiene (WASH) sector. PbR is attractive to funders as a mechanism because it focuses attention on results rather than inputs, and because it transfers a proportion of risk to suppliers. This paper reviews the experience of the UK Department for International Development (DFID) funded WASH Results Programme (WRP), which used PbR, drawing on a process evaluation and the experience of the first author in commissioning the programme, and the second author in evaluating it. The WRP met its targets for people reached with first-time access to water and sanitation and generated high-quality programme data. The PbR mechanism provided strong incentives to the suppliers to improve their monitoring systems. However, the suppliers tended to use tried and tested approaches, with limited innovation. It is critical to consider certain key elements in the design of PbR programmes, including the proportion of funding that uses PbR and the proportions of PbR that focuses on outputs and outcomes. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/). doi: 10.2166/washdev.2020.039 om http://iwaponline.com/washdev/article-pdf/10/4/716/828922/washdev0100716.pdf er 2021 Guy Howard (corresponding author) Department of Civil Engineering, University of Bristol, University Walk, Bristol BS8 1TR, UK E-mail: guy.howard@bristol.ac.uk Zach White Water, Sanitation, and Hygiene Team, Oxford Policy Management, Clarendon House, Level 3, 52 Cornmarket Street, Oxford, OX1 3HJ, UK

through the Department for International Development (DFID). By the end of 2013, 71% of centrally issued DFID contracts had a performance-based element. In 2014, DFID published its strategy on PbR, which included commitments to make PbR its 'business as usual' approach in contracts with its suppliers, rather than the exception (DFID b). increases in delivery costs in some cases; no evidence of extensive innovation and that suppliers were risk-averse; and limited evidence of 'gaming' the PbR system.
PbR programmes can deploy downside and upside incentives to the suppliers contracted. Downside incentives mean that failure to meet expected targets results in a loss of payment. Upside incentives reward suppliers with additional payments should the desired results be exceeded. Clist () concluded that a key element for the success of PbR is the ability and willingness of the funder to withhold payment, especially in relation to non-governmental organisations (NGOs). This focus on downside incentives, however, tends to make suppliers risk-averse and to stifle innovation. This paper reviews the experience of one WASH programme that used PbRthe WASH Results Programme (WRP)to examine how PbR has worked in practice, and the lessons learnt. The paper examines the key question of whether the programme delivered the expected results for the funder of the programme. The paper does not assess whether PbR worked better than other approaches to funding WASH. There are insufficient data to make a fair comparison as data were not collected from comparable programmes that had the same set of target results, similar geographies and similar timeframes, which would be required to make a comparative analysis.

THE WASH RESULTS PROGRAMME
The WRP is a £111 million programme running from 2013 to 2022. It was originally designed to meet a UK Government target for numbers of people gaining access to WASH between 2010 and 2015, but further extended in 2016 to support the achievement of greater numbers of people gaining first-time access to WASH between 2015 and 2020.
The WRP is delivered through three supplier contracts: the South Asia WASH results programme (SAWRP) consortium, led by Plan International; the Sustainable WASH in Fragile Contexts (SWIFT) consortium, led by Oxfam; and the SNV Sustainable Sanitation and Hygiene for All (SSH4A) programme. The contracts include two distinct sets of results associated with two phases of activity: an output phase, focused on ensuring first-time access to sanitation and/or water supply supported with hygiene education and a subsequent outcome phase to maintain use of sanitation and/or water supplies constructed and to maintain good hygiene practice. Table 1 summarises the suppliers' programmes and the number of people expected to be reached. The WRP achieved, and in most cases exceeded, the results targets established. At the end of the output phase under the original contracts, the WRP projects had provided over 1 million people with first-time access to water supply, over 4 million people with first-time access to sanitation, and over 10 million people reached with hygiene messages (DFID, ). The outcome phase focused on continued use of services as shown in Table 1.
At the start of the programme, an independent team of sector experts were contracted to provide verification that the results claimed by suppliers had been achieved, and therefore, payments could be made. Verification was primarily based on systems appraisals, assessing whether supplier monitoring and data systems were robust and reliable and, therefore, producing credible data, with limited spotchecks in the field. An autonomous team undertook both process and impact evaluations.
The WRP only used downside incentives, operating on a sliding scale down to 70% of the target, after which full payment would be lost. The WRP was designed as '100% PbR'that is, all payments were contingent on the achievement of a pre-defined set of results. However, payments were not solely linked to output-and outcome-level 'results': interim results were set within each phase to allow payments to be made throughout the programme.
Many early payment triggers were essential programme activities and inputs (e.g. training workshops).

METHODS
This paper draws on data collected through a process evaluation of the WRP (ePact ) and complements this with an analysis of the experience of the first author in the commissioning and management of the programme. We draw these two forms of evidence together through a critical analysis with greater weight given to empirical data over expert experience. The analysis of the findings of the evaluation and expert opinion are structured to primarily address the needs of the funder, as the data collection was designed to meet this purpose.
The data collected through the evaluation included: key informant interviews with DFID management, the global management of the three supplier consortia; interviews with programme management in the 11 WRP countries; and case studies in 4 of the 11 countries, which included interviews with field-level staff, government counterparts, and beneficiaries. The primary data were collected over the course of the evaluation, supplemented by a literature review, a review of programme documentation, and reviews of the programme annual reviews and business cases.
This evaluation drew on elements of contribution analysis and realist evaluation to assess the degree to which the PbR modality influenced implementation. This paper focuses on one of the core hypotheses explored in the evaluation. Namely, that the introduction of a PbR modality helped to achieve intended outputs and outcomes. This assertion is based on three related propositions: 1. the programme and its PbR modality allowed the flexibility of implementation approach within the subprogrammes, which helped to achieve output and outcome objectives; 2. stronger monitoring systems as a result of the PbR modality increased the likelihood of achieving intended outputs and outcomes; and 3. the results-oriented problem-solving promoted under the PbR modality increased the likelihood of achieving intended outputs and outcomes.
The analysis presented here does not cover the full breadth of the evaluation but provides a more detailed assessment on these three critical propositions and explores key issues of interest to funders seeking to use PbR. This includes the degree to which the PbR modality helped to achieve DFID's stated market-shaping objectives.
There are limitations to the study presented here, most importantly that the data collected and analysed does not permit a comparative assessment of PbR performance for the reasons noted above, and that detailed value for money analysis was not possible. In addition, the focus on the needs of the funder, means that the experience and impact on suppliers and communities, is not fully captured.

RESULTS
Did the PbR modality allow the flexibility of implementation approach within the sub-programmes, which in turn helped to achieve output and outcome objectives?
While there was flexibility in how suppliers met their targets, there is little evidence of innovation. There were several important reasons for this, including that many of the design decisions were made by partners before the PbR modality was fully understood. Furthermore, in the face of high downside delivery risks, the partners generally adopted tried and tested approaches in contexts familiar to them, as these could predictably deliver results. The main flexibility that supported the achievement of the targets was the ability of the suppliers to use multiple projects to deliver results, meaning that shortfalls in one country could be offset by achievements in other countries.
The evaluation found that the removal of financial and activity reporting requirements did result in programme managers being able to more flexibly manage programmes, but that this flexibility tended to stay at the higher levels of programme management. There were two principal reasons for this. First, many consortium leads were unwilling to transfer that level of risk to partners as they did not believe they had the ability to pre-finance activities. Secondly, there was a perceived need to tightly manage partners and field teams to ensure results were delivered, and complete financial autonomy at that lower level was seen as too risky.
Did stronger monitoring systems as a result of the PbR modality increase the likelihood of achieving intended outputs and outcomes?
The systems-based approach to verification under the WRP provided significant incentives to the suppliers to improve their monitoring systems and the suppliers made substantial investments to improve the reliability and robustness of their monitoring systems, which led to more robust data. These Did the use of the PbR modality contribute to DFID market-shaping objectives?
One of DFID's key aims in using PbR for the WRP was to build the supplier base and to attract new market entrants.
The WRP did identify NGOs capable of managing WASH programmes at a scale comparable to the United Nations Children's Fund, which provides one of the best benchmarks for larger-scale donor-funded WASH programmes.
However, given the limited number of NGOs able to demonstrate the ability to manage such large projects, marketshaping was limited. All the successful lead suppliers were organisations that had either significant WASH programmes in the proposed countries, allowing them to better manage financial risks, or were established consortia. No private sector organisation led a successful bid under the WRP, and, overall, the private sector response to this opportunity was more limited than had been anticipated by DFID.
One of the challenges in commissioning PbR projects is the availability of reliable price benchmarks for tender evaluation, which was a problem at the onset of the WRP.
As the suppliers worked across a wide range of contexts, it was challenging to assess what were reasonable variations in price caused by the delivery location, the particular groups targeted, and other qualitative factors.

DISCUSSION
The lessons from the WRP raise some interesting questions about the application of PbR in WASH programmes, and the use of PbR more broadly. The WRP experience suggests that the greatest value of PbR was in focusing supplier attention on specific aspects of programme implementation.

Flexibility in implementation
A key theorised benefit of PbR is that it allows for greater innovation (DFID a, b

Stronger monitoring and verification
System-based verification reinforced incentives for suppliers to improve their monitoring but potentially came at the cost of securing 'hard' independent data. This was a trade-off that was supported by the spot-checking of results by the verifiers to provide confidence that monitoring systems were reliable and robust. The use of spot-checks allowed issues of concern to be raised and discussed at payment decisions meetings, and in some cases resulted in deferred payments until such time as suppliers could provide reliable data. The alternative of verification through primary independent data collection would have significantly raised costs to the funder and would have reduced the incentives for suppliers to make their own reporting more robust.
The limited inception phase created issues for the efficiency of the monitoring. A longer inception phase would have allowed the more complex elements of verification of systems and in particular outcomes to be addressed.
Although the suppliers underestimated the additional costs of improving monitoring systems, for the WRP, in the view of the authors, the benefits from improved quality of data from better monitoring systems and strong technical verification justified the additional costs. However, it is noted that these benefits were largely confined to improvements within the NGO partners, and there was limited impact on government monitoring practices and systems in the countries in which the suppliers were working.

Impact on outputs and outcomes
The WRP experience confirms the findings of Clist () that PbR works best when it is focused on issues of shared concern between supplier and funder and that selecting the right performance measures is critical to the success of a PbR programme. However, linking payment to the early 'results' packages comprised of activities was inefficient to verify and of little benefit to improving programming. This suggests that future programmes using PbR should consider using a mixed model of grant and PbR funding, applying the former to essential inputs and the latter to the delivery of target output results. Identifying strong performance measures at the outcome level proved more difficult than identifying those at the output level, which was partly because outcome measures were not agreed at the outset of the contracts. In future PbR programmes, greater attention should be given to resolving these questions early in design, and before programming starts. NGOs willing to bid for such large contracts would do so in a context in which they would be moving into a country without pre-existing programmes. This suggests that nongeographically specific programmes may benefit from a wider market, but programmes focused on individual countries may not. It is noteworthy that no private sector organisation led a successful bid under the WRP, and, overall, the private sector response to this opportunity was more limited than had been anticipated by DFID.

CONCLUSION
The WRP has shown that PbR can be an effective financing instrument for rural basic WASH projects. It has delivered significant results at the output level and shown that these can be sustained for at least 2 years post-implementation at high levels. PbR was effective in promoting investment in monitoring systems by suppliers and independent verification increased the quality and reliability of the results. As with the previous use of PbR, it is essential that strong performance measures are agreed that address issues that both suppliers and funders care about. The experience under the WRP highlights that the PbR modality did not stimulate innovation and, in fact, promoted risk-averse behaviour by suppliers. The WRP experience also indicates that future WASH programmes using PbR will need to incorporate an inception phase to agree performance measures and processes of verification, and to obtain clarity on outcomes. It is also recommended that future programmes only apply PbR to a portion of total funding reserved for incentivising specific aspects of programming and that essential project inputs are funded through grant mechanisms. There is a need for further evaluations of PbR in practice to establish where, how, and in what ways this modality can be most effectively used, and in the coming years to synthesise the emerging evidence from the PbR programmes that have finished and been evaluated.
The question of whether PbR should be used in preference to other modalities remains unanswered and will depend on the context of the programme design. Where there are clear targets and no expectation of innovation, then PbR is well suited. Where objectives are more complex or where innovation is desired, other financing mechanisms may be more appropriate.