Effects of water and health on primary school enrolment and absenteeism in Indonesia.

Clean water provision remains a serious problem in low- and middle-income countries. In 2017, about 30% of the world population relied on unimproved water sources located outside of the dwellings. Often women and children are occupied in fetching water. This situation increases the prevalence of water-related diseases such as diarrhoea and reduces children's study time. School attendance may decrease due to the combined effects of diarrhoea and time spent on fetching water. We investigate the effects on school absenteeism and primary school enrolment in Indonesia, using a panel data set for 295 districts over the period 1994-2014. Districts with higher diarrhoea prevalence are found to have lower school enrolment (B: -0.202, sig p < 0.01) and higher school absenteeism (B: 2.334, sig p < 0.001). Districts where more households have access to private water facilities have higher school enrolment (B: 0.025, sig p < 0.01) and lower school absenteeism (B: -0.027, sig p < 0.01). More use of piped and bottled water in a district is associated with a lower diarrhoea prevalence (B: -0.004, sig p < 0.05). Policy-makers should take the benefits of improved water supply into account when making cost-benefit analyses regarding investments in water infrastructure.


INTRODUCTION
well as the link between water quality and diarrhoea, it seems important to incorporate diarrhoea prevalence in the analysis when studying the effects of clean water access on school attendance.
The current study aims to investigate both the direct effect of clean water access on school attendance as well as its indirect effect that runs through diarrhoea prevalence.
We consider the case of Indonesia, where water access and diarrhoea as well as school attendance are still major problems. The provision of piped water is limited (covering 20% of the population) and unevenly distributed over the country. In relation to this, Indonesia needs to cope with diarrhoea, which is the third major cause of child mortality (UNICEF ). Lastly, even though primary school enrolment in Indonesia reached 95% in 2014, this figure does not capture the true reality of school participation as about 12.5% of the students have been absent from school due to illnesses (Statistics Indonesia ). Providing a more comprehensive understanding of the relationship between access to clean water and school attendance might support policy-makers in finding more effective interventions.

A THEORETICAL FRAMEWORK OF DIRECT AND INDIRECT EFFECTS OF WATER ON SCHOOL ATTENDANCE
The demand for primary education can originate from policy measures, which oblige children to attend school in order to acquire the basic skills needed for economic participation (Checchi ). In Indonesia, the 1990 basic education law regulates 9 years of compulsory education for all children aged 7-15 years. However, this top-down approach is not sufficient to explain the variation in primary school attendance. Education can also be seen as an investment in human capital made by individuals to acquire future benefits (Eide & Showalter ; Todaro & Smith ).
Based on this view, school attendance depends on both the benefits and costs of education accrued by the individual or the population. The costs of education include the direct costs of tuition fees, books, and transport costs, and the indirect or opportunity costs of forgone income (Eide & Showalter ; Todaro & Smith ). In low-income countries, parents who decide on their children's education tend to be more myopic than the ones in high-income countries and base their choices more on current costs than on future gains, resulting in a lower demand for education (Checchi ). Other factors influencing the optimal demand for education are talent (i.e. personal intelligence or family background), current and expected future gains (such as employment conditions and expectation of higher returns in the future), the initial level of human capital, and the available resources for education in the region.
Based on these views, there are at least two channels through which water access influences school attendance.
First, households with no access to water in the dwelling have to transport the water into the house from outside, a job usually carried out by women and children. This water-fetching role can be a burden on the children, who lose energy and have less time to study (Dreibelbis et al. ; Nauges & Strand ). In such a situation, the parents might consider the current costs of sending their children to school as being higher than expected future benefits.
Second, the lack of access to clean water increases the prob- Azor-Martínez et al. ). Thus, sub-optimal water conditions may force children to be absent from school (Dreibelbis et al. ) or at worst drop out from school completely (Nauges & Strand ).

Susenas data set
This study uses a district panel data set constructed on the basis of the annual Indonesia National Socio-Economic Survey (Susenas) data sets from 1994 to 2014. Susenas was initiated in the period of 1963-1964 and is composed of two questionnaires. The first is a core questionnaire which contains household characteristics and household members' information on age, sex, education, health, and working activities. The core questionnaire is fielded to about 200,000-286,000 households and 0.7-1.1 million household members. With this sample size, the core data are representative at national, provincial, and district level. The survey is supplemented by a module questionnaire that collects additional information on consumption, expenditure, socio-cultural characteristics and education, as well as on health and housing. The module is fielded to about 65,000 households and representative at the national and provincial levels.
The Susenas sampling frame is prepared in accordance with the recent population census, e.g. the sampling frame for the 2000's Susenas originated from the 2000 population census. The Susenas sample is selected using a three-phase sampling design with two strata (urban and rural areas) in each district. In Phase 1, a number of census blocks are selected from the sample frame of census blocks using the probability proportional to the size (PPS) method. Size is the number of households in each census block in the sampling frame. In the second phase, one segment group is selected from each census block with the PPS method.
Households listing is then conducted for all selected segment groups. Lastly, in each enumeration area, 16 households are selected by systematic linear sampling.
Given this sampling procedure, Susenas is the only annual socio-economic data source that is representative at the district level in Indonesia.
The annual cross-sectional data from the core questionnaire are used to construct a balanced district panel of 259 of the 291 Indonesian districts between 1994-2014. The 32 districts that were not included are located in the provinces of Aceh, Sulawesi Tenggara, Maluku, Papua and Papua Barat. These districts were not surveyed in one or more years due to conflicts in the regions. In the current study, the district code is based on the 1994 geographical definition of the districts that were taken as the basic definition to accommodate the splitting up of districts during the period. Thus, the split districts were merged into their original parent district. Details of the procedure for generating homogenous districts over time can be obtained from the authors. All district-level variables were aggregated from the individual and household data. Long-term trend of piped water coverage, diarrhoea prevalence, and school attendance This section presents a descriptive analysis of the long-term trend in the main variables of this analysis, namely primary school attendance, clean water access, and diarrhoea prevalence among school-aged children in 259 districts in Indonesia over the period 1994-2014. Primary school attendance is measured by two variables, namely (1) school enrolment, which is the percentages of children aged 7-15 enrolled in primary school and (2) school absenteeism, which is measured by the percentage of primary school students who, during the past month, were absent from school due to health complaints for at least one day.
Good quality water is captured by the percentage of households with access to piped and bottled water. Nearby To get more insight into actual school attendance, Figure 3 shows the absenteeism rate due to health complaints in the districts as well as its average by island and nationally. Nationally, on average, 12.9% of the students was absent (for at least one day) from school due to health complaints. In 1994, about 9.4% of students missed school days, and there is hardly any improvement over the following two decades, as in 2014, even 12.5% of the students were absent from school. In 1994, about 50% of the households had access to a private source of drinking water ( Figure 5). This proportion is steadily increasing over time and reached 71% in 2014.
Meanwhile, the rest of the population has to obtain drinking water from a shared or public source. This means that these households need to walk from home to the source and queue there to get the water. Having a private source of drinking water is more common in Sumatera and Java islands. This is related to the relatively high coverage of piped and bottled water and urbanization at these two islands. The lowest percentages of households with a private drinking water source are found at Bali and NT islands.
Here only 28% of households had access to a private water source in 1994 and 45% in 2014. Figure 6 shows that diarrhoea prevalence among schoolaged children fluctuates at a quite low level (less than 2%), over the years and between regions. Diarrhoea is most

Independent variables
Following our conceptual framework that was explained previously, we include a set of control variables in our models. This section explains the theoretical foundation of each of these control variables in our model. Control Moreover, gender differences of parents' education level reflect the effect of cultural conditions in the region and the degree of the mother's position in decision-making processes.
A more educated mother might have more power within the household, and this increases the chances of the children, especially girls, attending school (Glick & Sahn ). She might also be better able to use the facilities for the benefit of her children. Jalan & Ravallion (), for example, found a larger health gain from piped water for children with bettereducated mothers. This effect is even significant among poor households, probably because an educated mother may have a better understanding of how to get and treat clean water. She will also be more likely to practice better hygiene in the family, specifically for her children. Consequently, less prevalence of diarrhoea is found (Mangyo

METHOD
Our model is based on the assumption that school attendance is a linear (we also experimented with quadratic terms, but these proved to be insignificant) function of access to a nearby water source, diarrhoea prevalence, control variables, time trend, and district-level fixed effects.
Moreover, we assume that diarrhoea prevalence is deter- Fixed effects models are used because we are interested in how changes in educational enrolment and absenteeism are related to changes in the quality of water, the location of water sources, and diarrhoea prevalence. Fixed effects models have the advantage over alternatives like random effects models that they make it possible to study these relations while completely controlling for all (measured and unmeasured) regional characteristics that are stable over time. One could argue that districts are not random drawings from a population but 'one of kind' and then fixed effects are preferred over random effects (Verbeek (), p. 351).
Two versions of fixed effects models are estimated: a static and a dynamic one. The static one implicitly assumes that adjustments take at most one period, in this case a year.
In many cases that is not realistic. We, therefore, estimate a dynamic fixed effects model too. This model assumes that   Table 1). The estimated coefficient of the percentage of households with a private water facility is not significant.

Static panel analysis
In addition, the model with the interaction terms (model D2) reveals that the protective effect of piped and bottled water on diarrhoea prevalence becomes stronger over time.
This stronger effect might be caused by an increase in the quality of water used by households over the period (sig p < 0.01). This quality change is driven by an increase in the share of bottled water in the mixture of piped and bottled water over the period (see Figure 4). All in all, these results support the existence of an indirect effect of water quality on school attendance (models A1, E1, and E2).
School absenteeism in a district is positively related to diarrhoea prevalence and negatively related to the percentage of households with a private water facility (Table 2, A1). The estimated coefficients are 2.334 (sig p < 0.001) and À0.027 (sig p < 0.01), respectively. The first coefficient means that a 1% reduction in diarrhoea prevalence (7-15 years) in a given district would reduce school absenteeism of primary school students in that district by 2.3%. The latter coefficient indicates that an increase in the percentage of households with a private water facility in the district by 10% is associated with a reduction of primary school absenteeism by 0.27%. The fact that diarrhoea prevalence is positively related with school absenteeism and that good quality water is negatively related with diarrhoea provides evidence for the existence of an indirect effect of water quality on the reduction of school absenteeism. No interaction effects with the time trend are found to be significant, which indicates that the effects do not change over time.
Lastly, school enrolment is significantly related to diarrhoea prevalence and the percentage of households with a private water facility ( that the effect is rather small, but this effect is actually more than half of the average annual change of primary school enrolment in Indonesia (0.42, see Table 1). In addition, given the latest primary school enrolment in Notes: A dependent variable in models D1 and D2 is diarrhoea prevalence (7-15 years; %), model A1 is student absent from school (%), and models E1 and E2 are primary school enrolment (%). Control variables in models D1 and D2 are HH with improved sanitation (%), health facility coverage (%), food expenditure (000,000 IDR), parents' education (year), relative difference of mothers' and fathers' education years, and household living in an urban area (%). Control variables in models A1, E1, and E2 are food expenditure (000,000 IDR), parents' education (year), relative difference of mothers' and fathers' education years, and households living in an urban area (%). All models are estimated using the (static) panel fixed effect. Full results are listed in the Appendix, Table A-1 (available with the online version of this paper). Significance level at 0.1% ***, 1% **, and 5% *.

Indonesia, which reached 35 million children (Indonesia
MoEC ), the 0.25% increase in primary school enrolment is equal to an additional primary school enrolment of 88,000 children. This effect is still important, given that the districts in Indonesia already reached, on average, a relatively high rate of primary enrolment (90%, see Table 1) and, hence, they might be facing the last mile problem. The coef-  Lastly, for all models, the error correction coefficient is negative and highly significant, indicating the existence of convergence effects towards the long-run equilibrium.

DISCUSSION
The current study explores the direct and indirect effects of access to good quality water and a nearby water source on primary school absenteeism and school enrolment. School absenteeism is included because many enrolled pupils appear to be absent from school due to illnesses, child labour, and household responsibilities (Neuzil et  The results also point to the relevance of the indirect effect. The percentage of households using piped and bottled waterour proxy for access to good quality wateris associated with lower diarrhoea prevalence in the region, which in turn is associated with lower absenteeism and higher primary school enrolment. This finding is a confir-  parents' education (years), relative difference of mothers' and fathers' education years, and households living in an urban area (%). Control variables in models A2, E3 and E4 are, both the level and the difference of, food expenditure (000,000 IDR), parents' education (years), relative difference of mothers' and fathers' education years, and households living in an urban area (%). All models are estimated using the DFE method. Full results are listed in the Appendix, Table A This study has some limitations, many of which are related to the data used. We constructed time series at the district level from the Susenas data set. As such, this was already quite an exercise as, in some cases, the boundaries of the districts changed over time. Moreover, it is the only possibility to generate time series. However, ideally, one would like to have longitudinal data at the household level, as at the district level, one can only investigate the relations for the average household. Consequently, the conclusions are very general. In practice, households within a district differ, and it is of interest how these differences between the households interact with their environment.
The present data do not allow for such an analysis over time. Another disadvantage relates to the way fetching water is measured. In the current study, we use the variable of whether households have a private water facility. If not, then we assume that children have to fetch water outside of the premises. We do not know the distance between this source and their parents' house. Moreover, we do not know who is fetching the water. Adding this information to the data set would help in finding more specific policy recommendations.

CONCLUDING REMARKS
The current study extends the understanding of the impact of water on schooling in two ways. Firstly, while previous studies have focused on the direct effect, our study provides evidence of both direct and indirect effects of water on schooling. Secondly, we study the effect of water access on education outcomes as measured by both school absenteeism and school enrolment. Our results suggest that the access to safe water at the premises can potentially improve education outcomes at the district level. The availability of good-quality and nearby water facilities is negatively related to school absenteeism and positively related to school enrolment. This result highlights the importance of safe water provision for children's health and schooling. Consequently, this benefit should be taken into account by the policymakers in identifying and quantifying the potential benefits of developing water infrastructure in developing countries.