This paper investigates the link between School Water, Sanitation, and Hygiene (SWASH) infrastructure and school attendance and retention in Mainland Tanzania. This was made possible by the definition of an algorithm that allowed us to link information about children in the Household Budget Survey (HBS 2018) with administrative data about schools in the Education Management Information System (EMIS). The study finds strong evidence of a link between the availability of gender-segregated toilets and girls' retention in school. The availability of water all year round also plays an important role. Given the strong returns to education in Tanzania, even small improvements in school attendance and retention are likely to generate large positive economic returns that outweigh the comparatively modest cost of investing in SWASH.

  • The main novelty lies in the development of an algorithm that allows us to link specific schools in the ministry's administrative data with the household survey, thus enabling analysis of the relation between school WASH infrastructure and the educational outcomes of children.

  • Given similar conditions, Tanzanian girls are less likely than boys to drop out of primary schools. However, girls' natural educational advantage is largely erased in schools that lack adequate SWASH infrastructure.

  • Simulations suggest that gender-segregated toilets, in particular, can add up to 1.34 of girls' potential years of schooling in rural areas and 0.87 years in urban areas.

  • For boys, a combination of factors appear to be at play, including the availability of septic tanks and the availability of water all year-round. Together, the various SWASH elements account for up to 0.81 years of boys' potential years of schooling in rural areas, and 0.85 in urban areas.

Tanzania has made notable progress in expanding access to primary education, and efforts have been made to increase secondary school enrolment as well. Nonetheless, the country continues to grapple with high school drop-out rates, which disrupt the educational trajectories of countless children and teenagers. According to UNESCO's Global Education Monitoring Report 2020, Tanzania had an estimated primary school drop-out rate of 11% in 2019, while the lower-secondary school drop-out rate stood at 32% in the same year (UNESCO 2020).

One particularly concerning aspect of school drop-out in Tanzania is its disproportionate impact on teenage girls. Gender disparities persist at various levels of the education system, and girls are often more vulnerable to dropping out, especially during adolescence. Factors such as early marriage, teenage pregnancy, and household responsibilities can hinder girls' educational progress. A study by the Tanzania National Bureau of Statistics and the Ministry of Health, Community Development, Gender, Elderly and Children in 2017 reported that 27% of Tanzanian girls aged 15–19 were married or in union, with some regions reporting even higher rates (DHS 2017).

Amid these challenges, access to School Water, Sanitation, and Hygiene (SWASH) facilities emerges as a pivotal but often overlooked factor influencing school attendance and retention, particularly for girls. SWASH infrastructure includes the provision of clean and safe drinking water, proper sanitation facilities, and hygiene within school environments.

There is growing evidence that inadequate WASH facilities limit school enrolment and attendance, lead to early drop-out, and affect performance and completion of education (see Bowen et al. 2007; O'Reilly et al. 2008; Blanton et al. 2010). For girls, the absence of gender-segregated toilets and facilities for menstrual hygiene management can be particularly disruptive, often leading to discomfort, embarrassment, and even missed school days during menstruation (see Sommer 2010; Crankshaw et al. 2020). A lack of sanitation facilities can deter girls from attending school, especially when they fear the absence of privacy and hygiene (see Abrahams et al. 2006). This, in turn, contributes to higher drop-out rates among adolescent girls (see Njuguna et al. 2009; McPhedran et al. 2010; Freeman et al. 2012).

While anecdotal evidence indicates that SWASH infrastructure is important for improving school attendance and retention, there is a notable research gap in quantifying the direct relationship between SWASH access and educational outcomes in Tanzania, especially for teenage girls (see Antwi-Agyei et al. 2017). Empirical studies that have looked specifically at this relationship in other countries have not been able to conclusively show a link between SWASH and attendance or retention (see Trinies et al. 2016; Sclar et al. 2017; Chard et al. 2019; McMichael 2019).

The study presented in this article draws upon data from Tanzania's Education Management Information System (EMIS) and the Household Budget Survey (HBS) conducted in 2018. It explores the relationship between SWASH infrastructure and children's effective years of schooling, providing empirical evidence that access to SWASH facilities significantly influences the duration of a child's education.

Section 1 presents the data, methods, and definitions used in the study. Section 2 lays out the basic descriptive statistics showing the relation between key SWASH indicators and school attendance. Section 3 presents the econometric models that will be used to explore the link between SWASH infrastructure and education. Section 4 presents the results and section five discusses the implications and costs of addressing the issues identified in the paper.

Data

One of the key value-addeds of this study lies in the fact that we were able to combine two different sources of data that rarely interact, allowing us to cross-check the relationship between SWASH facilities and attendance rates. The two data sources used are the HBS 2018 and the EMIS 2020 database.

The HBS 2018 is a household survey covering 9,552 households in Mainland Tanzania, including 16,840 children of school age (for this study, we included all persons aged 6 to 20 years to capture those who completed school late). The HBS is used to identify school-aged children who are currently in or out of school and describe their personal characteristics (age, sex and disability status.), as well as their household's demographic and socio-economic characteristics, as relevant to understand the factors that may lead them to not attend or permanently drop out of school.

The EMIS is an administrative census database describing school characteristics, including SWASH facilities of every school in Mainland Tanzania in some detail. The database includes 18,152 schools, 24 of which were excluded from the study due to missing or suspected incomplete information (schools with fewer than 10 pupils reported were excluded from the sample). The EMIS for Mainland Tanzania does not contain information on sanitation or enrolment numbers for secondary schools, only information on water access and quality.

As the HBS does not contain unique school identifiers, the HBS and EMIS data had to be matched based on the ward names, school level (primary and secondary), and school ownership (private and government). At the secondary level, there is often only one private and/or public school per ward, meaning that a one-to-one matching between the HBS and EMIS datasets was possible in most cases.

At the primary level, however, a one-to-one matching could not be performed in most cases, as there is typically more than one public/private primary school per ward. In these cases, a weighted average of school characteristics per ward/ownership was constructed based on the number of children attending each school. The idea is that children in the HBS are more likely to attend one of the larger schools in the EMIS. Therefore, the characteristics of these schools are given a higher weight in the matching. Let us say, for instance, that a given ward contains two public primary schools. The first has 1,000 pupils and 10 toilets. The second has 100 pupils and 10 toilets. A child in the HBS known to attend a public primary school in this ward will be said to face a pupil/toilet ratio of 91.8, reflecting a 91% likelihood that s/he is in the school with a ratio of 100 pupils per toilet and a 9% likelihood that s/he is in the school with a more favourable ratio of 10 pupils per toilet.

Another challenge faced in combining the two datasets lay in the fact that the HBS does not contain information on the school last attended by children who are currently out of school, so it is not possible to directly observe what school characteristics may have led children to drop out. Yet, the study has a particular interest in these children, as we seek to find out the reasons that lead children to drop out of or not attend school at all. For out-of-school children (4,438 children aged 6–20 in the HBS), we had to ‘guess’ what school the child was most likely to have attended before dropping out, based on the school reported by similar children who are still in school, using the following hierarchy:

  • 1. School attended by siblings in the same age group (± 2.5 years)

  • 2. School is attended by other household members of school age.

  • 3. Mode or most common characteristics (ownership/level) of schools attended by children in the ward who are in the same age group.

  • 4. Mode or most common characteristics (ownership/level) of schools attended by children in the district who are in the same age group.

  • 5. Mode or most common characteristics (ownership/level) of schools attended by children in the region who are in the same age group.

For children who could not be directly matched to a specific school through steps 1 and 2, the school level and ownership were imputed through steps 3–5. Then, a weighted matching was applied following the same procedure as for in-school children above.

Table 1 summarizes the matching results for children aged 6–20, depending on the type of matching achieved and the type and level of school attended. Children who could not be matched to any school (marked as ‘no match’ below) were not included in the analysis.

Table 1

Number of children aged 16–20 years in HBS matched to EMIS schools, by type of matching

In school
Out-of-school
Total
Public
Private
Public
Private
PrimarySecondaryPrimarySecondaryPrimarySecondaryPrimarySecondary
No match 1,428 451 188 125 726 163 146 57 3,284 
Weighted match 5,143 118  2,425 44  7,736 
Unique match 2,844 1,034 154 77 1,183 421 78 29 5,820 
TOTAL 9,415 1,603 346 202 4,334 628 226 86 16,840 
In school
Out-of-school
Total
Public
Private
Public
Private
PrimarySecondaryPrimarySecondaryPrimarySecondaryPrimarySecondary
No match 1,428 451 188 125 726 163 146 57 3,284 
Weighted match 5,143 118  2,425 44  7,736 
Unique match 2,844 1,034 154 77 1,183 421 78 29 5,820 
TOTAL 9,415 1,603 346 202 4,334 628 226 86 16,840 

Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

The following indicators were used to match EMIS and HBS databases:

  • Area of residence

  • Province

  • Level of education (primary/secondary)

  • Type of school (public/private)

The following variables from the EMIS database were used in the analysis:

  • Boys/girls per male/female toilet (number of boys/girls in school, and number of male/female toilets). It is not known if and how schools without gender-separated toilets were recorded in the EMIS, as it only contains numbers for girls' and boys' toilets separately.

  • The school has improved toilets flush, modern pit latrine, cleanable pit with slab.

  • There is no septic tank in the school.

  • The school has separate teacher toilets. Male and female teacher toilets were recorded separately in the EMIS, and it is not known if and how schools without gender-separated teacher toilets were recorded. Due to the limited number of observations for this variable, information on male and teacher toilets was pooled for this analysis into a single gender-neutral variable called ‘a number of separate teacher toilets’.

  • The school has improved water sources (boreholes, deep wells, rainwater, and tap water).

  • The school has access to water all year round.

All variables describing the child and household are from the HBS.

Analysis

Three types of analysis were undertaken for this study. First, simple descriptive statistics were used to provide an overview of the relationship between SWASH and schooling outcomes. Second, an econometric model was used to estimate the relation between these variables, controlling for possible confounding factors. Finally, microsimulations were carried out to project the impact of various SWASH scenarios on schooling outcomes, based on estimated parameters. Before explaining how each of these was carried out, we must define what is meant by effective years of education in this study.

Effective years of education

There are at least two different types of attendance issues that we wanted to capture with our model, namely (1) children who have permanently dropped out of school, and (2) children who are still enrolled in school, but temporarily not attending due to illness or for other reasons. To capture both issues in a single dependent variable for the modelling exercise, we used the notion of ‘effective years of education’.

Effective years of education correspond to actual years of education completed by the respondent, adjusted for days of schooling missed by those who are still in school. If, for instance, a girl misses 2 days of school each month during her period, that reduces the actual education she receives by 10% compared to someone who attended every day in the month.

The indicator measures the level attained by the pupil, rather than the number of years spent in school. A child having, say, reached grade 2 after having repeated grade 1 several times will thus be considered to have completed one effective year of education, even if s/he may have spent several years in school.

Effective years of education also consider educational outcomes, so that a person who is completely illiterate (not able to read or write a full sentence in English, Kiswahili, or local language) is considered to have fewer than one (0.5) effective year of education, regardless of how many years s/he reports having spent in school. This is based on the assumption that a pupil should have learned to read a full sentence by the end of the first year of primary.

Figure 1 shows the distribution of effective years of education for boys and girls aged 6–20. As is visible in the figure, a large number of children (5,779) are assessed to have less than one effective year of schooling. Of these, 1,287 have never been to school, 2,439 have only been to preschool/nursery and the rest have been to school but are unable to read or write a full sentence in any language.
Figure 1

Effective years of education (children aged 6–20), by sex. Source: Author's calculations based on HBS 2017/2018. Obs.: 16,840.

Figure 1

Effective years of education (children aged 6–20), by sex. Source: Author's calculations based on HBS 2017/2018. Obs.: 16,840.

Close modal

Descriptive statistics

For the descriptive analysis, the primary schools identified through the EMIS database have been grouped into two groups: (1) schools where less than 50 girls per female toilet and less than 50 boys per male toilet; (2) schools that did not meet this pupil/toilet threshold either for girls or for boys or both. The chosen threshold does not represent any normative standard for how many toilets are considered appropriate1 but allows us to divide the sample into sufficiently large groups to compare children in schools with fewer vs. more toilets per pupil. 10,787 children aged 6–16 years who had ever attended school in the HBS 2017/18 dataset could be matched to primary schools in the EMIS dataset using the algorithm described above. Of these, about one quarter (2,662) were matched to schools that met the minimum criteria for the number of male/female toilets per pupil.

EMIS database does not contain information on toilets for secondary schools. Instead, we compared schools where water was available all year round vs. schools that were not connected to water or were only connected for part of the year. The HBS 2017/18 sample contained 1,648 children aged 12–20 years who had never attended school and who could be matched to secondary schools in the EMIS database for which information on water access was available. Of these, 1,024 respondents could be matched to schools that had water all year round. The other children could either be matched to schools that did not have water all year or could not be matched with certainty to a school known to have water all year. Those are included in the groups with ‘intermittent water’ in Table 2. Because of the small sample size, there is more year-to-year variation in the reported attendance/completion rate for secondary school children. For this reason, dashed trendlines were added to the graphs (second-order polynomial) to show the general trend across the age group 12–20 years.

Table 2

Number of boys and girls included in the descriptive analysis

Primary (children aged 6–16)
Secondary (children aged 15–20)
50 + per toilet<50 per toiletTotalIntermittent waterWater all yearTotal
GIRLS 
 Obs. 4,032 1,343 5,375 284 519 803 
 Of which: 
 Unique matches (%) 29.2% 33.7% 30.3% 73.6% 100.0% 90.7% 
 Public schools (%) 97.2% 93.8% 96.3% 100.0% 91.9% 94.8% 
BOYS 
 Obs. 4,093 1,319 5,412 340 505 845 
 Of which: 
 Unique matches (%) 28.9% 36.5% 30.7% 75.0% 100.0% 89.9% 
 Public schools (%) 97.2% 93.8% 96.4% 100.0% 88.5% 93.1% 
Primary (children aged 6–16)
Secondary (children aged 15–20)
50 + per toilet<50 per toiletTotalIntermittent waterWater all yearTotal
GIRLS 
 Obs. 4,032 1,343 5,375 284 519 803 
 Of which: 
 Unique matches (%) 29.2% 33.7% 30.3% 73.6% 100.0% 90.7% 
 Public schools (%) 97.2% 93.8% 96.3% 100.0% 91.9% 94.8% 
BOYS 
 Obs. 4,093 1,319 5,412 340 505 845 
 Of which: 
 Unique matches (%) 28.9% 36.5% 30.7% 75.0% 100.0% 89.9% 
 Public schools (%) 97.2% 93.8% 96.4% 100.0% 88.5% 93.1% 

Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

Table 2 provides an overview of the samples used in the descriptive analysis.

Econometric model

To ascertain with greater confidence the role of the above-identified factors in determining school completion rate, we used a multivariate regression analysis to control for confounding factors. Indeed, it is possible that the availability of WASH facilities simply reflects the fact that those schools have better funding are better equipped in general, and are attended by girls from better socio-economic backgrounds, who have more encouragement and support from home to complete their studies. For instance, Table 4 showed, unsurprisingly, that private schools tend to be significantly better equipped than public ones (e.g. 24.5 vs. 74.7 girls/toilet), and have children from significantly richer families (173,454 vs. 76,804 Tsh per adult-equivalent/month), who tend to stay longer in school (5.36 vs. 3.67 years). It will therefore be crucial to control for all these differences, in order not to spuriously conclude that it is the higher frequency of toilets in private schools, rather all their other advantages, that are causing private-school children to stay longer in school.

Table 3

Summary of SWASH parameters under different scenarios

IndicatorCurrentFull investmentNo heavy infrastructureNo separate toiletsNo SWASH
Boys per male toilet 76.7 25 25 No boys' toilets No boys' toilets 
Girls per female toilet 73.2 20 20 No girls’ toilets No girls’ toilets 
% of improved toilets 79.5% 100% 100% 100% 0% 
% with no septic tank 55.6% 0% 100% 100% 100% 
% separate teacher toilet 57.5% 100% 100% 0% 0% 
% improved water source 63.9% 100% 100% 100% 0% 
% with water all year 54.6% 100% 0% 0% 0% 
IndicatorCurrentFull investmentNo heavy infrastructureNo separate toiletsNo SWASH
Boys per male toilet 76.7 25 25 No boys' toilets No boys' toilets 
Girls per female toilet 73.2 20 20 No girls’ toilets No girls’ toilets 
% of improved toilets 79.5% 100% 100% 100% 0% 
% with no septic tank 55.6% 0% 100% 100% 100% 
% separate teacher toilet 57.5% 100% 100% 0% 0% 
% improved water source 63.9% 100% 100% 100% 0% 
% with water all year 54.6% 100% 0% 0% 0% 

Source: EMIS 2022, and HBS 2017/2018 (leftmost column, obs. 6779). Parameters determined by author (other columns).

Table 4

Summary of key indicators used in the analysis (means, children aged 6–20)

NationalSex
Areab
Levelb
Typeb
MaleFemaleRuralUrbanPrimarySecond.PublicPrivate
Effective years of educ. 3.77 3.65 3.89 3.26 4.99 2.70 9.09 3.67 5.36 
% days missed 14.1% 14.1% 14.1% 14.0% 14.4% 14.2% 13.6% 14.1% 13.8% 
Consumption2 82,779 80,447 85,072 70,975 111,168 78,592 103,707 76,804 173,454 
Age (years) 12.3 12.3 12.3 12.1 12.8 11.4 16.5 12.2 13.5 
Distance to school (km) 28.2 27.6 28.8 28.7 26.9 27.5 31.6 28.5 24.6 
Boys per male toileta 76.7 77.0 76.5 75.3 81.4 76.7 – 78.2 26.7 
Girls per female toileta 73.2 73.4 73.1 72.9 74.4 73.2 – 74.7 24.5 
% of improved toiletsa 79.5% 80.1% 78.9% 81.7% 73.6% 92.4% – 79.9% 68.9% 
% with no septic tanka 55.6% 56.0% 55.1% 62.1% 37.9% 64.5% – 57.4% 4.7% 
% separate teacher toileta 57.5% 57.5% 57.5% 55.7% 62.5% 66.8% – 57.1% 69.7% 
% improved water sourcea 63.9% 63.4% 64.5% 55.7% 86.3% 62.0% 76.0% 62.8% 95.1% 
% with water all yeara 54.6% 53.9% 55.3% 47.2% 74.9% 52.5% 67.8% 53.0% 98.9% 
% with severe disabilities 2.3% 2.3% 2.3% 2.4% 2.0% 2.5% 1.6% 2.3% 2.1% 
Years of educ. (father) 2.24 2.25 2.23 2.06 2.70 2.18 2.55 2.20 2.84 
Years of educ. (mother) 2.43 2.47 2.39 2.28 2.74 2.33 2.85 2.41 2.70 
NationalSex
Areab
Levelb
Typeb
MaleFemaleRuralUrbanPrimarySecond.PublicPrivate
Effective years of educ. 3.77 3.65 3.89 3.26 4.99 2.70 9.09 3.67 5.36 
% days missed 14.1% 14.1% 14.1% 14.0% 14.4% 14.2% 13.6% 14.1% 13.8% 
Consumption2 82,779 80,447 85,072 70,975 111,168 78,592 103,707 76,804 173,454 
Age (years) 12.3 12.3 12.3 12.1 12.8 11.4 16.5 12.2 13.5 
Distance to school (km) 28.2 27.6 28.8 28.7 26.9 27.5 31.6 28.5 24.6 
Boys per male toileta 76.7 77.0 76.5 75.3 81.4 76.7 – 78.2 26.7 
Girls per female toileta 73.2 73.4 73.1 72.9 74.4 73.2 – 74.7 24.5 
% of improved toiletsa 79.5% 80.1% 78.9% 81.7% 73.6% 92.4% – 79.9% 68.9% 
% with no septic tanka 55.6% 56.0% 55.1% 62.1% 37.9% 64.5% – 57.4% 4.7% 
% separate teacher toileta 57.5% 57.5% 57.5% 55.7% 62.5% 66.8% – 57.1% 69.7% 
% improved water sourcea 63.9% 63.4% 64.5% 55.7% 86.3% 62.0% 76.0% 62.8% 95.1% 
% with water all yeara 54.6% 53.9% 55.3% 47.2% 74.9% 52.5% 67.8% 53.0% 98.9% 
% with severe disabilities 2.3% 2.3% 2.3% 2.4% 2.0% 2.5% 1.6% 2.3% 2.1% 
Years of educ. (father) 2.24 2.25 2.23 2.06 2.70 2.18 2.55 2.20 2.84 
Years of educ. (mother) 2.43 2.47 2.39 2.28 2.74 2.33 2.85 2.41 2.70 

Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

aIndicators from EMIS database.

bIndicators that are available in both the HBS 2017/18 and EMIS 2022 databases. These were used to match the two datasets.

The base model is an ordinary least squares (OLS) regression model, where the dependent variable is the ‘effective years of schooling’, as defined above. The variable is noted for child i in the HBS dataset attending school s in the EMIS dataset. On the right-hand side of the equation, we have the vector, , of variables describing the school's WASH facilities, as well as a number of variables, , controlling for relevant differences between children: age, household income, area of residence, province, type of school (public/private), distance to school, parent's education and the child's level of disability:

The interactive term describes whether we are dealing with a primary or secondary school. This distinction is only relevant for the two SWASH indicators for which information is available both at the primary and secondary level, namely access to improved water sources and access to water all year round. Separate models were run for girls and boys, to capture the different mechanisms that might be at play for both sexes. In both cases, the base model is restricted to children aged 6–20 years who have never attended school, not including pre-primary school.

For robustness, a second version of the model is also estimated for all children, including those who have never attended primary school (0 years of effective education). As shown in Figure 1, a large number of children have less than one year of effective schooling, as the model is unable to distinguish between children who dropped out in their first year of school, those who never started school, and those who attended but did not learn anything because they missed too many lessons or received poor quality education. In technical terms, we would say that the model is truncated at zero, and unable to distinguish variation within this group.

To account for this challenge, we use two alternative regression models, namely (1) a so-called Tobit regression model for censored distributions with a lower limit of zero, and (2) a Heckman regression two-step selection model, accounting for the fact that some children never start school in the first place, due to non-random factors such as household income and parents' education. The factors considered in the selection regression are: household consumption, child's age, parent's education level, distance to school, area of residence, and province.

Whether it is relevant to control for the factors affecting the decision to start school or not depends in part on whether we think that the availability of SWASH infrastructure is exogenous or endogenous to the decision to start school. In the first case, we assume that households first make a decision on whether to send children to school-based purely on the factors listed in the selection model (income and distance to school). Once in school, pupils with parents decide whether to stay school-based, amongst other things, on the quality of SWASH infrastructure. In this scenario, it is relevant to separate the two processes and to correct for the bias introduced by the fact that children attending school are a non-random sub-sample of all children of school age – a sample with higher income, shorter distance to school.

If, on the other hand, the decision to send a child to school already takes into account the quality of SWASH infrastructure, a single equation might suffice to capture both processes. This would, for instance, be the case if a school was known to have such poor SWASH facilities that some parents choose not to send their children to school in that catchment area.

Microsimulations

In the final step of this study, the parameters estimated through the multivariate regression analysis are used to project the possible impacts of different SWASH scenarios on future schooling outcomes, based on observable factors. Ts is done by simulating the effective years of schooling that children are expected to attain by age 20, under various scenarios. The simulations take the form of partial equilibrium micro-simulations, where the value of the various SWASH assets listed above is replaced by the target value to predict what the value of the dependent variable (effective years of schooling) would be under each scenario. The coefficients used are those from the full OLS model, which covers all children aged 6–20, and takes into account both the likelihood of starting school and the likelihood of dropping out of school before school completion. The following scenarios are considered:

  • Current investment: Investment is sufficient to maintain SWASH infrastructure at current levels, by maintaining, repairing, and replacing current infrastructure. However, no new investments are made to expand or improve the current state of SWASH facilities.

  • Full investment: All SWASH infrastructure is upgraded to meet international standards.

  • No heavy infrastructure: This scenario is the same as the full investment scenario, except for the fact that issues requiring investments in heavy infrastructure, such as septic tanks and water pipes or reservoirs, are excluded. Heavy infrastructure tends to be the most expensive but may be a precondition for other investments to have an effect, which is the reason for looking at these two groups separately.

  • No separate toilets: Heavy infrastructure is excluded, and no investments are made in building or maintaining boys' toilets, girls' toilets, or teachers' toilets.

  • No SWASH: No new investments are made in improving or maintaining SWASH infrastructure, and all current facilities are expected to decay, leaving schools with no access to water or sanitation facilities of any kind.

Table 3 summarizes the parameters used under the various scenarios. For all scenarios, except the one called ‘current’, the stated parameters are applied uniformly across all schools, meaning that all schools are expected to achieve the stated level of SWASH infrastructure. In the ‘current’ scenarios actual values for SWASH infrastructure, reported in the EMIS database, are used and only the child's age is allowed to vary to estimate the expected years of effective schooling that will be achieved at age 20 under current conditions. This means that all schools have different SWASH infrastructure under the ‘current’ scenario and that the values listed in the left-most column of Table 3 refer to weighted averages across all schools.

Limitations

The main limitations of this study arise from the necessity to use various secondary data sources instead of collecting primary data or conducting a dedicated experiment. Specifically, the following limitations may affect the external validity of the findings:

  • Only one-third of children in the HBS could be uniquely matched to schools in the EMIS. Although no systematic patterns were observed between matched and unmatched children, strong assumptions had to be made to link children to listed schools. This issue is particularly acute for primary-level children, who mostly required probabilistic weighting for matching, and for out-of-school children, who were matched based on siblings' or neighbours' characteristics when possible.

  • Available HBS indicators capture attendance and attainment but lack detailed information on educational outcomes. Some children were unable to read or write despite completing several years of education. These children were assigned less than 1 year of effective education in this study due to the absence of a more precise method to capture differences in the quality of education.

  • While efforts were made to control for potential confounding factors influencing children's decisions to stay in school, our ability to control for such factors was limited by the data available in the HBS and EMIS datasets. In particular, the HBS does not include information on the quality of instruction or other unobservable school-related variables that may impact educational trajectories. Future studies should aim to control for these factors and explore how they interact with observable SWASH infrastructure.

Descriptive statistics

This section uses simple descriptive statistics to describe the key relations between SWAH infrastructure and child schooling which will be analysed in more depth below. Table 4 starts by presenting the summary statistics for the key indicators used in the analysis. The indicators that come from the EMIS database have been marked with ‘*’. All numbers refer to averages for children aged 6–20 years in the HBS 2017/18, who could be matched with the EMIS database. They may therefore differ slightly from administrative figures drawn directly from the EMIS database, which includes more schools, and are unweighted.

The descriptive statistics section uses the variables describing boys/girls per male/female toilet (from the EMIS), as well as whether the school has water all year round. The econometric analysis and micro-simulations section uses all the variables listed in Table 4. All sections use information on area of residence, level of education, and child age, as well as type of school (public/private) to match children to schools.

Figure 2 provides a first descriptive attempt at quantifying the importance of SWASH for girls' schooling. It shows the percentage of children ever enrolled in school who are still in school or have completed primary (grade 7) by age 6 up to 16. The analysis shows that girls are more likely than boys to complete primary in schools with fewer pupils per toilet (graph B), but not in those with insufficient facilities (graph A).
Figure 2

Percent of children aged 6–16 years ever enrolled in school who are still in school or have completed primary (grade 7). (a) More than 50 boys per male toilet or more than 50 girls per female toilet. (b) Less than 50 boys per male toilet and less than 50 girls per female toilet. Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

Figure 2

Percent of children aged 6–16 years ever enrolled in school who are still in school or have completed primary (grade 7). (a) More than 50 boys per male toilet or more than 50 girls per female toilet. (b) Less than 50 boys per male toilet and less than 50 girls per female toilet. Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

Close modal
Figure 3

Percent of children aged 12–20 years ever enrolled in school who are still in school or have completed secondary (form IV). (a) No water or water for only part of the year. (b) Water all year round. Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

Figure 3

Percent of children aged 12–20 years ever enrolled in school who are still in school or have completed secondary (form IV). (a) No water or water for only part of the year. (b) Water all year round. Source: Author's calculations based on EMIS 2022, and HBS 2017/2018.

Close modal

In schools with improved SWASH facilities (graph B), the attendance gap between boys and girls starts building from age 12 and increases gradually over the years, as boys drop out of school at a higher rate than girls. By age 16, close to a quarter (23.1%) of once-enrolled boys have dropped out of school without completing grade 7 of primary school, whereas 84.4% of girls aged 16 are still enrolled or have completed primary school. The difference in school attendance/completion between boys and girls is statistically significant at the 5% level for the group of children aged 12–16 as a whole.

When looking at children who were matched to schools with insufficient toilets (graph A), on the other hand, the female advantage completely disappears: By age 16, 79% of girls are still in school or have completed grade 7. This is 5 age points less than for girls matched to schools that meet the minimum threshold of toilets per pupil, and it is statistically indistinguishable from the attendance/ completion rate for boys of the same age (77.9%).

At the secondary level, similar patterns can be observed, although it was not possible to specifically study the role of toilets, as the EMIS database does not contain information on toilets for secondary school (see Figure 3 below).

At this age, there is no visible female advantage, as girls are burdened by many additional disadvantages that are well documented in the literature, from teenage pregnancies to early marriage and household chores, for instance (see Shahidul & Karim 2015 and Rosenberg et al. 2015). However, the gender gap in secondary completion is relatively moderate for schools that have access to water all year round (graph B). In these schools, 82.9% of ever-enrolled females aged 20 are still enrolled in secondary schools or have already completed form IV. This is 7%age points less than for males of the same age (89.8%) matched to these same schools. The difference is not statistically significant.

In schools that do not have access to water all year round (graph 8), only 68.7% of ever-enrolled females aged 20 are still enrolled or have completed secondary school. This is almost 13%age points lower than for males of the same age (81.5%). The difference between girls and boys matched to these schools is statistically significant at the 15% level for the group aged 15–20.

Regression analysis

Table 5 below presents the multivariate regression results for the indicators of interest, related to SWASH infrastructure (see Supplementary material for full regression results). The regression results indicate that the ratio of girls per toilet is the factor most strongly associated with girls' effective years of education. Based on these regression results, a 0.1 point increase in the ratio of toilets/girl, which would correspond to a decrease in the number of girls per toilet from the current 73.2–63.2, could lead to girls staying in school between 1.2 and 1.3 more (effective) years, on average. The result is statistically significant at the 5% level for both versions of the OLS model and at 10% for the Heckman model. The fact that coefficients are similar across models suggests that the availability of SWASH infrastructure in general, and girls' toilets in particular, might already be part of the factors that households consider when deciding whether to send girls to school at all.

Table 5

Coefficients on variables of interest (dependent variable: effective years of education)

TypeIndicatorOLS (all girls)Girls
Boys
OLS (ever attended)HeckmanTobitOLS (all boys)OLS (ever attended)HeckmanTobit
Access Improved toilets 0.186 0.268 0.258 0.326 0.377 0.01 0.062 0.589** 
(Primary) Improved water −0.121 −0.09 −0.07 −0.145 −0.4** −0.403** −0.303** −0.486** 
Type of toilet Boys toilets −9.005 −10.406 −8.816 −7.871 −3.97 −4.755 −5.189 −4.414 
(Primary) Girls toilets 12.839** 13.319** 12.331* 12.009 4.898 8.497 8.531 4.196 
 Teachers toilets 0.212* 0.119 0.131 0.281* 0.384*** 0.317** 0.319** 0.492*** 
Infrastructure No septic tank 0.051 0.084 0.087 0.041 0.035 0.08 0.074 −0.052 
(Primary) Water all year 0.158 0.072 0.062 0.169 0.363*** 0.267* 0.209 0.434*** 
Secondary Improved water 0.388 0.073 0.039 0.431 0.519** 0.186 −0.02 0.604** 
 Water all year −0.046 −0.182 −0.176 −0.063 0.165 0.155 0.115 0.087 
          
 Obs. 6675 5118 7185 6775 6777 4964 7226 6777 
 R2/ F-stat. 0.6067 0.7181 154.2 153.3 0.5702 0.6661 116.7 157.5 
TypeIndicatorOLS (all girls)Girls
Boys
OLS (ever attended)HeckmanTobitOLS (all boys)OLS (ever attended)HeckmanTobit
Access Improved toilets 0.186 0.268 0.258 0.326 0.377 0.01 0.062 0.589** 
(Primary) Improved water −0.121 −0.09 −0.07 −0.145 −0.4** −0.403** −0.303** −0.486** 
Type of toilet Boys toilets −9.005 −10.406 −8.816 −7.871 −3.97 −4.755 −5.189 −4.414 
(Primary) Girls toilets 12.839** 13.319** 12.331* 12.009 4.898 8.497 8.531 4.196 
 Teachers toilets 0.212* 0.119 0.131 0.281* 0.384*** 0.317** 0.319** 0.492*** 
Infrastructure No septic tank 0.051 0.084 0.087 0.041 0.035 0.08 0.074 −0.052 
(Primary) Water all year 0.158 0.072 0.062 0.169 0.363*** 0.267* 0.209 0.434*** 
Secondary Improved water 0.388 0.073 0.039 0.431 0.519** 0.186 −0.02 0.604** 
 Water all year −0.046 −0.182 −0.176 −0.063 0.165 0.155 0.115 0.087 
          
 Obs. 6675 5118 7185 6775 6777 4964 7226 6777 
 R2/ F-stat. 0.6067 0.7181 154.2 153.3 0.5702 0.6661 116.7 157.5 

Source: Author's calculations based on EMIS 2022, and HBS 2017/2018. Statistical significance: * = 10%, ** = 5%, *** = 1%.

For boys, there is a wider range of factors that seem to be important. At the primary level, the most important factor appears to be the availability of separate teacher toilets, which is associated with up to 0.49 additional years of schooling for boys. This result is significant at the 5% level or more in all models. Further qualitative analysis would be required to assess whether teachers' toilets are important in their own right, or are a marker for some other feature that might be critical for boys' decision to stay in school.

Access to water all year round is associated with between 0.21 and 0.43 additional years of effective education at the primary level. This factor appears to be more important than whether the water source is improved or not, which has a negative coefficient at the primary level. This might, for instance, be the case if water is primarily used for sanitation (flushing) and hand washing, rather than drinking. Indeed, access to improved toilets has a positive coefficient in all models, although it is only statistically significant at the 5% level in one case (the Tobit model).

At the secondary level, access to an improved water source comes out as positive and significant in two of the models (Tobit and full OLS for all boys), and as having no effect or a statistically insignificant effect in the other two (Heckman and censored OLS, excluding boys who never attended school). One possible interpretation of this result could be that access to improved water affects the decision on whether or not to start school, but is not decisive for continued enrolment of boys who are already in secondary school.

A similar pattern appears to be visible for girls (0.4 additional years for Tobit and full OLS vs. 0.04–0.07 for Heckman and censored OLS), although none of the coefficients are statistically significant in that case. One would expect the availability of proper and separate sanitation facilities to be more important for girls at the secondary level (see Crankshaw et al. 2020). Unfortunately, we are not able to test this hypothesis, as the EMIS database does not include data on sanitation at the secondary level.

Microsimulations

This section looks at the implications of investing in SWASH infrastructure, based on the coefficients estimated in the previous section. Table 6 shows the results of the microsimulations under the various scenarios. Under the full investment scenario, boys are expected to complete 8.15 effective years of schooling by age 20 in urban areas and 6.4 years in rural areas. This is respectively 0.42 and 0.32 more effective years of schooling than they would be expected to achieve under the current state of SWASH infrastructure.

Table 6

Predicted average years of effective schooling at age 20, by investment scenarios

Rural (years)Rural (diff. /full investment)Urban (years)(diff. /full investment)
Male Current SWASH 6.08 −0.32 7.73 −0.42 
 Full investment 6.40 8.15 
 No heavy infrastructure 6.06 −0.34 7.77 −0.38 
 No separate toilets 5.59 −0.81 7.30 −0.85 
 No SWASH 5.56 −0.84 7.16 −0.99 
Female Current SWASH 6.42 −0.33 7.86 −0.37 
 Full investment 6.75 8.23 
 No heavy infrastructure 6.65 −0.10 8.13 −0.10 
 No separate toilets 5.41 −1.34 7.36 −0.87 
 No SWASH 5.29 −1.46 7.08 −1.15 
Rural (years)Rural (diff. /full investment)Urban (years)(diff. /full investment)
Male Current SWASH 6.08 −0.32 7.73 −0.42 
 Full investment 6.40 8.15 
 No heavy infrastructure 6.06 −0.34 7.77 −0.38 
 No separate toilets 5.59 −0.81 7.30 −0.85 
 No SWASH 5.56 −0.84 7.16 −0.99 
Female Current SWASH 6.42 −0.33 7.86 −0.37 
 Full investment 6.75 8.23 
 No heavy infrastructure 6.65 −0.10 8.13 −0.10 
 No separate toilets 5.41 −1.34 7.36 −0.87 
 No SWASH 5.29 −1.46 7.08 −1.15 

Source: Author's calculations based on EMIS 2022, and HBS 2017/18. Obs.: 6,675 (girls), 6,777 (boys).

Expected achievements for girls are higher than for boys, especially in rural areas. This reflects the female advantage that was identified in Figure 2, which appears to be dependent on, or at least strongly correlated with, the availability of SWASH infrastructure. For girls, full investments are expected to improve effective schooling by 0.37 and 0.33 years in urban and rural areas, respectively.

The simulations confirm the importance of SWASH infrastructure to enable the ‘female advantage’. Indeed, under the ‘no SWASH’ scenario, girls are expected to achieve fewer years of effective schooling than boys by age 20, especially in rural areas (0.27 and 0.08 years less than boys in rural and urban areas, respectively).

The simulations also confirm that the availability of separate toilets for girls is an essential determinant of girls' schooling. In the scenario without separate girls' toilets, girls would be expected to achieve 1.34 and 0.87 fewer years of effective schooling than under the full investment scenario, almost all of which is attributable to the lack of gender-separated toilets. For boys, the loss of schooling under this scenario would be smaller (−0.81 and −0.85 years in rural and urban areas, respectively), and with a greater share being attributable to non-gender specific SWASH infrastructure, such as septic tanks and availability of year-round water (−0.34/ − 0.38 years in rural/urban areas).

It would be beyond the scope of this paper to make a detailed estimate of the possible returns on investments in SWASH infrastructure in Tanzania. What can be said, based on existing studies, is that returns to education are known to be substantial in Tanzania, with each additional year of schooling raising expected earnings by an estimated 7% (Nikolov & Jimi 2018). At 2021 levels of GDP per capita (Tsh 2.6 million p.c.), this represents around Tsh 182,000 per person/ year. This means that, with its 0.38 additional years of effective education, the full investment scenario could raise expected earnings by as much as Tsh 70,000 per person/year, compared to the current situation.

By contrast, an assessment carried out by UNICEF in 2021 estimated the additional investments required to achieve international SWASH standards, corresponding to the full investment scenario above, to Tsh 338 billion per year in 2021 prices for Mainland Tanzania (UNICEF 2021). That represents slightly under Tsh 18,000 per child aged 6–20 or 0.2% of Tanzania's 2021 GDP. In other words, it is likely that the investments in SWASH would be very profitable from a macroeconomic point of view and that the additional cost could be recovered in a matter of months thanks to increased productivity once children entered the labour market.

This article sheds light on the critical importance of SWASH infrastructure in the context of education in Mainland Tanzania. The study addresses a gap in research by providing empirical evidence linking access to SWASH facilities to the duration of a child's education, with a particular focus on gender disparities.

The findings suggest that the availability of separate toilets for girls plays an important role in determining girls' schooling outcomes. When such facilities are lacking, girls are less likely to enrol and more likely to drop out of school, leading to a disruption in their educational journeys. This underscores the importance of addressing gender-specific needs within school infrastructure, such as menstrual hygiene management, to ensure that girls can fully engage in their education. Furthermore, the study indicates that access to improved water sources and water availability throughout the year may play a significant role in students' educational outcomes.

The results suggest that investing in SWASH infrastructure can lead to substantial improvements in effective years of schooling, thus potentially increasing future earning potential. The economic implications of these findings are noteworthy. While the investment required to achieve international SWASH standards might seem substantial, it pales in comparison to the potential economic returns estimated in this paper, due to increased productivity.

Additional research, particularly longitudinal and experimental studies, is needed to validate the causal relationships identified in this cross-sectional analysis and to address some of the limitations identified in the methodology section. Furthermore, qualitative research would be essential to explore the underlying mechanisms that explain the connections between different types of SWASH facilities and school retention rates for both boys and girls.

This research was funded by UNICEF. I would like to thank John Mfungo and other staff from the UNICEF Tanzania country office for their support and inputs on earlier drafts. I am also very grateful to the anonymous reviewers for exceptionally thorough and constructive review and advice, which improved the article.

Data cannot be made publicly available; readers should contact the corresponding author for details.;.

Sebastian Silva-Leander works as a consultant for the World Bank.

1

There are no internationally set standards for number of pupils per toilet. However, thresholds of 50 boys and 25 girls per gender-specific toilet have been cited in the relevant literature (see Adams et al. 2009).

2

Real monthly total food and non-food consumption per adult-equivalent per month in TSh.

Abrahams
N.
,
Mathews
S.
&
Ramela
P.
(
2006
)
Intersections of ‘sanitation, sexual coercion and girls’ safety in schools’
,
Tropical Medicine & International Health
,
11
,
751
756
.
Adams
J.
,
Bartram
J.
,
Chartier
Y.
&
Sims
J.
(
2009
)
Water, Sanitation and Hygiene Standards for Schools in low-Cost Settings
.
Geneva, Switzerland: World Health Organization
.
Antwi-Agyei
P.
,
Mwakitalima
A.
,
Seleman
A.
,
Tenu
F.
,
Kuiwite
T.
,
Kiberiti
S.
&
Roma
E.
(
2017
)
Water, sanitation and hygiene (WASH) in schools: results from a process evaluation of the national sanitation campaign in Tanzania
,
Journal of Water, Sanitation and Hygiene for Development
,
7
(
1
),
140
150
.
Blanton
E.
,
Ombeki
S.
,
Oluoch
G.
,
Mwaki
A.
,
Wannemuehler
K.
&
Quick
R.
(
2010
)
Evaluation of the role of school children in the promotion of point-of-use water treatment and handwashing in schools and households–Nyanza province, western Kenya, 2007
,
American Journal of Tropical Medicine and Hygiene
,
82
,
664
671
.
Bowen, A., Ma, H., Ou, J., Billhimer, W., Long, T., Mintz, E., Hoekstra, R. M. & Luby, S.
(
2007
)
A cluster-randomized controlled trial evaluating the effect of a handwashing-promotion program in Chinese primary schools
,
American Journal of Tropical Medicine and Hygiene
,
76
(6),
1166
1173
.
DHS
(
2017
)
Tanzania Demographic and Health Survey and Malaria Indicator Survey 2015–2016
.
Dodma, Tanzania: Tanzania National Bureau of Statistics (NBS) and Ministry of Health, Community Development, Gender, Elderly, and Children (MoHCDGEC)
.
(2018)
.
Freeman
M. C.
,
Greene
L. E.
,
Dreibelbis
R.
,
Saboori
S.
,
Muga
R.
,
Brumback
B.
&
Rheingans
R.
(
2012
)
Assessing the impact of a school-based water treatment, hygiene and sanitation programme on pupil absence in Nyanza Province, Kenya: a cluster-randomized trial
,
Tropical Medicine & International Health
,
17
(
3
),
380
391
.
McMichael
C.
(
2019
)
Water, sanitation and hygiene (WASH) in schools in low-income countries: a review of evidence of impact
,
International Journal of Environmental Research and Public Health
,
16
(
3
),
359
.
https://doi.org/10.3390/ijerph16030359
.
McPhedran
K.
,
Pearson
J.
&
Cairncross
S.
(
2010
) ‘
The impact of school sanitation on girls school attendance in the Dowa District of Malawi
’,
Paper Presented at: 2nd All Africa Environmental Health Congress 2010
.
Nikolov
P.
&
Jimi
N.
(
2018
) ‘
What factors drive individual misperceptions of the returns TOS in Tanzania? some lessons for education policy
’,
Binghamton University ORB Working Paper, Economics Series 2018
.
Mimeo
.
Njuguna
V.
,
Karanja
B.
,
Thuranira
M.
,
Shordt
K.
,
Snel
M.
,
Cairncross
S.
,
Biran
A.
&
Schmidt
W. P.
(
2009
)
The Sustainability and Impact of School Sanitation, Water and Hygiene Education in Kenya
.
New York, US: London School of Hygiene and Tropical Medicine and UNICEF
.
O’reilly, C. E., Freeman, M. C., Ravani, M., Migele, J., Mwaki, A., Ayalo, M., Ombeki, S., Hoekstra, R. M. & Quick, R.
(
2008
)
The impact of a school-based safe water and hygiene programme on knowledge and practices of students and their parents: nyanza province, western Kenya, 2006
,
Epidemiology & Infection
,
136
,
80
91
.
Rosenberg
M.
,
Pettifor
A.
,
Miller
W. C.
,
Thirumurthy
H.
,
Emch
M.
,
Afolabi
S. A.
&
Tollman
S.
(
2015
)
Relationship between school dropout and teen pregnancy among rural South African young women
,
International Journal of Epidemiology
,
44
(
3
),
928
936
.
Sclar
G. D.
,
Garn
J. V.
,
Penakalapati
G.
,
Alexander
K. T.
,
Krauss
J.
,
Freeman
M. C.
,
Boisson
S.
,
Medlicott
K. O.
&
Clasen
T.
(
2017
)
‘Effects of sanitation on cognitive development and school absence: a systematic review
,
International Journal of Hygiene and Environmental Health
,
220
(
6
),
917
927
.
https://doi.org/10.1016/j.ijheh.2017.06.010
.
Shahidul
S. M.
&
Karim
A. H. M. Z.
(
2015
)
Factors contributing to school dropout among the girls: a review of literature
,
European Journal of Research and Reflection in Educational Sciences
,
3
(
2
), 25–36.
Trinies
V.
,
Garn
J. V.
,
Chang
H. H.
&
Freeman
M. C.
(
2016
)
The impact of a school-Based water, sanitation, and hygiene program on absenteeism, diarrhea, and respiratory infection: a matched-Control trial in Mali
,
The American Journal of Tropical Medicine and Hygiene.
,
94
(
6
),
1418
1425
.
doi:10.4269/ajtmh.15-0757. Epub 2016 Apr 25. PMID: 27114292; PMCID: PMC4889767
.
United Nations Educational, Scientific and Cultural Organization (UNESCO). (2020) Global education monitoring report 2020: Inclusion and education: All means all. 92310038. UNESCO, Paris, France.
UNICEF
(
2021
)
A Costed Plan of Action and Investment Case for Implementation of School Water, Sanitation and Hygiene (SWASH) Services
.
New York, USA: UNICEF Tanzania Mimeograph
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).