Abstract

The paper presents the development and implementation of a geo-spatial model for mapping populations’ access to specified types of water and sanitation services in Nigeria. The analysis uses geo-referenced, population-representative data from the National Water and Sanitation Survey 2015, along with relevant geo-spatial covariates. The model generates predictions for levels of access to seven indicators of water and sanitation services across Nigeria at a resolution of 1 × 1 km2. Overall, the findings suggest a sharp urban–rural divide in terms of access to improved water, basic water, and improved water on premises, a low availability of piped water on premises and of sewerage systems throughout the country, a high concentration of improved sanitation in select states, and low rates of nationwide open defecation, with a few pockets of high rates of open defecation in the central and southern non-coastal regions. Predictions promise to hone the targeting of policies meant to improve access to basic services in various regions of the country.

This article has been made Open Access thanks to the generous support of a global network of libraries as part of the Knowledge Unlatched Select initiative.

INTRODUCTION

Until now, efforts to measure access to water and sanitation around the world have provided a certain level of aggregation at the subnational level, such as for particular government districts, but rarely do we encounter high-resolution maps for entire countries. Using survey data to map particular indicators is difficult for a number of reasons. First, the actual location of the surveyed establishment is usually unavailable. Second, due to cost constraints, and to ensure representativeness, surveys typically use cluster-based sampling techniques, which make the distribution of observations uneven across a given area. The absence of reliable, granular, evenly distributed, geo-referenced data makes it difficult to accurately compare water, sanitation, and hygiene (WASH) access across a country, or to identify those areas in greatest need of investment.

The poor provision of safe, accessible water and sanitation services in Nigeria has commensurate public health and economic impacts. Evidence from Nigeria has shown that those sectors of the population with the worst water, sanitation, and hygiene conditions are also the ones most at risk of attaining diseases due to inadequate health (Andres et al. 2017). A majority share of the Global Burden of Disease (GBD) enteric burden – a common measure for estimating the health burden and risk factors of diseases – estimated for Nigeria is associated with inadequate WASH, and disproportionately borne by poorer children and those in vulnerable geographic areas (Andres et al. 2017). Approximately 73 percent of the GBD enteric burden estimated for the country is associated with inadequate WASH (Andres et al. 2017).

A recent, nationwide multi-sector assessment — the 2015 National Water and Sanitation Survey (NWSS) — undertaken by the Federal Ministry of Water Resources (FMWR) of Nigeria, with support from the World Bank, provides uniquely detailed information on access to WASH in the country. The NWSS consists of a nationally representative household survey, of 201,842 households, covering access to safe water and sanitation, a national spatial inventory of 89,721 water points and 5,100 water schemes, and a survey on access to WASH services in over 50,000 public facilities, including health and educational centers (see Andres et al. (2017) for more information on the NWSS).

The model presented here makes use of the NWSS household survey, as well as the surveys on water points and water schemes, all of which include geo-locational data. (All surveyed households and water service points were geo-referenced in the surveys to provide latitude and longitude coordinates. Water schemes were also geo-referenced using their centroid location, although it should be noted that in many cases, these schemes occupy a significant area and so, the use of a single central location is a potentially crude approximation of their true spatial extent and coverage.) These data present an unprecedented opportunity to use geo-spatial models to analyze, at a detailed level, the geographical characteristics of access to safe water and sanitation across the country.

In sectors outside WASH, many household and facility surveys now include geo-locational information (e.g., the latitude and longitude of survey clusters, recorded via a Global Positioning System device at the time of the survey, or linked to spatial administrative boundary data). Spatial, statistical modeling approaches are being developed by exploiting this locational information to generate mapped surfaces of indicators of interest at increasingly fine spatial scales, and with greater precision than was previously possible. Central to many of these approaches is a body of theory known as model-based geo-statistics (MBG) (Diggle et al. 1998; Diggle & Ribeiro 2007). MBG has been successfully applied to point-located survey data to create a wide range of maps, including, for example, mapping malaria prevalence (Gething et al. 2011, 2012) and poverty (World Bank 2016).

The use of MBG approaches to generating interpolated surfaces is perhaps most established in the field of infectious diseases (Gemperli et al. 2004; Noor et al. 2009, 2012, 2013; Gething et al. 2012; Gosoniu et al. 2010, 2012; Reid et al. 2010; Riedel et al. 2010; Elyazar et al. 2011, 2012; Giardina et al. 2012; Raso et al. 2012; Bennett et al. 2013). In that context, geolocated data on disease prevalence are a direct analogue of the water and sanitation indicators addressed in the current work – both simply describe the proportion of the population meeting a given criterion at a survey location.

The availability of open-access, high quality, standardized and geolocated data on a wide range of social and demographic population indicators via initiatives such as the Demographic and Health Survey (DHS) Program funded by the United States Agency for International Development (USAID) and the United Nations Children's Fund (UNICEF) Multiple Indicator Cluster Survey (MICS) program has, in recent years, led to the application of MBG approaches to a much wider set of indicators (DHS Spatial Interpolation Working Group 2014; Gething & Molini 2015; Burgert-Brucker et al. 2016). These have included health outcomes (stunting in children and anemia in women); access to health interventions (insecticide-treated bed nets, contraception, childhood vaccinations, attended births, and antenatal care); literacy rates; tobacco use, etc.

In recent years, these methods have also been developed and applied in the context of mapping poverty rates. This includes the use of World Bank Living Standards Measurement Survey data, among other sources, to create high resolution poverty maps in Sierra Leone (Gething & Rosas 2015a), Tanzania (Gething & Rosas 2015b), Democratic Republic of Congo (Gething & Adoho 2015), Afghanistan (Gething & Pop 2015) and Nigeria (Gething & Molini 2015).

The availability of the NWSS 2015 data makes it possible to extend the MBG approach to mapping local populations’ access to water and sanitation services, and their proximity to the nearest functioning water source, in Nigeria. The high level of granularity resolved in the mapped outputs can improve our understanding of inequalities in access levels between and within the different regions of the country.

DATA

National Water and Sanitation Survey 2015

Data on access to WASH variables come from the 2015 NWSS household survey. The household survey was conducted by the FMWR, which interviewed 201,842 households across 36 states in Nigeria (Figure 1). (The NWSS surveyed an average of 22 random households across all 8,800 wards in Nigeria. See a more detailed description in Andres et al. (2017).) The survey asked questions relating to respondents’ access to water and sanitation services, and their use of water and sanitation infrastructure. It also included questions on household expenditure, health, and hygiene.

Figure 1

Map showing geo-positioned data from the 2015 National Water and Sanitation Survey on surveyed households (left) and water service points and schemes (right).

Figure 1

Map showing geo-positioned data from the 2015 National Water and Sanitation Survey on surveyed households (left) and water service points and schemes (right).

From the NWSS household survey, we were able to construct seven access to WASH indicators, informed by the Sustainable Development Goals (SDGs) (WHO/UNICEF 2015). These indicators are: (1) access to improved water, (2) access to basic water, (3) access to improved water on premises, (4) access to piped water on premises, (5) lack of access to fixed-point sanitation (also known as open defecation), (6) access to improved sanitation, and (7) access to sewerage connection, with definitions as follows:

  • (1)

    Improved water sources are those which, by the nature of their construction and when properly used, are adequately protected from outside contamination, particularly fecal matter. Such sources include piped water to yards/plots, public taps or standpipes, tube wells or boreholes, protected springs, and rainwater.

  • (2)

    Basic water satisfies the requirements of ‘improved water’ but also assumes a 30-minute round trip collection time.

  • (3)

    Improved water on premises fulfills the same requirements as basic water, but further implies that the water is available directly on household premises. (The global SDG indicator for water is defined as the ‘percentage of population using safely managed drinking water services,’ and covers those improved drinking water sources that are (1) located on premises, (2) available when needed, and (3) compliant with fecal and priority chemical standards. Unfortunately, at the time the FMWR commissioned data collection for the NWSS, this SDG indicator had not yet been defined, so we did not include access to safely managed water in the MBG model.)

  • (4)

    Piped water on premises fulfills the same requirements as improved water on premises, but is provided through pipes.

  • (5)

    Fixed-point sanitation involves a pit or other containment structure, regardless of the quality of the structure or whether it is hygienically maintained. While it includes both improved and unimproved facilities, it stands in contrast to open defecation, which is defined as not having access to any type of toilet.

  • (6)

    An unshared improved sanitation facility, an indicator of improved sanitation, is one that hygienically separates human excreta from human contact and is not shared with any other household. (The global SDG indicator for sanitation, ‘percentage of population using safely managed sanitation services,’ implies the use of an improved sanitation facility that is not shared with other households, and where excreta are safely disposed of on site or transported and treated offsite. Unfortunately, at the time the FMWR commissioned data collection for the NWSS, this indicator had not yet been defined, so data about excreta disposal or treatment were not collected.)

  • (7)

    Sewerage implies that an improved sanitation facility is connected to a sewer system.

Geo-spatial covariates and population data

In addition to the NWSS's outcome data on the indicators of interest, a second category of data used for analysis was a suite of geo-spatial covariates that may be correlated with the indicators of interest, and thus partially explain observed spatial variation, allowing for more accurate predictions across each map. Geo-spatial covariates are gridded spatial data: each grid cell (or pixel) contains the value of a particular property. An initial set of spatial covariates were identified as potentially useful predictors of water and sanitation access levels, based on previous attempts to predict poverty in Nigeria (Gething & Molini 2015). This set of covariates is presented in Figure 2 and consists of (1) a vegetation index, (2) aridity, (3) land-surface temperature, (4) brightness of nighttime lights, and (5) estimated travel time to the nearest functioning water source. The spatial covariates may be described as follows:

  • (1)

    Vegetation index (Figure 2(a)): NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) (http://modis.gsfc.nasa.gov/) generates high-resolution satellite imagery on various measures of environmental conditions. This includes the enhanced vegetation index (EVI), which measures reflectance in the green and red parts of the visible spectrum to provide a relative measure of the density of photosynthesizing vegetation in each pixel. These data were preprocessed to provide average values for the year 2015 in each 1 × 1 km pixel.

  • (2)

    Aridity (Figure 2(b)): The Consultative Group for International Agricultural Research (CGIAR) Consortium maintains high-resolution global raster climate data related to evapotranspiration processes and a rainfall deficit for potential vegetative growth. These are based on data from the WorldClim project (Hijmans et al. 2005) and ultimately from weather station data interpolated using covariates such as altitude (http://csi.cgiar.org/Aridity/) (Trabucco & Zomer 2009).

  • (3)

    Land surface temperature (Figure 2(c)): NASA's MODIS also generates high-resolution satellite imagery on land surface temperature.

  • (4)

    Brightness of nighttime lights (Figure 2(d)): This information comes from the Defense Meteorological Satellite Program Operational Linescan System's (DMSP OLS's) annual composite satellite data for nighttime lighting in 2009 (https://ngdc.noaa.gov/eog/). These data allow regions to be differentiated by the density of their population and also the degree of the electrification of their dwellings, commercial and industrial premises, and infrastructure.

  • (5)

    Estimated travel time to nearest functioning water source (Figure 2(e)): This covariate was created for the current study by first creating a ‘friction surface’ that estimates the time required to traverse each 1 × 1 km pixel across Nigeria. This varies according to the type of land cover, topography, and the layout of the road and the wider transport network across the country. The friction surface was then used in a least-cost path algorithm to estimate the likely travel time from the center of each 1 × 1 km pixel to the nearest functioning improved water source (such as a well, bore hole, or pump). The latitude and longitude, as well as the level of functionality, of every such water point and water scheme in Nigeria were recorded as part of the NWSS 2015.

  • (6)

    A final category of data used in the analysis was a gridded map of estimated population density across Nigeria (Figure 2(f)) constructed from satellite-derived settlement maps and available census data as part of the AfriPop project (www.afripop.org) (Linard et al. 2012). An alternative population grid, from the Global Rural Urban Mapping Project (http://sedac.ciesin.columbia.edu/data/set/grump-v1-population-density) (GRUMPv1, 2011), was also investigated. These gridded population surfaces were not used as covariates but were used to calculate population-weighted mean and count estimates for the various modeled indicators.

Figure 2

Geo-spatial covariates and ancillary data included in the analysis. (a) Mean EVI imagery derived from NASA's MODIS. (b) Aridity, derived from weather station data and maintained by the CGIAR Consortium. (c) Mean land surface temperature from NASA's MODIS. (d) Imagery of nighttime lights in Nigeria in 2009 maintained by NOAA. (e) Estimated travel time to nearest functioning water service point, as identified in the NWSS 2015. (f) Population density layer for Nigeria in 2011 maintained by the AfriPop project.

Figure 2

Geo-spatial covariates and ancillary data included in the analysis. (a) Mean EVI imagery derived from NASA's MODIS. (b) Aridity, derived from weather station data and maintained by the CGIAR Consortium. (c) Mean land surface temperature from NASA's MODIS. (d) Imagery of nighttime lights in Nigeria in 2009 maintained by NOAA. (e) Estimated travel time to nearest functioning water service point, as identified in the NWSS 2015. (f) Population density layer for Nigeria in 2011 maintained by the AfriPop project.

Defining and implementing a standardized grid format

The geo-spatial data sources described above were obtained in a variety of spatial resolutions and geographic extents. The land-sea templates inevitably varied, so the precise definition of coastlines, and the inclusion or exclusion of small islands and peninsulas, was not consistent. These factors precluded the direct use of these data in a single spatial model. To overcome these incompatibilities and generate a fully standardized suite of input grids on an identically defined geographic template, a processing chain with the following stages was developed. First, each input data source was re-projected, where necessary, using a standardized equirectangular Plate Carrée projection under the World Geodetic System 1984 coordinate system. Second, where input grids were defined at differing spatial resolutions, they were re-sampled to 1 × 1 km. Third, grids were either extended or clipped to match a standardized extent. Fourth, a bespoke algorithm was developed that compared each rectified and re-sampled grid to a ‘master’ land-sea template for Nigeria and used a simple interpolation and/or clipping procedure to align new grids to this master template, thus ensuring that the entire coastline was perfectly consistent on a pixel-by-pixel basis.

METHODOLOGY

Model-based geo-statistics

The predictive approach used in this study to generate fine-scale maps of each water and sanitation indicator across Nigeria was based on a body of statistical theory known as MBG. In an MBG framework, the observed variation in cluster-level indicator values is explained by one of the following four components:

  • (1)

    A sampling error, which can often be large given the small sample sizes of individual clusters, is represented using a standard sampling model (e.g., a binomial model where cluster-level data consist of a selection of ‘poor’ households from the total number sampled).

  • (2)

    Some non-sampling variation can often be explained using fixed effects – whereby a multivariate regression relationship is defined by linking the dependent poverty variable with a suite of geo-spatial covariates.

  • (3)

    An additional non-sampling error not explained by the fixed effects is usually spatially auto-correlated, and this is represented using a random effect component. A spatial multivariate normal distribution known as a Gaussian Process is employed, parameterized by a spatial covariance function.

  • (4)

    Finally, any remaining variation not captured by these components is represented using a simple Gaussian noise term, equivalent to that employed in a standard spatial linear model.

The full model output is, for every pixel on the mapped surface, a posterior distribution for the predicted indicator, representing a complete model of the uncertainty around the estimated value. These can be summarized using a point estimate (such as the posterior mean) to generate a mapped surface of the indicator value. This methodology is able to present smaller points of estimation (in the spatial dimension) than are other methodologies such as small area estimation (Blankespoor & van der Weide 2017).

Formal description of the model structure

MBG models are a class of generalized linear mixed model, with an approximation of a multivariate normal random field (i.e., a Gaussian Process) used as a spatially auto-correlated random effect term. Each indicator (the proportion of individuals with access to the specified water/sanitation services) at each location in Nigeria for the year 2015 was modeled as a transformation of a spatially structured field superimposed with additional random variation . The count of individuals with access from the total sample of in each survey cluster was modeled as a conditionally independent binomial variate, given the unobserved underlying value. The spatial component was represented by a stationary Gaussian process , with mean and covariance C. The unstructured component was represented as Gaussian with a zero mean and variance V. Both the inference and prediction stages were coded using the Integrated Nested Laplace Approximation (INLA) framework, primarily in the R programming language.

The mean component, was modeled as a linear function of the n geo-spatial covariates, , where was a vector consisting of a constant and the covariates indexed by spatial location x, and was a corresponding vector of the regression coefficients. Each covariate was converted to z-scores before analysis. Covariance between spatial locations was modeled using a Matern covariance function: 
formula
where, is the geographical separation between two points; are parameters of the covariance function defining, respectively, its amplitude, degree of differentiability, and scale; is the modified Bessel function of the second kind of order v; and is the gamma function.

Incorporation of covariates

In a standard non-spatial generalized linear model regression approach, it is necessary to undertake a formal covariate selection procedure to maximize the ultimate predictive accuracy of the model. Including too few informative covariates means that exploratory power is lost, but the inclusion of too many may result in the high-dimensional multivariate model overfitting the data, explaining noise rather than signal and, ultimately, reducing predictive accuracy. Because full geo-statistical models are extremely time-consuming to fit, a common practice has been to use simpler non-spatial models to determine the optimum covariate selection for subsequent inclusion in the full spatial modeling framework. Techniques such as stepwise variable selection are often used, whereby a covariate set is built up by progressively adding new candidate covariates to a model (forward selection) or subtracting them from an initial inclusive set (backward selection), and deciding to keep or discard each new covariate based on its impact on the model fit. These techniques are, however, known to be sensitive to the order in which variables are added or removed, and therefore risk generating arbitrary final selections.

In this study, a more novel approach has been implemented: the use of ‘regularization’ embedded within the geo-statistical model itself. In intuitive terms, this allows a large suite of candidate covariates to be entered into the main model while achieving two things. First, it allows the model to sacrifice a small amount of bias for a large reduction in variance (in a trade-off between bias and variance), greatly improving out-of-sample predictive capacity. Second, the regularizer shrinks the coefficients of the covariates, which means that the effects of collinearity are minimized, making the model more stable and robust. In formal terms, a Gaussian process anterior was imposed on the likelihood, allowing regularization of the posterior mean: 
formula
 
formula
 
formula
Here, is the Gaussian probability distribution function; is the Gaussian process function; y is the response; are the mean and covariance functions, as defined earlier; and is the noise or error. The regularization is not just the distance in the conventional ridge regression but the Mahalanobis distance, which accounts for the elliptical skew due to the covariance function, thereby including all correlated effects into the regularizer. In addition to the conceptual benefits afforded by the Gaussian process prior, the possible inclusion of a priori non-linear transformations on the fixed effects was explored. However, these non-linear transformations did not lead to significant improvements over the non-transformed parsimonious model, and so the latter was retained. Model complexity was measured using the Deviance Information Criteria.

Model implementation and output

Bayesian inference was implemented using the INLA algorithm to generate approximations of the marginal posterior distributions of the outcome variable at each location on a regular 1 × 1 km spatial grid across Nigeria and of the unobserved parameters of the mean, covariance function, and Gaussian random noise component. At each location, the posterior distribution was summarized using the posterior mean as a point estimate, and maps were generated of each of these metrics in ArcGIS 10.4.

Aggregation at the level of individual states and local government areas (access rate and count)

The MBG models generate predicted maps of each indicator at a 1 × 1 km resolution. While these provide the most fine-grained picture of variation in water and sanitation access across the country, it is also useful to summarize these patterns at higher levels of aggregation corresponding to the administrative unit levels at which program planning, implementation, and decision-making are carried out. For each indicator, therefore, various aggregate versions were calculated at both the level of the state (1st subnational unit) and local government area (2nd subnational unit), as follows:

  • (1)

    Mean indicator rates: These are calculated as population-weighted means of the indicator predictions across all pixels within each administrative unit and provide the best estimate of the percentage of the population within each unit that meets the criterion of each indicator (e.g., the percentage of people with access to basic water in state x).

  • (2)

    Indicator rate quintiles: Mapping the mean indicator rates allows for a comparison of the absolute level of access across administrative units. Also of interest is the relative level of access, and this is best visualized by identifying the quintile within which each administrative unit lies relative to others across the country.

  • (3)

    Indicator count: This is the sum of the population in each administrative unit that meets the criterion for the indicator. Since this metric is primarily used to help target underserved populations, a count was calculated for that fraction of the population without access to water/sanitation services (e.g., the count of people that do not have access to basic water in state x).

RESULTS

Model coefficients

Table 1 shows fitted coefficients for each of the fixed effects (covariates) used in the model for each water and sanitation indicator. Since these are Bayesian models, each parameter is estimated as a full posterior distribution and is summarized here via the 50th (median), 2.5th, and 97.5th percentiles. The magnitude, direction, and significance of fitted coefficients varied considerably across the different indicators. In some cases, the observed relationships matched prior expectations: for example, that access to basic and improved water was inversely correlated to an increase in travel time to the nearest water point or scheme, or that areas that were more lit up at night (thus more urban) were associated with higher access to sewerage connections and piped water on premises, and lower rates of open defecation. Others were less intuitive: for example, that improved sanitation rates were higher in areas that were less bright at night. It should be noted that although many covariates contributed in a statistically significant way to the final model fits, their interpretation is not as straightforward as in a non-spatial model, because much of the variation in observed indicator values is accounted for via the random effect component.

Table 1

Parameter estimates for fixed effects (covariates)

 Percentile EVI Aridity LST NTL Time to waterpoint 
Basic water 2.5th −0.699 −2.290 −4.184 −0.064 − 0.033 
50th −0.041 −0.765 −2.066 −0.020 − 0.028 
97.5th 0.615 0.758 0.051 0.024 − 0.022 
Improved water 2.5th − 1.597 −2.226 −4.409 −0.084 − 0.050 
50th − 0.869 −0.506 −2.061 −0.036 − 0.044 
97.5th − 0.143 1.209 0.282 0.011 − 0.039 
Improved water on premises 2.5th −0.353 −2.181 − 4.438 − 0.104 −0.005 
50th 0.248 −0.782 − 2.497 − 0.064 0.000 
97.5th 0.849 0.611 − 0.559 − 0.024 0.005 
Piped water on premises 2.5th 1.251 −1.699 −2.541 0.012 0.014 
50th 1.723 −0.618 −1.055 0.042 0.017 
97.5th 2.197 0.456 0.432 0.072 0.021 
Open defecation 2.5th 4.083 − 3.472 −1.945 0.151 0.014 
50th 4.728 − 1.899 0.121 0.191 0.018 
97.5th 5.373 − 0.324 2.187 0.231 0.023 
Improved sanitation 2.5th −0.628 −1.857 −3.760 − 0.096 −0.005 
50th 0.012 −0.343 −1.692 − 0.054 0.000 
97.5th 0.652 1.168 0.377 − 0.012 0.005 
Sewerage connection 2.5th 1.908 − 3.598 − 5.114 0.013 0.012 
50th 2.320 − 2.650 − 3.803 0.039 0.015 
97.5th 2.731 − 1.697 − 2.489 0.064 0.018 
 Percentile EVI Aridity LST NTL Time to waterpoint 
Basic water 2.5th −0.699 −2.290 −4.184 −0.064 − 0.033 
50th −0.041 −0.765 −2.066 −0.020 − 0.028 
97.5th 0.615 0.758 0.051 0.024 − 0.022 
Improved water 2.5th − 1.597 −2.226 −4.409 −0.084 − 0.050 
50th − 0.869 −0.506 −2.061 −0.036 − 0.044 
97.5th − 0.143 1.209 0.282 0.011 − 0.039 
Improved water on premises 2.5th −0.353 −2.181 − 4.438 − 0.104 −0.005 
50th 0.248 −0.782 − 2.497 − 0.064 0.000 
97.5th 0.849 0.611 − 0.559 − 0.024 0.005 
Piped water on premises 2.5th 1.251 −1.699 −2.541 0.012 0.014 
50th 1.723 −0.618 −1.055 0.042 0.017 
97.5th 2.197 0.456 0.432 0.072 0.021 
Open defecation 2.5th 4.083 − 3.472 −1.945 0.151 0.014 
50th 4.728 − 1.899 0.121 0.191 0.018 
97.5th 5.373 − 0.324 2.187 0.231 0.023 
Improved sanitation 2.5th −0.628 −1.857 −3.760 − 0.096 −0.005 
50th 0.012 −0.343 −1.692 − 0.054 0.000 
97.5th 0.652 1.168 0.377 − 0.012 0.005 
Sewerage connection 2.5th 1.908 − 3.598 − 5.114 0.013 0.012 
50th 2.320 − 2.650 − 3.803 0.039 0.015 
97.5th 2.731 − 1.697 − 2.489 0.064 0.018 

Note: EVI, enhanced vegetation index; LST, land surface temperature; NTL, brightness of nighttime lights. In a Bayesian model, each coefficient is fitted as a probability distribution function, and this is summarized here by the median and 95% credible interval range. Coefficients statistically different from zero (‘significant’ with 95% confidence) are in bold.

Model validation

The predictive performance of the model for each indicator is assessed via out-of-sample cross-validation. A fourfold hold-out procedure was implemented whereby 25% of the data points were randomly withdrawn from the data set, the model was run in full using the remaining 75% of data, and the predicted values at the locations of the hold-out data were compared with their observed values. This was repeated four times without replacement such that every data point was held out once across the four validation runs. Standard validation statistics were computed as measures of model precision (mean absolute error), accuracy (mean square error), and linear association (correlation) between observed and predicted values.

Table 2 displays validation statistics from the fourfold out-of-sample validation procedure implemented for each predicted variable. The correlation between observed and predicted values was generally very high, exceeding 0.8 (on a scale from zero to one) for most indicators. The two exceptions were piped water on premises and sewerage connection, and here, the lower correlations can be attributed to the almost universally low observed values of these indicators – meaning that correlations were being assessed within a very small range. Estimated levels of access to piped water on premises is 7% nationally in Nigeria, with a range of values for access across states from 2% to 17%, and access to sewerage is 8% nationally, with a range of values for access across states from 3% to 13%. (Reference Figures 8 and 9 and Table 3 for additional details.) Mean absolute errors, which measure the overall precision of the model (and are expressed here on the same scale as the variables themselves – i.e., a proportion between zero and one), again suggested good model performance: the average difference between observed and predicted values at each location was between 0.1 and 0.2. The most precise predictions were for piped water on premises and sewerage connection – again reflecting the lack of variability in the observed data. Mean square errors, which capture overall model performance (both bias and variance), were also small, exceeding 0.05 for only one variable – improved water.

Table 2

Validation statistics summarizing performance of geo-statistical models predicting each water and sanitation variable

Variable Correlation Mean absolute error Mean squared error 
Basic water 0.816 0.172 0.047 
Improved water 0.830 0.185 0.054 
Improved water on premises 0.808 0.142 0.035 
Piped water on premises 0.516 0.085 0.014 
Improved sanitation 0.815 0.150 0.039 
Open defecation 0.865 0.152 0.043 
Sewerage connection 0.241 0.076 0.009 
Variable Correlation Mean absolute error Mean squared error 
Basic water 0.816 0.172 0.047 
Improved water 0.830 0.185 0.054 
Improved water on premises 0.808 0.142 0.035 
Piped water on premises 0.516 0.085 0.014 
Improved sanitation 0.815 0.150 0.039 
Open defecation 0.865 0.152 0.043 
Sewerage connection 0.241 0.076 0.009 
Table 3

Indicator estimates by state and by urban/rural

State Population Improved source <30 min trip
 
Basic water
 
Improved sanitation
 
Improved water on premises
 
n n n n n 
Part 1 
Abia 3,840,896 1,351,488 64.81 2,048,858 46.66 2,855,037 25.67 2,737,903 28.72 
Abuja 1,639,987 767,845 53.18 587,736 64.16 1,169,365 28.70 1,074,901 34.46 
Adamawa 3,658,159 2,827,271 22.71 2,433,765 33.47 2,814,972 23.05 2,647,044 27.64 
AkwaIbom 4,833,660 2,606,639 46.07 3,082,322 36.23 4,162,963 13.88 4,025,965 16.71 
Anambra 5,589,839 2,148,965 61.56 3,472,210 37.88 4,051,635 27.52 4,248,990 23.99 
Bauchi 5,999,689 3,507,797 41.53 2,984,992 50.25 3,847,020 35.88 4,441,138 25.98 
Bayelsa 2,108,063 1,828,337 13.27 1,757,046 16.65 1,955,495 7.24 1,835,565 12.93 
Benue 5,482,011 3,498,834 36.18 4,047,631 26.17 3,946,299 28.01 4,414,458 19.47 
Borno 5,293,266 3,878,433 26.73 3,922,942 25.89 3,945,115 25.47 4,268,556 19.36 
Cross River 3,781,326 1,959,147 48.19 2,977,041 21.27 2,944,224 22.14 3,306,035 12.57 
Delta 5,210,434 3,254,215 37.54 2,703,836 48.11 4,365,551 16.22 3,755,680 27.92 
Ebonyi 2,648,901 1,303,248 50.80 1,629,913 38.47 2,298,823 13.22 2,042,658 22.89 
Edo 4,378,990 1,844,181 57.89 2,373,184 45.81 2,957,512 32.46 3,276,099 25.19 
Ekiti 3,006,775 1,010,414 66.40 2,136,236 28.95 2,306,856 23.28 2,403,850 20.05 
Enugu 4,231,441 1,916,443 54.71 3,914,032 7.50 3,870,547 8.53 3,958,482 6.45 
Gombe 3,137,075 1,976,600 36.99 1,978,444 36.93 2,088,755 33.42 2,646,840 15.63 
Imo 4,921,887 2,980,604 39.44 2,160,599 56.10 2,795,948 43.19 2,735,405 44.42 
Jigawa 5,795,239 2,087,141 63.99 1,660,060 71.35 3,663,095 36.79 2,552,402 55.96 
Kaduna 7,967,703 4,277,828 46.31 5,097,663 36.02 3,443,272 56.78 5,494,387 31.04 
Kano 12,580,898 3,924,758 68.80 7,935,198 36.93 7,683,898 38.92 9,112,765 27.57 
Katsina 7,536,593 3,436,192 54.41 4,567,210 39.40 3,917,539 48.02 5,189,303 31.15 
Kebbi 4,251,959 2,686,814 36.81 3,170,446 25.44 3,507,521 17.51 3,344,803 21.33 
Kogi 4,343,938 2,637,986 39.27 3,385,183 22.07 3,770,918 13.19 3,583,883 17.50 
Kwara 3,262,565 1,462,084 55.19 1,413,829 56.67 2,842,007 12.89 2,354,845 27.82 
Lagos 13,934,343 766,926 94.50 3,437,142 75.33 11,565,994 17.00 5,571,554 60.02 
Nassarawa 2,437,915 1,670,835 31.46 1,656,771 32.04 1,892,722 22.36 1,946,132 20.17 
Niger 5,065,664 3,286,659 35.12 3,585,811 29.21 3,622,252 28.49 4,052,999 19.99 
Ogun 4,679,294 1,638,458 64.98 2,029,481 56.63 3,984,053 14.86 2,830,465 39.51 
Ondo 4,119,647 1,799,562 56.32 2,754,668 33.13 3,495,091 15.16 3,328,290 19.21 
Osun 5,068,879 1,041,894 79.45 2,538,799 49.91 4,492,865 11.36 4,006,432 20.96 
Oyo 7,669,908 2,472,980 67.76 3,106,245 59.50 6,372,009 16.92 5,237,276 31.72 
Plateau 4,159,606 2,529,773 39.18 3,022,713 27.33 3,299,094 20.69 3,403,034 18.19 
Rivers 6,017,768 3,430,158 43.00 3,727,603 38.06 4,779,166 20.58 4,152,837 30.99 
Sokoto 4,729,577 2,510,713 46.91 3,528,244 25.40 4,082,103 13.69 3,716,723 21.42 
Taraba 2,958,207 2,270,747 23.24 2,181,099 26.27 2,353,195 20.45 2,546,058 13.93 
Yobe 3,044,649 1,857,390 38.99 1,770,576 41.85 2,321,527 23.75 2,351,669 22.76 
RURAL 91,658,217 71,450,352 22.05 61,948,160 32.41 67,976,395 25.84 71,248,951 22.27 
URBAN 91,932,232 15,653,077 82.97 45,688,487 50.30 68,960,984 24.99 60,399,736 34.30 
State Population Improved source <30 min trip
 
Basic water
 
Improved sanitation
 
Improved water on premises
 
n n n n n 
Part 1 
Abia 3,840,896 1,351,488 64.81 2,048,858 46.66 2,855,037 25.67 2,737,903 28.72 
Abuja 1,639,987 767,845 53.18 587,736 64.16 1,169,365 28.70 1,074,901 34.46 
Adamawa 3,658,159 2,827,271 22.71 2,433,765 33.47 2,814,972 23.05 2,647,044 27.64 
AkwaIbom 4,833,660 2,606,639 46.07 3,082,322 36.23 4,162,963 13.88 4,025,965 16.71 
Anambra 5,589,839 2,148,965 61.56 3,472,210 37.88 4,051,635 27.52 4,248,990 23.99 
Bauchi 5,999,689 3,507,797 41.53 2,984,992 50.25 3,847,020 35.88 4,441,138 25.98 
Bayelsa 2,108,063 1,828,337 13.27 1,757,046 16.65 1,955,495 7.24 1,835,565 12.93 
Benue 5,482,011 3,498,834 36.18 4,047,631 26.17 3,946,299 28.01 4,414,458 19.47 
Borno 5,293,266 3,878,433 26.73 3,922,942 25.89 3,945,115 25.47 4,268,556 19.36 
Cross River 3,781,326 1,959,147 48.19 2,977,041 21.27 2,944,224 22.14 3,306,035 12.57 
Delta 5,210,434 3,254,215 37.54 2,703,836 48.11 4,365,551 16.22 3,755,680 27.92 
Ebonyi 2,648,901 1,303,248 50.80 1,629,913 38.47 2,298,823 13.22 2,042,658 22.89 
Edo 4,378,990 1,844,181 57.89 2,373,184 45.81 2,957,512 32.46 3,276,099 25.19 
Ekiti 3,006,775 1,010,414 66.40 2,136,236 28.95 2,306,856 23.28 2,403,850 20.05 
Enugu 4,231,441 1,916,443 54.71 3,914,032 7.50 3,870,547 8.53 3,958,482 6.45 
Gombe 3,137,075 1,976,600 36.99 1,978,444 36.93 2,088,755 33.42 2,646,840 15.63 
Imo 4,921,887 2,980,604 39.44 2,160,599 56.10 2,795,948 43.19 2,735,405 44.42 
Jigawa 5,795,239 2,087,141 63.99 1,660,060 71.35 3,663,095 36.79 2,552,402 55.96 
Kaduna 7,967,703 4,277,828 46.31 5,097,663 36.02 3,443,272 56.78 5,494,387 31.04 
Kano 12,580,898 3,924,758 68.80 7,935,198 36.93 7,683,898 38.92 9,112,765 27.57 
Katsina 7,536,593 3,436,192 54.41 4,567,210 39.40 3,917,539 48.02 5,189,303 31.15 
Kebbi 4,251,959 2,686,814 36.81 3,170,446 25.44 3,507,521 17.51 3,344,803 21.33 
Kogi 4,343,938 2,637,986 39.27 3,385,183 22.07 3,770,918 13.19 3,583,883 17.50 
Kwara 3,262,565 1,462,084 55.19 1,413,829 56.67 2,842,007 12.89 2,354,845 27.82 
Lagos 13,934,343 766,926 94.50 3,437,142 75.33 11,565,994 17.00 5,571,554 60.02 
Nassarawa 2,437,915 1,670,835 31.46 1,656,771 32.04 1,892,722 22.36 1,946,132 20.17 
Niger 5,065,664 3,286,659 35.12 3,585,811 29.21 3,622,252 28.49 4,052,999 19.99 
Ogun 4,679,294 1,638,458 64.98 2,029,481 56.63 3,984,053 14.86 2,830,465 39.51 
Ondo 4,119,647 1,799,562 56.32 2,754,668 33.13 3,495,091 15.16 3,328,290 19.21 
Osun 5,068,879 1,041,894 79.45 2,538,799 49.91 4,492,865 11.36 4,006,432 20.96 
Oyo 7,669,908 2,472,980 67.76 3,106,245 59.50 6,372,009 16.92 5,237,276 31.72 
Plateau 4,159,606 2,529,773 39.18 3,022,713 27.33 3,299,094 20.69 3,403,034 18.19 
Rivers 6,017,768 3,430,158 43.00 3,727,603 38.06 4,779,166 20.58 4,152,837 30.99 
Sokoto 4,729,577 2,510,713 46.91 3,528,244 25.40 4,082,103 13.69 3,716,723 21.42 
Taraba 2,958,207 2,270,747 23.24 2,181,099 26.27 2,353,195 20.45 2,546,058 13.93 
Yobe 3,044,649 1,857,390 38.99 1,770,576 41.85 2,321,527 23.75 2,351,669 22.76 
RURAL 91,658,217 71,450,352 22.05 61,948,160 32.41 67,976,395 25.84 71,248,951 22.27 
URBAN 91,932,232 15,653,077 82.97 45,688,487 50.30 68,960,984 24.99 60,399,736 34.30 
 Population Improved water
 
Open defecation
 
Piped water on premises
 
Sewerage connection
 
State n n n n n 
Part 2 
Abia 3,840,896 924,175 75.94 196,399 5.11 3,706,913 3.49 3,537,670 7.89 
Abuja 1,639,987 397,114 75.79 487,510 29.73 1,368,546 16.55 1,432,688 12.64 
Adamawa 3,658,159 1,493,279 59.18 550,298 15.04 3,450,194 5.68 3,478,581 4.91 
AkwaIbom 4,833,660 2,012,590 58.36 186,175 3.85 4,713,592 2.48 4,673,963 3.30 
Anambra 5,589,839 1,711,445 69.38 523,788 9.37 5,475,650 2.04 5,380,931 3.74 
Bauchi 5,999,689 2,151,822 64.13 642,561 10.71 5,495,242 8.41 5,587,634 6.87 
Bayelsa 2,108,063 1,616,014 23.34 1,119,712 53.12 1,932,511 8.33 1,954,199 7.30 
Benue 5,482,011 3,821,075 30.30 2,201,039 40.15 4,985,516 9.06 4,949,584 9.71 
Borno 5,293,266 2,555,930 51.71 719,795 13.60 4,838,521 8.59 4,973,159 6.05 
Cross River 3,781,326 2,153,682 43.04 940,015 24.86 3,524,623 6.79 3,563,054 5.77 
Delta 5,210,434 2,187,529 58.02 1,673,257 32.11 4,639,413 10.96 4,807,651 7.73 
Ebonyi 2,648,901 1,057,714 60.07 882,353 33.31 2,535,632 4.28 2,504,235 5.46 
Edo 4,378,990 1,601,692 63.42 881,596 20.13 4,183,435 4.47 4,199,222 4.11 
Ekiti 3,006,775 1,602,580 46.70 1,082,125 35.99 2,836,546 5.66 2,847,225 5.31 
Enugu 4,231,441 3,130,921 26.01 1,545,134 36.52 4,100,783 3.09 4,079,381 3.59 
Gombe 3,137,075 1,648,427 47.45 719,888 22.95 2,835,106 9.63 2,867,789 8.58 
Imo 4,921,887 873,131 82.26 191,722 3.90 4,598,833 6.56 4,535,264 7.86 
Jigawa 5,795,239 616,745 89.36 508,761 8.78 4,935,285 14.84 5,543,890 4.34 
Kaduna 7,967,703 4,495,431 43.58 721,395 9.05 7,495,158 5.93 7,582,625 4.83 
Kano 12,580,898 5,973,517 52.52 300,057 2.39 11,928,095 5.19 12,250,127 2.63 
Katsina 7,536,593 2,839,625 62.32 398,590 5.29 7,025,578 6.78 7,239,944 3.94 
Kebbi 4,251,959 2,855,584 32.84 448,620 10.55 3,892,675 8.45 4,025,989 5.31 
Kogi 4,343,938 2,718,331 37.42 2,605,261 59.97 4,079,095 6.10 4,007,323 7.75 
Kwara 3,262,565 982,367 69.89 2,053,973 62.96 3,034,276 7.00 3,055,417 6.35 
Lagos 13,934,343 1,948,836 86.01 677,842 4.86 12,807,309 8.09 12,142,628 12.86 
Nassarawa 2,437,915 1,551,334 36.37 1,075,636 44.12 2,148,639 11.87 2,149,999 11.81 
Niger 5,065,664 2,872,740 43.29 1,430,140 28.23 4,643,945 8.33 4,700,769 7.20 
Ogun 4,679,294 1,119,021 76.09 858,435 18.35 4,348,930 7.06 4,318,968 7.70 
Ondo 4,119,647 2,307,629 43.98 1,928,455 46.81 3,881,808 5.77 3,908,442 5.13 
Osun 5,068,879 2,008,634 60.37 1,921,318 37.90 4,881,081 3.70 4,877,802 3.77 
Oyo 7,669,908 1,910,123 75.10 3,506,850 45.72 7,320,657 4.55 7,098,407 7.45 
Plateau 4,159,606 2,828,185 32.01 2,221,863 53.42 3,737,818 10.14 3,752,225 9.79 
Rivers 6,017,768 2,537,487 57.83 1,435,171 23.85 5,557,364 7.65 5,479,400 8.95 
Sokoto 4,729,577 2,578,576 45.48 495,551 10.48 4,449,031 5.93 4,598,303 2.78 
Taraba 2,958,207 1,841,355 37.75 1,271,366 42.98 2,649,075 10.45 2,645,036 10.59 
Yobe 3,044,649 893,937 70.64 470,247 15.45 2,702,438 11.24 2,828,441 7.10 
Zamfara 4,189,360 2,099,910 49.88 164,852 3.94 3,825,951 8.67 4,051,761 3.28 
RURAL 91,658,217 49,455,262 46.04 28,895,132 31.52 84,084,608 8.26 84,981,299 7.28 
URBAN 91,932,232 28,477,562 69.02 10,142,615 11.03 86,494,995 5.91 86,662,768 5.73 
 Population Improved water
 
Open defecation
 
Piped water on premises
 
Sewerage connection
 
State n n n n n 
Part 2 
Abia 3,840,896 924,175 75.94 196,399 5.11 3,706,913 3.49 3,537,670 7.89 
Abuja 1,639,987 397,114 75.79 487,510 29.73 1,368,546 16.55 1,432,688 12.64 
Adamawa 3,658,159 1,493,279 59.18 550,298 15.04 3,450,194 5.68 3,478,581 4.91 
AkwaIbom 4,833,660 2,012,590 58.36 186,175 3.85 4,713,592 2.48 4,673,963 3.30 
Anambra 5,589,839 1,711,445 69.38 523,788 9.37 5,475,650 2.04 5,380,931 3.74 
Bauchi 5,999,689 2,151,822 64.13 642,561 10.71 5,495,242 8.41 5,587,634 6.87 
Bayelsa 2,108,063 1,616,014 23.34 1,119,712 53.12 1,932,511 8.33 1,954,199 7.30 
Benue 5,482,011 3,821,075 30.30 2,201,039 40.15 4,985,516 9.06 4,949,584 9.71 
Borno 5,293,266 2,555,930 51.71 719,795 13.60 4,838,521 8.59 4,973,159 6.05 
Cross River 3,781,326 2,153,682 43.04 940,015 24.86 3,524,623 6.79 3,563,054 5.77 
Delta 5,210,434 2,187,529 58.02 1,673,257 32.11 4,639,413 10.96 4,807,651 7.73 
Ebonyi 2,648,901 1,057,714 60.07 882,353 33.31 2,535,632 4.28 2,504,235 5.46 
Edo 4,378,990 1,601,692 63.42 881,596 20.13 4,183,435 4.47 4,199,222 4.11 
Ekiti 3,006,775 1,602,580 46.70 1,082,125 35.99 2,836,546 5.66 2,847,225 5.31 
Enugu 4,231,441 3,130,921 26.01 1,545,134 36.52 4,100,783 3.09 4,079,381 3.59 
Gombe 3,137,075 1,648,427 47.45 719,888 22.95 2,835,106 9.63 2,867,789 8.58 
Imo 4,921,887 873,131 82.26 191,722 3.90 4,598,833 6.56 4,535,264 7.86 
Jigawa 5,795,239 616,745 89.36 508,761 8.78 4,935,285 14.84 5,543,890 4.34 
Kaduna 7,967,703 4,495,431 43.58 721,395 9.05 7,495,158 5.93 7,582,625 4.83 
Kano 12,580,898 5,973,517 52.52 300,057 2.39 11,928,095 5.19 12,250,127 2.63 
Katsina 7,536,593 2,839,625 62.32 398,590 5.29 7,025,578 6.78 7,239,944 3.94 
Kebbi 4,251,959 2,855,584 32.84 448,620 10.55 3,892,675 8.45 4,025,989 5.31 
Kogi 4,343,938 2,718,331 37.42 2,605,261 59.97 4,079,095 6.10 4,007,323 7.75 
Kwara 3,262,565 982,367 69.89 2,053,973 62.96 3,034,276 7.00 3,055,417 6.35 
Lagos 13,934,343 1,948,836 86.01 677,842 4.86 12,807,309 8.09 12,142,628 12.86 
Nassarawa 2,437,915 1,551,334 36.37 1,075,636 44.12 2,148,639 11.87 2,149,999 11.81 
Niger 5,065,664 2,872,740 43.29 1,430,140 28.23 4,643,945 8.33 4,700,769 7.20 
Ogun 4,679,294 1,119,021 76.09 858,435 18.35 4,348,930 7.06 4,318,968 7.70 
Ondo 4,119,647 2,307,629 43.98 1,928,455 46.81 3,881,808 5.77 3,908,442 5.13 
Osun 5,068,879 2,008,634 60.37 1,921,318 37.90 4,881,081 3.70 4,877,802 3.77 
Oyo 7,669,908 1,910,123 75.10 3,506,850 45.72 7,320,657 4.55 7,098,407 7.45 
Plateau 4,159,606 2,828,185 32.01 2,221,863 53.42 3,737,818 10.14 3,752,225 9.79 
Rivers 6,017,768 2,537,487 57.83 1,435,171 23.85 5,557,364 7.65 5,479,400 8.95 
Sokoto 4,729,577 2,578,576 45.48 495,551 10.48 4,449,031 5.93 4,598,303 2.78 
Taraba 2,958,207 1,841,355 37.75 1,271,366 42.98 2,649,075 10.45 2,645,036 10.59 
Yobe 3,044,649 893,937 70.64 470,247 15.45 2,702,438 11.24 2,828,441 7.10 
Zamfara 4,189,360 2,099,910 49.88 164,852 3.94 3,825,951 8.67 4,051,761 3.28 
RURAL 91,658,217 49,455,262 46.04 28,895,132 31.52 84,084,608 8.26 84,981,299 7.28 
URBAN 91,932,232 28,477,562 69.02 10,142,615 11.03 86,494,995 5.91 86,662,768 5.73 

Counts relate to the number of people without access to the service, whereas percentages describe the fraction with access. The only exception is open defecation where both the count and percentage relate to those practicing open defecation.

Model uncertainty

While the out-of-sample validation procedure provides an external check on the model's predictive performance and fit, the framework also provides an internal, model-based estimate of the uncertainty associated with the prediction in every pixel. It reveals which parts of each map are more or less certain, as driven by local heterogeneities in the indicator data and the density of data points. Figure 3 presents uncertainty levels for the water indicators. The estimation results for the indicators of access to improved water, basic water, and improved water on premises show high levels of confidence in densely populated areas. In areas where population numbers are low, the precision of the estimates is low. This suggests that, from a policy perspective, the WASH policies targeted at the most densely populated areas will also benefit from the greatest certainty. In the case of piped water on premises, the estimation results have a high level of certainty across a large proportion of the territory. In Figure 4, the results for sanitation indicators are similar to those for water. In the case of indicators with relatively widespread coverage, such as open defecation and improved sanitation, the results again have low levels of uncertainty in areas with high densities of population. For the access to sewerage indicator, at only 5.6%, on average, across the nation, a high level of confidence is seen nationwide. (Table 3 provides a full tabulation of estimates at the state level, including estimated count and the percentage of population according to each indicator.)

Figure 3

Map showing uncertainty associated with modeled 1 × 1 km pixel level predictions of the percentage of population with access to four water service indicators. Uncertainty is quantified using the width of the posterior predictive distribution for each pixel (measured on the same scale as the indicator itself: a percentage between 0 and 100%). This is the range of values within which there is a 95% probability that the true indicator value lies, and thus, wide intervals are more uncertain and narrow intervals less uncertain. (a) Improved water. (b) Basic water. (c) Improved water on premises. (d) Piped water on premises.

Figure 3

Map showing uncertainty associated with modeled 1 × 1 km pixel level predictions of the percentage of population with access to four water service indicators. Uncertainty is quantified using the width of the posterior predictive distribution for each pixel (measured on the same scale as the indicator itself: a percentage between 0 and 100%). This is the range of values within which there is a 95% probability that the true indicator value lies, and thus, wide intervals are more uncertain and narrow intervals less uncertain. (a) Improved water. (b) Basic water. (c) Improved water on premises. (d) Piped water on premises.

Figure 4

Map showing uncertainty associated with modeled 1 × 1 km pixel level predictions of the percentage of population with different sanitation access indicators. Uncertainty is quantified using the width of the posterior predictive distribution for each pixel (measured on the same scale as the indicator itself: a percentage between 0 and 100%). This is the range of values within which there is a 95% probability that the true indicator value lies, and thus wide intervals are more uncertain and narrow intervals less uncertain. (a) Sewerage connection. (b) Improved sanitation. (c) Open defecation.

Figure 4

Map showing uncertainty associated with modeled 1 × 1 km pixel level predictions of the percentage of population with different sanitation access indicators. Uncertainty is quantified using the width of the posterior predictive distribution for each pixel (measured on the same scale as the indicator itself: a percentage between 0 and 100%). This is the range of values within which there is a 95% probability that the true indicator value lies, and thus wide intervals are more uncertain and narrow intervals less uncertain. (a) Sewerage connection. (b) Improved sanitation. (c) Open defecation.

Geo-spatial modeling of basic indicators

In Figures 511, the results of the geo-statistical modeling exercise are presented for the seven water and sanitation indicators listed earlier. Each of these figures is divided into three different maps: (1) a detailed pixel-level map shows the predicted percentage of the population, in each 1 × 1 km pixel, with access to the indicator in question; (2) equivalent percentage estimates are aggregated at the state level; and (3) a population count of those with access to the indicator is defined for each state. Table 3 provides a full tabulation of estimates at the state level providing estimated count and the percentage of population according to each indicator.

Figure 5

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population using improved water. Also shown are state-level estimates of (a) the percentage of people with improved water and (b) the number of people without improved water.

Figure 5

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population using improved water. Also shown are state-level estimates of (a) the percentage of people with improved water and (b) the number of people without improved water.

Figure 6

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population using basic water. Also shown are state-level estimates of (a) the percentage of people with basic water and (b) the number of people without basic water.

Figure 6

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population using basic water. Also shown are state-level estimates of (a) the percentage of people with basic water and (b) the number of people without basic water.

Figure 7

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population with improved water on the premises. Also shown are state-level estimates of (a) the percentage of people with improved water on premises and (b) the number of people without improved water on premises.

Figure 7

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population with improved water on the premises. Also shown are state-level estimates of (a) the percentage of people with improved water on premises and (b) the number of people without improved water on premises.

Figure 8

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of the population with piped water on the premises. Also shown are state-level estimates of (a) the percentage of people with piped water on premises and (b) the number of people without piped water on premises.

Figure 8

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of the population with piped water on the premises. Also shown are state-level estimates of (a) the percentage of people with piped water on premises and (b) the number of people without piped water on premises.

Figure 9

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population with a sewerage connection. Also shown are state-level estimates of (a) the percentage of people with a sewerage connection and (b) the number of people without a sewerage connection.

Figure 9

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population with a sewerage connection. Also shown are state-level estimates of (a) the percentage of people with a sewerage connection and (b) the number of people without a sewerage connection.

Figure 10

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of the population with improved sanitation. Also shown are state-level estimates of (a) the percentage of people with improved sanitation and (b) the number of people without improved sanitation.

Figure 10

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of the population with improved sanitation. Also shown are state-level estimates of (a) the percentage of people with improved sanitation and (b) the number of people without improved sanitation.

Figure 11

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population not practicing open defecation. Also shown are state-level estimates of (a) the percentage of people not practicing open defecation and (b) the number of people practicing open defecation.

Figure 11

(main) Map showing modeled 1 × 1 km pixel level predictions of the percentage of population not practicing open defecation. Also shown are state-level estimates of (a) the percentage of people not practicing open defecation and (b) the number of people practicing open defecation.

Figure 5 maps the share of population using improved water. The 1 × 1 km pixel maps reveal pronounced spatial heterogeneity and across relatively short distances. This is partly due to urban–rural gradients: urban areas tend to have high rates of access to improved water, and rates drop off rapidly outside city limits. At the state level, rates span the range from just 23% (in Bayelsa) to 89% (in Jigawa). The largest concentrations of population without access to improved water are found in Kano (6.0 million), Kaduna (4.5 million), and Benue (3.8 million). Figures 6 and 7 map the share of population using basic water and improved water on premises, respectively. Unsurprisingly, estimated rates are lower for both indicators than for improved water, given their more stringent requirements. Both maps have a similar urban–rural pattern characterized by higher rates of access within and around the major urban centers (especially Lagos and Imo to the south and Kano to the north). The degree to which these higher urban rates extend past city limits and into surrounding rural areas is far smaller for basic water and improved water on premises than for improved water, leading to a more focal, concentrated urban effect.

At the state level, Enugu has the lowest rates of access to both basic water and improved water on premises (7.5% and 6%, respectively), while Lagos has the highest (75% and 60%, respectively). (Additional information, outside the scope of this paper, is required to further flesh out the patterns that may be explaining the concentration of low or high access to WASH indicators across some of these states. Some of these explanations may be driven by high geographic concentrations of poverty: high levels of poverty may be driving the extremely low levels of access to improved and basic water in Enugu state. See Chapter 4 of Andres et al. (2017) for the overlap between variations in access to WASH services and poverty levels.) Interestingly, despite having the highest rates of access, the large urban states also have the largest number of people without access. The two largest populations without basic water are in Kano (8 million) and Kaduna (5 million); the largest without improved water on premises are in Kano (9 million) and Lagos (5.5 million). Figures 8 and 9 map the populations with piped water on premises and with a sewerage connection, respectively. Very few Nigerians have access to either: the maps show almost uniform, very low rates nationwide other than in a handful of pockets with some access. Even in the states with the highest access rates (Abuja and Lagos), only 17% and 12% of the population have piped water and sewerage connections, respectively. Only seven states have rates of 10% or more for piped water (Abuja, Plateau, Taraba, Delta, Yobe, Nasarawa, and Jigawa) and just four states have rates of 10% or more for sewerage connections (Lagos, Abuja, Nasarawa, and Taraba).

Figure 10 maps the share of the population using an improved sanitation facility. Here, the spatial pattern is rather different from the others; while there are predominantly low rates throughout much of the country, the pixel-level map shows areas of much higher access across the states of Kaduna and Niger and parts of Kano and Jigawa. Interestingly, these well-served areas are not well identified in the state-level aggregate maps, highlighting the importance of looking at variations at a local-level resolution. Rates vary at the state level, from 7% in Bayelsa to 57% in Kaduna: the largest absolute populations without access are found in Lagos, with 12 million without access, or around 87% of the state population; and Kano, with 8 million without access, or around 61% of the state population. (Reference Table 3 for additional details.) When we compare these results with Figure 9, which shows the predicted level of access to sewerage, we observe that the main difference is in access to improved sanitation. In the case of sewerage, the level of access is very low across all the regions of Nigeria.

Finally, Figure 11 maps the share of population practicing open defecation. This is the indicator that displays perhaps the most polarization across the country: around one-third of states display very high rates of open defecation, especially in the central and southern areas, excluding the coastal regions. The remainder of the country to the north displays very low rates. Accordingly, the state with the highest rates is Kwara, where 63% of the population practices open defecation, while the practice is least prevalent in Kano, at just 2%.

CONCLUSION

To design targeted policies, access to geographically specific information is crucial. However, this information is usually derived from representative surveys, whose sampling techniques are meant to save on costs while ensuring the representativeness of the population, but only permit a limited degree of desegregation, so the inferences are not extended to outliers. Geo-spatial models can help address these limitations by generating predictions for areas where information is lacking. In this paper, we implement a model-based geostatistical (MBG) prediction of access to specified water and sanitation services in Nigeria. Using information from households and water points and water schemes gathered as part of the NWSS 2015, as well as an array of geo-spatial covariates, we generate layers of information for seven key indicators of access to WASH, at a spatial resolution of 1 × 1 km.

Overall, the findings suggest a sharp urban–rural divide in terms of access to improved water, basic water, and improved water on premises, a low availability of piped water on premises and of sewerage systems throughout the country, a high concentration of improved sanitation in select states, and low rates of nationwide open defecation, with a few pockets of high rates of open defecation in the central and southern non-coastal regions.

The availability of these spatially detailed estimates provides a new trove of important information to support the targeting of programs advancing water and sanitation access in Nigeria, and offers more detailed, granular estimates for tracking progress toward the SDGs.

AUTHOR CONTRIBUTION

Senior authorship is not assigned. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the view of the World Bank, its executive directors, or the countries they represent. The findings, interpretations, and any remaining errors in this paper are entirely those of the authors.

REFERENCES

REFERENCES
Andres
L.
,
Duret
M.
,
Mantovani
P.
,
Molini
V.
&
Ort
R.
2017
A Wake Up Call: Nigeria Water Supply, Sanitation, and Hygiene Poverty Diagnostic
.
World Bank
,
Washington, DC
,
USA
.
Bennett
A.
,
Kazembe
L.
,
Mathanga
D. P.
,
Kinyoki
D.
,
Ali
D.
&
Snow
R. W.
2013
Mapping malaria transmission intensity in Malawi, 2000–2010
.
Am. J. Trop. Med. Hyg.
89
,
840
9
.
Blankespoor
B.
&
van der Weide
R.
2017
Mapping Access to Water and Sanitation Using Small Area Estimation Methods: With Applications to Bangladesh and Nigeria
.
Unpublished manuscript
.
Burgert-Brucker
C. R.
,
Dontamsetti
T.
,
Mashall
A.
&
Gething
P. W.
2016
Guidance for Use of the DHS Program Modeled Map Surfaces
.
DHS Spatial Analysis Reports No. 14
.
ICF International
,
Rockville, Maryland, USA.
Center for international Earth Science Information Network – CIESIN – Columbia University, International Food Policy Research Institute – IFPRI, The World Bank, and Centro Internacional de Agricultura Tropical – CIAT
2011
Global Rural-Urban Mapping Project, Version 1 (GRUMPv1): Population Density Grid
.
NASA Socioeconomic Data and Applications Center
,
Palisades, NY
.
DHS Spatial Interpolation Working Group
2014
Spatial Interpolation with Demographic and Health Survey Data: Key Considerations
.
DHS Spatial Analysis Reports No. 9
.
ICF International
,
Rockville, Maryland, USA.
Diggle
P. J.
,
Ribeiro
P. J.
2007
Model-based geostatistics
. From:
Springer Series in Statistics
(
Bickel
P.
,
Diggle
P.
,
Fienberg
S.
,
Gather
U.
,
Olkin
I.
&
Zeger
S.
, eds).
Springer
,
New York
,
USA
.
Diggle
P. J.
,
Tawn
J. A.
&
Moyeed
R. A.
1998
Model-based geostatistics
.
Appl. Stat.
47
(
3
),
299
326
.
Elyazar
I. R. F.
,
Gething
P. W.
,
Patil
A. P.
,
Rogayah
H.
,
Kusriastuti
R.
&
Wismarini
D. M.
2011
Plasmodium falciparum malaria endemicity in Indonesia in 2010
.
PLoS One
6
,
e21315
.
Elyazar
I. R. F.
,
Gething
P. W.
,
Patil
A .P.
,
Rogayah
H.
,
Sariwati
E.
&
Palupi
N. W.
2012
Plasmodium vivax malaria endemicity in Indonesia in 2010
.
PLoS One
7
,
e37325
.
Gemperli
A.
,
Vounatsou
P.
,
Kleinschmidt
I.
,
Bagayoko
M.
,
Lengeler
C.
&
Smith
T.
2004
Spatial patterns of infant mortality in Mali: the effect of malaria endemicity
.
Am. J. Epidemiol.
159
(
1
),
64
72
.
Gething
P.W.
&
Adoho
F.
2015
Developing a Poverty Map for Democratic Republic of Congo
.
Report prepared for the World Bank
,
Washington, DC, USA
.
Gething
P. W.
&
Molini
V.
2015
Developing an Updated Poverty Map for Nigeria
.
Report prepared for the World Bank
,
Washington, DC
,
USA
.
Gething
P. W.
&
Pop
L. B.
2015
Developing a High Resolution Poverty Map for Afghanistan in 2011/12
.
Report prepared for the World Bank
,
Washington DC, USA
.
Gething
P. W.
&
Rosas
N.
2015a
Developing a High Resolution Poverty Map for Tanzania
.
Report prepared for the World Bank
.
Washington, DC, USA
.
Gething
P. W.
&
Rosas
N.
2015b
Developing a Poverty Map for Targeting of Social Safety Net Programs in Sierra Leone
.
Report prepared for the World Bank
.
Washington, DC, USA.
Gething
P. W.
,
Patil
A.
,
Smith
D. L.
,
Guerra
C. A.
,
Elyazar
I. R. F.
&
Johnston
G.
2011
A new world malaria map: Plasmodium falciparum endemicity in 2010
.
Malar. J.
10
,
378
.
Gething
P. W.
,
Elyazar
I. R. F.
,
Moyes
C. M.
,
Smith
D. L.
,
Battle
K. E.
&
Guerra
C. A.
2012
A long neglected world malaria map: Plasmodium vivax endemicity in 2010
.
PLoS Negl. Trop. Dis.
6
(
9
),
e1814
.
Hijmans
R. J.
,
Cameron
S. E.
,
Parra
J. L.
,
Jones
P. G.
&
Jarvis
A.
2005
Very high resolution interpolated climate surfaces for global land areas
.
Int. J. Climatol.
25
(
15
),
1965
1978
.
Linard
C.
,
Gilbert
M.
,
Snow
R. W.
,
Noor
A. M.
&
Tatem
A. J.
2012
Population distribution, settlement patterns and accessibility across Africa in 2010
.
PLoS ONE
7
(
2
),
e31743
.
doi:10.1371/journal.pone.0031743
.
NASA EOSDIS Land Processes DAAC, USGS Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota. Available at: https://lpdaac.usgs.gov (Accessed April 15, 2017).
Noor
A. M.
,
Gething
P. W.
,
Alegana
V. A.
,
Patil
A. P.
,
Hay
S. I.
&
Muchiri
E.
2009
The risks of malaria infection in Kenya in 2009
.
BMC Infect. Dis.
9
,
180
.
Noor
A. M.
,
El Mardi
K. A.
,
Abdelgader
T. M.
,
Patil
A. P.
,
Amine
A.
&
Bakhiet
S.
2012
Malaria risk mapping for control in the republic of Sudan
.
Am. J. Trop. Med. Hyg.
87
,
1012
21
.
Noor
A. M.
,
Uusiku
P.
,
Kamwi
R. N.
,
Katokele
S.
,
Ntomwa
B.
&
Alegana
V. A.
2013
The receptive versus current risks of Plasmodium falciparum transmission in Northern Namibia: implications for elimination
.
BMC Infect. Dis.
13
,
184
.
Raso
G.
,
Schur
N.
,
Utzinger
J.
,
Koudou
B. G.
,
Tchicaya
E. S.
&
Rohner
F.
2012
Mapping malaria risk among children in Côte d'Ivoire using Bayesian geo-statistical models
.
Malar. J.
11
,
160
.
Reid
H.
,
Haque
U.
,
Clements
A. C. A.
,
Tatem
A. J.
,
Vallely
A.
&
Ahmed
S. M.
2010
Mapping malaria risk in Bangladesh using Bayesian geostatistical models
.
Am. J. Trop. Med. Hyg.
83
,
861
7
.
Trabucco
A.
&
Zomer
R. J.
2009
Global Aridity Index (Global-Aridity) and Global Potential Evapo-Transpiration (Global-PET) Geospatial Database
.
CGIAR Consortium for Spatial Information. Published online, available from the CGIAR-CSI GeoPortal at: http://www.csi.cgiar.org/
.
WHO/UNICEF
2015
Update and MDG Assessment
.
WHO Press, World Health Organization
,
Geneva
,
Switzerland
, p.
90
.
http://doi.org/10.1007/s13398-014-0173-7.2
World Bank
2016
Poverty Reduction in Nigeria in the Last Decade
. .