ABSTRACT
Whilst much research focusses on challenges related to achieving SDG 6.1 (universal and equitable access to safe and affordable drinking water), there has been less attention to challenges of safe transport, storage and use of collected water. In particular, there are relatively few high-quality datasets quantifying the number and volume of water containers used by households for such purposes. This paper reports results from the application of machine learning (ML) techniques to a database of images of domestic water storage collected during 2022 as part of an initiative to improve water supply in southern Bangladesh. Because the number of different water container types was relatively small, it was possible to train an ML algorithm to identify water containers and estimate water storage with greater than 90% accuracy. These results have allowed the rapid creation of a unique high-quality, high-resolution dataset describing water storage quantitatively in a study community. This dataset includes data quantifying the number of vessels as well as their individual and aggregated water storage volumes. The paper discusses policy implications for the study location specifically before concluding with suggestions for the inclusion of this sort of analysis in ongoing studies of household and community scale water insecurity.
HIGHLIGHTS
Machine learning can speed up the processing of images of water storage taken during conventional household water access questionnaires.
Accurate quantification of available water storage can assist in water services programme planning.
Machine learning-assisted analysis of over 800 images collected during household research in Bangladesh in 2022 showed that the average storage per person is only 2.5 L.
INTRODUCTION
Variation in size and shape of commonly used domestic water containers (Source: Staddon personal collection).
Variation in size and shape of commonly used domestic water containers (Source: Staddon personal collection).
Water and Sanitation for Health (WASH) researchers frequently use survey questionnaires to elicit data about type, volume and quality of water storage infrastructure including containers at the household scale. The ability to triangulate survey-derived qualitative data with data generated through machine learning (ML) analysis of collected images offers a number of important advantages. First, there are obvious issues with self-reported data on topics like water storage, including the inability to exactly recall or express storage in numbers of containers and their volumes. Second, because WASH research often involves lengthy questionnaires covering a variety of subtopics, any opportunity to automate data analysis in a subtopic area could help reduce overall survey length and the resulting burden on respondent households. Third, there may be useful information to be generated from comparing self-reported water storage and ML-derived estimates, in terms of a better understanding of cultural and other biases inherent in conventional WASH research. Finally, it may be possible for ML analysis to generate data about related topics of interest including cleanliness of containers, the prevalence of appropriate capping/closing of openings and appropriateness of storage locations – all important factors in water quality management.
The applications of ML-driven object detection are vast but as yet little applied to WASH research. For instance, in autonomous driving, object detection systems play a critical role in identifying pedestrians (Wu et al. 2018; Shahbaz & Jo 2021), vehicles (Diwan et al. 2023), and traffic signs, ensuring safer navigation (Wang et al. 2023a). In the retail sector, they facilitate better inventory management and enable cashier-less checkout experiences (Wang et al. 2023b). In healthcare (Wu et al. 2018), object detection aids in medical image analysis, from detecting tumours in radiology scans to tracking cells in microscopy images (Chowdhury et al. 2023), and in water management, ML-based object detection has been used to track qualitative and quantitative changes in surface water features using satellite imagery (Kashtan & Hnatushenko 2024; Liu & Li 2024).
This paper introduces an artificial intelligence powered algorithm situated at the intersection of computer vision and ML, specifically for assessing the number and volume of water containers from photographs collected as part of conventional WASH research. The primary technical goal here is to utilize automated object detection to correctly identify commonly used water storage containers, such as bottles, buckets, jugs, and more. These data can be used to quickly assess the adequacy of domestic storage volumes in humanitarian and development contexts and to evaluate the impact of specific programme interventions such as the distribution of water containers.
The subsequent sections of the paper are organized as follows:
Section 2 presents the methodology and discusses key aspects of the ML algorithm used, offering quantitative and qualitative insights into its performance.
Section 3 presents data on model accuracy in terms of precision and recall statistics.
Section 4 discusses the results of applying this algorithm to a collection of images of water storage collected in 2022 in Bangladesh.
The final section of the paper discusses some areas for future development, that will be of interest to WASH researchers.
METHODOLOGY USED FOR THE PILOT
In 2022, the UK-based charity Groundwater Relief, the University of the West of England and Asian University for Women in Bangladesh led a study of water infrastructure needs in Teknaf Upazila, located in the southernmost part of Bangladesh. In recent years, this part of Bangladesh has received more than one million Rohingya refugees fleeing government-backed persecution in neighbouring Burma (Myanmar); which has placed great strain on underdeveloped water services infrastructure. This new work was built on earlier studies undertaken by the project team (e.g., Akhter et al. 2020; Rafa et al. 2020) but specifically sought to capture data about the relative water insecurity experience of host communities and refugee households across wet and dry seasons. During the household survey data collection phase, more than 800 photographs of respondent households' water fetching and storage containers were collected. These images were confirmed to not have identifying data and were made available for the model creation.
The trial algorithm employs YOLO (Jocher et al. 2022), or ‘You Only Look Once’ as an object detection algorithm due to its proven efficiency and efficacy. Imagine if you could look at a photo and immediately point out objects in it, like cars, dogs, or people (water bottles in this study). Now, imagine doing that as fast as flipping through a large stack of pictures. This is essentially what YOLO does; it is a technology that helps computers recognize and find objects in images and videos very quickly and efficiently. YOLO is optimized for fast inference performing object bounding and labelling in a single pass (Fan & Song 2024; Liang et al. 2024). Additionally, later versions of YOLO offer a PyTorch implementation with automatic data augmentation, hyperparameter tuning, and pre-trained model support making it highly accessible for both researchers and developers.
Imagine you have an N×N grid overlay on your photo, like a Noughts and Crosses board but with considerably more squares. YOLO splits each image into a grid, and each square of the grid is responsible for detecting objects that fall within it. This helps the system quickly identify different objects in various parts of the image. In each square of the grid, YOLO predicts two things: the probability of an object being present and the precise location of the object (by drawing a box around it). For example, if there is a water bottle in the photo, YOLO will highlight the water bottle and tell you exactly where it is. YOLO does not just detect objects; it also classifies them. This means it recognizes what each object is – whether it is a dog, a bicycle, or a water bottle. It does this by using a database of known objects (derived during model training) and matching what it sees to what it has learned. While our YOLO-based object detection model shows promising results, we acknowledge the necessity for further validation with larger and more diverse datasets before advocating for widespread practical implementation.
For this project, the algorithm was trained on a Windows 11 PC equipped with a Core i9 3 GHz CPU, 32 GB of RAM, and a 24GB Nvidia RTX 4090 GPU. Implementation and training were carried out using the PyTorch deep learning library (Paszke et al. 2019), with support from the Ultralytics codebase (Jocher et al. 2022). The process is easy to set up and could be divided into two steps: (1) Annotating the dataset with respective objects, and (2) Training an ML model on the dataset. A small dataset, say 100–150 images, could be annotated in 2–3 h which in turn could be used to train an ML model (say 1–2 h). In other words, this sort of system could be set up in only 4–5 h by someone comfortable with running desk-based computer models. Significant coding expertise is not required.
Evaluation of results was conducted through the creation of a confusion matrix and statistics on precision (ratio of true positives to all positives) and recall (ratio of true positives to true positives plus false negatives). Model performance was represented through precision (P), recall (R) and mean average precision (mAP) @ 0.5 intersection over union statistics.
HOW GOOD WERE THE MODELS AT IDENTIFYING WATER CONTAINERS?
Of the five ML models assessed, it was YOLOv5s that performed best across all types of water storage containers, achieving an overall precision of 87.7%. Table 1 presents a quantitative analysis of the five different models tested. Notably, the ‘Kholash-5’ container type achieves the highest precision at nearly 99%. This outcome may be attributed to the fact that ‘Kholash’ is the most prevalent water container type within the dataset meaning that the ML model had much more training data for this object type. It is moreover a very distinctive and unique water container shape (Figure 1, middle panel). Conversely, the ‘Bucket 5’ type of container (5 L cylindrical bucket) was only correctly identified 69.7% of the time and with a higher number of false negatives leading also to a lower recall score. Images including a standard scale object (usually a 500- or 1,000-mL PET plastic water bottle) allowed higher rates of precision across all object classes, and we note that both precision and recall were very good for these containers across all container types.
Precision (P), recall (R) and mean average precision (mAP) @ 0.5 intersection over union (IOU) statistics for the five YOLO models tested
. | . | YOLOv5n . | YOLOv5s . | YOLOv5m . | YOLOv5l . | YOLOv5x . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Class . | Instances . | P . | R . | mAP . | P . | R . | mAP . | P . | R . | mAP . | P . | R . | mAP . | P . | R . | mAP . |
All | 1,191 | 0.838 | 0.787 | 0.836 | 0.877 | 0.851 | 0.867 | 0.859 | 0.809 | 0.86 | 0.876 | 0.814 | 0.864 | 0.849 | 0.805 | 0.8 |
Bottle 0.5 | 70 | 0.849 | 0.886 | 0.896 | 0.898 | 0.843 | 0.95 | 0.819 | 0.9 | 0.92 | 0.94 | 0.9 | 0.945 | 0.855 | 0.871 | 0.91 |
Drum 20 | 107 | 0.888 | 0.813 | 0.882 | 0.899 | 0.911 | 0.91 | 0.929 | 0.86 | 0.91 | 0.92 | 0.86 | 0.92 | 0.883 | 0.85 | 0.888 |
Kholash 5 | 107 | 0.962 | 0.956 | 0.976 | 0.98 | 0.967 | 0.981 | 0.986 | 0.962 | 0.98 | 0.981 | 0.962 | 0.98 | 0.978 | 0.954 | 0.982 |
Bottle 1 | 367 | 0.924 | 0.787 | 0.876 | 0.902 | 0.898 | 0.92 | 0.885 | 0.806 | 0.874 | 0.903 | 0.861 | 0.903 | 0.86 | 0.815 | 0.795 |
Bucket 5 | 108 | 0.629 | 0.423 | 0.511 | 0.697 | 0.635 | 0.59 | 0.652 | 0.51 | 0.581 | 0.703 | 0.433 | 0.562 | 0.667 | 0.462 | 0.445 |
Jug 5 | 104 | 0.753 | 0.753 | 0.806 | 0.855 | 0.801 | 0.88 | 0.892 | 0.788 | 0.873 | 0.853 | 0.753 | 0.834 | 0.855 | 0.767 | 0.58 |
Jug 10 | 55 | 0.941 | 0.873 | 0.931 | 0.935 | 0.927 | 0.947 | 0.923 | 0.873 | 0.931 | 0.943 | 0.909 | 0.946 | 0.924 | 0.891 | 0.833 |
Bucket 10 | 234 | 0.761 | 0.803 | 0.813 | 0.847 | 0.829 | 0.824 | 0.785 | 0.774 | 0.809 | 0.768 | 0.833 | 0.819 | 0.767 | 0.833 | 0.804 |
. | . | YOLOv5n . | YOLOv5s . | YOLOv5m . | YOLOv5l . | YOLOv5x . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Class . | Instances . | P . | R . | mAP . | P . | R . | mAP . | P . | R . | mAP . | P . | R . | mAP . | P . | R . | mAP . |
All | 1,191 | 0.838 | 0.787 | 0.836 | 0.877 | 0.851 | 0.867 | 0.859 | 0.809 | 0.86 | 0.876 | 0.814 | 0.864 | 0.849 | 0.805 | 0.8 |
Bottle 0.5 | 70 | 0.849 | 0.886 | 0.896 | 0.898 | 0.843 | 0.95 | 0.819 | 0.9 | 0.92 | 0.94 | 0.9 | 0.945 | 0.855 | 0.871 | 0.91 |
Drum 20 | 107 | 0.888 | 0.813 | 0.882 | 0.899 | 0.911 | 0.91 | 0.929 | 0.86 | 0.91 | 0.92 | 0.86 | 0.92 | 0.883 | 0.85 | 0.888 |
Kholash 5 | 107 | 0.962 | 0.956 | 0.976 | 0.98 | 0.967 | 0.981 | 0.986 | 0.962 | 0.98 | 0.981 | 0.962 | 0.98 | 0.978 | 0.954 | 0.982 |
Bottle 1 | 367 | 0.924 | 0.787 | 0.876 | 0.902 | 0.898 | 0.92 | 0.885 | 0.806 | 0.874 | 0.903 | 0.861 | 0.903 | 0.86 | 0.815 | 0.795 |
Bucket 5 | 108 | 0.629 | 0.423 | 0.511 | 0.697 | 0.635 | 0.59 | 0.652 | 0.51 | 0.581 | 0.703 | 0.433 | 0.562 | 0.667 | 0.462 | 0.445 |
Jug 5 | 104 | 0.753 | 0.753 | 0.806 | 0.855 | 0.801 | 0.88 | 0.892 | 0.788 | 0.873 | 0.853 | 0.753 | 0.834 | 0.855 | 0.767 | 0.58 |
Jug 10 | 55 | 0.941 | 0.873 | 0.931 | 0.935 | 0.927 | 0.947 | 0.923 | 0.873 | 0.931 | 0.943 | 0.909 | 0.946 | 0.924 | 0.891 | 0.833 |
Bucket 10 | 234 | 0.761 | 0.803 | 0.813 | 0.847 | 0.829 | 0.824 | 0.785 | 0.774 | 0.809 | 0.768 | 0.833 | 0.819 | 0.767 | 0.833 | 0.804 |
The bold values show the best performance of the given metrics in a particular ‘class’.
Although not uniformly accurate across all container types, YOLO V5s performed well enough to support estimation of domestic water storage capabilities, to which we turn in the next section.
RESULTS OF PILOT: WATER STORAGE IN TEKNAF UPAZILA, BANGLADESH
- Kholashs are usually made of aluminium, making them more robust than plastic vessels, though also more expensive.
- The design usually allows for a tight-fitting plate-type lid, helping to prevent recontamination of collected water, and
- The vessel is better suited than other vessel types for carrying on the head or the hip, possibly reducing the likelihood of musculoskeletal injuries (Geere et al. 2010; Venkataramanan et al. 2020).
Though far less prevalent, the next three most common water containers are the 500 mL bottle (n = 118), the 10-L bucket (n = 109) and the 1 L bottle (n = 95), which suggests that very many water containers are in fact reused containers originally obtained for other purposes. Especially for the 500 mL and 1 L plastic bottles, this may bring larger health risks related to their construction from PET plastic and the difficulty of properly cleaning them (Ioannidou et al. 2016; Staddon & Brewis 2024).
Next, we looked at the mix of water containers used by households to fetch and store water. As noted previously, the kholash is the most prevalent water container, possessed by 66% of households. The most common combination of water containers is one 500 L kholash, and one 10 L bucket, which may well represent the maximum carrying capacity for a single person.1
Frequency of households with specified volumes of per-person water storage.
CONCLUSION: UTILITY OF ML FOR ASSESSING DOMESTIC WATER STORAGE
In the absence of a piped to home water supply, a key pinch point in achieving SDG 6.1 (Universal Access to Clean and Abundant Drinking Water) is the availability of containers that can safely hold a sufficient volume of water for domestic needs. This paper demonstrates a novel method of extracting useful water storage data from photographs taken during conventional household WASH surveys. Using a readily available ML system called YOLO V5s we were able to extract information about quantities of water stored and storage vessel types with a high degree of accuracy and reliability. With this new data in hand, it was easy to calculate water sufficiency relative to reported household size and compare this against the WHO 15 L per person per day standard. We were also able to further contextualize this storage information with collected data on the number and length of water-fetching journeys per day. We suggest such ML-based techniques could, with only a small amount of training, become a useful tool in water security research and humanitarian practice.
From a technical perspective, we note that even with relatively few training images (less than 700) the ML algorithm demonstrated impressive precision and recall, even with complications related to poor lighting, shading, occlusion of target objects by other objects, angles of view, etc. Notwithstanding these difficulties, we achieved 90% accuracy or better for all but one object class. Further training of the algorithm, with either real-world images or images specially constructed to be challenging, will only improve object detection accuracy.
Our dataset is limited in size as this is a proof-of-concept paper. However, we have employed techniques such as data augmentation/cross-validation to improve generalizability. Additionally, we are exploring the integration of larger datasets and alternative approaches (e.g., ensemble methods) to enhance model robustness in future work.
We also expect that other models, such as Microsoft's Florence-2 may be able to more efficiently automate the image analysis process, including offering summary narratives with desired statistics for each image analysed as well as for entire sets of images.
While we have made progress in using ML techniques to describe water storage quantitatively, this approach may have only limited relevance to water quality assessment. Specifically, while it may be possible to identify those that are excessively dirty or stored on dirt floors these characteristics are only indirectly linked to the quality of stored water.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.
The average water fetching time (including waiting) was 30 min per trip, with most households making 1–2 trips per day.