ABSTRACT
Efficient sensor operation and extended network lifetime are critical for effective water quality monitoring using Wireless Sensor Networks (WSNs). Traditional models often neglect the importance of information value, leading to redundant data transmission from low-value sensors and inefficient energy consumption. This study proposes a novel information-centric algorithm that employs Minimum Redundancy, Maximum Information (MIRI) principles to prioritize data collection from high-information sensors. By dynamically assessing the information value at each round, the algorithm strategically selects Cluster Heads (CHs) based on their ability to provide valuable insights while conserving energy. Simulation results indicate that the proposed model achieves an average residual energy of 65% after 1,500 rounds, compared to only 35% in conventional models. Additionally, the algorithm extends individual sensor lifetimes by up to 40%. These findings highlight the effectiveness of an information-centric approach in optimizing WSN performance, thereby facilitating improved efficiency in environmental monitoring applications.
HIGHLIGHTS
This research presents a novel information-centric algorithm for optimizing energy in water quality monitoring WSNs.
The algorithm utilizes minimum redundancy maximum information (MIRI) principles to prioritize high-value data.
It enhances network lifetime by reducing redundant data transmission from low-information sensors.
Results demonstrate significant performance improvements in energy efficiency and data accuracy through simulations.
INTRODUCTION
Water security is a critical issue that many countries around the world are coping with today (Saputra et al. 2023). The challenges are complex, including inadequate access to safe water, inefficient water management practices, increasing levels of water pollution, and escalating water demand driven by population growth and urbanization (Sukri et al. 2023). Ensuring water security is essential for achieving Sustainable Development Goal 6 (SDG 6), which aims to ensure availability and sustainable management of water and sanitation for all (Javaid et al. 2013). This goal is fundamental to human health, environmental sustainability, and economic prosperity. Addressing water security challenges requires innovative and sustainable solutions that encompass water conservation, infrastructure development, and improved water governance to ensure equitable access to safe and sufficient water resources.
Continuous evaluation and assessment of water quality are crucial for ensuring the safety and health of water resources (Abdolabadi et al. 2016). With the rapid growth of economies leading to increased pollutant discharge and emerging challenges such as the spread of diseases influenced by environmental conditions, there is growing concern over water quality (Magd et al. 2023). To address the concerns, there is an urgent need to ensure the society about the safety and health of water resources (United Nations Department of Economic and Social Affairs 2022). Traditionally, water quality monitoring involves field sampling and laboratory analysis which are not only challenging and time-consuming but also costly and may not provide the real-time data (Adu-Manu et al. 2020). However, contemporary approaches promote advancements in technology to innovate monitoring devices autonomous and capable of in situ measurement (Silva et al. 2022).
Water quality networks (WQNs) provide accurate and real-time information about the condition of water (Dasig 2019). Such information is vital for devising informed strategies regarding water management, pollution control, and safe water supply (Kumari et al. 2020). These networks allow us to detect prompt changes in water quality to avoid any potential hazards (Liu & Wu 2022). WQNs typically involve a network of sensors installed at specific locations within a water system (Saputra et al. 2023). These sensors collect data at consistent time slots and transmit the data to a central system (Pule et al. 2017). The data can then be investigated to assess the physical, chemical, and biological characteristics of water, identify trends, and detect any deviations from established standards or guidelines (Miller et al. 2023). With the advent of new technologies, many researchers and organizations have employed wireless sensor networks (WSNs) and Internet of Things (IoT) to facilitate effective and remote water quality monitoring (Wu et al. 2016; Ahmedi et al. 2018; Saravanan et al. 2018; Gurusamy & Diriba 2022; Ali et al. 2023; de Camargo et al. 2023). WSNs are self-configuring and self-coordinating networks that classically consist of sensor nodes equipped with energy sources, sensing mechanisms, data storage units, and transmitters (Zhao et al. 2018a). WSNs have become an essential tool in many areas, including environmental monitoring, agriculture, and industrial automation (Ullo & Sinha 2020). Several researches employed WSNs to monitor diverse water quality indicators such as pH, temperature, turbidity, total dissolved solids, redox, and dissolved oxygen (Khetre & Hate 2013; Sridharan 2014). For instance, Adu-Manu et al. (2020) employed wireless sensor nodes for real-time water quality monitoring, using energy-efficient data transmission and solar panels for node longevity. Conducted in Ghana's Greater Accra Region, the study used smart water sensors to measure physical and chemical parameters at Weija intake, a vital water source. Results revealed varying levels of pH, conductivity, calcium, temperature, fluoride, and oxygen content, impacting plant and aquatic life.
The widespread implementation of WSNs requires careful consideration of several factors, including energy consumption, scalability, and interruption during data collection to ensure the successful deployment and operation of WSNs for environmental monitoring (Zulkifli et al. 2022). Designing WSNs for environmental monitoring involves crucial considerations: placing sensor nodes for optimal coverage, managing energy efficiency, managing distance to reduce the energy during data transmission, designing communication protocols to accommodate limited bandwidth, employing data aggregation methods, and establishing scalable architectures (Wu et al. 2016; Evangelakos et al. 2022; Gurusamy & Diriba 2022). So far, energy consumption has been known as one of the most critical factors in WSNs design compared with other issues (Gherbi et al. 2017). The sensor nodes' limited energy capacity necessitates the development of energy-efficient protocols and algorithms to minimize power consumption during communication, data processing and aggregation, and data transmission (Rawat & Chauhan 2021). Efficient utilization of energy resources helps prolong the network's lifespan and improves its overall performance. A wide variety of studies assessed how data aggregation and routing algorithms heighten the energy efficiency of the networks (Smaragdakis et al. 2004; Nguyen et al. 2017; Shahraki et al. 2017; Abdulsalam et al. 2018; Rajathi 2023).
Clustering-based models aim to improve energy efficiency in WSNs by reducing the energy consumption of individual sensor nodes. By organizing the network into clusters, energy can be conserved by selecting cluster heads (CHs) that perform data aggregation and communication tasks, while other nodes can operate in a low-power sleep mode (Sharma et al. 2019). One popular clustering algorithm is low-energy adaptive clustering hierarchy (LEACH) (Heinzelman et al. 2000). LEACH utilizes a randomized rotation of CHs to evenly distribute energy consumption among nodes and extend the network lifespan. Various improvements over the original LEACH protocol address issues like non-uniform cluster head distribution and energy consumption. These enhancements include C-LEACH, employing centralized selection of CHs based on energy and location information (Heinzelman et al. 2002); MODLEACH, which optimizes energy usage by replacing CHs only when energy levels surpass a threshold and by employing differentiated signal amplification for various communication types (Mahmood et al. 2013): TL-LEACH (two levels LEACH), which addresses uneven energy distribution through secondary and primary CHs (Zhixiang & Bensheng 2007; Peng et al. 2015); PEGASIS (power efficient gathering sensor information system), using chain structure for energy-efficient data gathering and dissemination (Lindsey & Raghavendra 2002; Arora et al. 2016); DFCA (distributed fault tolerance clustering algorithm) (Azharuddin et al. 2013), which employ gateways and backup nodes for energy-efficient data transmission; and MLRC (multi-level route-aware clustering) employs a route-aware approach to establish communication paths among sensor nodes (Sabet & Naji 2016).
Fuzzy-based approaches aim to optimize CH selection and network performance in WSNs by enhancing the selection of CHs and mitigate uncertainties arising from environmental factors and overlapping parameters (Guo & Zhang 2014). LEACH-FL is an advanced version of LEACH employing fuzzy logic with input variables such as residual energy, distance to the Base Station (BS), and node degree for CH probability calculation (Ran et al. 2010). Multi-objective fuzzy clustering algorithm (MOFCA) addresses hot spot and energy depletion challenges, with nodes selecting temporary CHs based on fuzzy inputs (Sert et al. 2015). An adaptive multi-clustering algorithm using fuzzy logic (MCFL) utilizes three clustering algorithms guided by fuzzy logic and inputs for CH selection, adapting based on CH energy levels and thresholds (Mirzaie & Mazinani 2017). Lastly, a distributed fuzzy logic-based unequal clustering approach and routing algorithm (DFCR) encompasses information sharing, cluster formation, virtual backbone establishment, and data routing, with fuzzy logic-driven decisions for CH competency, timer-based radius calculations, member selection, and virtual backbone classification (Mazumdar & Om 2018).
Metaheuristic approaches are also utilized to address the complexity of clustering with the aim of efficiently selecting CHs from a vast solution space (Al Aghbari et al. 2020; Del-Valle-Soto et al. 2023; Abdolabadi & Khosravian 2025). Given the optimization problems, these methods offer valuable solutions. GP-LEACH and HS-LEACH are examples of how genetic algorithms and harmony search enhance CH selection (Mohammad et al. 2012). Sharmin et al. (2023) utilized the hybrid (HPSO-ILEACH) for CH selection to enhance the efficiency and lifespan of WSNs. Results indicate that the proposed hybrid algorithm boosts the network's lifespan and controls average energy consumption. Diakhate et al. (2023) implemented the firefly algorithm optimization for optimal CH selection. To evaluate the performance, the number of dead nodes and data packets received by the BS are assessed. Results demonstrate the efficacy of the proposed hierarchical clustering approach. Jabbar et al. (2023) introduced a novel routing strategy known as FLH-P, which integrates fuzzy logic with the hybrid energy-efficient distributing (HEED) algorithm to improve both the longevity of the network and the energy levels of individual nodes. The FLH-P approach employs HEED to establish clusters, and subsequently, a combination of fuzzy inference and the LEACH algorithm to take into account factors such as residual energy, minimal hops, and node traffic. Results showed the effectiveness of the FLH-P in reducing energy consumption and prolonging the network's operational lifespan. Bharany et al. (2023) proposed a clustering procedure for underwater wireless sensor networks (UWSNs) using the glowworm swarm optimization algorithm to improve energy efficiency.
Reviewing the literature reveals that while classical, fuzzy-based, metaheuristic, and hybrid approaches take into consideration clustering-based macro and micro parameters, as well as methodology based parameters such as residual energy, distance to the BS, node degree, and the chance of becoming a CH (Fanian & Rafsanjani 2019), there is a lack of attention to the value of information gained by the sensor nodes. As sensors consume energy to collect data from the environment, it is essential to ensure that the data are relevant and provide useful insights for the intended applications. Therefore, information gain plays a crucial role in determining the effectiveness of the network. Studies demonstrate the diverse applications of entropy theory in water systems, ranging from network evaluation and design to understanding the information content and organization of hydrological data (Shi et al. 2018; Chen et al. 2022). The entropy theory measures the amount of information in a random variable or a set of variables (Keum et al. 2017). The application of entropy theory in water systems has gained significant attention in the literature. Ruddell & Kumar (2009) introduced the concept of ecohydrologic process networks and identified the information content and organization of these networks using entropy theory. They discussed the potential of entropy-based measures in understanding the interactions and information flow within ecohydrologic systems. Li et al. (2012) proposed an approach that introduces a criterion called maximum information minimum redundancy (MIMR), based on the entropy theory. MIMR aims to optimize station placement by maximizing joint entropy among selected stations while considering transinformation both within and outside the chosen stations.
In this paper, we aim to introduce a novel information-centric algorithm designed specifically for WSNs deployed in water quality monitoring applications. This approach departs from conventional methods that treat all sensors equally. Our approach prioritizes data from sensors with high information value. By utilizing entropy theory and MIRI principles, the algorithm identifies these crucial sensors and optimizes energy consumption through strategic data collection. This focus on information value ensures the network gathers the most critical data while extending sensor lifetimes and network longevity. This research contributes to the field of WSN optimization by demonstrating the effectiveness of an information-centric approach in improving network performance and efficiency in resource-constrained environments.
The structure of this paper is as follows: Section 2 reviews the basic entropy for monitoring network selection, MIRI principles to identify high information sensors, the network model including the network setup, the clustering protocol, and CH selection algorithms. Section 3 presents a numerical example to demonstrate the proposed model. Section 4 discusses the results obtained from the simulations and highlights the benefits of the information-centric approach. Then, it compares the performance of our proposed algorithm against the existing model. Finally, Section 5 summarizes the key findings and emphasizes the potential of information-centric techniques for enhancing WSN efficiency and longevity.
METHODS
Basic entropy for monitoring network selection
In the entropy theory, essential information measures encompass marginal entropy, joint entropy, transinformation, and total correlation. These metrics give insights about the amount of information retained by individual random variables, the collective information conveyed by multiple variables, the extent to which knowledge of one variable can infer information about another, and the redundant information shared among multiple variables (Wang et al. 2018).


where ⟨·⟩ is the merging operator.

The parameters λ1 and λ2 represent the trade-off weights between information and redundancy in the network design, and they are constrained by the condition λ1 + λ2 = 1. To maximize the information content, λ1 should be assigned a larger value compared with λ2, as suggested in prior studies 0.8 and 0.2, respectively (Li et al. 2012; Wang et al. 2018). In this process, monitoring stations with higher uncertainty information are usually prioritized. Moreover, the selected stations should share as much information as the unselected stations.
Network model
The network model structure. The network model setup process involves determining the optimal number of clusters, clustering sensor nodes based on energy consumption, implementing the entropy method to prioritize nodes, selecting CHs based on distance, energy, and information value, and employing energy division and dormancy strategies to enhance efficiency. The left figure is inspired by Zhao et al. (2018b).
The network model structure. The network model setup process involves determining the optimal number of clusters, clustering sensor nodes based on energy consumption, implementing the entropy method to prioritize nodes, selecting CHs based on distance, energy, and information value, and employing energy division and dormancy strategies to enhance efficiency. The left figure is inspired by Zhao et al. (2018b).
Network setup
The model consists of N sensor nodes that are uniformly distributed in a circular area with a diameter of W. The BS is at the center of the network and has no energy restriction. This model considers the energy consumed by the sensor nodes during transmission, reception, and idle states.
Here, Eelec represents the energy consumed by the transceiver, εfs and εmp are the transmitter amplifier in the free space and the multipath model, and d0 is the crossover distance given by . The energy consumed to receive the message, ERX, is quantified by
. Therefore, the total energy is
.
Optimal number of clusters


The average energy consumed by a cluster . The energy consumed by all clusters in the region in one round is
. Hence, the optimal number of clusters can be calculated as
by taking the derivative of Esum with respect to k.
Clustering protocol
Hierarchical clustering is one of the widely used methods. At first, each of the nodes is considered as a cluster, and then, the distance between each node is calculated to form the distance matrix. Then, according to the closeness of the nodes to each other, they form a new cluster, and to some extent, it continues to reach the optimal number of k clusters. While the simplicity is one of the main advantages of this method, it is time-consuming especially for high-dimensional datasets. K-means clustering algorithm presents a solution to improve the efficiency.




CH selection
Our work builds upon the foundation laid by Zhao et al. (2018b), particularly their two-pronged CH selection strategy for general and large clusters. We extend this strategy by incorporating additional considerations that prioritize network longevity over data integrity. Specifically, we deviate from selecting CHs based solely on their proximity to the cluster center, distance from the BS, and residual energy. Instead, we introduce a modified selection process that considers the node with the highest information value (NHI) and prioritizes CHs located closer to NHI. This ensures that critical information is gathered and transmitted efficiently while also reducing the load on CHs.




To further prolong the network lifetime, the protocol implements a node dormancy mechanism that selectively puts nodes with low energy and long distances from CHs into dormancy. This mechanism is activated only after the first node death and involves several steps. Firstly, dormancy factors (Sdor) are calculated for all cluster member nodes . The smaller the Sdor value for a node, the higher the probability of it becoming dormant. Next, the node dormancy ratio (R) is determined by
, where n is the number of live nodes.
NUMERICAL EXAMPLE
Simulation parameters (Zhao et al. 2018b)
Parameter . | Value . |
---|---|
Eelec | 50 nJ/bit |
EDA | 5 nJ/bit/message |
εfs | 10 pJ/bit/m2 |
εmp | 0.0013 pJ/bit/m4 |
The diameter of monitoring area, D | 100 m |
Initial number of nodes, N | 30 |
Size of message, b | 4,000 bits |
Initial energy | 0.4 J |
Parameter . | Value . |
---|---|
Eelec | 50 nJ/bit |
EDA | 5 nJ/bit/message |
εfs | 10 pJ/bit/m2 |
εmp | 0.0013 pJ/bit/m4 |
The diameter of monitoring area, D | 100 m |
Initial number of nodes, N | 30 |
Size of message, b | 4,000 bits |
Initial energy | 0.4 J |
(a) The distribution of 30 sensor nodes in a 100-unit diameter network. The red star indicates the BS. (b) Radar plot of predefined random data distribution for each sensor node.
(a) The distribution of 30 sensor nodes in a 100-unit diameter network. The red star indicates the BS. (b) Radar plot of predefined random data distribution for each sensor node.
RESULTS AND DISCUSSION
The marginal entropy, mutual information, and the total correlation for each sensor.
The marginal entropy, mutual information, and the total correlation for each sensor.

(a) The position of selected sensors with high information value in the monitoring area (red sensor nodes). (b) The cluster assignments at the first iteration.
(a) The position of selected sensors with high information value in the monitoring area (red sensor nodes). (b) The cluster assignments at the first iteration.
Network lifetime
Alive node of the proposed model compared with the conventional model.
Zhao et al. (2018b) provided a comprehensive analysis of network lifetime across various protocols. They highlight that while SEP1 builds upon LEACH by considering initial energy, its performance in homogeneous networks is similar to that of LEACH, as all nodes have the same initial energy. In addition, the number of surviving nodes over time for both SEP and LEACH remains closely aligned, indicating limited advantages. However, the advantages of DEEC2 become apparent with continued iterations, showing a notable extension of network lifetime compared with LEACH – by 8.93 and 12.37% in two homogeneous networks, respectively.
The key takeaway from Zhao et al.'s findings is that their protocol effectively addresses energy distribution among nodes, yet it still does not account for the value of information collected. The protocols they evaluated (LEACH, SEP, and DEEC) tend to elect CHs without considering energy levels adequately, which can lead to inefficient energy usage and premature node failure. In contrast, the information-centric approach prioritizes sensors based on both their energy levels and their ability to provide high-value data.
(a) The energy consumption of each node by implementing the conventional algorithm (red lines just show selected sensor nodes with high information value). (b) Applying the proposed model indicates that selected sensors have the highest lifetime.
(a) The energy consumption of each node by implementing the conventional algorithm (red lines just show selected sensor nodes with high information value). (b) Applying the proposed model indicates that selected sensors have the highest lifetime.
Residual network energy
Residual energy under both conventional and proposed models. The proposed model demonstrates significantly higher residual energy compared with the conventional model.
Residual energy under both conventional and proposed models. The proposed model demonstrates significantly higher residual energy compared with the conventional model.
In the initial rounds (up to about 800 rounds), both models exhibit similar energy consumption rates, as the network is in its early operational phase and nodes are functioning without significant disruption. After 800 rounds, the conventional model begins to show accelerated energy depletion. This behavior is likely due to the uniform energy consumption across nodes, leading to the premature depletion of nodes closer to the BS or those burdened with high communication loads. In contrast, the proposed model maintains a more gradual energy depletion rate, extending the network's operational lifespan. At 1,200 rounds, the residual energy of the proposed model is visibly higher than that of the conventional model. This indicates that the proposed model is more energy-efficient during this period, benefiting from its adaptive strategies that prioritize high information value nodes and balance the workload among CHs. Beyond 1,400 rounds, the conventional model nearly depletes its energy reserves, resulting in network failure. Meanwhile, the proposed model continues to function, with a notable residual energy percentage at this stage. Therefore, the proposed model sustains the WSN's operation well beyond the point where the conventional model fails. It extends the network's lifetime by approximately 40% compared with the conventional model. This improvement is crucial for applications requiring prolonged and uninterrupted data collection, such as environmental monitoring.
CONCLUSIONS
In this research, we developed a WSN model which accounts for monitoring DO levels. The model is based on an energy balanced model, introducing a method to determine CHs considering the information value of sensor nodes. The process involves determining the optimal number of clusters, clustering sensor nodes using a hybrid algorithm, and implementing the entropy method to identify significant nodes. CHs are selected based on distance, residual energy, and information value. Energy division and dormancy strategies are employed to enhance energy efficiency, with updates made to network energy consumption.
A comparison is made between the existing protocol by Zhao et al. (2018b) and our proposed protocol incorporating an information value approach, optimizing energy consumption and extending sensor lifetime. The optimization of the MIRI problem results in selecting an optimal set of sensors with maximum total information content and minimum redundant information. Multivariate entropy, mutual information, and total correlation values are provided. Two scenarios are analyzed, comparing the proposed and existing models in terms of live nodes, network energy consumption, and residual energy analysis. The network's lifetime is a critical metric for evaluating the proposed protocol's success, with the survival of nodes directly impacting network longevity. Performance evaluation shows that the proposed model maintains a higher number of alive nodes compared with the conventional model, with nodes with high information content experiencing minimal energy loss and extended lifetime. In simulations, the proposed model maintained 22 alive nodes at round 1,258, while the conventional model had none. This translates to a 50% survival rate at round 1,274, highlighting the effectiveness of the information-centric approach.
The proposed model outperforms the conventional algorithm in terms of network longevity. Residual energy, an essential metric for network lifetime, is highlighted, with the proposed algorithm achieving higher residual energy levels compared with the conventional approach. This indicates greater energy retention in nodes and potential for a longer network lifetime. The superior residual energy of the proposed model suggests enhanced energy efficiency and network longevity.
Finally, it should be noted that the proposed model has not been tested in scenarios where the network topology changes dynamically, such as in situations of sensor failures or environmental disruptions, making this a compelling topic for future research. Addressing these challenges by integrating robust disruption management strategies into WSNs could significantly enhance their resilience and adaptability. Future research should focus on developing algorithms capable of proactively reconfiguring network topologies in real time, ensuring continuous and high-quality data collection even under adverse conditions. Additionally, exploring scalable solutions for larger networks and diverse environmental settings will be critical as WSN applications expand. Such advancements would not only optimize WSN performance but also enhance their reliability in real-world applications like environmental monitoring, disaster response, and public health initiatives.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.
Stable election protocol (Smaragdakis et al. 2004).
Distributed energy-efficient clustering (Javaid et al. 2013).