Population movement, such as commuting, can affect water supply pressure and efficiency in modern cities. However, there is a gap in the research concerning the relationship between water use and population mobility, which is of great significance for urban sustainable development. In this study, we analyzed the spatial–temporal dynamics of the population and its underlying mechanisms, using multi-source geospatial big data, including Baidu heat maps (BHMs), land use parcels, and point of interest. Combined with water consumption, sewage volume, and river depth data, the impact of population dynamics on water use was investigated. The results showed that there were obvious differences in population dynamics between weekdays and weekends with a ratio of 1.11 for the total population. Spatially, the population concentration was mainly observed in areas associated with enterprises, industries, shopping, and leisure activities during the daytime, while at nighttime, it primarily centered around residential areas. Moreover, the population showed a significant impact on water use, resulting in co-periods of 24 h and 7 days, and the water consumption as well as the wastewater production were observed to be proportional to the population density. This study can offer valuable implications for urban water resource allocation strategies.

  • Analysis of spatiotemporal population distribution and mobility based on the Baidu heat map.

  • Population dynamics mechanisms related to land use.

  • A novel idea exploring the impact of population dynamics on water use.

  • Valuable implications for optimizing and controlling water supply and wastewater treatment systems.

Rapid urbanization worldwide has led to a significant influx of people into cities presenting new issues in urban planning and the environment. These issues include unbalanced social development, water scarcity, traffic congestion, and air pollution (Bao et al. 2023). Particularly, the problem of water scarcity has been exacerbated by the increasing water demand resulting from urban population growth, which is compounded by the issue of irrational water resource utilization (Sadeghi et al. 2023). Within the city, the unevenly distributed population can result in various water demands, as a result, highly populated regions will face tremendous pressure on water supply. For the efficient management of water resources, it is crucial to investigate the relationship between population distribution and water usage patterns, as it is meaningful for the planning of water supply and supporting sustainable city development (Bakchan et al. 2022).

Several factors can affect water consumption, such as population, climate, seasonality, economy, and water price (Rasifaghihi et al. 2020; Wang et al. 2020; Yu et al. 2023). Among them, the importance of the population was emphasized by previous studies (Atinkpahoun et al. 2018). The water consumption is not only affected by the total population but also affected by its dynamic behavior. For example, Yu et al. (2023) suggested that water consumption and wastewater treatment plant (WWTP) discharges in a certain area are closely related to the total population of the area, which varies over time due to human movement. Smolak et al. (2020) considered the impact of total population on water consumption. Some previous studies reported the differences in urban water consumption before and after the coronavirus (COVID-19) pandemic (Kalbusch et al. 2020; Bakchan et al. 2022), while others have compared the differences in population aggregation patterns (Jia et al. 2020; Zeng et al. 2021; Zhang et al. 2022). These studies indirectly proved that the changes in population behavior had a certain impact on water use. In addition, there were studies on the water use patterns in residential areas, and the results show that water use and wastewater discharge are related to the commuting and lifestyle habits of the population (Atinkpahoun et al. 2018). It has been reported that in the residential area, water consumption can peak just before office hours, decline in the afternoon, and then increase again during the evening (Kavya et al. 2023). The mechanisms between population dynamics and water use is a subject worthy of further research, with significant implications for optimizing and controlling water supply and wastewater treatment systems in high-population-density cities.

Nevertheless, although the relationship between the total population and water use has been demonstrated, there is still a gap in the precise quantitative analysis of the impact of population dynamics on water use. Similarly, population dynamics are not sufficiently taken into account in water use predictions. These can lead to an increase in the operating costs and energy consumption of water supply and WWTP systems, and may even lead to the pollution of the water environment if sewage cannot be treated effectively. This is primarily owing to the difficulty of characterizing the dynamics of population activity. In recent years, the rapid development of mobile devices and the abundance of location-based services (LBS) big data enabled researchers to analyze human mobility patterns with finer temporal resolution (Gu et al. 2018). This also provides the possibility to study the impact of population mobility on urban water consumption and WWTP emissions. Baidu heat map (BHM) is one of the most popular dynamic LBS data and has gained extensive applications in studies related to behavioral characteristics and the spatiotemporal distribution of urban residents due to its good accessibility (Li et al. 2019; Bao et al. 2023). Previous studies have shown that the spatiotemporal dynamic distribution of populations is influenced by residents' behaviors and is subject to specific spatiotemporal rules, and most behaviors are predictable (Song et al. 2010; Zhang et al. 2023). For example, residents frequently commute among some fixed locations such as residential and workplaces (Kung et al. 2014). And they may prefer to go to leisure places or rest at home on weekends. The spatiotemporal changes in population density are influenced by differences in population activities and spatial functions (Shi et al. 2020), which can be analyzed by combining geospatial data, such as point of interest (POI) data. With these multi-source big data and water use data, the relationship between water use and population dynamics can be explored.

Based on these contexts, this study aims to analyze the influence of spatiotemporal dynamic changes in the population on urban water usage using the northern part of the Haidian District in Beijing, which is one of the biggest cities in China, as the study area. Using BHM data, we analyzed the temporal and spatial distribution characteristics of the population on weekdays and weekends. Subsequently, combined with POI data, we investigated the underlying mechanisms of population dynamics by analyzing the relationship between population and land use. Then, water consumption and sewage volume data were used to analyze the relationship between water use and population dynamics. The main contribution of this study to the current knowledge is the innovative empirical analysis of the relationships between spatiotemporal population dynamics and urban water usage using multi-source geospatial big data to provide valuable insights for the optimal allocation of urban water resources.

The remainder of this article is organized as follows. The section ‘Materials and Methods’ introduces the study area, data sources, and the employed methods. The section ‘Methods’ sequentially analyzes the spatiotemporal dynamics of the population, the relationship between population density and land use, and the relationship between water use and population. The section ‘Results’ discusses urban water usage under the influence of population dynamics. Finally, the conclusions are given in the last section.

Study area

This study focuses on the northern part of Haidian District in Beijing, China (Figure 1). The region comprises four towns (FTs), namely Shangzhuang Town (SZ), Xibeiwang Town (XBW), Wenquan Town (WQ), and Sujiatuo Town (SJT), with a total area of approximately 207 km2. According to the seventh national population census, the resident population of the region is 384,000. The region is located on the periphery of Beijing's core area, with most of it situated between the Fifth Ring Road and the Sixth Ring Road, exhibiting diverse land use types and urban functional zones. XBW and WQ are characterized by a significant presence of enterprises, industrial parks, and commercial residential areas, while SZ and SJT have more scenic spots and villages. The differences between the four regions make this area representative for analyzing population dynamics. Moreover, the water supply and drainage systems in the region are relatively independent, making it suitable for analyzing patterns of water use and wastewater discharge. On the one hand, the industrial sector in this region primarily consists of high-tech industries, including renowned companies such as Huawei, Baidu, and Tencent. This sector has a relatively high water use efficiency. On the other hand, there is a mismatch between water demand and actual water resource planning. For example, some of the sewage treatment facilities have insufficient operational capacity, and some water supply plants are under significant water supply pressure. Therefore, studying the water use patterns in this region is of significant importance for water resource management.
Figure 1

The geographical location and overall situation of the study area.

Figure 1

The geographical location and overall situation of the study area.

Close modal

Data and preprocessing

BHM data

Baidu is the largest search engine company in China with a massive user base and influence, holding a 66% market share in the Chinese search engine market. As of 2022, over 90% of internet users in China are Baidu users. In 2011, Baidu launched a type of big data visualization product called BHM (Li et al. 2019). This product utilizes location information from users accessing Baidu services like Baidu Maps, Baidu Search, Baidu Weather, and more (Lyu & Zhang 2019). By calculating the calorific value of human flows in different areas, the data are visualized on Baidu Maps. The BHM is updated every 15 min, allowing for a real-time representation of the heat of the crowd in specific areas and providing dynamic population distribution information (Zhang et al. 2020). Due to these advantages, it has been widely used for urban research.

This study collected the original BHM data for a duration of 14 days, from February 21, 2022, to March 6, 2022. The data were collected at hourly intervals, resulting in a total of 336 BHM images. The acquired raw data underwent image processing, georeferencing, and projection conversion, and were saved in the GeoTIFF format with RGB (Red, Green, Blue) bands. The raw data exhibit variations in population concentrations across different regions, depicted by a range of colors including black, blue, light blue, cyan, green, yellow, orange, and red. These colors correspond to an ascending trend in population density. To perform quantitative analysis on the regional heat data, we assigned integer values from 0 to 7 to these eight colors in ascending order, representing the heat values. After incorporating geographical coordinate information, the processed data were stored in the GeoTIFF format. Subsequently, the vector boundary file of the study area was used to clip, obtaining the heat map data of the study area. The heat values of each raster pixel were then identified and stored in another database. It should be noted that the data collected have no personal privacy issues.

Land use data

In this study, we used POI data and land use parcel data to assess land use patterns. POI data refer to specific locations of various geographic entities and typically include attributes such as the name, address, coordinates, and category. POI categories resemble land use categories and effectively depict people's preferences and social functions. The density of different POI types can serve as an indicator of land use and functional zoning (Wu et al. 2018).

In our study, we obtained 15,532 POIs in 2022, including 23 major categories (e.g., enterprises, commercial houses, tourist attractions, finance, and insurance services). Referring to the Chinese Land Use Classification Criteria (GB/T21010-2017) and considering the characteristics of the study area (Bao et al. 2023), we identified the main POI categories that were most relevant to human behavior patterns. These categories were further classified into commercial (CPOI), residential (RPOI), shopping (BPOI), tourist attraction (TPOI), education (EPOI), science (SPOI), and government (GPOI) categories. Specifically, The CPOI includes enterprises, finance, and insurance services; the RPOI represents residential areas; the BPOI signifies shopping services; the TPOI denotes tourist attractions; the SPOI includes science and culture services; the EPOI comprises a variety of schools; and the GPOI represents government agencies and social groups.

Land use parcels are delineated functional zoning areas based on satellite image data, which are represented as polygonal vector files with location and functional attributes. Similarly, the land use parcel categories were reclassified into office, industrial, residential, village, education, recreation, and transportation. The distribution of land use parcels is shown in Figures 1 and 2.
Figure 2

Several reclassified types of land use parcels.

Figure 2

Several reclassified types of land use parcels.

Close modal

Water quantity data

As it is difficult to obtain daily water consumption data due to its accessibility, the annual water consumption of each water user in 2021 was attained. However, for developed regions with a high wastewater collection ratio, the daily wastewater volume received by WWTPs on dry days can provide a measurable indicator of water consumption. Therefore, under the assumption of negligible leakage in the sewage network, it was used as an alternative.

The daily sewage volume data recorded by the flow meters at the Yongfeng WWTP from November 2021 to March 2022 were collected. The Yongfeng WWTP is located in XBW and primarily handles wastewater from the town. This data provide valuable insights into the patterns of wastewater generation, capacity requirements of the wastewater treatment infrastructure, the need for planning expansions or upgrades, and the overall efficiency of the system. The sewage volume can vary based on factors such as population size, industrial activities, and water usage patterns. Considering the high population density and significant population mobility in XBW, it was selected as a representative area to analyze the relationship between population mobility and water usage.

In addition, since the treated water from the Yongfeng WWTP is discharged into the Nansha River, we installed a water level meter in the river at the outfall of the Yongfeng WWTP to monitor water level changes. The monitoring frequency was set at 5-min intervals, and the data were collected from November 26, 2021 to February 9, 2022. During this period, there was almost no precipitation in Beijing, and variations in river water levels were primarily influenced by the volume of recycled water discharged from upstream sewage plants after treatment.

The TPI and the PDI

Various studies have proposed some population indices to predict the population of a sub-district based on the BHM (Li et al. 2019). Taking into account these existing studies and the characteristics of the study area, we developed two demographic indicators, namely the Total Population Index (TPI) and the Population Density Index (PDI), using spatial statistical methods. The formulas for these indicators are shown in Equations (1) and (2).
formula
(1)
formula
(2)
where TPIt is the sum of heat values of a spatial unit at time t, which serves as an indicator of the total population; PDIt is the average heat value in a spatial unit at time t, reflecting the population density; Hi represents the heat value of color i; and bi denotes the number of pixels of color i.

Spatial autocorrelation analysis

This study computed the PDI for 200 m × 200 m spatial units during the active period of 8:00–23:00 and then carried out spatial autocorrelation analysis to investigate the spatial clustering pattern of the population. Moran's I index (Moran 1948) is an index for spatial autocorrelation analysis that is capable of evaluating the dependency and heterogeneity of a certain variable in space. It can be categorized into global Moran's I and local Moran's I according to its application manner.

The global Moran's I was used to describe the overall distribution of attribute values (i.e. PDI in this study) within the study area and to examine the presence of spatial clustering (Wei et al. 2021). The formula for calculating the global Moran's I is shown in Equation (3).
formula
(3)
where n is the number of spatial units; xi and xj are the attribute values of the ith and jth spatial units; is the mean attribute value across all spatial units; Wij is the spatial weight.
The local Moran's I, on the other hand, was used to analyze the spatial autocorrelation characteristics of each unit, reflecting the spatial distribution of clustering patterns (Wei et al. 2021). The formula for calculating the local Moran's I is shown in Equation (4).
formula
(4)
where the coefficients have the same meanings as those in the previous formula.

Both indicators have values ranging from −1 to 1. Under a certain level of significance, a positive value of the global Moran's I indicates positive spatial autocorrelation, while a negative value suggests negative spatial autocorrelation. When the value approaches 0, it means the absence of spatial autocorrelation. The local Moran's I values can be used to distinguish four spatial types. A positive value indicates High–High (HH) clustering or Low–Low (LL) clustering, where the local average is either higher or lower than the overall average. Conversely, a negative value suggests High–Low (HL) clustering or Low–High (LH) clustering, where high values are surrounded by low values or vice versa (Zeng et al. 2021).

Kernel density estimation

Kernel density estimation (KDE) is a statistical method that reconstructs the probability density of the spatial distribution of points and lines by considering their current locations (Anderson 2009). It analyzes the proximity and arrangement of the selected points and lines to create a continuous density surface, revealing the relative likelihood of finding additional points or lines in different areas. Since KDE takes into account the location effect of the first law of geography, it is often used to deal with datasets with spatial uncertainty and is regarded as superior to other density expression methods. We used KDE to calculate the density of the POI for each functional category on a 200 m × 200 m grid to reflect various land use types. As a result, the scale and format of the processed POI density data can be identical to the PDI data, thereby enabling further analysis. The calculation formula of the KDE method is shown in Equation (5).
formula
(5)
where f(x) represents the point density of the grid in location X; K is the kernel function; h is the bandwidth to define the size of smoothing; n denotes the number of estimated points; i denotes each estimated point; and Xi represents the location of each input point.

Random forest

Random forest (RF) is a tree-based machine learning model that incorporates the principles of ensemble learning and randomness (Breiman 2001). It operates by constructing multiple decision trees by randomly selecting m sub-samples and n sub-features from the original dataset. Each tree is trained independently on different subsets of the data and random subsets of the features. The final prediction is made by combining the predictions of individual trees, typically through averaging or voting. The randomness makes the model more robust and less prone to overfitting. The randomness in RF does not only come from randomly selecting a subset of data samples, i.e., bagging, but also from the feature randomness that enhances the diversity among the trees. By leveraging the strength of multiple trees and the inherent randomness, RF can effectively capture complex relationships and interactions in the data. It is particularly suitable for handling nonlinear and high-dimensional datasets (Bao et al. 2023). Recognizing the good performance of the RF model, we used it to capture the intricate and nonlinear relationship between population distribution and land use types across different periods.

In addition, to evaluate the accuracy of the RF model, we used mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R2) to assess its performance. MAE measures the average absolute errors between predicted values and actual values. RMSE represents the square root of the average of the squared differences between predicted and actual values. The coefficient of determination, often denoted as R2, represents the proportion of the variance in the dependent variable that is predictable from the independent variables, indicating the extent to which the model can explain the variations in the observations. The better the model's performance, the closer the MAE and RMSE are to 0, and the closer R2 is to 1.

Linear regression model and continuous wavelet analysis

The relationship between population dynamics and daily sewage volume was analyzed by adopting a linear regression model based on ordinary least squares. The sewage volume was taken as the dependent variable and the TPI as the independent variable to construct the regression model.

Continuous wavelet analysis (CWT) was employed to analyze the periodical components of the water level series from the river downstream of the Yongfeng WWTP outfall. Wavelet analysis is a powerful mathematical tool developed by Morlet (1982) for signal processing. It is based on the concept of mathematical functions called ‘wavelets’, which are spatially localized in both time and frequency domains. Unlike traditional Fourier analysis, which offers a fixed resolution across all frequencies, wavelet analysis allows for variable resolution, making it particularly useful for analyzing signals with rapidly changing frequencies over time. Since the introduction of the Morlet wavelet (Grossmann & Morlet 1984), wavelet analysis has become more popular in geophysics and meteorology (Yi & Shu 2012; Hermida et al. 2015). In this study, Morlet was chosen as the wavelet basis function ψ0(η), as it can effectively balance the relationship between time and frequency resolution. The Morlet wavelet basis function is represented as follows:
formula
(6)
where ψ0(η) is the wavelet function, i is the imaginary unit, η and ω0 represent dimensionless time and frequency factors, respectively. To meet the acceptability condition of the wavelet, we set ω0 = 6. After performing CWT on the given signal Xn, the formula for wavelet coefficients is as follows:
formula
(7)
where WnX(s) represents the wavelet coefficient, essentially a complex number with real and imaginary parts. s is the scale of the wavelet, t is the time step, n is the local time index, and n′ is the shifted time index of the time coordinate Xn. In this study, the real part of the wavelet coefficients represents the amplitude and periodic position of water level fluctuations at different time scales. Wavelet power is the squared modulus of wavelet transform coefficients, which helps to study the significance of periods identified through background white noise testing in the time–frequency domain. The dominant periodic position of water level fluctuations is determined by the peak values of the average wavelet power spectrum obtained through a 5% significance test.

Temporal characteristics of population aggregation

As the characteristics of functional zoning are different among the four towns, the temporal variation of the population distribution in each town was analyzed separately (Figure 3). The population quantity and density of different regions showed significant differences over time. The population and population density were highest in XBW, followed by WQ with a relatively small population but the second highest concentration. On the other hand, SZ and SJT had smaller populations and lower population densities. This difference can be attributed to the presence of dense industrial and commercial residential areas in XBW and WQ, while SZ and SJT have more tourist attractions, green spaces, and squares.
Figure 3

Temporal variation in population in each region. (a) Hourly average of the TPI, (b) hourly average of the PDI, (c) daily average of the TPI by week, and (d) daily average of the PDI by week.

Figure 3

Temporal variation in population in each region. (a) Hourly average of the TPI, (b) hourly average of the PDI, (c) daily average of the TPI by week, and (d) daily average of the PDI by week.

Close modal

In terms of the variation mode, the populations exhibited daily periodicity. The period between 8:00 and 23:00 demonstrated the highest activity level, as indicated by a high heat value, suggesting increased movement and concentration. On the contrary, during the period of 0:00–7:00, the active population decreased, and there was a decrease in the positioning frequency as individuals were primarily at rest, resulting in a low heat value. Furthermore, significant population fluctuations were observed during commuting hours (7:00–9:00 and 17:00–20:00). These findings align with our understanding of human activity patterns during work and sleep times.

Between 8:00 and 23:00, the average TPI ratio of weekdays to weekends was calculated in different scales. A ratio of 1.11 was found for the entire study area and 1.04 for all of Beijing. This result indicates that the total population on weekdays was higher than on weekends for the FT area. Within the region, the ratio was 1.12 for WQ, 1.19 for XBW, 0.96 for SZ, and 1.02 for SJT. In addition, the PDI on weekdays was also higher than on weekends, with WQ and XBW exhibiting particularly noticeable differences (Figure 4). On weekdays, the population density in each zone followed a ‘growth-decrease-stability-growth-decrease’ pattern, displaying more significant fluctuations during commuting periods. On weekends, the population density demonstrated a ‘decrease-stability-increase’ trend, with a later wake-up time and slower growth of the PDI after 8:00. In contrast to weekdays, the activity patterns of people on weekends were delayed, resulting in insignificant fluctuations during morning and evening commuting hours and reduced traffic pressure. Specifically, the population of XBW and WQ was larger on weekdays than on weekends, especially in XBW, which functioned as a concentrated work area, attracting individuals from outside the region for work. Conversely, the population of SZ and SJT was larger on weekends than on weekdays, reflecting a focus on livability and leisure activities. During weekdays, some individuals residing in these two areas commuted elsewhere for work, indicating an outflow of people.
Figure 4

Changes in the PDI of each zone on the working day and off day. (a) FT, (b) SJT, (c) SZ, (d) WBW, and (e) WQ.

Figure 4

Changes in the PDI of each zone on the working day and off day. (a) FT, (b) SJT, (c) SZ, (d) WBW, and (e) WQ.

Close modal

Spatial distribution characteristics of population aggregation

The temporal changes in the spatial distribution of population density during weekdays and weekends are shown in Figure 5. The BHM s were overlayed and averaged during the active periods for weekdays and weekends (Figure 6). The spatial location of population aggregation on weekdays and weekends was highly overlapping, but the speed of population gathering and dispersion was higher on weekdays. Specifically, two main areas showed high population aggregation on weekdays: Yongfeng Industrial Park in XBW and Zhongguancun Environmental Protection and Technology Park (ZE-Park) in WQ. Yongfeng Industrial Park houses numerous enterprises and industries, such as Zhongguancun Integrated Circuit Design Park, with surrounding commercial residential buildings like Yongfeng Jiayuan. ZE-Park, which is located in the northern part of WQ Town, accommodates various companies and research institutes, including Huawei Beijing Research Institute, with several villages and residential areas to the south of ZE-Park. Generally, on weekdays, individuals gather in the industrial and enterprise zones for work, with some enterprises operating until late hours, which lead to employees getting off work late. Notably, during the periods of 8:00–10:00 and 17:00–19:00, high population concentration along the Sixth Ring Road was observed, indicating high traffic flow and high population mobility during these two commuting hours. It is worth noting that Beijing often encounters traffic congestion during peak commuting hours on weekdays.
Figure 5

The spatial distribution of heat maps over time. (a) weekday and (b) weekend.

Figure 5

The spatial distribution of heat maps over time. (a) weekday and (b) weekend.

Close modal
Figure 6

The average population aggregation during the active period (8:00–23:00). (a) Weekday, (b) weekend, (c) the ratio of heat value between weekday and weekend, and (d) the ratio of land use parcels' TPI between weekday and weekend.

Figure 6

The average population aggregation during the active period (8:00–23:00). (a) Weekday, (b) weekend, (c) the ratio of heat value between weekday and weekend, and (d) the ratio of land use parcels' TPI between weekday and weekend.

Close modal

Compared to the weekdays, weekends exhibited lower peak population density and generally lower levels of population aggregation, with a more dispersed population distribution. The high aggregation areas on weekends primarily comprised commercial residential communities, such as Yongwang Home and Tujing Jiayuan in XBW Town, Dong Xiaoying Village and Cuibei Jiayuan in SZ Town, Tongzeyuan in SJT Town, and Baijiatong District in WQ Town. The secondary level of population aggregation areas consisted of smaller residential areas and villages. Notably, the villages in Beijing are urbanized villages with high housing density and a diverse population of residents. Along the Sixth Ring Road, there was a noticeable increase in population from 16:00 to 18:00, most likely due to recreational activities or evening dining.

To quantitatively analyze the population differences between weekdays and weekends, we computed the ratio of heat values during the active periods for weekdays and weekends in spatial terms (Figure 6(c)). Additionally, we derived the ratio of TPI values for different land use plots (Figure 6(d)). According to the figure, Yongfeng Industrial Park and ZE-Park emerged as the main areas with ratios greater than 1, confirming the higher population density on weekdays in industrial and enterprise zones. The ratios for industry and office land parcels also exceeded 1, further supporting the observation. In contrast, residential areas and villages exhibited ratios below 1 during the daytime, indicating a larger proportion of people resting at home during weekends compared to weekdays.

Subsequently, the PDI for 200 m × 200 m spatial units during the active period was calculated. The global Moran's I index values for the PDI on weekdays and weekends were 0.887 and 0.875, respectively, indicating a spatial clustering pattern of population aggregation. And the population aggregation is more pronounced on weekdays compared to weekends. The results of the local Moran's I analysis are presented in Figure 7. The spatial clustering pattern of the population predominantly manifested HH clustering, with a significant concentration of people in XBW and WQ Town, particularly. The spatial autocorrelation between weekdays and weekends was largely consistent, and the discrepancies were mainly in locations such as high schools, banks, and government agencies, where students or employees are present on weekdays but not during weekends due to the absence of classes or work.
Figure 7

The local Moran's I of the PDI during the active period. (a) Weekday and (b) weekend.

Figure 7

The local Moran's I of the PDI during the active period. (a) Weekday and (b) weekend.

Close modal

Relationship between population distribution and land use

The KDE method was employed to calculate POI density for seven functional categories on a 200 m × 200 m grid to represent various land use types. To examine human activity patterns, we divided the heat maps from 0:00 to 23:00 into four time periods, i.e., 0:00–7:00 (nighttime), 8:00–12:00 (morning), 13:00–17:00 (afternoon), and 18:00–23:00 (evening). We separately averaged the PDI for these four time periods on weekdays and weekends. Subsequently, we constructed a RF model to analyze the relationship between land use types and population changes using the density of various POI types as the independent variable and the average PDI as the dependent variable. After developing the RF model, we extracted the values of the feature importance for each predictor variable, which indicated the contribution of the predictor variable to the target variable. Finally, we used the feature importance to analyze the relationship between population distribution and land use during different periods on weekdays and weekends, respectively. The feature importance assessment results are shown in Figure 8, and the model's fitting accuracy is given in Figure A.1. The MAE and RMSE being less than 0.25 and 0.4, respectively, along with an R2 value greater than 0.6, indicated a good performance of the RF model.
Figure 8

Results of RF regression models. (a) Weekdays and (b) weekends. The CPOI represents the kernel density of commercial POIs, while the RPOI stands for residential, BPOI for shopping, TPOI for tourist attraction, EPOI for education, SPOI for science, and GPOI for government.

Figure 8

Results of RF regression models. (a) Weekdays and (b) weekends. The CPOI represents the kernel density of commercial POIs, while the RPOI stands for residential, BPOI for shopping, TPOI for tourist attraction, EPOI for education, SPOI for science, and GPOI for government.

Close modal

The RF regression achieved satisfactory fitting accuracy for all four time periods for both the weekdays and weekends. For weekdays, during the 0:00–7:00 time period, the population density distribution was mainly influenced by residential land use. During the 8:00–12:00 and 13:00–17:00 time periods, commercial land use played a dominant role in determining the population distribution, indicating that a large number of people were concentrated in the enterprise, finance, and insurance functioning zones. During 18:00–23:00, shopping land use and commercial land use were the main influencing factors, suggesting intensive activities for shopping and working purposes in this period. For weekends, similar patterns were observed during the 0:00–7:00 time period, in which residential land use strongly influenced the population density distribution. During the remaining three time periods on weekends, shopping land use and residential land use were the main influencing factors of the population distribution.

In general, the population distribution at nighttime (0:00–7:00) was mainly influenced by residential land use on both weekdays and weekends. During the weekday daytime (8:00–17:00), working was the main activity, while during the weekend daytime, shopping and leisure activities became the primary focus. In the evening (18:00–23:00), most people engaged in shopping and leisure activities.

Relationship between population and water use

We used water meter locations and corresponding water consumption data for water users in 2021 to visualize on a map. It is worth noting that a water meter can represent either a single water user or multiple water users, so there is some overlap or deviation between meter locations and actual water user positions. We observed a general consistency between the distribution of annual water use by water users and population density. As can be seen from the bar graphs in Figure 9, the annual water consumption of the four towns was also generally consistent with the TPI and the 7th Census population.
Figure 9

Locations of water users and their water consumption in 2021.

Figure 9

Locations of water users and their water consumption in 2021.

Close modal
Taking XBW Town, which exhibits prominent commuting characteristics, as a representative area, we investigated the relationship between the TPI and the volume of sewage discharge within the area (Figure 10). In general, there were similar trends between the population and sewage discharge, and the population and the sewage volume had a linear relationship to some extent. Specifically, on weekdays when the population was concentrated, there was an increase in water use and a larger volume of wastewater discharged. Toward the end of the week, the water volume tended to decrease, indicating that factors such as the separation of jobs and housing had an actual impact on regional water use and sewage discharge. However, the water quantity data and the process of population changes did not perfectly align in detail, such as a sudden drop in water volume on some weekdays. We speculated that this may be a problem with the data, or it may be the result of differences in water use characteristics of different populations and uncertainty in water use. Therefore, this study preliminarily explored the qualitative patterns between job-housing separation and sewage characteristics through BHM data. More detailed data will be needed to support the accurate estimation of the relationship between the population and water usage.
Figure 10

The relationship between the TPI and sewage volume. (a) trends of population and sewage discharge and (b) relationship between population and sewage discharge.

Figure 10

The relationship between the TPI and sewage volume. (a) trends of population and sewage discharge and (b) relationship between population and sewage discharge.

Close modal

Population dynamics and mechanisms

As described in the ‘Results’ section, a high degree of population aggregation and differentiation within the study area was observed, both temporally and spatially. The changes in population quantity and density between weekdays and weekends were found to be different. This phenomenon can be attributed to variations in resident activity patterns and urban spatial functional differentiation. On weekdays, most people are engaged in work activities, resulting in a higher population density in workplaces, such as enterprises and finance districts. Consequently, the heat map value increases along the main roads during commuting times. On weekends, people engage in leisure and rest activities, such as shopping, dining out, visiting scenic spots, and spending time at home. This shift in activities leads to higher population density in residential areas, shopping centers, and recreational zones, and this phenomenon is consistent with previous studies (Wang et al. 2011; Li et al. 2019; Bao et al. 2023). Additionally, we investigated the characteristics and differences of sub-regions. Within the region, XBW Town and WQ Town are characterized by commerce finance, and industrial functions, leading to a higher population density on weekdays compared to weekends. Conversely, SZ Town and SJT Town are more oriented to residential and leisure functions, resulting in a slightly higher concentration of people on weekends compared to weekdays.

In the results of the RF analysis, higher feature importance indicates that this particular land use type is more influential on population density during that time period, suggesting a closer association between residents and activities related to this land use type. The results of the RF feature importance show that on weekdays, people generally follow a pattern of ‘rest-work-leisure-rest’ throughout the day, while on weekends, the activity pattern shifts to ‘rest-leisure-rest’. Similar results were found in previous studies (Bao et al. 2023). On the other hand, the intensity variation of the influencing factors reflects not only the daily activity patterns of residents but also the heterogeneity of urban functional zoning. At night, residential land significantly influences urban population distribution, whereas during the day, the distribution is primarily affected by commercial land (on weekdays) and shopping land (on weekends). Different from previous studies in other cities (Feng et al. 2019; Li et al. 2019), enterprises have a larger influence during weekdays' daytime than educational and scientific institutions. This can be explained by the differences in urban function zoning between cities. Specifically, unlike the region studied by the literature (Feng et al. 2019; Li et al. 2019), there is a higher density of enterprises and financial institutions in this study area, while educational institutions are relatively rare. The spatial and temporal distribution of the population is actually the result of the purpose-oriented activities of the residents, which are affected by the distribution of the functional zones. The combined effect of institutional change, market economy, planning, and regulation diversified urban space and functional zoning. Therefore, it can be said that the interaction between residents' activity purpose and spatial function difference leads to the spatial and temporal variation of population distribution.

The impact of population dynamics on water use

The analysis of water usage and population changes revealed similar trends and a positive linear correlation between population evolution and sewage volume changes. Intensive population agglomeration on weekdays corresponds to high water usage and sewage generation. In contrast, on weekends, the population decreases and sewage discharge decreases accordingly. Moreover, the water usage of each water user in 2021 aligns with the distribution of the heat value and the census population. To further verify the observed water usage patterns, we examined the water level data from the river downstream of the Yongfeng WWTP outlet (Figure 11(a)), and the wavelet power spectrum and the average power curve of the water level time series derived from the CWT analysis are shown in Figure 11(b). The analysis was conducted during the dry season. For some water-scarce cities, treated wastewater even becomes one of the main sources of water for urban rivers in dry seasons (Luthy et al. 2015). The average wavelet power spectrum showed three notable periods of 12 h, 24 h, and 7 days. Among them, the 24-h period was the most obvious and dominant. The power of the 12-h period was weaker and less variable than the 24-h period but can still be clearly identified over much of the data. Both periods can be attributed to the influence of the WWTP discharge, as demonstrated by similar patterns in other studies (Hubbard et al. 2016; Eppehimer et al. 2021). These diurnal and semidiurnal patterns can be explained in terms of human activity patterns. Generally, people are more active during the day and rest at night. Since water usage and discharge are directly related to population activities, the inflow and outflow of sewage also exhibit a similar pattern, leading to the dominance of the 24-h period. The 12-h period can be attributed to daily activity patterns in the population (Matos et al. 2013; Wang et al. 2023). Typically, people engage in bathing, flushing, and cleaning mainly in the morning and evening, which are the sources of domestic wastewater, especially in residential areas. During the day, people go to work or engage in other activities that do not necessitate the use of water, resulting in a slight decrease in water use. This leads to a semidiurnal pattern. Therefore, affected by human activity patterns, municipal WWTP influent and effluent usually show diurnal patterns and semidiurnal variations.
Figure 11

Characteristics of the water level downstream of the discharge outlet. (a) The temporal variation of the water level, (b) the wavelet power spectrum of the water level time series, and (c) the average power curve of the water level time series.

Figure 11

Characteristics of the water level downstream of the discharge outlet. (a) The temporal variation of the water level, (b) the wavelet power spectrum of the water level time series, and (c) the average power curve of the water level time series.

Close modal

In addition, a 7-day period is also evident. According to the previous analysis, it can be seen that the movement of the population in the XBW Town is 19% more on weekdays than on weekends. Hence, the weekly variation of the water use population resulted in the 7-day period pattern of wastewater discharges. Combined with Figure 10, it can be concluded that the significant separation of jobs and residences in metropolitan areas can have a significant impact on urban water use and wastewater discharges. It is consistent with previous literature (Atinkpahoun et al. 2018). Therefore, as a suggestion, maybe it is feasible to adjust the water supply mode and the wastewater treatment operation mode in a more efficient and less energy-consuming way, in conjunction with the activity pattern and the migration pattern of the population, so as to ensure a more cost-effective operation of the city.

Limitations and implications

It should be noted that the measurement of water consumption and wastewater production is a complex and systematic project that may also involve factors such as spatial geographic features, types of water use, and water resource management. The current research still has some limitations. Limited by data sources, this study only selected 14 days of BHM data, as well as daily sewage and annual water consumption data for research. On the one hand, although the BHM data can represent a significant number of Baidu users, it is based on sampled data and may contain potential uncertainties. Additionally, it may not capture data from individuals without smartphones, such as children and the elderly, or those who do not use Baidu-related products, these exceptions could result in some deviations in the results. On the other hand, the length and accuracy of the water volume data can still be improved in the future for a more incisive analysis. Therefore, the analysis presented here provides preliminary results regarding water usage and sewage discharge under the influence of population changes, and there is still a significant distance to cover in using population-related data to provide precise technical guidance for water resource supply and demand management.

Despite its preliminary characteristics, this study can demonstrate the potential to utilize population dynamics for predicting water consumption and sewage discharge, optimizing water supply infrastructure, and improving water resources management level. These findings are of significant importance to society's sustainable development and urban well-being. Some important issues remain to be addressed by future studies. Higher spatial resolution of big data may further improve the knowledge of water use mechanisms and help in urban water consumption prediction. As advanced machine learning methods, such as relevance vector machine tuned with improved Manta-Ray foraging optimization (RVM-IMRFO), the hybrid adaptive neuro-fuzzy inference system coupled with the new hybrid heuristic algorithm techniques (ANFIS-WCAMFO), and spatiotemporal attention long short-term memory (STA-LSTM), have been developed rapidly, they can be effectively employed in the modeling and prediction of water and wastewater time series. Furthermore, constructing the population estimation models using multi-source big data, clarifying the water use quotas of different population groups during various periods, and further assessing the actual water use efficiency are of great significance in promoting the efficient use of water resources. These efforts will bring substantial improvements to the accuracy of evaluation results and the innovation of evaluation work and therefore provide substantial references to urban planning and water resource management strategies.

In this study, we analyzed the spatiotemporal dynamics and influencing mechanisms of population using BHM, land use parcels, and POI data, and studied the impact of population dynamics on water use using data on water consumption, sewage volume, and river depth. The main findings of this article include:

  • i.

    From a temporal perspective, we observed different patterns of population change between weekdays and weekends. On weekdays, there are two additional minor peaks observed during commuting hours, distinguishing them from weekends. And the activity time of residents on weekends tends to be slightly delayed compared to weekdays.

  • ii.

    Spatially, the populations display H–H clustering both on weekdays and weekends, with significant overlap in spatial distribution. The population density is generally higher on weekdays compared to weekends. Specifically, on weekdays, high population concentrations are mainly observed in industrial parks and enterprise districts, while on weekends, they tend to be more concentrated in residential areas. Moreover, different sub-regions have different population dynamic characteristics.

  • iii.

    In terms of the underlying mechanisms of population dynamics, during the daytime, population distribution is primarily influenced by work-related areas on weekdays, while shopping and recreation areas have a predominant impact on weekends. During nighttime, residential areas play a major role in population distribution. The interaction between residents' activity purpose and spatial function difference leads to the spatial and temporal variations of population distribution.

  • iv.

    Population density and water use are positively related and both show diurnal and weekly patterns. The daily routine and commuting activities of the residents, and the distribution of urban functional areas, jointly lead to variations in urban water consumption and wastewater discharge.

By comprehensively analyzing the interaction between population dynamics and water use, our study contributes to a better understanding of urban water management and provides valuable insights for sustainable resource allocation and planning. While we have demonstrated the influence of changing population dynamics on urban water use, our study still has some limitations stemming from data acquisition and the scale of our study. To further advance our understanding of the complex mechanisms underlying water usage in urban areas, future works should integrate more multi-source big data and higher-resolution water quantity data. It will enable a more in-depth exploration of these dynamics, ultimately contributing to the enhancement of the operational efficiency of water supply and wastewater treatment systems.

We thank for the water quantity data provided by the Beijing Water Science and Technology Institute.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Atinkpahoun
C. N.
,
Le
N. D.
,
Pontvianne
S.
,
Poirot
H.
,
Leclerc
J.-P.
,
Pons
M.-N.
&
Soclo
H. H.
2018
Population mobility and urban wastewater dynamics
.
Science of the Total Environment
622
,
1431
1437
.
Bakchan
A.
,
Roy
A.
&
Faust
K. M.
2022
Impacts of COVID-19 social distancing policies on water demand: A population dynamics perspective
.
Journal of Environmental Management
302
, 113949.
Bao
W.
,
Gong
A.
,
Zhang
T.
,
Zhao
Y.
,
Li
B.
&
Chen
S.
2023
Mapping population distribution with high spatiotemporal resolution in Beijing using Baidu heat map data
.
Remote Sensing
15
, 458.
Breiman
L.
2001
Random forests
.
Machine Learning
45
,
5
32
.
Eppehimer
D. E.
,
Enger
B. J.
,
Ebenal
A. E.
,
Rocha
E. P.
&
Bogan
M. T.
2021
Daily flow intermittence in an effluent-dependent river: Impacts of flow duration and recession rate on fish stranding
.
River Research and Applications
37
,
1376
1385
.
Feng
D. Y.
,
Tu
L. L.
&
Sun
Z. W.
2019
Research on population spatiotemporal aggregation characteristics of a small city: A case study on Shehong County based on Baidu Heat Maps
.
Sustainability
11
, 6276.
Grossmann
A.
&
Morlet
J.
1984
Decomposition of hardy functions into square integrable wavelets of constant shape
.
Siam Journal on Mathematical Analysis
15
,
723
736
.
Gu
J. F.
,
Xu
P.
,
Pang
Z. H.
,
Chen
Y. B.
,
Ji
Y.
&
Chen
Z.
2018
Extracting typical occupancy data of different buildings from mobile positioning data
.
Energy and Buildings
180
,
135
145
.
Hermida
L.
,
Lopez
L.
,
Merino
A.
,
Berthet
C.
,
Garcia-Ortega
E.
,
Sanchez
J. L.
&
Dessens
J.
2015
Hailfall in southwest France: Relationship with precipitation, trends and wavelet analysis
.
Atmospheric Research
156
,
174
188
.
Hubbard
L. E.
,
Keefe
S. H.
,
Kolpin
D. W.
,
Barber
L. B.
,
Duris
J. W.
,
Hutchinson
K. J.
&
Bradley
P. M.
2016
Understanding the hydrologic impacts of wastewater treatment plant discharge to shallow groundwater: Before and after plant shutdown
.
Environmental Science: Water Research & Technology
2
,
864
874
.
Jia
J. S. S.
,
Lu
X.
,
Yuan
Y.
,
Xu
G.
,
Jia
J. M.
&
Christakis
N. A.
2020
Population flow drives spatio-temporal distribution of COVID-19 in China
.
Nature
582
,
389
.
Kalbusch
A.
,
Henning
E.
,
Brikalski
M. P.
,
De Luca
F. V.
&
Konrath
A. C.
2020
Impact of coronavirus (COVID-19) spread-prevention actions on urban water consumption
.
Resources Conservation and Recycling
163
, 105098.
Kavya
M.
,
Mathew
A.
,
Shekar
P. R.
&
Sarwesh
P.
2023
Short term water demand forecast modelling using artificial intelligence for smart water management
.
Sustainable Cities and Society
95
, 104610.
Luthy
R. G.
,
Sedlak
D. L.
,
Plumlee
M. H.
,
Austin
D.
&
Resh
V. H.
2015
Wastewater-effluent-dominated streams as ecosystem-management tools in a drier climate
.
Frontiers in Ecology and the Environment
13
,
477
485
.
Lyu
F. N.
&
Zhang
L.
2019
Using multi-source big data to understand the factors affecting urban park use in Wuhan
.
Urban Forestry & Urban Greening
43
, 126367.
Matos
C.
,
Teixeira
C. A.
,
Duarte
A.
&
Bentes
I.
2013
Domestic water uses: Characterization of daily cycles in the north region of Portugal
.
Science of the Total Environment
458
,
444
450
.
Moran
P. A. P.
1948
The interpretation of statistical maps
.
Journal of the Royal Statistical Society Series B-Statistical Methodology
10
,
243
251
.
Morlet
J.
1982
Sampling theory and wave-propagation
.
Geophysics
47
,
489
489
.
Rasifaghihi
N.
,
Li
S. S.
&
Haghighat
F.
2020
Forecast of urban water consumption under the impact of climate change
.
Sustainable Cities and Society
52
, 101848.
Sadeghi
B.
,
Borazjani
M. A.
,
Mardani
M.
,
Ziaee
S.
&
Mohammadi
H.
2023
Systemic management of water resources with environmental and climate change considerations
.
Water Resources Management
37
,
2543
2574
.
Shi
P. P.
,
Xiao
Y. H.
&
Zhan
Q. M.
2020
A study on spatial and temporal aggregation patterns of urban population in Wuhan City based on Baidu Heat Map and POI data
.
International Review for Spatial Planning and Sustainable Development
8
,
101
121
.
Smolak
K.
,
Kasieczka
B.
,
Fialkiewicz
W.
,
Rohm
W.
,
Siła-Nowicka
K.
&
Kopańczyk
K.
2020
Applying human mobility and water consumption data for short-term water demand forecasting using classical and machine learning models
.
Urban Water Journal
17
,
32
42
.
Song
C. M.
,
Qu
Z. H.
,
Blumm
N.
&
Barabasi
A. L.
2010
Limits of predictability in human mobility
.
Science
327
,
1018
1021
.
Wang
H.
,
Bracciano
D.
&
Asefa
T.
2020
Evaluation of water saving potential for short-term water demand management
.
Water Resources Management
34
,
3317
3330
.
Wang
Q.
,
Yu
J.
,
Zheng
Y.
,
Yao
X.
,
Yue
Q.
&
Xu
S.
2023
Hydraulic simulation of an urban river affected by treated effluent based on signal processing theory and physically based models
.
Journal of Hydrology: Regional Studies
49
,
101518
.
Wei
J. X.
,
Lei
Y. L.
,
Yao
H. J.
,
Ge
J. P.
,
Wu
S. M.
&
Liu
L. N.
2021
Estimation and influencing factors of agricultural water efficiency in the Yellow River Basin, China
.
Journal of Cleaner Production
308
, 127249.
Yu
J.
,
Tian
Y.
,
Jing
H.
,
Sun
T.
,
Wang
X.
,
Andrews
C. B.
&
Zheng
C.
2023
Predicting regional wastewater treatment plant discharges using machine learning and population migration big data
.
ACS ES&T Water
3
,
1314
1328
.
Zeng
P.
,
Sun
Z. Y.
,
Chen
Y. Q.
,
Qiao
Z.
&
Cai
L. W.
2021
COVID-19: A comparative study of population aggregation patterns in the Central Urban Area of Tianjin, China
.
International Journal of Environmental Research and Public Health
18
, 2135.
Zhang
S. M.
,
Zhang
W. S.
,
Wang
Y.
,
Zhao
X. Y.
,
Song
P. H.
,
Tian
G. H.
&
Mayer
A. L.
2020
Comparing human activity density and green space supply using the Baidu Heat map in Zhengzhou, China
.
Sustainability
12
, 7075.
Zhang
G. Y.
,
Poslad
S.
,
Fan
Y. L.
&
Rui
X. P.
2022
Quantitative spatiotemporal impact of dynamic population density changes on the COVID-19 pandemic in China's mainland
.
Geo-Spatial Information Science
26, 642–663.
Zhang
C.
,
Zhao
K.
&
Chen
M.
2023
Beyond the limits of predictability in human mobility prediction: context-transition predictability
.
IEEE Transactions on Knowledge and Data Engineering
35
,
4514
4526
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data