Household water consumption plays an important role in addressing the problem of water shortage and achieving sustainable water development. To identify, assess, and analyze the impact of a family structure on household water consumption, this study develops a mathematical statistical method to conduct multi-scenario simulations of average annual household water consumption based on data from the 2016 China Family Panel Studies (CFPS). The Kolmogorov–Smirnov test and the two independent sample t-tests were used to obtain the distribution with the highest degree of fitting, and the probability distribution and expected value of average annual household water consumption were obtained from the distribution probability function. The results demonstrated that the Birnbaum–Saunders distribution was the optimal distribution; families comprising one and two generations were dominant in terms of water consumption; and the number of water-saving households was far less than that of households with high levels of water consumption. The findings of this study have valuable implications for water governance and the domestic water planning.

• This study develops a mathematical statistical method to conduct multi-scenario simulations of average annual household water consumption based on data from the 2016 China Family Panel Studies (CFPS).

• The mathematical statistical method was used to compare the degree of fit between the samples of water consumption of various families and the distribution of various functions.

Water shortage is a common challenge worldwide. The sixth goal of Sustainable Development Goals (SDGs) is to ensure the availability and sustainable management of water and sanitation. Water shortage is caused not only by natural factors but also by human and social factors (Corbella & Pujol 2009). Domestic water accounts for less than 10% of all human water consumption. Domestic water includes urban and rural domestic water. Specifically, it includes water for residential, public, and livestock purposes. There are significant regional differences in global water consumption, and people from different countries have diverse levels of water consumption. China has one of the highest water consumption levels in the world. According to the 2019 China Water Resources Bulletin, China's total water consumption was 602.12 billion m3, of which 87.17 billion m3 was domestic water, accounting for 14.4% of the total water consumption. In 2019, China's population was 1.398 billion, compared with the global population of 7.673 billion, accounting for 18.2% of the global population. China's per capita total water consumption is 431 m3. The urban per capita domestic water consumption was 225 L/day, and the per capita domestic water consumption of rural residents was 89 L/day. This substantial water consumption level requires a detailed longitudinal analysis of household water consumption. Therefore, examining household water consumption in China is of great significance not only for China but also for global sustainable water development.

There is extensive literature on household water consumption regarding individual consumption preferences, willingness to pay, water-saving cognition, and institutional culture (Basu et al. 2017; Vieira et al. 2018; Garcia et al. 2019; Liao et al. 2021). The characteristics and influencing factors of household water consumption are hot topics in water governance and domestic water planning. Existing studies analyze household water consumption mainly from the supply and demand sides. From the perspective of the water demand side, consumers' attitudes toward water savings are closely related to the degree of sustainable water development. Social and demographic characteristics (Jorgensen et al. 2010), family characteristics, and water resource utilization and protection (Syme et al. 2004) are the key factors that have impacted the changes in household water consumption. Willis et al. (2011) used econometrics, questionnaire surveys, factor analysis, and cluster analysis to explore the relationship between water-saving attitudes and household water consumption based on 132 independent households on the Gold Coast of Australia and found that residents with positive attitudes toward water saving had significantly less water consumption than those with negative attitudes. Fielding et al. (2012) collected the water consumption data of 1,008 households in Australia and found that factors such as population, social psychology, behavior, occupancy rate, and infrastructure determine household water consumption. Studies have shown that demographic factors are a key element that affects household water consumption; families with a strong water-saving culture and habits consume less water. Salman found that there is a significant correlation between water consumption and variables such as household type and age, which is of great significance to the management of regional water resources. Dudkiewicz & Laska (2019) used the input–output analysis method to quantitatively analyze the life cycle water consumption of Chinese households from 2002 to 2015 and found that demographic changes can reduce the household water consumption of rural families but increase the household water consumption of urban families. Shan et al. (2015) analyzed three major factors pertinent to the behavior of domestic water consumers: end-use behaviors, socio-demographic and property characteristics, and psychosocial constructs. In addition, the impact of behavior on household water consumption has received increasing attention in recent years (Cary 2008; Shahangian et al. 2022).

The management and study of the water supply side have always been key determinants of household water consumption. The main influencing factors include climate change and meteorology (Slavíková et al. 2013), water pricing, and water policy (Jorgensen et al. 2010). For example, the Chinese government believed that price is one of the important tools of water resource supervision for more than two decades. The sharp rise in water prices in China has improved water use efficiency. However, the implementation of water resource taxes has shortcomings such as unclear responsibilities, low collection rates, and poor governance capabilities (Olmstead et al. 2007). In 2013, China's National Development and Reform Commission pointed out that, although the wealthiest 5% of families were willing to pay three times or more for basic water consumption, about 80% of low-class families were unwilling to do so. Therefore, if the Chinese central government adopted reform measures to reduce the demand for large amounts of water, the water consumption of most families would not meet their basic needs because of the high prices. Jorgensen pointed out that the changes in water policy by the water authorities in England and Wales have clearly deviated from social equity. The new charging policy is unfair to low-income families, and the charging strategy does not fully consider the differences in supply and demand caused by social and geographic disparities (Jorgensen et al. 2014). Keshavarzi et al. (2006) quantitatively analyzed the impact of price and non-price factors on residential water demand through household survey data in 10 countries and found that there was a complementarity between household water-saving behaviors and average water prices. Martins & Fortunato (2005) analyzed panel data obtained from a 72-month survey of five communities in Portugal and established that household size was positively correlated with residential water demand. Although it has weak elasticity, price plays a role in water demand management. Zhang et al. (2017) used difference-in-differences models to evaluate China's water price reforms and found that the policy reform reduced annual residential water demand by 3–4% in the short run and by 5% in the long run. Lam (2010) found that consumers' subjectivity to water saving has a positive effect on alleviating water use in arid areas, and further asserted that household income and educational background have significant but inconsistent effects on water use.

At present, scholars generally study how consumers affect domestic water consumption by analyzing family members at the individual level. Clearly, individual differences in families do have a certain impact on household water consumption. However, in practice, most families are composed of multiple people, and the consumption of household resources, including water and energy, is actually the result of collective living consumption, and the household water consumption is not simply the sum of individual water consumption (Ren et al. 2016; Hu et al. 2020). In addition, household water consumption needs to consider the size of the family and the consumption differences of different age populations. Existing studies also indicate that there are scale effects, intergenerational effects, and marginal effects on household consumption of resources such as electricity (Hu et al. 2020; Wu et al. 2021). It is reasonable to assume that the same pattern applies to water consumption. Therefore, family structure, namely, combinations of different population sizes and different generations, have a crucial impact on household water consumption.

When a family consumes water resources, it is necessary to consider not only the size of the family, but also the needs of people of different ages to achieve a better balance of household water consumption. It is important to consider the preferences and needs of various family members in daily household water consumption, especially when different generations are living together, and it is particularly important to consider the difference between generations. Therefore, studying the simulation and prediction of the impact of family structure on household water consumption is crucial for an effective analysis of the changes in household water consumption. Finding the optimal household structure can provide theoretical support and guide the rational formulation of household water control standards and policies.

Currently, there is limited literature on the impact of family structure on household water consumption examined through mathematical statistical methods. Innovative statistical and machine learning methods have been introduced in the last years to analyze household water consumption (Duerr et al. 2018; Dimauro et al. 2022). Since household water consumption has different characteristics, finding a comprehensive and precise model for simulation and prediction that includes the main factors are challenging (Fontdecaba et al. 2013; Chenoweth et al. 2016). The novelty of this study is to calculate the annual household water consumption of different family types from the perspective of household structure based on the China Family Panel Studies (CFPS) data, to obtain the optimal distribution through mathematical function simulations, and finally to estimate the probability distribution and expected value of the average annual household water consumption of different family structures through probability density functions. This study provided a demographic profile of household water consumption and preliminarily revealed the correlation between family structure and water consumption by using statistical analysis and mathematical model fitting. The data were first cleaned and statistically analyzed, then the optimal model was selected using an iterative approach, and finally, the optimal Birnbaum–Saunders model was selected to fit the probability distribution of the family structure and household water consumption. The results of this study help to fully understand the impact of family structure on household water consumption, thereby providing a basis for precise policy implementation. Furthermore, another contribution of this study is conducted an extensive literature review of household water consumption and identified an optimal mathematical model based on a detailed database.

### Data source and processing

The research data were obtained from the 2016 CFPS. The CFPS is an open database provided by the Institute of Social Science Survey of Peking University, China. The CFPS is a national longitudinal survey of Chinese communities, families, and individuals. The CFPS is designed to collect individual-, family-, and community-level longitudinal data in contemporary China. The studies focus on the economic and non-economic well-being of the Chinese population, with a wealth of information covering topics such as economic activities, education outcomes, family dynamics and relationships, migration, and health. The CFPS has successfully interviewed almost 15,000 families and almost 30,000 individuals within these families, with an approximate response rate of 79%. All members over the age of 9 in a sampled household are interviewed. These individuals constitute the core respondents of CFPS.

To reduce the effects of contingency and ensure the validity and reliability of the results, the sample was preliminarily processed as follows. People in the sample who were not at home, residents that were non-economically related to the family were excluded. In addition, many families in China work in other cities all year round and only return home during holidays, which also needs to be excluded as the outliers. These families could not be directly identified in the sample, and we performed a simple calculation. The introduction has already introduced that the per capita water consumption of rural residents in China is 89 L/day. Suppose that a rural family has only one person, according to the minimum standard of water consumption, the annual water consumption is 32.5 m3. Therefore, it can be assumed that the family with less than 30 m3 of water consumption is not in the local all year round, and those families need to be excluded from the sample. After the aforementioned data processing, 5,321 families were retained in the sample. The sample data showed that the numbers of families with one to four generations were 1,648, 2,025, 1,262, and 386, respectively. Among the samples, families with two generations accounted for the largest proportion, followed by families with one generation. Families with four generations were the least common (Table 1).

Table 1

Sample distribution of families with different numbers of generations

Number of generations in familySample numberProportion of total sample (%)
1,648 30.97
2,025 38.06
1,262 23.72
386 7.25
Total 5,321 100
Number of generations in familySample numberProportion of total sample (%)
1,648 30.97
2,025 38.06
1,262 23.72
386 7.25
Total 5,321 100

The data covered most of China's mainland areas (Table 2), with a total of 24 provinces. The sample distribution was relatively uniform. Nineteen regions had a sample size of more than 100, while in eight regions, the proportion of the sample size exceeded 5%. Gansu Province had the largest sample size (641), accounting for 12.05% of the total sample. Henan Province had the second largest sample with 604 families, accounting for 11.35% of the total sample. The sample numbers in Guangdong, Hebei, Liaoning, Shandong, and Shanghai were 423, 344, 585, 273, and 279, and they accounted for 7.95, 6.46, 10.99, 5.13, and 5.24% of the total sample, respectively. The region with the smallest sample number was Beijing, whose sample number was only 44, which accounted for 0.83% of the total sample. The sample number in the remaining regions accounted for 40% of the total.

Table 2

Regional distributions of the samples

RegionsSample numberProportion of total sample (%)
Anhui 109 2.05
Beijing 44 0.83
Fujian 59 1.11
Gansu 641 12.05
Guangdong 423 7.95
Guangxi 111 2.09
Guizhou 197 3.70
Hebei 344 6.46
Henan 604 11.35
Heilongjiang 173 3.25
Hubei 106 1.99
Hunan 157 2.95
Jilin 119 2.24
Jiangsu 98 1.84
Jiangxi 108 2.03
Liaoning 585 10.99
Shandong 273 5.13
Shanxi 120 2.26
Shaanxi 376 7.07
Shanghai 279 5.24
Sichuan 42 0.79
Tianjin 155 2.91
Yunnan 122 2.29
Zhejiang 76 1.43
Others 109 2.05
Total 5,321 100
RegionsSample numberProportion of total sample (%)
Anhui 109 2.05
Beijing 44 0.83
Fujian 59 1.11
Gansu 641 12.05
Guangdong 423 7.95
Guangxi 111 2.09
Guizhou 197 3.70
Hebei 344 6.46
Henan 604 11.35
Heilongjiang 173 3.25
Hubei 106 1.99
Hunan 157 2.95
Jilin 119 2.24
Jiangsu 98 1.84
Jiangxi 108 2.03
Liaoning 585 10.99
Shandong 273 5.13
Shanxi 120 2.26
Shaanxi 376 7.07
Shanghai 279 5.24
Sichuan 42 0.79
Tianjin 155 2.91
Yunnan 122 2.29
Zhejiang 76 1.43
Others 109 2.05
Total 5,321 100

### Calculation of household water consumption

Since the CFPS surveys only household water expenditure and does not have direct household water consumption data, the calculation of household water consumption was required. In this study, the annual household water consumption for families with one to four generations was calculated based on the amount of household water consumption from the 2016 CFPS, together with the tiered water price and amount of tiered water consumption in each province of China. In China, tiered pricing for household water consumption has been fully implemented since 2015. As the annual household water consumption was represented using P, the tier 1, tier 2, and tier 3 water prices were represented as P1, P2, and P3, respectively. The corresponding cap of tiered water consumption at all three levels were T1, T2, and T3, respectively. Thus, the equations for household water consumption at all three levels, represented by C1, C2, and C3, respectively, and the annual household water consumption amount T were calculated as follows:
(1)
(2)

Table 3 presents the descriptive statistics of the household residential water consumption.

Table 3

Descriptive statistics of household residential water consumption

Families with one generationFamilies with two generationsFamilies with three generationsFamilies with four generations
Average 183.25 191.02 209.09 178.08
Standard error 135.69 147.35 174.41 151.54
Maximum 1,926.00 2,372.93 1,442.43 1,488.04
Minimum 30.00 30.00 32.97 32.97
Range 1,896.00 2,342.93 1,409.47 1,455.08
Median 153.19 158.35 171.43 132.26
Mode number 125.00 102.13 137.14 34.78
Coefficient of variation 0.74 0.77 0.83 0.85
Kurtosis 24.94 38.89 11.75 20.43
Skewness 3.21 4.13 2.75 3.44
Families with one generationFamilies with two generationsFamilies with three generationsFamilies with four generations
Average 183.25 191.02 209.09 178.08
Standard error 135.69 147.35 174.41 151.54
Maximum 1,926.00 2,372.93 1,442.43 1,488.04
Minimum 30.00 30.00 32.97 32.97
Range 1,896.00 2,342.93 1,409.47 1,455.08
Median 153.19 158.35 171.43 132.26
Mode number 125.00 102.13 137.14 34.78
Coefficient of variation 0.74 0.77 0.83 0.85
Kurtosis 24.94 38.89 11.75 20.43
Skewness 3.21 4.13 2.75 3.44

Unit: m3/year.

The mean annual household water consumption range was [178.08, 209.09]. Among the samples, four-generation households have the least water consumption per household, and three-generation households have the highest water consumption per household. The range of standard deviation was [135.69,174.41], and that of the median was [132.26,171.43]. Why the four-generation households have lower water consumption in the samples? Considering in the context of social, economic, and demographic characteristics in China, the possible reasons are as follows. First, there are very few families with four generations. According to the data of China's sixth census, only 18% of all Chinese families with three generations or more1, and 7.25% of all the families with four generations in the samples of this study, which exceeds the national average. Second, considering the population structure, there are generally very old people and very young children in four-generation families, which consume less water, while families of three generations and below are mainly young adults, who consume more water. At last, there is an effect of diminishing marginal water use in family structure, and four generations are a critical inflection point. The Kurtosis coefficients of all four categories of families were greater than zero, thereby indicating that there were few instances of extreme data on both sides. The distribution of annual household water consumption was lower than the normal distribution, thereby showing a sharp peak distribution. The skewness coefficients were all greater than zero, the peak of the frequency distribution was shifted to the left side, and the long tail extended to the right side, which indicates a positive skewness distribution. The coefficients of variation of household water consumption of families with one to four generations were all less than one, thereby indicating that the sample data were relatively concentrated and representative (Figure 1).
Figure 1

Characteristics of household water consumption.

Figure 1

Characteristics of household water consumption.

Close modal

### Building the fitting model

Simulation is the basis of forecasting the trend of water use and formulating water resource management policy scientifically. Through multi-function fitting of the sample data of water consumption, we can find an optimal model, which can predict water consumption and structure according to the characteristics of population and family in China, providing reference for the Chinese government's decision-making, and helping to promote water governance and sustainable water development. In order to test whether household water consumption data conform to a certain statistical distribution, and to further explore the correlation between household structure and water consumption and the optimal structure distribution of household structure, the Kolmogorov–Smirnov test was carried out on samples of household water consumption. The Kolmogorov–Smirnov test is a useful nonparametric goodness of fit test used to check whether a set of samples follow a probability distribution (Saeed et al. 2021; Wee et al. 2021). The consistency test function was as follows:
(3)
where Ci represents the household water consumption of families with different generation numbers obeying a certain distribution. F refers to the accumulation of the distribution of probability density function, Iα is the confidence interval, and ρ is the probability of significance in the Kolmogorov–Smirnov test, ρ ɛ [0, 1]. The frequency of ρ ≥ 0.05 indicated that the number of distribution acceptance obtained by the simulation was high, there were no significant differences between the sample distribution from the actual data and the distribution derived by the simulation, and the degree of fit was high. When the frequency of ρ < 0.05, at a high rejection frequency, there were significant differences between the data distribution from the actual data and the simulated distribution of the sample data, and the degree of fit was low. Eight distributions have been tested including Birnbaum–Saunders distribution, Burr distribution, Gamma distribution, Generalized Extreme Value distribution, inverse Gaussian distribution, Log-Logistic distribution, lognormal distribution, and t Location-Scale distribution. At last, the distribution with more rejections was directly excluded, and the three candidate distributions with the highest number of acceptance and the highest degree of fit was retained, namely, the Birnbaum–Saunders, lognormal, and inverse Gaussian distributions.
The optimization model of Birnbaum–Saunders distribution is:
(4)
The optimization model of lognormal distribution is:
(5)
The optimization model of inverse Gaussian distribution is:
(6)
Taking the optimization model of the Birnbaum–Saunders distribution as an example, represents the function of the accumulative probability density of the Birnbaum–Saunders distribution. represents the Kolmogorov–Smirnov test for the sample value Ci of household water consumption, where and represent the shape and scale parameters, respectively. Φ(x) is the distribution function of the standard normal distribution. Setting as a simple random sample group with a sample size of N from the population of the B–S distribution, the observed values are , respectively, at each point in time. Thus, the shape and scale parameters could be determined using the maximum likelihood method. The maximum likelihood function is expressed as follows:
(7)

### Characteristic of household water consumption

According to the preliminary statistical analysis, the histogram of household water consumption of families with one to four generations exhibited a trend of high left and low right based on the 2016 CFPS data, which is a right-skewed distribution; that is, the peak of frequency distribution was shifted to the left, while the long tail extended to the right, showing a positively skewed distribution (Figure 1). MATLAB was used to perform the function fitting on the sample data. If the shape of the risk cumulative curve was close to that of the histogram of household water consumption, the degree of fit was higher, and the distribution was well fitted. Four to five types of distributions with high degrees of fit were selected for each family type. For all the sample data, eight major distributions, including Birnbaum–Saunders distribution, Burr distribution, Gamma distribution, Generalized Extreme Value distribution, inverse Gaussian distribution, Log-Logistic distribution, lognormal distribution, and the Location-Scale distribution, were used. Due to differences in sample data of each family, there were both overlapped and different in models. For example, the Gamma distribution using the data from families with four generations was less well fitted than the distributions of the other three family types. Here, the Gamma distribution was not removed because it was only used as an example. Then the optimal fitting model is selected. At last, the probability density distribution and cumulative probability density distribution of household water consumption for families with one to four generations were obtained (Figures 2 and 3).
Figure 2

The probability density distribution of household water consumption for families with 1–4 generations.

Figure 2

The probability density distribution of household water consumption for families with 1–4 generations.

Close modal
Figure 3

The cumulative probability density distribution of household water consumption for families with 1–4 generations.

Figure 3

The cumulative probability density distribution of household water consumption for families with 1–4 generations.

Close modal

The Birnbaum–Saunders distribution, inverse Gaussian distribution, and lognormal distribution were chosen as the candidate distributions after comparing the degree of fit of various distributions of household water consumption.

### Analysis of distribution simulation

To obtain the optimal distribution, further optimization of the three candidate distributions was required. The specific optimization method is as follows:

First, the Monte Carlo method was used to perform several simulations using the sample data of household water consumption. Then, the consistency test between sample data and simulation data distribution was conducted using the two-sample Kolmogorov–Smirnov test and the two independent sample t-tests. Five groups of Kolmogorov–Smirnov tests were conducted based on the data of household water consumption and three candidate distributions; each group of simulations was performed 100 times. Thus, a large number of simulation cycles effectively guaranteed the stability and reliability of the calculation results. Table 4 shows the number of simulations with the results of P ≥ 0.05. In each simulation, if P ≥ 0.05, the distribution of the simulation was acceptable. Consequently, the degree of acceptance of the candidate distribution depended on the time of the simulation with the results of P ≥ 0.05. As shown in Table 4, the first data indicates that there existed 95 times with the results of P ≥ 0.05 in the first simulation cycle of the sample data of household water consumption for families with one generation using the Birnbaum–Saunders distribution. Based on the principle that the higher the number of acceptances, the better the simulation result, it can be found that the acceptances of all three candidate distributions differed very little and the degree of fit was high.

Table 4

Number of acceptances for all three candidate distributions

Family typesDistribution functions12345678910
Families with one generation B–S 95 94 97 97 94 91 93 95 94 97
I–G 93 91 89 97 91 87 89 93 96 90
Lognor 90 91 92 94 94 90 88 90 91 90
Families with two generations B–S 94 94 89 87 96 94 95 91 92 93
I–G 93 86 88 93 93 89 91 87 89 89
Lognor 93 90 93 88 89 95 91 91 94 91
Families with three generations B–S 88 97 96 92 93 93 95 93 91 96
I–G 94 95 90 88 94 89 92 95 91 97
Lognor 94 94 89 93 99 95 93 92 95 91
Families with four generations B–S 100 97 95 99 98 98 93 97 97 94
I–G 96 93 95 94 100 97 94 99 98 98
Lognor 95 98 97 94 96 100 97 98 94 99
Family typesDistribution functions12345678910
Families with one generation B–S 95 94 97 97 94 91 93 95 94 97
I–G 93 91 89 97 91 87 89 93 96 90
Lognor 90 91 92 94 94 90 88 90 91 90
Families with two generations B–S 94 94 89 87 96 94 95 91 92 93
I–G 93 86 88 93 93 89 91 87 89 89
Lognor 93 90 93 88 89 95 91 91 94 91
Families with three generations B–S 88 97 96 92 93 93 95 93 91 96
I–G 94 95 90 88 94 89 92 95 91 97
Lognor 94 94 89 93 99 95 93 92 95 91
Families with four generations B–S 100 97 95 99 98 98 93 97 97 94
I–G 96 93 95 94 100 97 94 99 98 98
Lognor 95 98 97 94 96 100 97 98 94 99

For the three candidate distributions, the two independent sample t-tests were used to optimize the distributions of sample data for families with one to four generations. Table 5 shows the significance coefficient (double-tails) of the candidate distribution, i.e., the P value obtained. Here, PBI refers to the significance coefficient of double independent t-test of the Birnbaum–Saunders distribution and the inverse Gaussian distribution (double-tails), PBL refers to the significance coefficient of double independent t-test of the Birnbaum–Saunders distribution and the lognormal distribution (double-tails), and PIL refers to the significance coefficient of double independent t-test of the inverse Gaussian distribution and the lognormal distribution. If the P value was less than 0.05, there was a significant difference between the two distributions. If the P value was less than 0.01, the two distributions were considered to have a statistically significant difference. If the P value was greater than 0.05, there was no significant difference between the two distributions. Table 5 compares the data and shows that there was a significant difference between the Birnbaum and Saunders distribution and the inverse Gaussian distribution when the double independentt-test was performed on the household water consumption samples of families with one generation. The average simulated acceptance number of the Birnbaum–Saunders distribution was 86.16, which was higher than that of the inverse Gaussian distribution (83.36). Thus, the Birnbaum–Saunders distribution was superior to the inverse Gaussian distribution. When the value of PBL is 0.00 (less than 0.01), it can be considered that there is an extremely significant statistical difference between the Birnbaum–Saunders distribution and the lognormal distribution. As the average number of acceptances of the simulated results for the Birnbaum–Saunders distribution was higher than the lognormal distribution, the Birnbaum–Saunders distribution was regarded as being superior to the lognormal distribution. Among the results of double independent t-tests of household water consumption of the families with two generations, the number of simulated acceptances (when PBI was less than 0.05) of the Birnbaum–Saunders distribution was higher than that of the inverse Gaussian distribution, which indicates that the Birnbaum–Saunders distribution was superior to the inverse Gaussian distribution. When the double independent t-test was performed on the household water consumption of families with three or four generations, the P values were all greater than 0.05, thereby indicating that there was no significant difference between the three candidate distributions. All distributions could be considered as suitable distributions. Based on the simulations of families with one or two generations, it can be concluded that the Birnbaum–Saunders distribution has the highest degree of fit for the distribution pattern of household water consumption for families with different generations. Thus, the Birnbaum–Saunders distribution was the optimal distribution in this study.

Table 5

The results of double independent t-test for the three candidate distributions

Families with one generationFamilies with two generationsFamilies with three generationsFamilies with four generations
Number of acceptances in the B–S distribution 86.18 84.27 85.18 88.36
Number of acceptances in the I–G distribution 83.36 81.82 84.36 88.00
Number of acceptances in the lognormal distribution 82.82 83.36 85.27 88.36
PBI 0.02 0.04 0.49 0.70
PBL 0.00 0.39 0.94 1.00
PIL 0.61 0.13 0.44 0.69
Families with one generationFamilies with two generationsFamilies with three generationsFamilies with four generations
Number of acceptances in the B–S distribution 86.18 84.27 85.18 88.36
Number of acceptances in the I–G distribution 83.36 81.82 84.36 88.00
Number of acceptances in the lognormal distribution 82.82 83.36 85.27 88.36
PBI 0.02 0.04 0.49 0.70
PBL 0.00 0.39 0.94 1.00
PIL 0.61 0.13 0.44 0.69

Currently, the B–S distribution is widely used in reliability statistical analysis, and its characteristics are as follows: (a) arises from the process of fatigue, (b) the density function of the B–S distribution is skewed to the right side, which is consistent with the descriptive statistical results of the sample data of household water consumption, and further illustrates the advantages of the B–S distribution in this context, (c) the failure probability function is in an inverted bathtub shape, and (d) the scale parameter is its median.

### Probability distribution and expectation of household water consumption

The probability distribution of annual household water consumption in families of different generations was calculated according to the characteristics of household water consumption and the probability density function of the Birnbaum–Saunders distribution (Figure 4). A close and inverted U-shaped relationship between household water consumption and the number of generations in the families was observed.
Figure 4

The distributions of probability of annual household water consumption for families with different generations.

Figure 4

The distributions of probability of annual household water consumption for families with different generations.

Close modal

When the number of generations in a family increased from one to three, household water consumption increased gradually. The highest household water consumption of 215.16t was observed in families with three generations. When the number of generations in a family further increased to four, household water consumption decreased greatly. Household water consumption of families with four generations was the lowest (173.37t). Thus, it can be concluded that families with three generations are high household water consumption families, while families with four generations are water-conservation families. Household water consumption of families with one and two generations is between that of the other two types of families.

The probability of the occurrence of high levels of water consumption in families with one to two generations gradually increases, while that in families with two to four generations gradually decreases. The probability of the occurrence of high levels of water consumption in families with two generations was the highest (38%), and the probability of the occurrence of high levels of water consumption in families with four generations was the lowest (0.06). It can be inferred that currently, families in China are more inclined to be with fewer generations or contemporaneous residences.

Using the screened optimal B–S distribution and the data of household water consumption for families with one to four generations, the expected value of the corresponding average household water consumption and averaged household water consumption for different number of generations was obtained. The expected value is the average value of the repeated random test under the same conditions. The average value was equivalent to the expected value. This value, to a certain extent, can represent the household water consumption for the combination of each family intergenerational combination model. The equation is as follows:
(9)
(10)
where is the B–S distribution of the probability density function, and is the standard normal distribution density function, namely . Here, x represents the sample data of household water consumption for families with one to four generations. The integrals of are the mathematical expectations for household water consumption, and Figure 5 shows the calculated results. The average household water consumption of each generation number in families decreased gradually and reached the lowest value in families with four generations. The average household water consumption of families with one to three generations increases gradually and reaches the highest value in families with three generations. When the number of generations in families increased from three to four, the average household water consumption decreased significantly and reached the lowest value. It was found that the expected household water consumption of families with one to four generations is consistent with the probability distribution, indicating that the fitted Birnbaum–Saunders distribution is reasonable and feasible for describing the effect of the number of generations in families on household water consumption.
Figure 5

The expectations of household water consumption per household and per generation in families with different generations.

Figure 5

The expectations of household water consumption per household and per generation in families with different generations.

Close modal

In general, families with one and two generations were dominant in terms of household water consumption, with a cumulative probability of about 67.5%. However, the number of water-saving households was far less than that of households with high levels of water consumption. In conclusion, it seems that the current dominant family structure and household lifestyle are not conducive to water saving.

### Reliability test

To further verify the reliability of the fit of the simulation model, KS tests were conducted on the sample data and optimized distribution for 30 groups and five cycles. Based on the Birnbaum–Saunders distribution, a simulation data boxplot was obtained for different family types. Figure 6 shows the boxplot of the sample data and the five simulation cycles including the lower edge value, lower quartile, median, upper quartile, upper edge value, and significance level. It was observed that the positions and heights of the lower edge value, lower quartile, median, upper quartile, and upper edge values of household water consumption in families with different generations were basically the same, and the values were also relatively close, which indicated that the results derived from the fitting distribution were relatively accurate. To ensure the effective analysis of the differences between the sample data and the simulation data, the lower edge value, lower quartile, median, upper quartile, upper edge value, and significance level of the sample data of household water consumption for the families with one to four generations were calculated and compared with the boxplots of the simulation data. For example, for families with one generation, the lower edge values of the simulation data and the sample data of the household water consumption were all zero, indicating that there was no difference. It is noteworthy that the analysis method was the same for other family types. The lower quartile of the sample data was 87.59, while that of the simulation data was [85.79, 92.63]. The upper quartile of the sample data was 237.45, while the upper quartile interval of the simulation data was [221.15, 242.54]. The median of the sample data was 153.19, the median interval of the simulation data was [143.25, 179.03], the upper edge value was between [322.30, 420.29], and the lower edge value was between [18.19,24.13]. The value of is the probability of significance in the Kolmogorov–Smirnov test, ɛ [0, 1]. When ≥ 0.05, it indicates that there are no significant differences between the distribution derived from the simulation and the overall distributions from the samples, showing a high degree of fit and good fitting effect. The value ofin all five simulation groups was greater than zero, indicating that the distribution was highly consistent with the overall distribution from which the sample data were derived. The distribution boxplots of household water consumption for different family types were analyzed, and it was found that the interval length of the lower quartile, upper quartile, median, lower marginal value, and upper marginal value of each simulated household water consumption was small and close to the sample data. When p > 0, the optimal Birnbaum–Saunders distribution is effective and feasible.
Figure 6

Boxplots for amount of household water consumption in families with different generations based on Birnbaum–Saunders distribution.

Figure 6

Boxplots for amount of household water consumption in families with different generations based on Birnbaum–Saunders distribution.

Close modal

Based on the 2016 CFPS data, this study classified the sample families according to the number of generations in a family. Then, a mathematical statistical method was used to compare the degree of fit between the samples of water consumption of various families and the distribution of various functions. Three candidate distributions were obtained. Finally, the optimal distribution was obtained by further comparison based on the test results. The probability distribution of the average annual household water consumption of the four types of families and the expected value of the household average and generation average were calculated using the probability density function of the optimal distribution.

The results of this study demonstrated that the Birnbaum–Saunders distribution is the optimal distribution among families with one to four generations. The average household water consumption of the three-generation family was the largest, and the average household water consumption of the four-generation family was the smallest. The number of water-saving households (four-generation families) was far less than that of households with high levels of water consumption (three-generation family). The families with one to two generations dominated household water consumption, and their water consumption accounted for about 67.5% of the total water consumption.

This study proved that family structure is an important factor influencing household water consumption, and the Birnbaum–Saunders distribution is an optimal fit between the family structure and household water consumption. The prediction of household water consumption is complicated, and this study is carried out from the perspective of household structure, which highlights the correlation between family structure and water consumption by statistical analysis and numerical simulation. Policymakers can use these predictions to devise effective policies to intervene in household water consumption to promote sustainable water development. For example, the results of this study indicate that the number of water-saving households was far less than that of households with high levels of water consumption. Therefore, Chinese government needs to strengthen education and policy incentives for household water-saving. In addition, results show that the highest household water consumption was observed in families with three generations in current China, and the Chinese government needs to pay attention to this phenomenon due to many factors including the impact of the one-child policy, the challenge of ageing, and the influence of China's traditional culture of ‘big family’. There are some limitations in this study need to improve in the following research. First, the data in this study are from China. Whether the research conclusions are applicable to other countries, especially developing countries, needs to be verified by data from other countries. Second, the methods in this study are statistical analysis and mathematical modeling, which need to be verified by other methods. Finally, policy interventions as important factors need to be further evaluated, such as the future impact of China's newly introduced three-child policy.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

All co-authors deeply mourn the passing of the first author Ms. Mei Wang. May her soul rest in peace and light forever.

Basu
M.
,
Hoshino
S.
,
Hashimoto
S.
&
DasGupta
R.
2017
Determinants of water consumption: a cross-sectional household study in drought-prone rural India
.
International Journal of Disaster Risk Reduction
24
,
373
382
.
https://doi.org/10.1016/j.ijdrr.2017.06.026
.
Cary
J. W.
2008
Influencing attitudes and changing consumers’ household water consumption behavior
.
Water Supply
8
(
3
),
325
330
.
https://doi.org/10.2166/ws.2008.078
.
Chenoweth
J.
,
Lopez-Aviles
A.
,
Morse
S.
&
Druckman
A.
2016
Water consumption and subjective wellbeing: an analysis of British households
.
Ecological Economics
130
,
186
194
.
https://doi.org/10.1016/j.ecolecon.2016.07.006
.
Corbella
H. M.
&
Pujol
D.
2009
What lies behind domestic water use? A review essay on the drivers of domestic water consumption
.
Boletin de La Asociación de Geógrafos Españoles
50
(
50
),
297
314
.
Dimauro
G.
,
Barletta
V. S.
,
Catacchio
C. R.
,
Colizzi
L.
,
Maglietta
R.
&
Ventura
M.
2022
A systematic mapping study on machine learning techniques for the prediction of CRISPR/cas9 sgRNA target cleavage
.
Computational and Structural Biotechnology Journal
20
,
5813
5823
.
Dudkiewicz
E.
&
M.
2019
Inequality of Water Consumption for Hygienic and Sanitary Purposes in Production Halls
. In
E3S Web of Conferences, 100: 00014
.
EDP Sciences
.
Duerr
I.
,
Merrill
H. R.
,
Wang
C.
,
Bai
R.
,
Boyer
M.
,
Dukes
M. D.
&
Bliznyuk
N.
2018
Forecasting urban household water demand with statistical and machine learning methods using large space-time data: a comparative study
.
Environmental Modelling & Software
102
,
29
38
.
Fielding
K. S.
,
Russell
S.
,
Spinks
A.
&
A.
2012
Determinants of household water conservation: the role of demographic, infrastructure, behavior, and psychosocial variables
.
Water Resources Research
48
,
W10510
.
Fontdecaba
S.
,
Sanchez-Espigares
J. A.
,
Marco-Almagro
L.
,
Tort-Martorell
X.
,
Cabrespina
F.
&
Zubelzu
J.
2013
An approach to disaggregating total household water consumption into major end-uses
.
Water Resources Management
27
(
7
),
2155
2177
.
https://doi.org/10.1007/s11269-013-0281-8
.
Garcia
J.
,
Salfer
L. R.
,
Kalbusch
A.
&
Henning
E.
2019
Identifying the drivers of water consumption in single-Family households in Joinville, Southern Brazil
.
Water
11
(
199010
).
https://doi.org/10.3390/w11101990
.
Hu
Z.
,
Wang
M.
,
Cheng
Z.
&
Yang
Z.
2020
Impact of marginal and intergenerational effects on carbon emissions from household energy consumption in China
.
Journal of Cleaner Production
273
,
123022
.
https://doi.org/10.1016/j.jclepro.2020.123022
.
Hussien
W. A.
,
Memon
F. A.
&
Savic
D. A.
2016
Assessing and modelling the influence of household characteristics on per capita water consumption
.
Water Resources Management
30
(
9
),
2931
2955
.
https://doi.org/10.1007/s11269-016-1314-x
.
Jorgensen
B.
,
Graymore
M.
&
O'Toole
K.
2010
Household water use behavior: an integrated model
.
Chinese Journal of Environmental Management
91
(
1
),
227
236
.
Jorgensen
B. S.
,
Martin
J. F.
,
Pearce
M. W.
&
Willis
E. M.
2014
Predicting household water consumption with individual-level variables
.
Environment & Behavior
46
(
7
),
872
897
.
Keshavarzi
A. R.
,
M.
,
Kamgar Haghighi
A. A.
,
Amin
S.
,
Keshtkar
S.
&
A.
2006
Rural domestic water consumption behavior: a case study in Ramjerd Area, Fars Province, I.R. Iran
.
Water Research
40
(
6
),
1173
1178
.
https://doi.org/10.1016/j.watres.2006.01.021
.
Liao
X.
,
Chai
L.
&
Liang
Y.
2021
Income impacts on household consumption's grey water footprint in China
.
Science of The Total Environment
755
(
1425841
).
https://doi.org/10.1016/j.scitotenv.2020.142584
.
Martins
R.
&
Fortunato
A.
2005
Residential water demand under block rates – a Portuguese case study
.
Water Policy
9
(
2
),
217
230
.
S. M.
,
Hanemann
W. M.
&
Stavins
R. N.
2007
Water demand under alternative price structures
.
Journal of Environmental Economics & Management
54
(
2
),
181
198
.
Ren
Z.
,
Chan
W. Y.
,
Wang
X.
,
Anticev
J.
,
Cook
S.
&
Chen
D.
2016
An integrated approach to modelling end-use energy and water consumption of Australian households
.
Sustainable Cities and Society
26
,
344
353
.
https://doi.org/10.1016/j.scs.2016.07.010
.
Shahangian
S. A.
,
Tabesh
M.
,
Yazdanpanah
M.
,
Zobeidi
T.
&
Raoof
M. A.
2022
Promoting the adoption of residential water conservation behaviors as a preventive policy to sustainable urban water management
.
Journal of Environmental Management
313
,
115005
.
Shan
Y.
,
Yang
L.
,
Perren
K.
&
Zhang
Y.
2015
Household water consumption: insight from a survey in Greece and Poland
.
Procedia Engineering
119
,
1409
1418
.
Slavíková
L.
,
Malý
V.
,
Rost
M.
,
Petružela
L.
&
Vojáček
O.
2013
Impacts of climate variables on residential water consumption in the Czech Republic
.
Water Resources Management
27
,
365
379
.
Syme
G. J.
,
Shao
Q.
,
Po
M.
&
Campbell
E.
2004
Predicting and understanding home garden water use
.
Landscape & Urban Planning
68
(
1
),
121
128
.
Vieira
P.
,
Jorge
C.
&
Covas
D.
2018
Efficiency assessment of household water use
.
Urban Water Journal
15
(
5
),
407
417
.
https://doi.org/10.1080/1573062X.2018.1508596
.
Wee
S.
,
Choi
C.
&
Jeong
J.
2021
Blind interleaver parameters estimation using Kolmogorov–Smirnov test
.
Sensors
21
(
10
),
3458
.
Willis
R. M.
,
Stewart
R. A.
&
Panuwatwanich
K.
2011
Quantifying the influence of environmental and water conservation attitudes on household end use water consumption
.
Journal of Environmental Management
92
(
8
),
1996
2009
.
Wu
T.
,
Xu
D.-L.
&
Yang
J.-B.
2021
Decentralised energy and its performance assessment models
.
Frontiers of Engineering Management
8
(
2
),
183
198
.
Zhang
B.
,
Fang
K. H.
&
Baerenklau
K. A.
2017
Have Chinese water pricing reforms reduced urban residential water demand?
Water Resources Research
53
(
6
),
5057
5069
.

## Author notes

deceased

This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).