Study data includes 290 groundwater samples obtained at wells of households in coastal plain area of Vinh Linh and Gio Linh districts. The predominant chemical compositions in these samples consisted of three main ingredients as calcium carbonate (CaCO
3) calcium (Ca), and carbon dioxide (CO
2). In addition, there were some other physico-chemical components (as ammonia, magnesium, and iron oxide), but their contents were not significant in these samples. Three input variables include Ca, CO
2, and CaCO
3, which was collected from 290 wells of two districts’ households. The statistical characteristic results are also pointed out in Table 1. The range of the following characteristics was computed from the observation: the mean, min, and max values, St Dev, skew. The mean and standard deviation of the CaCO
3, Ca and CO
2 were 1.30 mg/l and 4.23 mg/l, 6.05 mg/l and 16.1 mg/l, 0.79 mg/l and 2.42 mg/l, respectively. The skewness for a normal distribution is zero, and any symmetric data should have a skewness near zero. Negative values for the skewness indicate skewed left data, and positive values for the skewness indicate skewed right data (Sahu
et al. 2003; Brys
et al. 2004). Hence, the skew of data fluctuating from 1.56 mg/l to 2.06 mg/l could be considered acceptable for prediction through these models. The input data patterns of 290 items were randomly selected with two parts. The first part was used for the training phase, which contained about 70% of the entire data. The second part was used for the test phase, which contained about the remaining 30%. In addition, the methodology of this study is described by the diagram in
Figure 3. The process was summarized by experimental stages as below. Firstly, the collected dataset is preprocessed and tested statistical procedure, and the data is also divided into training phase and testing phase. Secondly, the FFBB-PB, MARS, and DTR models are employed based on the training samples, and to acquire the best network parameters. Finally, the performances of the algorithms are compared by using metrics of the accuracy parameters, and looking for the most suitable forecasting model is found for the study.
Table 1Statistical characteristics of physico-chemical components data
Item
. | St Dev
. | Mean
. | Min
. | Max
. | Skew
. |
---|
CaCO3 | 4.23 | 1.30 | 0 | 25.80 | 2.06 |
Ca | 16.1 | 6.05 | 0 | 87.55 | 1.57 |
CO2 | 2.42 | 0.79 | 0 | 12 | 1.56 |
Item
. | St Dev
. | Mean
. | Min
. | Max
. | Skew
. |
---|
CaCO3 | 4.23 | 1.30 | 0 | 25.80 | 2.06 |
Ca | 16.1 | 6.05 | 0 | 87.55 | 1.57 |
CO2 | 2.42 | 0.79 | 0 | 12 | 1.56 |
Figure 3
Flowchart of the experimental steps conducted in this study.
Figure 3
Flowchart of the experimental steps conducted in this study.
Close modal