## Abstract

This study is aimed at developing a neuro-fuzzy model with the Matlab Graphical User Interface (GUI) for calculating the biocoagulant quantity needed for turbid water clarification. A neuro-fuzzy network (NFN) was developed for three different levels (low, medium and high) of turbid water. Experimental turbid water bioclarification data were used, in the Matlab environment through a sub-clustering neuro-fuzzy function, for modelling NFN. The network consisted of four inputs (untreated water turbidity, untreated water pH, settling time as well as treated water turbidity) and Mango Kernel Coagulant (MKC) dosage as the output variable. The best NFN architectures that produced minimum percentage error were considered for biocoagulant dosage calculator GUI development and implementation. The experimental data and results obtained from the NFN-GUI calculator were compared; and the prediction of the dosage has Root Mean Square Error (RMSE) as well as correlation coefficient ranges of 0.01–0.10 and 0.93–0.99 respectively. The high correlation coefficient found in this study indicates that the NFN-GUI calculator is a perfect match with the traditional jar-test calculator. Therefore, the Matlab-based calculator template is able to predict the biocoagulant quantity needed in a community water bioclarification treatment unit.

## INTRODUCTION

Rapid urbanization development in Nigeria has contributed to an increasingly huge volume of particulate pollutants found in municipal water treatment dam water. Dissolved particles and colloids cannot be removed by traditional solid–liquid separation operations such as conventional sand and activated-carbon filtration techniques (Errais *et al.* 2010; Kinyua *et al.* 2016). Water treatment processes such as reverse osmosis, electrodialysis and membrane separation technology can remove the pollutants from untreated water but they are accompanied by some disadvantages such as high installation, operation and maintenance costs (Rezvanpour *et al.* 2009). The Water Clarification (WC) treatment method is a preferable technique for eliminating colloids and suspended fine impurities from untreated water because of its associated advantages such as being a cost-effective, easy-to-operate and less energy-demanding process (Menkiti & Ezemagu 2015; Robert & Robert 2015).

WC is one of the major solid–liquid separation operations in drinking water treatment plants (WTP). Typical WC in WTP comprises coagulation, flocculation and settling processes; it has been widely used for turbidity removal and separation of organics/inorganics in both drinking water and wastewater treatment operation (Zainal-Abideen *et al.* 2012; Roli & Pradeep 2016). Coagulation is dosing of optimum coagulant into turbid water samples in order to destabilize stable dissolved and colloidal particles therein, while flocculation enhances collision of destabilized particles in order to form larger particles (Bello *et al.* 2014b). Flocculation and coagulation processes are characterized by slow and fast mixing processes but settling occurs with the aid of gravity settling. Wang *et al.* (2007), Sellami *et al.* (2014) and Lessoued *et al.* (2017) reported that pH, untreated water turbidity, coagulant dosage and nature, temperature as well as mixing-rate operating variables affect the coagulation–flocculation/settling process.

According to Deng & Lin (2017), determination of an optimum coagulant dosage in turbid water clarification is greatly considered necessary in an attempt to produce satisfactory clarified water qualities. Analysis of optimal coagulant dosage has found a wide range of applications in economic water operations and management as it maximizes manpower, water adaptation and guides the operators to control the high cost of coagulant. Traditional methods of determining coagulant dose rely greatly on manual calculation; these include the jar-test technique and automatic control by a Streaming Current Detector (SCD) (Dubey *et al.* 2017).

However, drawbacks associated with jar testing are the necessity to perform manual intervention, and that the tests are time-consuming and less adaptive to changes in raw water quality in real time. The SCD process also measures the net residual charge surrounding turbidity and colloidal particles in water (Lind 1994). These instruments require a set point to be entered which represents an optimum water-quality standard. Streaming-current values above the set point indicate an excess of coagulant, while values below the set point indicate insufficient coagulant for full WC to occur. Consequent upon this, a jar-test experiment must be conducted to determine the set point if raw water source characteristics are altered. In addition, the disadvantages associated with the SCD are its operation cost and its lack of adaptation to all types of raw water quality.

Due to the limitations found from the foregoing, it is therefore obvious that optimal modelling of coagulant dosage is desirable to overcome these inadequacies. Modelling of turbid water clarification is important because it enables the operator to understand, analyze, simulate and even predict the behavior of the system (Amir *et al.* 2016). Due to the complex and nonlinear nature of the WC process, the development of analytical models is a challenging and time-consuming process and also lacks accuracy in simulation. Menkiti & Ezemagu (2015) fitted turbid water bioclarification data into a theoretical model for the estimation of certain kinetic parameters for the process. However, the model lacks competency to describe the relationships between the coagulant dosage and independent variables. Black box modelling techniques which require no information about the mechanism of the process can be used to model the coagulant dosage needed for turbid water clarification. Various applications of these models have been reported for the prediction of synthetic coagulant dosing without a coagulant calculator template for the operator (Olanrewaju *et al.* 2012; Bello *et al.* 2014a).

At present, previous research in water and wastewater treatment industries show that utilization of synthetic coagulant is detrimental to human health and the environment (Menkiti *et al.* 2012). Recent studies have focused on application of plant seeds such as *Moringa oleifera* and *mucuna* seed, *Telfairia occidentalis* seed and other leguminous seeds, with polymeric substances, for turbid water bioclarification (Ali *et al*. 2010; Antov *et al.* 2010; Rachdi *et al.* 2017). These seeds are economically and medicinally valuable, however, mango kernel present in Nigeria is a common seed that is readily available in large quantity and has no economic value attached to it, and hence was considered as a biocoagulant in this work. In the literature there is still a paucity of study on the development of a biocoagulant dosage soft calculator template for determining Mango Kernel Coagulant (MKC) dosage to enhance the turbid water bioclarification process and this is a gap our study aimed to fill.

## MATERIALS AND METHOD

### Experimental procedure and data collection

#### Data collection

Water samples for experimental study were collected from Ogbomoso Waterworks Treatment Plant (OWTP) water dam. The samples were classified into low, medium and high water turbidity levels. The statistics of each water turbidity level is presented in column 2 of Tables 1, 2 and 3 respectively. The samples were collected for 5 months (March–July 2014). Turbidity, pH, conductivity, temperature and total dissolved solids of the untreated water samples were measured using a 2100P Portable Turbidimeter and Hanna instrument (HI98129). The characteristics of the river water samples are presented in Table 4.

Statistical parameter | Raw water turbidity NTU | Raw water pH | Settling time mins | Treated water turbidity % | Coagulant dosage mg/l |
---|---|---|---|---|---|

Min | 0.5 | 9.01 | 30 | 68.89 | 0.5 |

Max | 29.68 | 12 | 45 | 94.39 | 2.5 |

Mean | 16.13 | 11.02 | 32.649 | 79.3 | 1.50036 |

Standard deviation | 7.459 | 0.553 | 1.522 | 7.501 | 2.5 |

Number of data | 140 | 140 | 140 | 140 | 140 |

Statistical parameter | Raw water turbidity NTU | Raw water pH | Settling time mins | Treated water turbidity % | Coagulant dosage mg/l |
---|---|---|---|---|---|

Min | 0.5 | 9.01 | 30 | 68.89 | 0.5 |

Max | 29.68 | 12 | 45 | 94.39 | 2.5 |

Mean | 16.13 | 11.02 | 32.649 | 79.3 | 1.50036 |

Standard deviation | 7.459 | 0.553 | 1.522 | 7.501 | 2.5 |

Number of data | 140 | 140 | 140 | 140 | 140 |

Statistical parameter | Raw water turbidity NTU | Raw water pH | Settling time mins | Treated water turbidity % | Coagulant dosage mg/l |
---|---|---|---|---|---|

Min | 31.27 | 9.09 | 30 | 67.14 | 3.5 |

Max | 119.95 | 12 | 45 | 73.71 | 5.5 |

Mean | 74.96 | 11.054 | 32.47 | 67.16 | 3.57 |

Standard deviation | 25.99 | 0.575 | 1.45 | 3.40 | 1.13 |

Number of data | 140 | 140 | 140 | 140 | 140 |

Statistical parameter | Raw water turbidity NTU | Raw water pH | Settling time mins | Treated water turbidity % | Coagulant dosage mg/l |
---|---|---|---|---|---|

Min | 31.27 | 9.09 | 30 | 67.14 | 3.5 |

Max | 119.95 | 12 | 45 | 73.71 | 5.5 |

Mean | 74.96 | 11.054 | 32.47 | 67.16 | 3.57 |

Standard deviation | 25.99 | 0.575 | 1.45 | 3.40 | 1.13 |

Number of data | 140 | 140 | 140 | 140 | 140 |

Statistical parameter | Raw water turbidity | Raw water pH | Settling time mins | Treated water turbidity % | Coagulant dosage mg/l |
---|---|---|---|---|---|

Min | 250.2 | 9.1 | 30 | 45.98 | 6.5 |

Max | 268.8 | 11.98 | 45 | 52.05 | 9.5 |

Mean | 260.4 | 10.95 | 32.4 | 46.06 | 6.75 |

Standard deviation | 5.4 | 0.94 | 1.45 | 2.5 | 1.08 |

Number of data | 140 | 140 | 140 | 140 | 140 |

Statistical parameter | Raw water turbidity | Raw water pH | Settling time mins | Treated water turbidity % | Coagulant dosage mg/l |
---|---|---|---|---|---|

Min | 250.2 | 9.1 | 30 | 45.98 | 6.5 |

Max | 268.8 | 11.98 | 45 | 52.05 | 9.5 |

Mean | 260.4 | 10.95 | 32.4 | 46.06 | 6.75 |

Standard deviation | 5.4 | 0.94 | 1.45 | 2.5 | 1.08 |

Number of data | 140 | 140 | 140 | 140 | 140 |

Parameter | Value |
---|---|

Turbidity | 0.5–268.89 NTU |

pH | 9.1–12 |

Total dissolved solids | 156–274 mg/l |

Electrical conductivity | 329–389 μ/m |

Temperature | 27–29 °C |

Parameter | Value |
---|---|

Turbidity | 0.5–268.89 NTU |

pH | 9.1–12 |

Total dissolved solids | 156–274 mg/l |

Electrical conductivity | 329–389 μ/m |

Temperature | 27–29 °C |

#### Batch bioclarification jar-test experiment

*N*

_{O}is initial water or untreated water turbidity and

*N*is the treated water turbidity per time.

_{t}#### Neuro-fuzzy modeling

Among various combinations of techniques in soft computing, fuzzy logic and neuro-computing have the highest application leading to the neuro-fuzzy system (Jang 1993). Fuzzy modeling accounts for the hidden imprecision in data and performs accurate input–output mapping using fuzzy logic and rules (Bello *et al.* 2014a). This process implies fuzzification of the input variables through the Membership Function (MF), a curve that maps the input values to membership grades within the interval of 0 and 1. Fuzzy conditional statements are the building blocks of the Fuzzy Inference System (FIS); and they are useful for explaining the imprecise manners of human reasoning necessary to make decisions in fuzzy and imprecise scenarios or conditions.

In this study, the Takagi–Sugeno (TS) fuzzy rule was used; TS has fuzzy sets only in the antecedent part while the consequent part is expressed as a constant, linear or non-fuzzy equation of the input variables. The FIS is the nucleus part of fuzzy models that generates results based on the following steps (Jang 1993): find the membership grade of each linguistic value on the antecedent part by comparing the input variables with the MF; determine the firing strength or weight of each fuzzy rule by aggregating the membership grades on the antecedent parts; compute the qualified consequent of each fuzzy rule as a function of the firing strength; aggregate the qualified consequent to generate a single-valued output.

#### Fuzzy model identification and fuzzy subtractive clustering

The fuzzy model can be estimated from the input and output data using an appropriate model identification algorithm. The algorithm for fuzzy identification includes the following procedures (Bello *et al.* 2014b): (i) fuzzy clustering; (ii) determination of appropriate cluster radius; (iii) determination of the consequent parts of the fuzzy rules by the least squares parameter estimation technique. The subtractive clustering technique was used to form clusters in the data and convert them into fuzzy rules. This clustering method is one of the automated data-driven-based methods for constructing the primary fuzzy models, proposed by Chiu (1994). It is a fast, one-pass algorithm for estimating the number of clusters and the cluster centres in a set of data. The number of rules and antecedent membership functions were achieved from this method by considering the centre of each cluster as a fuzzy rule.

*m*data points in an

*N*-dimensional space (such as a hypercube unit). The algorithm assumes each data point to be a potential cluster centre and calculates some measure of potential for each of them according to Equation (1): where and defines the neighbourhood radius for each cluster centre;

*r*

_{a}is a positive constant called the cluster radius and ∥.∥ denotes the Euclidean distance, and = density measure of data points. After calculating the potential for each vector, the one with the higher potential is selected as the first cluster centre. Let there be the centre of the first group and its potential. Then, the potential for each is reduced according to Equation (2): Also and represents the radius of the neighbourhood for which considerable potential reduction will happen. A data point is considered to be a cluster centre when more data points are closer to it. Thus, the data point with highest density measure is considered to be the first cluster centre;

*r*

_{b}is a positive constant that results in a measurable reduction in density measures of neighbourhood data points in order to avoid closely spaced cluster centres. Using Equation (2), the density measure of each point is obtained; the data point with the highest remaining density measure is assigned the next cluster centre. The process is repeated and this process is stopped when adequate numbers of cluster centres have been generated. The cluster centres are the representations of the system to be modelled and exhibit certain similar characteristics. Then, the cluster information obtained is used for determining the initial number of rules and the antecedent MF that is used for identifying the FIS. For every unique input vector a membership degree for each fuzzy set greater than zero is computed, and therefore every rule in the rule base fires. This leads to the possibility of generating a couple of rules for describing the accurate relationship between input and output data (Lohani

*et al.*2006).

#### Cluster radius determination

*r*

_{a}, cluster validity analysis is carried out by running the clustering algorithm for several values of

*r*

_{a}starting from a small value to a large value with different initializations. The validity measure is calculated for each run, and the cluster radius which minimizes the validity measure is selected as the appropriate cluster radius. In this study, prediction error is used as the validity measure, expressed as: where

*y*and are the true data and the predicted output respectively.

#### Consequent parameters estimation

#### User interface calculator development

The Matlab Graphical User Interface (GUI) development environment provides a set of tools for creating GUIs through a user interface laying out procedure and programming codes within the environment. The layout procedure was used to create GUI components and also created menus, context menus, and the size of the GUI. Consequently, Matlab automatically generated an M-file that controls the behavior of the GUI. Programming codes were written within the M-file, which initializes the user interface, as well as incorporating NFN models into the interface. The codes contain partly a framework for the interface callbacks – the routine functions that execute in response to instruction from input parameters. The GUI in this study allows users to process turbid water clarification input data and to calculate coagulant dosage immediately after data acquisition. The methodology used for NFN-coagulant dosage GUI development is depicted in Figure 1: data preparation, data training and checking, ANFIS setting (specification of cluster radius and epoch number), model evaluation, GUI testing, numeric output and comparative analysis.

#### Neuro-fuzzy network performance metrics

*R*) and RMSE respectively:

## RESULTS AND DISCUSSION

### NFN development and validation

The behaviour and performance of the NFN basically depend on the number of training iterations (epoch number) and value allotted for the radius of the influence in data clustering (Araromi 2011; Bello *et al.* 2014a). Figure 2(a) and 2(b) represent the MF of the network for low turbidity water clarification of input 1. It was observed from Figure 2(a) (at low value of radius of influence such as 0.3) that numbers of clusters formed were many and thereby increased computational time which makes simulation with such operating variables to be computationally cumbersome. On the contrary, fewer clusters of data were formed in Figure 2(b) (at low value of radius of influence such as 0.9), and thereby reduce the computational time of NFN simulation. Therefore, the subsequent NFN simulations (low, medium and high turbidity water clarification simulation) were carried out at high values of radius of influence.

The simulation results of NFN obtained are plotted in Figures 3 and 4; the figures show the behaviour of the model performance indexes (RMSE and) with varying radii of influence. It was noticed from Figure 3 that as the radius of influence in the NFN increased, the RMSE increased and decreased sinusoidally. However, at the radius of influence of 0.94, the minimum RMSE was obtained. Figure 4 shows the effect of radius of neighbourhood influence on the correlation coefficient of NFN. It was observed that as the radius of influence in the NFN increased, *R*^{2} increased and decreased sinusoidally. Nevertheless, at the radius of influence of 0.94, maximum *R*^{2} was obtained. RMSE of the NFN model at varying radius of influence 0.3 to 0.7 ranges from 2.2 to 2.5, which is relatively high with corresponding value of 0.490. The results of RMSE and at these conditions indicated poor predictions of the NFN (figures for these are not shown).

Bio-clarification data obtained from jar-test experiments for medium turbidity water treatment was trained and checked using different iterations as shown in Table 5. NFN simulation was performed at radius of neighbourhood influence of range 0.1–0.3 for various training iterations. The MF of the NFN for medium turbidity water treatment with 0.3 and 50 of radius and iteration number generated many clusters. The simulation process for this took about 15 minutes to complete the iterations with poor predictive results. All the simulation runs, except for 0.3 radius of influence, were performed in less than 40 seconds with good predictive results as shown in Table 5. The reliability and adequacy of the network was evaluated based on the correlation coefficient which exceeded 0.98, indicating that the model estimation is good at every radius of influence except 0.3 and 0.88. The correlation coefficient shows the relationship between the NFN predicted and actual values. A correlation coefficient of 1 indicates a perfect match of the model (Olanrewaju *et al.* 2012).

Radii of neighbourhood | Epoch number | R-Square value | RMSE |
---|---|---|---|

0.3 | 200 | 0.163 | 1.139 |

0.84 | 100 | 0.97684 | 0.0632 |

0.86 | 100 | 0.97331 | 0.0672 |

0.88 | 150 | 0.49771 | 0.5803 |

0.9 | 150 | 0.98489 | 0.0505 |

0.92 | 200 | 0.98696 | 0.047 |

0.94 | 50 | 0.98258 | 0.0542 |

0.96 | 100 | 0.9748 | 0.0652 |

0.98 | 150 | 0.98385 | 0.0522 |

1 | 200 | 0.98813 | 0.0448 |

Radii of neighbourhood | Epoch number | R-Square value | RMSE |
---|---|---|---|

0.3 | 200 | 0.163 | 1.139 |

0.84 | 100 | 0.97684 | 0.0632 |

0.86 | 100 | 0.97331 | 0.0672 |

0.88 | 150 | 0.49771 | 0.5803 |

0.9 | 150 | 0.98489 | 0.0505 |

0.92 | 200 | 0.98696 | 0.047 |

0.94 | 50 | 0.98258 | 0.0542 |

0.96 | 100 | 0.9748 | 0.0652 |

0.98 | 150 | 0.98385 | 0.0522 |

1 | 200 | 0.98813 | 0.0448 |

Meanwhile, the minimum RMSE of 0.044 obtained implies that the error is highly insignificant which suggests that a very high accuracy is achieved for the prediction of biocogulant dosage for medium turbidity water. Table 6 shows the biocoagulant dosage NFN simulation for high turbidity water treatment at different epochs and radii of influence. Good prediction of dosage for high turbidity water treatment was obtained at radius of influence 0.9 and 150 epoch number of iterations with of 0.9602 and RMSE of 0.03058 as shown in Table 6.

Radius of influence | Epoch number | R-square value | RMSE |
---|---|---|---|

0.82 | 50 | 0.57576 | 0.6907 |

0.84 | 100 | 0.53167 | 0.7325 |

0.86 | 100 | 0.46107 | 0.8237 |

0.88 | 150 | 0.63276 | 0.6366 |

0.9 | 150 | 0.9602 | 0.03058 |

0.92 | 200 | 0.711 | 0.6474 |

0.94 | 50 | 0.57199 | 0.6964 |

0.96 | 100 | 0.57202 | 0.6995 |

0.98 | 150 | 0.64229 | 0.6233 |

1 | 200 | 0.64176 | 0.6238 |

Radius of influence | Epoch number | R-square value | RMSE |
---|---|---|---|

0.82 | 50 | 0.57576 | 0.6907 |

0.84 | 100 | 0.53167 | 0.7325 |

0.86 | 100 | 0.46107 | 0.8237 |

0.88 | 150 | 0.63276 | 0.6366 |

0.9 | 150 | 0.9602 | 0.03058 |

0.92 | 200 | 0.711 | 0.6474 |

0.94 | 50 | 0.57199 | 0.6964 |

0.96 | 100 | 0.57202 | 0.6995 |

0.98 | 150 | 0.64229 | 0.6233 |

1 | 200 | 0.64176 | 0.6238 |

The Matlab-based NFN-GUI calculator, as shown in Figure 4, is a user-friendly interface. It contains input parameters such as untreated water turbidity and pH, settling time and treated water turbidity for calculating the output (biocoagulant dosage). The input data, obtained from the jar-test experiments, were fed into the calculator and used to calculate coagulant dosage (output) needed for turbid water (low, medium and high turbidity water) clarification. The developed coagulant dosage calculator ANFIS-GUI is the simplest user interface for the modelling of coagulant dose during river water clarification. To date, there is no such GUI developed for calculating coagulant dosage and this can be used by the end user of any field to calculate the coagulant quantity needed for turbid water clarification. The GUI as shown in Figure 4 is the completed version of the coagulant calculator-ANFIS-GUI which has been fully developed on the platform of Matlab to calculate coagulant dosage with corresponding treated water turbidity. This system has lessened researchers’ fear-to-simulate factor due to phobia for generating program codes for the process. The calculator has been calibrated and implemented in available Matlab software versions and the results are satisfactory. However, future work is still needed for refining behind-the-screen models based on new data obtained from different water treatment plants and a mobile application version of the GUI is also necessary for proper deployment of the outputs of this research project. The developed calculator is a data-driven calculator, thus, it can accommodate other input–output WC data for estimating coagulant dosage.

The first ten turbidity values for low, medium and high turbidity water were used to validate the developed NFN-calculator. The calculator was set and operated at untreated water pH 9, desired turbidity 98% and settling time 35 minutes as shown in the input keys of Figure 5. The setting was done so as to compare the calculator estimated outputs with experimental data. However, the calculator is flexible to accommodate other data within the range of the experimental data as shown in Tables 1, 2 and 3 respectively. Figure 6(a)–6(c) compare NFN-calculator and jar-test apparatus data for low, medium and high turbidity water. It was observed that the outputs of the coagulant dosage calculator for all turbidity levels are very close to the experimental data. Correlation coefficients between experimental data and Matlab GUI calculator for low, medium and high turbidity water are 0.9776, 0.984 and 0.938 respectively. The values obtained for the correlation of the experimental data and calculator are very close to 1. The closer the correlation value to unity (1) the better the closeness of the data (experimental and calculator) to each other.

The GUI results obtained from this study were compared with previous investigations (Sivarao *et al.* 2009; Bertone *et al.* 2016, 2018); it was observed that the present investigation is better than existing works in terms of *R*^{2} obtained. This study gave *R*^{2} values in the range 0.93–0.99; however, *R*^{2} for earlier works is less than these except Sivarao *et al.* (2009) that gave better *R*^{2} than the current research. Therefore, the Matlab GUI soft calculator for estimating the coagulant dosage needed for turbid water clarification is reliable and dependable. Thus, the estimation of the calculator in Figure 6 means that without carrying out daily jar-test experiments for coagulant dosage for a community water clarifier, the developed calculator can be used to estimate the quantity of coagulant needed for turbid water treatment.

## CONCLUSION

The Matlab-based user-friendly calculator for biocoagulant dosage has been developed and implemented successfully. This calculator provides a rapid method to process raw water characteristic data and obtain coagulant dosage information for turbid water clarification. The calculator is also useful for real-time monitoring and control of coagulant quantity needed for turbid water clarification operation. This work also demonstrated the advantages of the NFN-GUI dosing interface over jar-test experiments. Therefore, it could help in operational cost reduction, reduced human error in dosing and ensuring the quality of the treatment process. Thus, this study will help daily operation and optimization of turbid water bioclarification for community water supply.

## REFERENCES

*PhD Thesis*

**47**(3), 3985–3991

**47**(3), 370–376

*.*

Hybrid Modelling and Fuzzy Control of Reactive Distillation Process