Permanent monitoring of environmental issues demands efficient, accurate, and user-friendly pollutant prediction methods, particularly from operating variables. In this research, the efficiency of multiple polynomial regression in predicting the adsorption capacity of caffeine (*q*) from an experimental batch mode by multi-walled carbon nanotubes (MWCNTs) was investigated. The MWCNTs were specified by scanning electron microscope, Fourier transform infrared spectroscopy and point of zero charge. The results confirmed that the MWCNTs have a high capacity to uptake caffeine from the wastewater. Five parameters including pH, reaction time (t), adsorbent mass (M), temperature (T) and initial pollutant concentration (C) were selected as input model data and *q* as the output. The results indicated that multiple polynomial regression which employed C, M and t was the best model (normalized root mean square error = 0.0916 and R^{2} = 0.996). The sensitivity analysis indicated that the predicted *q* is more sensitive to the C, followed by M, and t. The results indicated that the pH and temperature have no significant effect on the adsorption capacity of caffeine in batch mode experiments. The results displayed that estimations are slightly overestimated. This study demonstrated that the multiple polynomial regression could be an accurate and faster alternative to available difficult and time-consuming models for *q* prediction.

## INTRODUCTION

Caffeine (CFN) with chemical formula C_{8}H_{10}N_{4}O_{2} is an alkaloid appertaining to the methylxanthine group naturally available in foods including tea, coffee, kola nuts and cacao beans. In humans, caffeine acts not only as a central nervous system provocative but also as a natural pesticide for the plant (Pavel *et al.* 2003). Caffeine is liberated in the aquatic environment and has been found in surface water, ground water and also in wastewater effluents in large concentration (∼10 g L^{−1}) (Glassmeyer *et al.* 2005). Because caffeine is difficult to metabolize and is commonly found in ground water and surface water, it has been proposed as a chemical indicator of environmental pollution. For instance, zebra fish embryos could not survive when the dose of caffeine in water was greater than 300 mg L^{−1} (Chen *et al.* 2008). Literature also indicated that this substance is toxic to most of the aquatic environment (Pollack *et al.* 2009). Caffeine can damage soil fertility, cause barriers to seed germination and lowers growth of seedlings (Pollack *et al.* 2009). Thus, elimination of excess caffeine from different water sources is necessary.

Different chemical and physical technologies have been employed for the elimination of this substance in water including coagulation/flocculation, ion exchange, oxidation/reduction, membrane separation, biological treatment and adsorption (Ma *et al.* 2012). Traditional procedures applied for caffeine elimination are either expensive or involve the utilization of toxic organic materials. However, as traditional wastewater treatment do not degrade caffeine, it is important to look for alternatives (Álvarez-Torrellas *et al.* 2016). Adsorption procedure is one of the best wastewater treatments due to the simplicity, ease of operation, high efficiency and can be utilized in small scale household units; thus, adsorption techniques are commonly applied. Adsorbents must have suitable characteristics including high adsorption capacity, surface area, selectivity, lifetime and capacity for regeneration (Arshadi *et al.* 2014).

Nowadays, a great deal of interest is being concentrated on the use of nano-structured materials such as carbon nanotubes (CNTs) as adsorbents to eliminate harmful and toxic organic pollutant from aqueous media. CNTs which were described by Iijima (1991), are one of the most commonly investigated carbon nano-materials and can act as good adsorbents for the removal of toxic pollutants due to their layered and hollow structure, and also large surface area (Tan *et al.* 2012). CNTs materials can be divided into three groups: functionalized CNTs, single-wall CNTs and multi-wall CNTs (MWCNTs) (Yu *et al.* 2014). MWCNTs are successfully applied in the removal of many pollutants such as methyl red (Ghaedi & Kokhdan 2012), methyl orange (Hosseini *et al.* 2011), Eriodirome Cyanine R (Ghaedi *et al.* 2011), pharmaceuticals and personal care products (Wang *et al.* 2016) from wastewater. However, examples of the utilization of MWCNTs for caffeine removal are rather scarce in the literature.

The main factors affecting the adsorption reaction including pH, reaction time (t), adsorbent mass (M), temperature (T) and initial pollutant concentration (C) were investigated and analyzed, using the equations governing the adsorption process such as adsorption isotherm or/and adsorption kinetic models. In these equations only one factor is variable and the others are held constant. It is better to use an equation that encompasses all factors affecting the adsorption process. In other words, it is better to investigate the effects of all factors simultaneously. Several researchers have employed artificial intelligence models to predict adsorption efficiency of pollutant from aqueous media. These models include artificial neural networks (Yetilmezsoy & Demirel 2008; Gamze Turan *et al.* 2011a, 2011b; Amiri *et al.* 2013a), adaptive neural-based fuzzy inference systems (Amiri *et al.* 2013b), wavelet neural networks and support vector regression (Mousavi *et al.* 2012). However, many cases of prediction, such as adsorption process polynomial regression, did at least as well if not better than the artificial intelligence models. The main purpose of this research was data modeling of caffeine removal from aqueous media using MWCNTs by multiple polynomial regression with interaction effects.

## MATERIALS AND METHODS

### Adsorbate and adsorbent

Caffeine (i.e 1,3,7-trimethylpurine-2,6-dione) was purchased from Sigma–Aldrich Co. (Germany), in analytical purity and applied in the experiments directly without any further puriﬁcation. Suitable concentrations of caffeine solutions were provided by diluting a stock solution with deionized water. MWCNTs were prepared by Nanocyl Co. (Belgium) and used as adsorbent. All other chemicals were purchased from Merck (Germany). Some of the physical characteristics of MWCNTs are seen in Table 1.

Items . | MWCNTs . |
---|---|

Diameter | 10–20 nm |

Length | 30 μm |

Purity | >95 wt% |

Ash | <1.5 wt% |

Specific surface area (m^{2} g^{−1}) | 200 |

Density (g cm^{3}) | 2.1 |

Items . | MWCNTs . |
---|---|

Diameter | 10–20 nm |

Length | 30 μm |

Purity | >95 wt% |

Ash | <1.5 wt% |

Specific surface area (m^{2} g^{−1}) | 200 |

Density (g cm^{3}) | 2.1 |

### Characterization techniques

The pH of solution was modified using 0.1 M HCl/NaOH using a pH meter (Metrohm, 827 pH Lab). Zero point charge (pH_{ZPC}) of MWCNTs was measured with the solid addition method (Balistrieri & Murray 1981). Nitrogen (99.999%) adsorption tests were carried out at −196 °C using volumetric apparatus (Quantachrome NOVA automated gas sorption analyzer). The concentrations of caffeine solutions were obtained by using a UNICO-2100 UV-Vis spectrophotometer at a wavelength corresponding to the maximum absorbance, *λ _{max}* = 275 nm. The functional groups existing in MWCNTs was investigated using the Fourier transform infrared (FTIR spectroscopy) technique. FTIR spectra were recorded using a Jasco FT/IR-680 plus spectrophotometer as KBr pellets. The size and surface morphology of MWCNTs were determined using a scanning electron microscope (SEM) (MIRA3TESCAN-XMU).

### Experimental procedure

^{−1}). The bottles containing the mixture of caffeine solution with MWCNTs were shaken at 120 rpm using a SEBD001 rotary orbital shaker. In the final stage of batch experiments, the solutions were filtered using filter paper No. 42. The concentration of residual caffeine in the remaining solution was determined using a UV/Vis spectrophotometer. The adsorption capacity (

*q*) (mg g

^{−1}) of caffeine using MWCNTs was computed using Equation (1), where

*C*and

_{0}*C*are the initial concentration of caffeine in solution and the concentration in equilibrium (mg L

_{e}^{−1}), respectively,

*V*is the volume of solution (L) and

*m*is the adsorbent mass (g).

### Multiple polynomial regression with interaction effects

Statistical procedures such as multiple polynomial regression are the best tools for studying any relationship between low example sizes of response and independent variables (Razi & Athappilly 2005). Statistical analysis that justifies the combinations of factor levels was used through multiple polynomial regression analysis using Minitab 17 software. Multiple polynomial regression is a procedure applied to model the relationship between one or more independent parameters and a response variable. To create this regression model, it was necessary to determine critical parameters, given that the best model must have the least number of necessary parameters and maximum accuracy. In this research, *q* was the response variable. Also t, C, M, pH and T were continuous predictors. To implement the multiple polynomial regression, the measured data set included 114 caffeine samples which were divided into two parts: the first group (86 observations, 75% of the data) was randomly selected for building the model, and the second group (28 observations, 25% of the data) was applied for testing the model.

### Goodness of fitted model

*R*), normalized root mean square error (NRMSE) and mean residual error (MRE). These indices are described as follows (Willmott

^{2}*et al.*1985): where and are the measured and model estimated amounts of caffeine adsorbed (mg g

^{−1}), respectively, and are the mean of measured and predicted values, respectively, and

*n*is the number of measurements. A lower

*NRMSE*value and higher

*R*value are regarded as showing goodness of agreement between measured and estimated caffeine adsorbed data.

^{2}*NRMSE*of less than 10% represents accuracy of the model. When the quantity of NRMSE is closer to zero the precision of results is higher. The prediction is considered poor if the NRMSE is higher than 30%, fair if the NRMSE is higher than 20% and lower than 30%, good if the NRMSE is higher than 10% and lower than 20% and excellent if the NRMSE is lower than 10% (Jamieson

*et al.*1991). MRE is a measure of estimation bias, with positive and negative MRE values demonstrating overestimation and underestimation, respectively.

## RESULTS AND DISCUSSION

### Characterization of MWCNTs

*μ*m in length. The well developed porous structure can be observed. Figure 1(a) shows quite clearly MWCNTs, but after adsorption (B), the surface is tarnished, suggesting that a thin layer of caffeine covers MWCNTs, without changing the morphology of MWCNTs.

^{−1}is the most intense band. This band is related to the presence of C═C in the aromatic rings of caffeine molecules and corresponds to the stretching vibration of C═O bonds.

^{−1}is characteristic of alcohol O-H stretch and amine N-H stretch groups present in the adsorbent surface. In the FTIR spectra of the MWCNTs, there is a small peak at 2,927.8 cm

^{−1}assigned to the carboxylic acid O-H stretch and alkyl C-H stretch groups. These peaks show minor chemical reactions between the caffeine molecules and MWCNTs surfaces. The peak at 1,402.18 cm

^{−1}is characteristic of aromatic C-H bending. The physical properties of the MWCNTs show that the surface area and mean pore diameter are 200 m

^{2}g

^{−1}and 3.46 nm, respectively. However, the great specific surface area and average pore size of MWCNT give an increase in the approachability of active sites of the adsorbent in contact with contaminated water.

_{ZPC}of the MWCNTs were measured. The results of pH

_{ZPC}determination are shown in Figure 4. pH

_{ZPC}of the MWCNTs was estimated to be about 3.82 which indicates that the surface charge of the adsorbent is negative at high pH of solution. The pH of the solution determines not only the predominant species in the solution but also the net charge on the carbonaceous materials. This aspect has been studied by Ayranci

*et al.*(2005) on phthalic acid and its esters adsorption onto activated carbon cloth.

Several works have been performed in order to elucidate the mechanism of adsorption of many molecules on different adsorbents. Those publications reveal that adsorption of organic molecules from dilute aqueous solutions on carbon materials is a complex interplay between electrostatic and non-electrostatic interactions and that both interactions depend on the characteristics of the adsorbent and adsorbate, as well as the solution chemical properties (Moreno-Castilla 2004). As a similar substance, MWCNTs can be considered effective in removing organic contaminants.

### Multiple polynomial regression study

#### Training the model

As can be seen in Equation (5), this model contains linear effects (t, C, M, pH, T), quadratic effects (), cubic effects (), and interaction effects (t* C, t* M, t* pH, t* T, C* M, C* pH, C* T, M* pH, M* T, pH* T). However, not all these effects are significant and step-by-step the non-effective parameters were removed. The final model has parsimonious parameters and maximum accuracy. Following on from this, we used Note 1 and 2 to determine the final model:

Note 1 – In polynomial regression when a cubic effect is significant, we must assume linear and quadratic effects in the model (significant or not significant). Also when a quadratic effect is significant, we must assume linear effect in model (significant or not significant).

Note 2 – In regression with interaction effects, when an interaction effect is significant, we must assume both linear effects in model (significant or not significant).

*P*-value >0.05). By considering Notes 1 and 2, the factors given in Table 3 were candidates for removal. Then the most not significant term (more

*P*-value) in the first run (pH

^{3}) was removed and the operation run again. The results of the second step for multiple polynomial regression with interaction effects are given in Table 4. The factors given in Table 5 were the candidates for removal. The process was continued step-by-step to the point that there were only significant factors. Table 6 shows the results of the last run. As such, the cubic effect of t and C (t

^{3}, C

^{3}), the quadratic effects of t, C and M (t

^{2}, C

^{2}and M

^{2}), the linear effects of t, C and M and the interaction effect of t and C (t*C) remained in model. Therefore, the final regression equation was as follows:

Source . | P-value
. |
---|---|

t | 0.000 |

C | 0.000 |

M | 0.000 |

pH | 0.972 |

T | 0.994 |

t^{2} | 0.000 |

C^{2} | 0.146 |

M^{2} | 0.057 |

pH^{2} | 0.982 |

T^{2} | 0.989 |

t*C | 0.000 |

t^{3} | 0.000 |

C^{3} | 0.026 |

M^{3} | 0.425 |

pH^{3} | 0.991 |

Source . | P-value
. |
---|---|

t | 0.000 |

C | 0.000 |

M | 0.000 |

pH | 0.972 |

T | 0.994 |

t^{2} | 0.000 |

C^{2} | 0.146 |

M^{2} | 0.057 |

pH^{2} | 0.982 |

T^{2} | 0.989 |

t*C | 0.000 |

t^{3} | 0.000 |

C^{3} | 0.026 |

M^{3} | 0.425 |

pH^{3} | 0.991 |

Source . | P-value
. |
---|---|

T^{2} | 0.989 |

C^{3} | 0.026 |

M^{3} | 0.425 |

pH^{3} | 0.991 |

Source . | P-value
. |
---|---|

T^{2} | 0.989 |

C^{3} | 0.026 |

M^{3} | 0.425 |

pH^{3} | 0.991 |

Source . | P-value
. |
---|---|

t | 0.000 |

C | 0.000 |

M | 0.000 |

pH | 0.918 |

T | 0.994 |

t^{2} | 0.000 |

C^{2} | 0.143 |

M^{2} | 0.056 |

pH^{2} | 0.914 |

T^{2} | 0.989 |

t*C | 0.000 |

t^{3} | 0.000 |

C^{3} | 0.026 |

M^{3} | 0.422 |

Source . | P-value
. |
---|---|

t | 0.000 |

C | 0.000 |

M | 0.000 |

pH | 0.918 |

T | 0.994 |

t^{2} | 0.000 |

C^{2} | 0.143 |

M^{2} | 0.056 |

pH^{2} | 0.914 |

T^{2} | 0.989 |

t*C | 0.000 |

t^{3} | 0.000 |

C^{3} | 0.026 |

M^{3} | 0.422 |

Source . | P-value
. |
---|---|

pH^{2} | 0.914 |

C^{3} | 0.026 |

M^{3} | 0.422 |

T^{2} | 0.989 |

Source . | P-value
. |
---|---|

pH^{2} | 0.914 |

C^{3} | 0.026 |

M^{3} | 0.422 |

T^{2} | 0.989 |

Source . | P-value
. |
---|---|

t | 0.000 |

C | 0.000 |

M | 0.000 |

t^{2} | 0.000 |

C^{2} | 0.093 |

M^{2} | 0.000 |

t*C | 0.000 |

t^{3} | 0.000 |

C^{3} | 0.013 |

Source . | P-value
. |
---|---|

t | 0.000 |

C | 0.000 |

M | 0.000 |

t^{2} | 0.000 |

C^{2} | 0.093 |

M^{2} | 0.000 |

t*C | 0.000 |

t^{3} | 0.000 |

C^{3} | 0.013 |

^{2}, R

^{2}

_{adj}and R

^{2}

_{pred}were obtained as 0.998, 0.9979 and 0.9973, respectively. Also, the RMSE of Equation (6) is 0.5987. Figure 5 shows the residual plots of the final model. This figure is applied to test the goodness of fit in multiple polynomial regression. Testing residual plots reveals that the usual least squares assumptions are being met. Figure 5(a) shows the histogram of the residuals, and whether the variance is normally distributed. An almost symmetric bell-shaped histogram of Figure 5(a) displays the normality assumption is likely to be true. Figure 5(b) shows the residuals versus fits indicating that the residuals have a nearly constant variance. Figure 5(c) shows the normal probability plot of the residuals. As can be seen from Figure 5(c), the plot is nearly linear assuming that the error terms are normally distributed. Figure 5(d) shows the residuals versus order of data assuming that the residuals are uncorrelated with each other.

#### Testing the model

*q*(mg g

^{−1}), was calculated. The linear regression was applied between measured adsorption capacity (

*q*) and predicted adsorption capacity (

_{m}*q*) of caffeine. Comparison of the

_{p}*q*by multiple polynomial regression and

_{p}*q*is presented in Figure 6. Also, the confidence limits at the 5% level, based on the distribution of points around the fitted line, indicate an excellent reliability to predict adsorption capacity of caffeine. As Table 7 shows, the various tests results indicated that Equation (6) with inputs of t, C and M accomplished best in predicting

_{m}*q*.

_{p}Model . | Input . | MRE . | R^{2}
. | NRMSE . | Performance . | Equation . |
---|---|---|---|---|---|---|

Equation (6) (best model) | t, C and M | 0.035 | 0.9961 | 0.0916 | Excellent | q_{p} = 1.05q_{m} |

Structure1 | C and M | 0.069 | 0.975 | 0.1544 | Good | q_{p} = 1.07q_{m} |

Structure2 | C and t | 0.18 | 0.929 | 0.2171 | Fair | q_{p} = 1.09q_{m} |

Structure3 | M and t | 0.814 | 0.4873 | 0.4731 | Poor | q_{p} = 1.25q_{m} |

Model . | Input . | MRE . | R^{2}
. | NRMSE . | Performance . | Equation . |
---|---|---|---|---|---|---|

Equation (6) (best model) | t, C and M | 0.035 | 0.9961 | 0.0916 | Excellent | q_{p} = 1.05q_{m} |

Structure1 | C and M | 0.069 | 0.975 | 0.1544 | Good | q_{p} = 1.07q_{m} |

Structure2 | C and t | 0.18 | 0.929 | 0.2171 | Fair | q_{p} = 1.09q_{m} |

Structure3 | M and t | 0.814 | 0.4873 | 0.4731 | Poor | q_{p} = 1.25q_{m} |

Results of the test data set show that Equation (6) slightly overestimated (MRE was positive). The R^{2} and NRMSE of Equation (6) were calculated to be 0.9961 and 0.0916, respectively, showing the excellent performance of multiple polynomial regression in predicting *q _{p}*

_{.}As mentioned in the above section, the pH and T were removed in final model, therefore, the pH and temperature have no significant effect on adsorption capacity of caffeine in a batch mode experiment, and so the number of experiments could be reduced based on the results of the models. In this respect, for three independent variables (t, C and M), three structures were built. The statistical indices and performance of the three structures as compared with the best model (Equation (6)) are presented in Table 7.

For all linear regressions observed in Table 7, the distances from the origin are not significant at the 5% level and are considered zero. Based on Table 7, *q _{p}* values of Equation (6), Structure1, Structure2 and Structure3 were higher than the measured adsorption capacity of caffeine, where the ratios of q

_{Eq(6)}/q

_{m}, q

_{Structure1}/q

_{m}, q

_{Structure2}/q

_{m}and q

_{Structure3}/q

_{m}were 1.05, 1.07, 1.09 and 1.25, respectively. Structure1 indicated good performance, with value of NRMSE equal to 0.1544. Performance of Structure2 was fair, with NRMSE equal to 0.2171, but when Structure3 was applied, results indicated that this equation gives poor prediction of the adsorption capacity of caffeine. To determine the sensitivity of the model to each independent variable, the variation of NRMSE was investigated as compared with the best model by eliminating each operating variable from the model. As shown in Table 7, greatest increase in NRMSE was due to C, followed by M, and then t. The most efficient variable is the initial concentration of caffeine, the NRMSE amount was markedly raised (0.0916 to 0.4731) when this parameter was removed from Equation (6). Similar outcomes have been demonstrated in prior studies (Jain

*et al.*2009; Mousavi

*et al.*2012). In fact, the greater removal efficiency of caffeine was observed at the lower concentration of caffeine due to the accessibility of more unoccupied active sites. However, the removal efficiency of caffeine decreased on raising initial caffeine concentration due to saturation of exchangeable sites of the MWCNTs. Therefore, the removal of caffeine by MWCNTs is more dependent on C.

The second most efficient parameter is mass of adsorbent, where the NRMSE amount raised from 0.0916 to 0.1544. Similarity, the third most efficient parameter is contact time, where the NRMSE amount raised from 0.0916 to 0.2171. This can be attributed to the short time of equilibrium and caused less effective caffeine adsorption by MWCNTs. In fact, equilibrium is reached after 5 min for caffeine (data not presented). The removal efficiency increases with time in the first 5 min and then the adsorption curve reached equilibrium after this time. However, pH and temperature were not significant variables and were dropped from the final model which confirmed the applicability of MWCNTs for treatment of surface and ground water in the broad range of these parameters. This study demonstrated that the multiple polynomial regression could be accurate and a faster alternative to the available difficult and time-consuming models for *q* prediction.

## CONCLUSION

In this research, the adsorption capacity of caffeine by MWCNTs is studied using multiple polynomial regression. The influences of C, M, t, T and pH on *q* (mg g^{−1}) were studied. The MWCNTs were specified by SEM, FTIR spectroscopy and point of zero charge. The results confirmed that the MWCNTs have a high capacity to uptake caffeine from the wastewater. Results show that multiple polynomial regression could give excellent fit to the observation adsorption capacity. The increment of input variables from only ‘C + M + t’ to ‘C + M + t + pH + T’ did not show a significant effect on the removal efficiency of caffeine from aqueous media and consequently did not affect the model. The results also indicated that C was more important in *q* prediction, relative to M and t. The suggested technique is easy to operate, accurate, rapid and needs less computational time.