In this paper, the use of nonlinear nearest trajectory based on phase space reconstruction along with several data-driven methods, including two types of perceptron artificial neural networks with Levenberg–Marquardt (ANN-LM) and particle swarm optimization learning algorithms (ANN-PSO), adaptive neuro-fuzzy inference system (ANFIS) and gene expression programming for forecasting suspended sediment concentration (SSC) dynamics in streamflow is studied. The nonlinearity of the series was tested using the method of surrogate data at 0.01 significance level as well as correlation exponent method. The proper time delay is calculated using the average mutual information function. Obtained results of different models are compared using root mean square error (RMSE), Pearson's correlation coefficient (PCC) and Nash–Sutcliffe efficiency with logarithmic values (Eln). Of the applied nonlinear methods, ANFIS generates a slightly better fit under whole daily SSC values (the least amount of RMSE = 10.5 mg/l), whereas ANN-PSO shows superiority based on the Eln criterion (the highest amount of Eln = 0.885). According to the non-parametric Mann–Whitney test, all data-driven models represent the same forecasted results and are significantly superior to the nearest trajectory-based model at the 99% confidence level.
INTRODUCTION
Proper estimation of sediment concentration including bed load and suspended sediment is a critical task in hydrology and hydraulic engineering projects. Determining sediment concentration has implications for dam design, sediment erosion to waterways, and watershed management (Cigizoglu & Alp 2006). Suspended sediment concentration (SSC) is known as the ratio of the mass of sediment in a water–sediment mixture to the volume of the mixture. The assessment and prediction of the volume of transport SSC are vital in river and hydraulic engineering (Kisi 2008).
There are a variety of nonlinear data-driven simulating and predicting/forecasting techniques for hydrological time series data sets including methods based on the concept of trajectories of system dynamics such as chaotic prediction methods (Yang et al. 2011) and data-driven models including artificial neural networks (ANNs), neuro-fuzzy models, and gene expression programming (GEP) (Zounemat-Kermani & Teshnehlab 2008; Shiri & Kisi 2011; Khatibi et al. 2012). Several studies can be found using artificial intelligence-based models, such as ANNs and adaptive neuro-fuzzy inference systems (ANFISs) for SSC modeling with promising results (Rajaee et al. 2009, 2011; Melesse et al. 2011; Kakaei Lafdani et al. 2013; Zounemat-Kermani et al. 2016).
In recent years, the use of phase space reconstruction in hydrology has received considerable attention for analyzing and forecasting hydrological time series data. They generate predictions/forecasts by finding local portions of the time series that closely resemble a portion of the points immediately preceding the point to be predicted. Nonetheless, related studies for SSC prediction based on nearest trajectory strategy and local prediction methods are scarce. Sivakumar & Wallender (2005) investigated the predictability of river flow and suspended sediment transport in the Mississippi River. Their approach was based on the reconstruction of a single-dimensional series within a multi-dimensional system state for representing the underlying dynamics as well as the use of a local approximation technique for the forecasting process. They claimed that prediction accuracy was encouraging for 1-day-ahead results. Shang et al. (2009) employed the system state embedding-based weight predictor algorithm to make a short-term forecasting of sediment transport phenomena. They reported that the predicted values were in reasonable agreement with observed values within 15 days, but they appeared much less accurate beyond the limit of 20 days. Yang et al. (2011) proposed a new chaos Bayesian optimal prediction method by choosing optimal parameters in the local linear prediction method and improving the prediction accuracy with Bayesian theory. The prediction results indicated that the applied method can choose the optimal parameters adaptively in the prediction process of monthly discharge. Adenan & Noorani (2014) applied the local approximation method (LPM) for prediction of river flow at three measurement stations in Peninsular Malaysia. Results showed that the predicted results were close to agreement for all stations with a high correlation coefficient. Albostan & Önöz (2015) investigated the chaotic behavior of the river basin. For that purpose, the daily discharge data of four gauge stations were examined by using nonlinear data analysis methods. The LPM was also applied to evaluate the performance of the model. They reported that the predicted values were in good agreement with the observations by having high values of the correlation coefficient.
The present study attempts to detect the dynamical behaviors of the SSC processes (in the San Joaquin River, California) at daily temporal scale time series using a phase space reconstruction theory based on the time delay embedding method. In addition, the ability of nearest trajectory strategy – based on phase space reconstruction – to provide daily forecasting of SSC as well as several data-driven models (ANN-LM, ANN-PSO, ANFIS and GEP) is investigated. The outline of the paper is as follows. The methodology including nonlinear methods is described immediately below. The study area (San Joaquin River) and statistical analyses over the data are then presented. This is followed by the application of simulating models and their results. Next, the findings are discussed and remarks and directions for future studies given, and finally, conclusions are drawn.
METHODOLOGY
In this section, the methods based on nearest trajectory strategy, including reconstruction of the system state and local prediction method (LPM) and method of surrogate as well as other data-driven methods are briefly described.
Nearest trajectory strategy: system state reconstruction and local prediction method
All nonlinear local and global models have parameters that must be selected before the model can be constructed or tested. In this study, a local zeroth-order model is constructed using the nearest points on the k nearest neighboring trajectory segments. This forecasting technique is based on the fact that nearby trajectories either converge or do not diverge fast enough for small sample steps in the system state. This model evaluates the mapping function among the points which are near the forecast points. The time series identifies the sub-phase space by selecting k nearest adjacent (neighbor) points within the d dimensional space (Sannasiraj et al. 2004).
The radius of the neighborhoods r should be large enough in order to cover the noise extent, but still smaller than a typical curvature radius. For each point yt in the time series, the nearest neighbor yj for each point is sought in an m-dimensional space (Hegger et al. 1999). In this study, the initial distance (radius) was chosen as 1/1,000.
Method of surrogate data
Surrogate data are random data sets which have the same autocorrelation structure as the dynamical real time series. In other words, surrogates have the same spectral properties and the same statistical distribution as the original data. The correspondence between surrogates and the original data, in terms of the relative discrepancy, either approaches to zero as the number of iteration increases or converges to a finite value acting as an uncertainty threshold (Millan et al. 2009). This method serves as a null hypothesis (e.g., the SSC time series was generated by a stochastic/linear process), which consists of a candidate stochastic/linear process (Theiler et al. 1992). The objective of the hypothesis is to reject the null hypothesis by comparing the original and random data sets for testing the nonlinearity. In case of the existence of significant difference, the assumption of stochasticity of the time series will be rejected. In general, two approaches can be applied to generate surrogate data, adding noise to the same spectral signal from the original time series or generating a random series by shuffling the sequence. The first technique is used in the present study for the generation of surrogate data (Ng et al. 2007).
Perceptron neural networks (ANN-LM and ANN-PSO) and ANFIS models
Gaussian-type functions were chosen as fuzzy set membership functions. More details of the applied ANN and ANFIS models are presented in Table 1.
ANN . | . | ANFIS . | . |
---|---|---|---|
Learning rate (LM) | 0.01 | Membership function | Gaussian |
Momentum (LM) | 0.85 | And/Or method | Product/Maximum |
Population size (PSO) | 100 | Aggregate/Implication method | Maximum/Product |
Output layer's activation function | Pure linear | Defuzzification method | Wtavera |
Hidden layer's activation function | Tangent sigmoid |
ANN . | . | ANFIS . | . |
---|---|---|---|
Learning rate (LM) | 0.01 | Membership function | Gaussian |
Momentum (LM) | 0.85 | And/Or method | Product/Maximum |
Population size (PSO) | 100 | Aggregate/Implication method | Maximum/Product |
Output layer's activation function | Pure linear | Defuzzification method | Wtavera |
Hidden layer's activation function | Tangent sigmoid |
aNote: Returns the weighted average of the x-values, with the y-values used as weights.
Gene expression programing
GEP is a genotype/phenotype GA. It can evolve programs of different sizes and shapes, encoded in linear fixed-length chromosomes. Similar to GAs, GEP uses populations of individuals, selects them according to fitness, and introduces genetic variation by using one or more genetic operators (Ferreira 2006). The GEP initializes a population consisting of random chromosomes; the fitness of each chromosome is evaluated according to a target value. Each chromosome is composed of multiple genes with each gene encoding a smaller sub-program. Every gene encodes a mathematical expression expressed as an expression tree. In GEP, the unconstrained operation of important genetic operators, such as mutation, transposition, and recombination are permitted as a result of structural and functional organization of the linear chromosomes (Hashmi et al. 2011; Shiri & Kisi 2011).
Function set . | Symbol . | Weight . | Function set . | Symbol . | Weight . |
---|---|---|---|---|---|
Addition | 4 | Logarithm(x, y) | Log2 | 1 | |
Subtraction | 4 | X to the power2 | X2 | 1 | |
Multiplication | 4 | X to the power3 | X3 | 1 | |
Division | 2 | Cubic root | 1 | ||
Square root | 1 | Gaussian | Gau | 1 | |
Exponential | Exp | 1 | Sine | sin | 1 |
Natural log | Ln | 1 | Cosines | cos | 1 |
Power | Pow | 1 | Arctangent | Atan | 1 |
General | Genetic operators | ||||
Chromosomes | 60 | Mutation rate | 0.05 | ||
Genes | 5 | Inversion rate | 0.1 | ||
Head size | 10 | IS transposition rate | 0.1 | ||
Linking function | Addition | RIS transposition rate | 0.1 | ||
Fitness function | RMSE | 1-point recombination rate | 0.2 | ||
2-point recombination rate | 0.3 | ||||
Gene recombination rate | 0.1 | ||||
Gene transposition rate | 0.1 |
Function set . | Symbol . | Weight . | Function set . | Symbol . | Weight . |
---|---|---|---|---|---|
Addition | 4 | Logarithm(x, y) | Log2 | 1 | |
Subtraction | 4 | X to the power2 | X2 | 1 | |
Multiplication | 4 | X to the power3 | X3 | 1 | |
Division | 2 | Cubic root | 1 | ||
Square root | 1 | Gaussian | Gau | 1 | |
Exponential | Exp | 1 | Sine | sin | 1 |
Natural log | Ln | 1 | Cosines | cos | 1 |
Power | Pow | 1 | Arctangent | Atan | 1 |
General | Genetic operators | ||||
Chromosomes | 60 | Mutation rate | 0.05 | ||
Genes | 5 | Inversion rate | 0.1 | ||
Head size | 10 | IS transposition rate | 0.1 | ||
Linking function | Addition | RIS transposition rate | 0.1 | ||
Fitness function | RMSE | 1-point recombination rate | 0.2 | ||
2-point recombination rate | 0.3 | ||||
Gene recombination rate | 0.1 | ||||
Gene transposition rate | 0.1 |
STUDY AREA AND DATABASE
Study area
Data and statistical analysis
In this study the periods for training (October 1st 1964–September 30th 2007; about 90% of the total data) and testing sets (October 1st 2007–September 30th 2012; the remaining 10% of the total data) were selected based on the existing data of 48 water years. The statistical parameters of the entire data as well as the training data and testing data sets are presented in Table 3.
Statistical parameters . | Training set . | Testing set . | Total data . |
---|---|---|---|
Number of data | 16,021 (90%) | 1,780 (10%) | 17,801 (100%) |
Xmean | 79.96 | 51.00 | 76.94 |
Sx | 49.83 | 30.13 | 48.96 |
Cv | 0.62 | 0.59 | 0.64 |
Xmax | 1,590 | 441 | 1,590 |
Xmin | 6 | 12 | 6 |
Csx | 5.07 | 4.74 | 5.00 |
r1 | 0.89 | 0.92 | 0.88 |
r2 | 0.75 | 0.80 | 0.76 |
Statistical parameters . | Training set . | Testing set . | Total data . |
---|---|---|---|
Number of data | 16,021 (90%) | 1,780 (10%) | 17,801 (100%) |
Xmean | 79.96 | 51.00 | 76.94 |
Sx | 49.83 | 30.13 | 48.96 |
Cv | 0.62 | 0.59 | 0.64 |
Xmax | 1,590 | 441 | 1,590 |
Xmin | 6 | 12 | 6 |
Csx | 5.07 | 4.74 | 5.00 |
r1 | 0.89 | 0.92 | 0.88 |
r2 | 0.75 | 0.80 | 0.76 |
aNote: Xmean is the average of the data; Xmax is the maximum value of the data; Xmin is the minimum value of the data; Sx is the standard deviation; Cv is the coefficient of variation; Csx is the skewness; and rn is autocorrelation of SSC n-day lag.
APPLICATION AND RESULTS
Evaluation of results
The efficiency criterion (E) proposed by Nash & Sutcliffe (1970) is defined as one minus the sum of the absolute squared differences between the predicted and observed values normalized by the variance of the observed values during the period under investigation. To reduce the problem of the squared differences and the resulting sensitivity to extreme values, the Nash–Sutcliffe efficiency E is often calculated with logarithmic values of So and Sp. The Eln gains the advantage of the logarithmic transformation of the SSC values. Thereafter, the peaks are flattened and the low SSCs are kept more or less at the same level. As a result, the influence of the low SSC values is increased in comparison to the SSC peaks (Krause et al. 2005).
Data preprocessing
Test of nonlinearity
Another alternative way to detect possible nonlinearity in a time series is by applying amplitude-adjusted Fourier-transformed (AAFT) surrogate data set method (Theiler et al. 1992). In this respect, the behavior of the correlation exponent can be investigated for determining the presence of chaos in a time series, such that, for deterministic processes the value of the correlation exponent saturates after a certain value of embedding dimension (Zounemat-Kermani 2016). As illustrated in Figure 6(b), the relationship between correlation dimension and embedding dimension of the original SSC data and for its surrogate data, shows that in the case of surrogate data – unlike the original SSC series – the correlation exponents do not reach a saturation level. This issue also implies nonlinearity of the original SSC time series.
Results of the nearest trajectory strategy model
In order to forecast the 1-day ahead of SSC using the nearest trajectory strategy (LPM) the time series analysis (TISEAN) package was used. Initially, the delay time was estimated by the AMI function using time lags of 1–100 days (as shown in Figure 4) to reconstruct the original system state (τ= 66). In this paper, the method used for the determination of the embedding dimension is based on a calculation of the minimum RMSE of the forecasted results by increasing the embedding dimension to achieve the optimal dimension (Shang et al. 2009). Therefore, embedding dimensions 1 to 8 are used for the system state reconstruction.
Model (embedding dimension) . | RMSE (mg/L) . | PCC . |
---|---|---|
LPM(1) | 11.332 | 0.924 |
LPM(2) | 11.055 | 0.932 |
LPM(3) | 11.582 | 0.928 |
LPM(4) | 13.539 | 0.917 |
LPM(5) | 14.843 | 0.903 |
LPM(6) | 15.397 | 0.895 |
LPM(7) | 16.063 | 0.886 |
LPM(8) | 16.391 | 0.870 |
Model (embedding dimension) . | RMSE (mg/L) . | PCC . |
---|---|---|
LPM(1) | 11.332 | 0.924 |
LPM(2) | 11.055 | 0.932 |
LPM(3) | 11.582 | 0.928 |
LPM(4) | 13.539 | 0.917 |
LPM(5) | 14.843 | 0.903 |
LPM(6) | 15.397 | 0.895 |
LPM(7) | 16.063 | 0.886 |
LPM(8) | 16.391 | 0.870 |
Delay time (day) . | RMSE (mg/L) . | PCC . |
---|---|---|
10 | 11.558 | 0.927 |
20 | 11.562 | 0.928 |
30 | 11.124 | 0.930 |
40 | 11.065 | 0.931 |
50 | 11.067 | 0.931 |
60 | 11.049 | 0.933 |
70 | 11.098 | 0.931 |
80 | 11.469 | 0.922 |
90 | 11.598 | 0.918 |
100 | 11.587 | 0.919 |
Delay time (day) . | RMSE (mg/L) . | PCC . |
---|---|---|
10 | 11.558 | 0.927 |
20 | 11.562 | 0.928 |
30 | 11.124 | 0.930 |
40 | 11.065 | 0.931 |
50 | 11.067 | 0.931 |
60 | 11.049 | 0.933 |
70 | 11.098 | 0.931 |
80 | 11.469 | 0.922 |
90 | 11.598 | 0.918 |
100 | 11.587 | 0.919 |
Results of the data-driven methods
Five input combinations (see Table 6) and a single output (the 1-day ahead suspended sediment load (St + 1)) were established for ANN-LM, ANN-PSO, ANFIS, and GEP models based on the PACF analysis (see Figure 5(a)).
Input-Neurons-Output . | Input-Rules number-Output . | Gene No.-Chromosomes-Head size . | Input(s) . | Output . |
---|---|---|---|---|
ANN(1-4-1) | ANFIS(1-2-1) | GEP(3-30-6) | St | St+1 |
ANN(2-5-1) | ANFIS(2-4-1) | GEP(4-40-7) | St, St−1 | St+1 |
ANN(3-7-1) | ANFIS(3-8-1) | GEP(4-50-8) | St, St−1, St−2 | St+1 |
ANN(4-8-1) | ANFIS(4-16-1) | GEP(5-55-9) | St, St−1, St−2, St−3 | St+1 |
ANN(5-10-1) | ANFIS(5-32-1) | GEP(5-60-10) | St, St−1, St−2, St−3, St−4 | St+1 |
Input-Neurons-Output . | Input-Rules number-Output . | Gene No.-Chromosomes-Head size . | Input(s) . | Output . |
---|---|---|---|---|
ANN(1-4-1) | ANFIS(1-2-1) | GEP(3-30-6) | St | St+1 |
ANN(2-5-1) | ANFIS(2-4-1) | GEP(4-40-7) | St, St−1 | St+1 |
ANN(3-7-1) | ANFIS(3-8-1) | GEP(4-50-8) | St, St−1, St−2 | St+1 |
ANN(4-8-1) | ANFIS(4-16-1) | GEP(5-55-9) | St, St−1, St−2, St−3 | St+1 |
ANN(5-10-1) | ANFIS(5-32-1) | GEP(5-60-10) | St, St−1, St−2, St−3, St−4 | St+1 |
In order to designate the appropriate number of neurons in the ANN's hidden layer, a trial and error procedure was used and the model with the lowest MSE for the testing set was considered as the representative ANN structure.
The RMSE and PCC statistics for the optimum ANNs, ANFIS, and GEP structures in the test period (Table 7) clearly show that the ANFIS(5-32-1) model is superior to the other models (lowest RMSE = 10.513 mg/L and highest PCC = 0.939) followed by GEP(5-55-9) model (RMSE = 10.731 mg/L and PCC = 0.936).
Model . | Model inputs . | RMSE (mg/L) . | PCC . |
---|---|---|---|
ANN-LM(4-8-1) | St, St−1, St−2, St−3 | 11.031 | 0.933 |
ANN-PSO(5-10-1) | St, St−1, St−2, St−3, St−4 | 10.945 | 0.935 |
ANFIS(5-32-1) | St, St−1, St−2, St−3, St−4 | 10.513 | 0.939 |
GEP(5-55-9) | St, St−1, St−2, St−3 | 10.731 | 0.936 |
Model . | Model inputs . | RMSE (mg/L) . | PCC . |
---|---|---|---|
ANN-LM(4-8-1) | St, St−1, St−2, St−3 | 11.031 | 0.933 |
ANN-PSO(5-10-1) | St, St−1, St−2, St−3, St−4 | 10.945 | 0.935 |
ANFIS(5-32-1) | St, St−1, St−2, St−3, St−4 | 10.513 | 0.939 |
GEP(5-55-9) | St, St−1, St−2, St−3 | 10.731 | 0.936 |
In this study, all the GEP models consisted of 16 functions (see Table 2 for further details) selected among all those available. This selection was based on the most comprehensive operators and their relevance to the nature of the problem. According to the PACF analysis, the terminal set contained one to five independent lagged historical variables similar to other forecasting models. Subsequently, each possible solution of the SSC prediction problem was randomly generated by combining some of the introduced functions and some terminals until the highest possible coefficient of determination was achieved. To arrive at the global optimal solution and to avoid getting trapped in a local minimum, all the genetic operators described in Table 2 were used, with the same appointed values employed for all the GEP models.
DISCUSSION
A comparison of RMSE and PCC values (Tables 4 and 7) indicates that the ANFIS model outperformed all of the other SSC forecasting models. To achieve a better perspective of the overall results, the RMSE of the best forecasting models as well as their Er values and Nash–Sutcliffe efficiency with logarithmic values (Eln) are tabulated in Table 8.
Model . | RMSE (mg/L) . | Er (mg/L) . | Eln . | Enhancement in RMSE efficiency (%) . | Enhancement in Eln efficiency (%) . |
---|---|---|---|---|---|
ANFIS(5-32-1) | 10.513 | −1.04 | 0.883 | 4.9 | 1.1 |
GEP(5-55-9) | 10.731 | −1.02 | 0.882 | 2.9 | 1.0 |
ANN-PSO(5-10-1) | 10.945 | −1.05 | 0.885 | 0.3 | 1.4 |
ANN-LM(4-8-1) | 11.031 | −1.16 | 0.884 | 0.2 | 1.3 |
LPM(2) | 11.055 | −1.43 | 0.873 | Base | Base |
Model . | RMSE (mg/L) . | Er (mg/L) . | Eln . | Enhancement in RMSE efficiency (%) . | Enhancement in Eln efficiency (%) . |
---|---|---|---|---|---|
ANFIS(5-32-1) | 10.513 | −1.04 | 0.883 | 4.9 | 1.1 |
GEP(5-55-9) | 10.731 | −1.02 | 0.882 | 2.9 | 1.0 |
ANN-PSO(5-10-1) | 10.945 | −1.05 | 0.885 | 0.3 | 1.4 |
ANN-LM(4-8-1) | 11.031 | −1.16 | 0.884 | 0.2 | 1.3 |
LPM(2) | 11.055 | −1.43 | 0.873 | Base | Base |
The negative values of Er for each model indicates that all models under-forecasted SSC to a greater or lesser degree.
Based on the findings of the study, not much difference can be seen among the forecasted models, especially for the base SSC values (see Table 7, Enhancement in Eln efficiency column). Thereafter, to have a better judgment about models’ performance, two statistical tests were conducted (Table 9). First, the Anderson–Darling test was applied for checking each series normality. As all the series were found to be non-normal data (P-value < 0.005), the non-parametric Mann–Whitney test was conducted for checking the difference between the original and each model's forecasted results. According to the Mann–Whitney nonparametric test, that two samples come from the same population, there is no significant difference between the original time series and data-driven models, whereas there is a significant difference for the local prediction model considering confidence level = 99%. In other words, it can be said that all the data-driven models performed satisfactorily with no statistical superiority, but they clearly outperformed the LPM model. However, considering confidence level equal to 95%, it can be seen from Table 8 that the ANFIS and GEP models are significantly superior to the other models.
Model . | Anderson–Darling normality test . | Normality . | Mann–Whitney test . | Significantly different (α=0.01) . | Significantly different (α=0.05) . |
---|---|---|---|---|---|
Original | P-value < 0.005 | No | – | – | – |
ANFIS | P-value < 0.005 | No | Significant at 0.166 | No | No |
GEP | P-value < 0.005 | No | Significant at 0.096 | No | No |
ANN-PSO | P-value < 0.005 | No | Significant at 0.037 | No | Yes |
ANN-LM | P-value < 0.005 | No | Significant at 0.031 | No | Yes |
LPM(2) | P-value < 0.005 | No | Significant at 0.002 | Yes | Yes |
Model . | Anderson–Darling normality test . | Normality . | Mann–Whitney test . | Significantly different (α=0.01) . | Significantly different (α=0.05) . |
---|---|---|---|---|---|
Original | P-value < 0.005 | No | – | – | – |
ANFIS | P-value < 0.005 | No | Significant at 0.166 | No | No |
GEP | P-value < 0.005 | No | Significant at 0.096 | No | No |
ANN-PSO | P-value < 0.005 | No | Significant at 0.037 | No | Yes |
ANN-LM | P-value < 0.005 | No | Significant at 0.031 | No | Yes |
LPM(2) | P-value < 0.005 | No | Significant at 0.002 | Yes | Yes |
The outcomes of this study imply that all the mentioned practical models are useful tools for forecasting 1-day ahead sediment concentration (PCC >0.9 and Eln >0.85). However, according to the results of the Mann–Whitney test, it is recommended to apply nonlinear data-driven models rather than LPM. One of the main limitations of the implemented LPM method can be stated as its restriction to choosing the proper time delay and embedding dimensions (which are challenging issues) for constructing the phase space. Therefore, it is recommended to check the possibility of embedding the SSC dynamic system based on a differential phase space (rather than time delay phase space) and continuously examine the results of the forecasting model. In order to embed a chaotic time series data set into a differential phase space, unlike the time delay phase space theorem, the time scale parameter (rather than time lag) is used. Another difference between the two theorems is that during the differential phase space reconstruction, the noise in the chaotic time series has been reduced (Xu 2009).
CONCLUSIONS
A preliminary attempt was made on the evaluation of nearest trajectory strategy using local prediction method (LPM) and several data-driven models (ANN-PSO, ANN-LM, ANFIS and GEP) for the daily forecasting of SSC in Vernalis station located on the San Joaquin River, California. Forecasts based on the nearest trajectory strategy (LPM method) were made by reconstructing the SSC series in the system state using a delay time theorem. Results suggested that the dynamics of the underlying system could be embedded in a two-dimensional system. Although the LPM model provided reasonable results (RMSE= 11.06 mg/L and PCC= 0.93), it was inferior to data-driven models based on the Mann–Whitney statistical test (significantly different at the 99% confidence level). Future studies could involve applying other techniques such as conceptual methods in forecasting SSC in comparison with data-driven methods.