Modelling stream ﬂ ow and sediment yield using Soil and Water Assessment Tool: a case study of Lidder watershed in Kashmir Himalayas, India

The conjunction of heavy snowfall during winters and intensive rainfall during monsoons along with the mountainous topography expose the Lidder watershed to serious erosion and ﬂ ood aggravation issues. Barely any attempts have been made for an in-depth examination of the Lidder watershed for precise estimation of sub-basin level runoff and erosion. In this study, the Soil and Water Assessment Tool (SWAT) was calibrated using the Sequential Uncertainty Fitting algorithm (SUFI-2) for modelling stream ﬂ ow and sediment yield of the Lidder watershed. Daily runoff and sediment event data from 2003 – 2013 were used in this study; data from 2003 – 2008 was used for calibration and 2009 – 2013 for validation. Model performance was evaluated using various statistical tools, which showed good results revealing excellent potential of the SWAT model to simulate stream-ﬂ ow and sediment yield for both calibration and validation periods. The annual rate of average upland sediment drawn from the watershed was approximately 853.96 Mg/ha for an average surface runoff of 394.15 mm/year. This study identi ﬁ es the vulnerable areas of the Lidder watershed, which can be thoroughly examined by decision-makers for effective management and planning. Further, the calibrated model can be applied to other watersheds with similar characterization to in ﬂ uence strategies in the management of watershed processes. of the and its sensitive parameters. In the current the Soil and Water Tool (SWAT) calibrated to minimize the errors for better simulation of stream ﬂ ow and sediment yield in the Lidder watershed of the Kashmir Himalayas. SWAT model (2012 version) was integrated with ArcGIS (10.4 version) for ef ﬁ cient utilization of spatial data to redesign model conduct and to provide a user-friendly editing environment. The model was calibrated using the SUFI-2 algorithm, a popular algorithm that estimates the sensitivity and uncertainty of a hydrological model, for obtaining persuasive model predictions. Sensitivity analysis revealed that Base ﬂ ow alpha factor (ALPHA_BF) is the most sensitive parameter for ﬂ ow simulation and Soil evaporation compensation factor (ESCO) is the most sensitive parameter for sediment simulation. Model performance was tested by using time series plots and other model evaluation statistical tools, coef ﬁ cient of (R and Nash-Sutcliffe coef ﬁ cient (NSE). The calibration and authenticated daily some uncertainties in ﬂ ows and of Number In the SWAT model has tool ﬂ ow and Lidder watershed.


GRAPHICAL ABSTRACT INTRODUCTION
Kashmir Valley is prone to natural hazards mostly due to its geographic, geological, and tectonic settings. The vulnerability of Kashmir Valley to floods and soil erosion is a serious issue that has adverse impacts on economic as well as environmental stability (Bhat et al. 2019). Thorough investigation and effective management of watersheds in this region have become a need in order to restrict deterioration of its arable lands and smooth functioning of hydraulic structures. An enormous amount of capital expenditure is consumed on dredging works of rivers in Kashmir in order to increase their full capacity. It's of utmost importance to address the issues of watershed degradation at the gross root level and identify the reasons and areas responsible for the siltation of rivers in Kashmir. Lidder river is an extensive source of Jhelum both in terms of runoff and sediment yield and has a significant impact on the geomorphology of Kashmir Valley (Mir & Jeelani 2015) and surprisingly no thorough investigations have been carried out to analyze the hydrological behavior of this watershed. The estimation of runoff and soil loss at the sub-basin level in this watershed is a significant test for ecological planning, design of hydraulic structures and flood protection works.
The occurrence of frequent floods due to excessive precipitation along with improper management of land resources has resulted in environmental worsening of watersheds in Kashmir Valley that needs continuous monitoring and comprehensive assessment. Although several studies have been performed on morphometry and hydrology of some micro-watersheds in Kashmir (Mir et al. 2016;Shah et al. 2017), assessment of critical watersheds like the Lidder Valley has been entirely neglected. Assessment of hydrological parameters like surface runoff and soil erosion has become a challenging test for researchers and decision-makers across the globe.
Quantification of streamflow and soil loss by conventional methods has become a tedious and uneconomical practice. However, hydrological modelling techniques have been revolutionized over the years and have become an indispensable tool for mapping environmental complexity and assessment of various hydrological processes. These models are governed by laws of conservation and certain other physical equations to describe the spatial and temporal variation of a hydrological system by utilizing input data such as climatic conditions, land use, topography, and soil characteristics. Streamflow and sediment yield analysis form a premise whereupon water managers make appropriate decisions consistent with effective management of watershed resources. Various hydrological models, such as the System Hydrologique European Transport Model (SHETRAN) (Ewen et al. 2000), the Agricultural Non-Point Source Model (AGNPS) (Young et al. 1989), the Areal Non-Point Source Watershed Environment Response Simulation Model (ANSWERS) (Beasley et al. 1980), the Kinematic Runoff and Erosion Model (KINEROS) (Smith et al. 1995), the Topography Based Hydrological Model (TOPMODEL) (Beven & Kirkby 1979), the Rangeland Hydrology and Erosion Model (RHEM) (Nearing et al. 2011), the Stanford Watershed Model (SWM) (Crawford & Burges 2004), the Sedimentology and Distributed Modeling Technique (SEDIMOT) (Barfield et al. 2010), the Watershed Erosion Simulation Program (WESP) (Santos et al. 2003), the Soil and Water Assessment Tool (SWAT) (Arnold et al. 1998), and the Watershed Erosion Prediction Project (WEPP) (Flanagan et al. 2001), have been used to solve various watershed problems all over the world. A comprehensive review of watershed models has revealed that the Soil and Water Assessment Tool performs well in complex watersheds because of its adaptability to input requirements and long-term computations (Gull & Shah 2020). SWAT has been applied across different watersheds worldwide for a variety of applications (Khelifa et al. 2017;Liu & Jiang 2019;Zhang et al. 2019). The results of these studies are promising, which confirms the versatility of SWAT to evaluate hydrological processes of a watershed. SWAT model has also been used to evaluate the effect of land-use changes on soil erosion and spatial distribution of sediment for proper planning and management of watersheds (Anand et al. 2018;Bhattacharya et al. 2020).
Geomorphic studies of some watersheds have been analyzed using various geospatial techniques, which revealed that Kashmir Valley is highly susceptible to environmental hazards such as erosion, landslides, and floods due to its complex topography and perplexed geography (Gull et al. 2017;Rather et al. 2017;Meraj et al. 2018). Lidder watershed, being one of the biggest watersheds of Kashmir Valley, needs continuous monitoring and proper assessment for effective management of its land and water resources. The aim of this study is to analyze the efficiency and reliability of the SWAT model by simulating the streamflow and sediment yield of the Lidder watershed and to analyze the spatial distribution of sediment for the preparation of a watershed prioritization map. This study could be a significant contribution towards understanding the hydrological behavior of the mountainous ecosystem of Kashmir. It will be a potential tool for decision-makers and researchers dealing with the environment around the world.

Study area
Lidder watershed lies between the Pir Panjal range in the South and South-East, the North Kashmir range in the North-East and Zanskar range in the South-West and is located between the latitudes 33°40 0 N and 34°20 0 N and longitudes 75°00 0 E and 75°30 0 E ( Figure 1). The geographical elevation of the watershed varies between a minimum of 1,425 meters and a maximum of 5,187 meters above mean sea level with a mean elevation of 3,169 meters and a standard deviation of 872 meters. It has an area of 1,097 km 2 and possesses distinctive climatic characteristics because of its geophysical setting, being enclosed on all sides by high mountain ranges. Lidder river flows through the middle of the watershed and its source lies in the Kolhoi glacier. The river flows southwards up to Pahalgam where it is joined by the major tributary which has its source from Sheshnag lake. Lidder river then mostly flows westwards until it joins Jhelum at the Mirgund area of Khanbal.
The study watershed receives most of its precipitation in the seasons of winter and spring in the form of snow and rainfall, respectively. Due to its complex topography, there are high spatial changes in minimum and maximum temperatures of the area throughout the year. The land use of the watershed is mainly dominated by mixed forests covering 46.61% of the total watershed area followed by barren lands covering 20.46% of the total area. Soil cover is mainly dominated by loamy soils/lithosols covering 73.28% of the total watershed area. Morphometrically, the stream network of the Lidder watershed is constituted of First-Seventh order streams as shown in

Soil and Water Assessment Tool
Soil and Water Assessment Tool is a long-term, continuous, and physically distributed hydrological model vastly used as a watershed scale simulation tool to address different watershed questions. SWAT was developed by the United States Department of Agriculture (USDA), Agricultural Research Service (ARS) to evaluate the impact of watershed management practices on various hydrological processes like runoff, sediment yield, crop growth, nutrient, and pesticide transport. It has proven to be an efficient tool for evaluating the impact of changes in land use and management practices in the complex watershed due to its flexibility to incorporate different cropping stages, soil characteristics, topography, land-use practices, and climatic conditions (Yu et al. 2018;Du et al. 2019;Hosseini & Khaleghi 2020). For precise simulation of hydrological processes, SWAT divides a watershed into multiple sub-basins through which streams are routed. These sub-basins are further divided into hydrologically homogenous units knows as Hydrologic Response Units (HRUs), which are a unique combination of land use, soil, and slope characteristics. The hydrological cycle is analyzed and simulated based on the following daily water balance Equation (1).
where Sf ¼ water content at final stage (mm); So ¼ water content at initial stage (mm); P ¼ rainfall on day i (mm); Q ¼ surface runoff on day i (mm); E ¼ potential evapotranspiration on day i (mm); Ws ¼ water entering the vadose zone on day i (mm); Rf ¼ return flow on day i (mm); t ¼ time period (days). SWAT offers two computational methods for estimation of surface runoff, viz. Soil Conservation Service (SCS) Curve Number (CN) method (SCS 1972) and Green and Ampt technique (Green & Ampt 1911). In this study, SCS Curve number technique was embraced due to inaccessibility of sub-daily data as Green and Ampt method is more data intensive. Variable storage method was used to route the flow in channels while the modified Penmen-Monteith method (Monteith 1965) was applied to determine potential evapotranspiration. Soil loss was estimated using the modified universal soil loss equation given below (Arnold et al. 1998): where T sed ¼ sediment yield (metric tons), Q S ¼ surface runoff volume (mm per hectare), q p ¼ peak runoff rate (m 3 /s), A hru ¼ area of hydrologic response unit (hectares), K usle ¼ soil erodibility factor, C usle ¼ cover and management factor, P usle ¼ support practice factor, LS usle ¼ topographic factor, C frg ¼ coarse fragment factor.

Input data requirements
The first and foremost task is the collection of data that comprises both satellite and field observations. The inconsistent or missing climatic data needs to be fixed and brought into the form that is acceptable to the SWAT model. For missing data, a negative 99.00 (-99.00) should be used. This value commands SWAT to generate data for that particular day. Similarly, satellite imageries are to be processed in terms of geo-referencing and brought into a correct Zone. A digital elevation model (DEM) of good resolution is needed in order to derive the slope and drainage of the area. Land use maps and soil maps, which are the basic inputs of the Soil and Water Assessment Tool, are to be prepared and processed in a usable form. The DEM is a three-dimensional representation of an elevation dataset that represents the terrain of a particular area. The prepared DEM should be reprojected to a required UTM zone (Universal Transverse Mercator), which was found to be Zone 43 for our study area as Kashmir Valley falls in the Northern Hemisphere. Soil and land use maps were downloaded from different online sources and then processed accordingly in ArcGIS software to bring them into a usable form. Weather data was downloaded from the global weather website and it was screened to remove the errors from the downloaded dataset to bring consistency using Notepad þþ software. It is required to reproject DEM, soil map, and land use map into a UTM projection system for SWAT to read it correctly. The input data used in this study and their collection sources are summarized are Table 1.

Model setup
After preparation of all inputs like soil maps, land use maps, slope, and meteorological data, the SWAT model is set up and run to obtain the necessary hydrological parameters, which further need to be compared with observed data for calibration and validation of the model. The typical flowchart illustrating the working of the SWAT model is shown in Figure 3.

Watershed delineation
The initial step in SWAT simulation is watershed delineation. A DEM of 10 m resolution was projected as shown in Figure 4. The SWAT model provides a choice for stream definition, in the event that it is predefined or DEM based, so that it can calculate flow direction and flow accumulation for DEM-based stream definition. Likewise, a modeler is given an alternative to choose the threshold critical source area that can be considered for drawing runoff to the outlet. A threshold critical source area of 100 hectares was selected for our study area stream-network and created accordingly. After giving the outlet and inlet definitions, SWAT automatically creates the various outlet points, which can be edited manually by a modeler as per his requirements. A total of 12 outlet points were selected, including the whole watershed outlet, to delineate the Lidder watershed into 12 sub-basins as shown in Figure 5(a). Sub-basin parameters were calculated before carrying out HRU analysis.

HRU analysis
SWAT divides a sub-basin into hydrologically homogenous units called Hydrologic Response Units (HRUs). Further, SWAT reclassifies land use and soil data as per its own defined classes, as shown in Figure 5(b) and 5(c). The slope data was also reclassified by SWAT on the basis of user-defined slope classes, as shown in Figure 5(d). The user-defined threshold values are given to derive Hydrologic Response Units. Threshold values of 10, 10, and 15% for land use, soil, and slope respectively were given as recommended by Ricci et al. (2018). It means if land use, soil or slope is somewhere less than the specified threshold limit, it will be automatically merged into the upper class. The Lidder watershed was delineated into 130 HRUs.

Writing input tables
The data was configured for the weather station, which includes rainfall, temperature, solar radiation, wind speed and relative humidity. The SWAT input files were written and the database was updated.

SWAT setup and run
Before running the simulation, the SWAT run is set up to enter the simulation period. The model was run for different hydrological processes and a SWAT check was performed to obtain the graphical representation of different hydrological processes such as runoff and sediment yield. The outputs were obtained to perform   Figure 6(a) and 6(b), respectively.

Performance evaluation
Different statistical tools have been utilized by the researchers to check the accuracy and credibility of outputs given by watershed models (Moriasi et al. 2007). In the current study, the results of calibration and validation periods were evaluated by various goodness-of-fit coefficients like coefficient of determination (R 2 ), Index of Agreement (D), and Nash-Sutcliffe coefficient (NSE). The coefficient of determination (R 2 ) shows the strength of the linear relationship among measured and predicted values of a quantity. Its value ranges between 0 and 1 where a value of 0 demonstrates no connection at all between measured and predicted values and 1 shows perfect linear relation. A value of R 2 above 0.5 is considered acceptable (Aawar & Khare 2020).
Index of Agreement (D) is a normalized measure of the degree of model prediction error proposed by Willmott (1981). It represents the ratio of mean square error and potential error and it falls in the range of 0 and 1, where 1 shows an amazing match and 0 demonstrates the all-out failure of the model (Lee et al. 2018).
Nash-Sutcliffe coefficient (NSE) is a normalized statistical value that shows the relative magnitude of the residual variance compared to the observed dataset variance. Its range lies between -∞ to 1. An efficiency of 0 indicates that the model prediction is the same as the mean of the observed dataset. An efficiency of less than 0 shows the inefficiency of the model to produce desired outputs while the value of 1 shows the zero-estimation error variance of the model (Sowah et al. 2020).
The index of agreement (D), coefficient of determination (R 2 ), and Nash-Sutcliffe coefficient (NSE) are calculated from Equations (3)-(5), as given below: where O i is the observed data, S i is the simulated data, O m is the mean observed data and S m is the mean simulated data.

Sensitivity analysis
Recognition of model parameters that fundamentally affect explicit model outputs like streamflow and sediment yield is the basic step towards a better comprehension of SWAT. A set of parameters was chosen based on the extensive relevant literature (Brouziyne et al. 2017;Guo & Su 2019;Karki et al. 2020) and SWAT documentation (Neitsch et al. 2005). The responsiveness and impact of model parameters on model outputs were ascertained by an approach known as sensitivity analysis. Latin Hypercube analysis (LH-OAT) was carried out to reveal the dominant variables of the Soil and Water Assessment Tool and their relative sensitivity (Sahoo et al. 2019). During parameter investigation, the model runs (m þ 1) n times, where m is the number of parameters under evaluation and n is the number of Latin Hypercube loops. The parameter values were varied for each sampling area and the ones delivering the highest values were labelled as the most sensitive parameters (Leta et al. 2017). During sensitivity analysis, 18 parameters were found most sensitive for streamflow simulation, among which runoff baseflow alpha factor (ALPHA_BF) was ranked at the top of the list followed by soil evaporation compensation factor (ESCO). For sediment yield simulation, 7 parameters were found to have a significant impact on sediment yield output and were labelled as the most sensitive parameters with Channel cover factor (CH_COV) ranked as the most sensitive parameter followed by Channel erodibility factor (CH_EROD). The most sensitive parameters, along with their range and fitted values for flow simulation and sediment simulation, are summarized in Table 2. The results reveal that the efficiency of the SWAT model in predicting streamflow in the study watershed is mainly controlled by groundwater flow response to changes in recharge and soil depth used to meet soil evaporation demands. Similarly, sediment dynamics in the Lidder watershed are mainly controlled by vegetation cover of a channel and degree of resistance by soils to inherent yielding.

Calibration and validation
The complexity of a catchment and susceptibility of hydrological modeling parameters, inputs, and observed data to uncertainty obliges a modeler to test the hydrological model by calibration and validation process to check its efficiency. In this study, the SWAT Calibration and Uncertainty Program package (SWAT-CUP) known as SUFI-2 was used to calibrate and validate the model. The agreement between simulated and observed values of streamflow and sediment yield was obtained utilizing subjective and quantitative measures for recommended parameter thresholds with automatic SWAT-CUP optimization. The final value of each model parameter that showed ideal model viability during adjustment was used for approval of the model with no further changes for all uncertainty procedures. A split sample procedure using daily streamflow and sediment yield measured at the outlet of watershed for the period 2003-2008 and 2009-2013 was used for calibration and validation respectively. The period from from 2000-2002 was used as a warm-up period to alleviate unknown initial conditions.
To check the proficiency of a model based on visual assessment, plots were drawn between daily values of observed and simulated data for both stream flow and sediment yield as shown in Figure 7 and 8, respectively.
Visual appraisal of the streamflow hydrograph shows that the model underestimated the peak flows during calibration, as well as validation periods. This could be ascribed to the limited available climatic data and significant variations in the spatial distribution of precipitation in the Lidder watershed. Another conceivable explanation behind this deviation may be attributed to the fact that soils on hilly areas with high hydraulic conductivity and available water holding capacity retain a greater amount of precipitation during high rainfall events, which is later released to the baseflow. Similarly, glaciers retain a huge amount of snow, which melts during summers resulting in increased discharge of the Lidder river. Sediment peaks were also underestimated, which could be attributed to the vulnerability of glacial sediment transported to streams during the snowmelt process. Nonetheless, correlation of observed and simulated values of both streamflow and sediment yield showed outstanding results for calibration, as well as validation periods.
The observed and simulated values were plotted against one another to decide the goodness of fit criterion of coefficient of determination (R 2 ), Nash-Sutcliffe coefficient (NSE), and Index of Agreement (D) for both streamflow and sediment yield. Statistical comparison of streamflow results reveals that the value of Index of Agreement (D) was 0.958 for calibration and 0.953 for validation period indicating excellent model fit. Coefficient of determination (R 2 ) values were found to be 0.9454 and 0.9408 while Nash-Sutcliffe coefficient values were found to be 0.876 and 0.862 for calibration and validation periods respectively (Figure 9(a) and 9(b)).
The goodness of fit results between observed and simulated values of sediment yield showed excellent agreement with D ¼ 0.797, R 2 ¼ 0.888, NSE ¼ 0.591 for the calibration period and D ¼ 0.768, R 2 ¼ 0.874, NSE ¼ 0.541 for validation period (Figure 10(a) and 10(b)).

Spatial distribution of soil erosion
Watershed prioritization allows for improved resource planning to combat the study area's soil erosion problem. The administrative bias for adopting watershed management plans is also addressed by prioritizing watersheds. SWAT model sediment yield analysis was done to perceive the areas susceptible to high erosion among 12 contributing sub-basins of the Lidder watershed. Based on the average daily sediment yield drawn from these subbasins, a watershed prioritization map was prepared to classify the entire watershed into various soil loss severity zones ( Figure 11). Overall, 35% of the Sindh watershed has been identified to have a high to extremely high potential for causing erosion. On average, the Sindh watershed generates 1.72 t/ha of sediment every day. With an average daily sediment loss of 6.17 t/ha, the Sindh watershed's Sub-basin number 1 is the most vulnerable to soil erosion. It was examined that a lot of sediment is drawn from zones having steep slopes as compared to sub-basins with flat topography. This could be attributed to the fact that these sub-basins have a unique land use, which mostly consists of barren lands and snow-covered mountains that are highly susceptible to erosion

CONCLUSIONS
For simulation of hydrological processes like streamflow and sediment yield, effective model calibration is the basic step towards obtaining effective and precise results. For good modelling practices, it is important to consider  the structural complexity of the model and its sensitive parameters. In the current study, the Soil and Water Assessment Tool (SWAT) was calibrated effectively to minimize the errors for better simulation of streamflow and sediment yield in the Lidder watershed of the Kashmir Himalayas. SWAT model (2012 version) was integrated with ArcGIS (10.4 version) for efficient utilization of spatial data to redesign model conduct and to provide a user-friendly editing environment. The model was calibrated using the SUFI-2 algorithm, a popular algorithm that estimates the sensitivity and uncertainty of a hydrological model, for obtaining persuasive model predictions. Sensitivity analysis revealed that Baseflow alpha factor (ALPHA_BF) is the most sensitive parameter for flow simulation and Soil evaporation compensation factor (ESCO) is the most sensitive parameter for sediment simulation. Model performance was tested by using time series plots and other model evaluation statistical tools, such as Index of Agreement (D), coefficient of determination (R 2 ), and Nash-Sutcliffe coefficient (NSE). The calibration and validation results authenticated the model performance to be excellent in simulating daily streamflow and sediment yield. However, some uncertainties were seen in modelling the peak flows and sediment yield, which may be associated with vulnerability of input parameters, as well as limitations of the Curve Number method and MUSLE equation used by SWAT to predict watershed processes. In general, the SWAT model has proved to be an effective tool for prediction of streamflow and sediment yield of the Lidder watershed. Soil erosion analysis showed that the upland sub-basins significantly contribute to the total sediment yield drawn from the watershed, which may be primarily attributed to the susceptibility of sleep slopes, barren lands, and snow-covered areas to aggravated soil loss. The lowland sub-basins were labelled as low seriousness zones where very low erosion was discovered, which may be ascribed to the flat topography and forest cover of these regions.
This study can be used for further analysis, including environment and land use changes and their impact on the hydrological conduct of a watershed. To alleviate the frequent occurrences of floods and incessant sedimentation of the Lidder river, the study watershed needs effective conservation measures and extensive afforestation in the highly erosion prone areas for future sustainable uses and infrastructure development. This study's main limitation is that it uses the Curve number approach, which does not provide an accurate estimate of runoff on a day with numerous storms because soil moisture and the associated runoff curve number vary from storm to storm. Sub-daily climatic data measures are required to accurately forecast runoff during a day with many storms. A key suggestion is that more attention should be paid to input data because the model requires more exact continuous information, such as meteorological and hydrological data. As a result, more sediment and streamflow measuring stations should be established by government authorities. Other hydrological models can be used to evaluate the research region, and comparisons with this study can be made.

DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its supplementary information.