Estimation of sediment thickness by solving Poisson ’ s equation with bedrock outcrops as boundary conditions

Sediment thickness and bedrock topography are vital for the terrestrial hydrosphere. In this study, we estimated sediment thickness by using information from digital elevation models, geological maps, and public databases. We discuss two different approaches: First, the horizontal distances to the nearest bedrock outcrop were used as a secondary function in kriging and cokriging. Second, we applied Poisson ’ s equation to estimate the local trend of the sediment thickness where bedrock outcrops were used as boundary conditions. Differences between point observations and the parabolic surface from Poisson ’ s equation were minimized by inverse modelling. Ordinary kriging was applied to the residuals. These two approaches were evaluated with data from the Øvre Eiker, Norway. Estimates derived from Poisson ’ s equation gave the smallest mean absolute error, and larger soil depths were reproduced better if the local trend was included in the estimation procedure. An independent cross-validation was undertaken. The results showed the best accuracy and precision for kriging on the residuals from Poisson ’ s equation. Solutions of Poisson ’ s equation are sensitive to the boundary conditions, which in this case were locations of the bedrock outcrops. Bedrock outcrops are available for direct observations; hence, the quality of the estimates can be improved by updating input from high-resolution mapping.


INTRODUCTION
Questions are put forward as to whether current conceptual models in hydrology are able to predict the impact of environmental change on the terrestrial hydrosphere (Beven ; Wagner ; Gupta et al. ). A fundamental part of the terrestrial environment is the interface between the atmosphere and the unweathered bedrocklabelled the Critical Zone (CZ) in Earth Science (NRC ; Giardino & Houser ; Richardson ). Water saturation in the CZ is a key variable for rainfall-runoff processes (Dunne & Black ; Dunne ). Evaporation and transpiration from this domain are important for the terrestrial water flux of moisture to the atmosphere (Trenberth ; Bierkens & van den Hurk ). A major part of the CZ is the unconsolidated material above the bedrockfor simplicity, called the sediment thickness in this paper. The spatial distribution of the sediment thickness and the bedrock topography is interesting in many different applications. In this article, we do not focus on processes in the CZ, but we address different approaches to estimate the geometry of sediment thickness and bedrock topography by utilizing publicly available databases. Before looking more closely into the estimation problem, it is interesting to review briefly some studies that focus on the role of bedrock topography and sediment thickness for rainfallrunoff processes. Graham et al. () and Graham & McDonnell () used field experiments and modelling results to demonstrate the importance of bedrock topography and permeability for threshold response in runoff events. Above a threshold discharge, the runoff increased dramatically with a given rainfall intensity. In such cases, a major fraction of the subsurface flow took place at the interface between the sediments and the less permeable bedrock. Hopp & McDon-nell () used a synthetic dataset to show the importance of surface topography, soil depth, and bedrock topography to simulate the rainfall-runoff response. They concluded that the subsurface connectivity of saturated areas explained the runoff characteristics. Gabrielli et al. () studied the relation between runoff and groundwater in sediments and bedrock in two hillslope catchments in New Zealand and Oregon, USA. In both hillslopes, the bedrock influenced the streamflow, but rapid subsurface stormflow was apparently most important if the bedrock had low permeability compared with the soil. Based on a large dataset, Trompvan Meerveld & McDonnell (a) showed that pipe flow, matrix flow, and total flow had a threshold response to precipitation. Total flow increased two orders of magnitude if precipitation increased to a specified threshold. From their analysis, they concluded that bedrock topography determined the upstream contributing area, not the surface area as usually assumed. In the following study, Tromp-van Meerveld & McDonnell (b) showed that micro depressions in the bedrock surface were filled before they contributed to subsurface runoff. Freer et al. ()  Methods for mapping and/or estimation of sediment thickness or bedrock topography are determined by the purpose of the study. Test drilling yields usually the ultimate point precision, but for resource mapping or hazards assessments, geophysical prospecting methods might give the required precision. In the case of water balance, integrated grid estimates might be sufficient. A thorough review of geotechnical and geophysical methods are beyond the scope of this study, but a short summary of the most relevant estimation approaches is appropriate. Rempe & Dietrich ()  () applied an Artificial Neural Network (ANN) to estimate glacial bedrock topography. They evaluated the method by using synthetic input data where the (synthetic) ice thickness was calculated by solving the shallow ice equations. The spatial reproduction of the bedrock topography had a maximum misfit of more than 400 m with a rootmean-square error of 70 m. The relative error of ice volume varied from À45 to þ220%. Mey et al. () applied ANN in the same way as Clarke et al. (). They used the digital elevation model (DEM) to estimate the sediment thickness in the Rhone valley and reported an average estimation error in sediment fill thickness of less than 27%. The ANN method assumes a 'sameness' in the training data and the output data. The assumption implies that the geomorphic signature of the bedrock topography is similar above and below the elevation where the bedrock is covered by sediments or ice. The ANN approach is, therefore, sensitive to the training dataset with respect to statistical homogeneity. with the surface that is exposed to observations. The opposite is more common in geo-applications: The unknown domain is usually large compared with the exposed bedrock.
One way to overcome this problem is to optimize the source term in Poisson's equation by inverse modelling. The residuals between observed sediment thickness and the optimal parabolic function can be modelled as a stochastic function in space.
The bedrock topography B(u) (m a.m.s.l.) is given by where T (u) (m a.m.s.l.) is the terrain elevation, andD(u) (m) is the estimate of the sediment thickness. The challenge is to findD(u) that minimize the estimation uncertainty ϵ(u Using L(u) in the estimation procedure was inspired by an intuitive procedure among drill operators: Based on previous experiences from a given area, it is often possible to judge the sediment thickness D at a location u, by simple visual inspection of the horizontal distance L, to the nearest bedrock outcrop (Figure 1). The statistical relation between D(u) and L(u) was used in cokriging and ordinary kriging to evaluate the effect on the estimation variance.
The quality of publicly available data is interesting to explore because the volume of information is increasing for every year. In this study, D(u) was taken from the Norwegian well database GRANADA (NGU a). Data from mainland Norway (denoted the national-or large scale in the following) were used to estimate covariance-and Raftery ). Before presenting the methods and the results, it is necessary to describe the data material we applied to evaluate the estimation procedures.

DATA MATERIAL
As indicated above, we employed three sources of information for the current study: (i) point observations of sediment thickness, (ii) DQM, and (iii) DEM of the surface.
In addition, we also had access to geological maps of the bedrock for the study area. Even though the locations of the different bedrock types were not utilized in the current study, a brief description of the most prominent geological structures in the study area is interesting as background information.

Local geology
The study area Øvre Eiker (9 km × 7.1 km) is located about 60 km west of Oslo ( Figure 2). Øvre Eiker is a part of Vestfold graben, which is one of the three main graben structures in the Oslo Rift. Large parts of the Vestfold graben come  More information on the bedrock geology can be found in Andersen et al. () and NGU (b).

Quaternary maps and surface elevation
DQMs (NGU c) were used to identify locations where the bedrock was exposed to the atmosphere or covered with a thin and patchy layer of organic matter R(u) ¼ 1 (37% of the study area). For simplicity, the area where R(u) ¼ 1 is referred to as exposed bedrock in this article.
The remaining area where R(u) ¼ 0 (63%) was covered by the unconsolidated material ( Figure 3). Marine deposits are the most frequent sediment type in the area, which cover about 27% of the study area. The marine limit was levelled to 194 m above present mean sea level (Bargel ). More than 70% of the surface area covered by sediments R(u) ¼ 0 was below the marine limit, while more than 75% of the exposed bedrock R(u) ¼ 1 was located above this altitude ( Figure 3).
Till deposits are the second most frequent surface sediment, which are exposed to about 13% of the study area.
Most of the tills are deposited as a relatively thin layer above the bedrock, and more than 95% of the exposed tills are located above the marine limit. For simplicity, we also included front moraine in the till category for the present study, which implies that the sediment thickness may be larger in some locations. Glaciofluvial sediments are deposited below 200 m a.m.s.l. and fluvial sediments below 120 m a.m.s.l. (Figure 3). About 8% of the study area is covered by colluvial or weathered material. Peat and swamps are most frequent at higher altitudes and cover less than 2% of the area. A minor fraction of the area (0.14%) is anthropogenic material.

METHODS
The statistical methods we applied for this study were based on the Gaussian theory, which is well documented in text-

books (Isaaks & Srivastava ; Journel & Huijbregts
. The equations we apply will, therefore, not be reproduced here, but some more details are given in the Supplementary Appendix.

Experimental semivariograms
Ordinary kriging (OK), cokriging (CK), and bedrock kriging (BK) of sediment thickness in the study area were based on semivariogram and cross-semivariogram parameters derived from the national GRANADA database.
The estimation parameters should, therefore, be considered as average values for mainland Norway (Kitterød ). These estimation parameters are not stationary in space, and local estimation parameters should, in principle, give better results. In this case, the derivation of local parameters was not feasible due to the scarcity of point observations in the study area. For this reason, we preferred to perform local kriging with average large-scale semivariogram model parameters instead of less reliable local model parameters. Before the modelling took place, the GRANADA data required some pre-processing.
Boreholes and wells are usually located in populated areas, which means that most of the GRANADA recordings were from urban areas, while fewer recordings were from rural areas. If such data are used to calculate average properties, uneven sampling will affect the statistics because oversampled domains will dominate. To suppress such clustering effects, each observation received a weight calculated as a function of the distance to other boreholes.
Boreholes located close to each other were given less

Ordinary kriging and cokriging
Expected sediment thicknessD(u) and estimation variance σ 2 D (u) were obtained by solving the ordinary kriging (OK) and the cokriging (CK) equations. The OK and CK equations were expressed in matrix notation and solved simultaneously (cf. Supplementary Appendix A).
Several methods exist for including secondary information in the estimation procedure (Deutsch & Journel ; Goovaerts et al. ). Here, we suggest a method called bedrock kriging (BK), which is less formal than cokriging.
The BK method is a two-step OK procedure. First, OK was done on Z(u i ) ¼ ln (D(u i )), and then second on the relation: where D (m) is the sediment thickness, L (m) is the horizontal distance to bedrock outcrop, and u i is the point observations. The motivation for using the simple relation in Equation (3) is that information of L(u) is given a priori in every point of interest from the DQMs. See Supplementary Appendix A3 and A4 for more details.

Poisson's equation
Inverse modelling of Poisson's equation P was applied to fit a two-dimensional parabolic surface χ(u) to the observations. A general version of Poisson's equation is written as follows: where Δ denotes the Laplace operator and Ξ(u), u ∈ Ω is the source function where Ω spans the total domain of observations (e.g. mainland Norway).
Three types of boundary conditions can be applied to solve Equation (4) the local trend of the sediment thickness D(ujΞ o ), and the local trend of the bedrock topography is given by: In the case study presented below, we fitted where Ω 0 is the Øvre Eiker study area. Since Ω 0 ≪ Ω, we approximated the right-hand side of Equation (4) to a constant: In this context, inverse modelling of χ(u) means to find a Ξ that minimizes the difference between point observations χ obs and χ sim , which is the solution of Equation (5): where χ sim (u) is given in Equation (5) included yields a parabolic function with minimum deviation from the observations. The advantage is that the bedrock morphology is honoured directly by using the outcrop locations as boundary conditions. In this way, the complete geometry of the bedrock outcrop is included in the estimation procedure, not only the shortest distance to the bedrock outcrop as in cokriging and bedrock kriging.
We applied an explicit finite difference numerical scheme of second order to solve Equation (5). The explicit scheme implemented the boundary conditions very simply, and the matrix multiplication in MATLAB gave fast convergence to stable solutions. The evaluation of the numerical scheme was beyond the scope of this study, but no numerical instability was revealed, and high numerical precisions were achieved by feasible consumption of CPU time.
To obtain conditional mean and estimation variance, we did ordinary kriging of the residuals between the observations and the optimal solution of Poisson's equation: where D obs (u i ) is the point observation, D(u i jΞ o ) is the results of inverse modelling (Equation (6)), and n is the number of local observations. Ordinary kriging of the residuals from optimal solution of Poisson's equation is called Poisson kriging (PK) in the following. In the PK procedure, we also included the boundary locations as conditional observations: D(u k ) ¼ 0, where u k is the locations of the bedrock outcrops.

Evaluation criteria
To keep the model evaluation as simple and consistent as possible, we decided to present the uncertainties in terms of probability density functions by calculation of percentiles The modelling results can be expressed as follows: The subscript m denotes the estimation model where OK is ordinary kriging, CK is cokriging, BK is bedrock kriging, PK is Poisson kriging, and where u is the estimation location, ξ p is the percentiles, and A denotes all available information.
Three criteria were employed for evaluation of the modelled uncertainty: the mean absolute error M AE , the accuracy A C , and the precision P C (Goovaerts et al. ; Kitterød ). The mean absolute error M AE is the difference between the observed variable B obs (u i ) and the median of the estimated variable B m (u i ; ξ 0:5 ): where n is the number of observations used in crossvalidation.
The accuracy A C denotes whether the observed estimate is within two predefined percentile intervals around ξ 0:5 . For the current case study, we used: [ξ 0:25 , ξ 0:75 ]. If the observed value is within the predefined percentile interval, then ϑ ¼ 1, if it is outside, then ϑ ¼ 0, thus: By this definition, A C includes the model uncertainty, which means that in locations with less information and thus larger uncertainty, the model can be accurate (i.e. the estimates are between the given percentile intervals), but not very precise because high uncertainties yield large percentile intervals. An additional criterion for precision is, therefore, necessary.
For this purpose, precision was defined by taking the difference between upper and lower percentiles of the accuracy estimates and sum the inverse difference. By this definition, the modelled precision P C reads: where k is the number of estimates within the predefined accuracy criteria. For the case study presented below: realizations.
In addition to the criteria above, we calculated a more simple probability score of the observations P sc [D obs (u j )]: where u j is the location of the cross-validated observation, and F(u i ) is the cumulative probability density function Tables 3 and 4, case F in Kitterød ()). Estimated model parameters are given in Table 1.

Inverse modelling of Poisson's equation
Poisson's equation was applied for trend analysis of the local bedrock topography. This was done by inverse modelling of the constant parameter Ξ in Equation (5), which means that we search for a Ξ that minimizes the mean absolute error, M AE (Equation (9)). Since only one parameter was involved, this was obtained by simple incremental stepping of the Ξ factor. The sensitivity of Ξ was evaluated by successively leaving one observation out The primary cross-validation results in Figure 8 show that, in general, the OK results overestimated observations smaller than the lumped mean sediment thickness and underestimated observations larger than the lumped mean. Most of the OK estimates were around one standard deviation from the lumped mean observation. The CK and BK estimates gave slightly better results, but the best result was obtained from PK estimates (i.e. kriging on residuals from Poisson's equation, Equation (7)). The optimal solution of Poisson's equation (Equations (5) and (6)) gave significant deviations from observations (black squares in Figure 8).
A summary of the cross-validation results by leaving one observation out is given in , where mn indicates either normal score or lognormal transformed sediment thickness D, or the horizontal distance to the bedrock outcrop L, κ ¼ ln [D=L], and X denotes the residuals between Poisson's equation and observations (Equation (7)). The practical range β ¼ log(0.05) for all models.  the highest accuracy, and 86% of the cross-validated observations were within predefined percentiles. The PK precision was a bit lower than the CK estimates, but the precision is calculated only for the accurate estimates, which was 61% for the CK estimates.
The mean probability score P sc for the cross-validated observations was close to 0.5 for all methods, but this criterion must be judged together with standard deviation which was smallest for the PK estimates. Mean values of the CRPS are similar to M AE with smallest values for the PK estimates. The standard deviation of the CRPS values was also smallest for the PK estimates.
Even though a simple linear regression is not a formal evaluation method, the slope coefficients (Sc in Table 2) show an average underestimation between 0.23 and 0.31 for OK and BK methods, and 0.37 for CK. For the PK method, the underestimation was 0.89, i.e. the regression model gave on average estimates that were 11% less than the observations. The correlation coefficients (R in Table 2) cannot be interpreted as a 'success factor', but they indicate the deviation between a linear regression line   (9)); AC, Accuracy (Equation (10)); PC, Precision (Equation (11)); Psc, Probability score (Equation (12)); CRPS, Continuous ranked probability score (Gneiting & Raftery 2004); R, Correlation coefficient between observations and estimates; Sc, Slope coefficient estimated by linear regression. and the estimates. In this case, the PK method had estimates closest to the regression line (R ¼ 0.99).

Jackknife cross-validation
Conditional cdfs were calculated in all grid points within the study area for jackknife cross-validation. These cdfs were obtained by using the parameters listed in Table 1. The Poisson's equation was solved by using the optimal load parameter Ξ o ¼ 5.5 × 10 À5 were all boreholes recorded before 2010 were used for inverse modelling (Figure 6(b)).
All estimates were conditioned on local observations of Before presenting the jackknife cross-validation results, some characteristic features in Figure 9 should be noted.
The where L was large and where the distance to the nearest boreholes were larger than the range (Figure 9(e)). The CK results showed also the spatial structure of the secondary variable L, but less prominent than the BK estimates in areas that were distant to the borehole observations ( Figure 9(c)). This can be seen in the west-east cross-section that goes through the two boreholes located south of the study area ( Figure 10). In general, the uncertainty in the BK estimates was greater than the CK estimates. The CK results, however, show greater estimation variances than the BK estimates for locations that were close to the bedrock outcrops, but distant from the nearest D observation. The Vshape of the estimates was evident in the CK and the BK estimates, while PK estimates, on the other hand, had the characteristic U-shape as expected (Figure 9(f)).   Table 2). In jackknife cross-validation, this number was reduced to 80% for PK and about 25% for BK (i.e. 35% over-estimations and 40% under-estimations).

DISCUSSION
The cross-validation results demonstrate that the assumption of second-order stationarity of sediment thickness or bedrock topography is not valid. This conclusion was supported by the relatively low performance of the ordinary kriging cases (OK Ã , CK Ã , OK 0 , and BK 0 in Table 2). We  Table 2), reduced the estimation uncertainty.
All evaluation criteria applied in this study were consistent with this result. Hence, the PK method was superior to the other methods.
In this study, we explored the use of public data to esti- We modelled the estimation uncertainty of the bedrock topography (Equation (2)) by using input data from the GRANADA database and digital information from the local study area (Øvre Eiker area, Figure 2). The purpose was to minimize the uncertainty ϵ(u) by using DQMs, DEMs, and point observations of sediment thicknesses D(u i ). The methods we applied relied on multi-Gaussian statistics, which is a simplification of the reality. We also assumed that the information in the DQM could be simplified to a spatial indicator function R(u), where R(u) ¼ 1 in areas where the bedrock was exposed (D ≃0 m), and R(u) ¼ 0 in areas covered by deposits (D>0 m). Below, we discuss briefly some implications of the statistical and geological simplifications we made, but before doing so, it is necessary to comment on some general problems of using public data sources like the GRANADA database, for geostatistical modelling.

Clustering and bias of empirical data
Like most geo-and environmental databases, the GRAN-ADA database contains clustered observations. This is due to the fact that spatial data is usually sampled for a specific

Cross-validation results
Cross-validation of observations from the study area con- (yellow) are derived from declustered observations. The brown curve (pdf-fit) and the purple curve (cdf-fit) were fitted to an optimal Gaussian pdf. The  Figure 4 indicate a physical relation: If L is large, there is no small D, and vice versa.
The cumulative histograms of P sc (Equation (12) Public data are always affected by uncertainties. The areal information of DQMs is not always consistent with independent point information. At least in one case, the bedrock surface was registered as 7 m below the ground in the GRANADA database, but from the DQM, the borehole was located on exposed bedrock. Thus, geographical uncertainties should be included as part of the estimation procedure, and not hidden in the results.

Statistical simplifications
Modelling of the bedrock topography was first done by conditioning on normal score transformed data, and second, by using the lognormal transform. The normal score transform ranks the observations from minimum to maximum and finds the corresponding entries in a standard normal pdf (Deutsch & Journel  (A14)). This drawback explains why the lognormal modelling results had low precision. In the present case study, the modelling results based on the normal score transform did not differ very much from the lognormal transformed results. One reason is that even though the univariate pdf by definition is perfect Gaussian, the same is not automatically true for the bivariate pdf. This problem is visualized in Figure 14, where the bivariate normal score transforms of D and L were compared with a synthetic bivariate Gaussian variable with the covariance structure given in Table 1 for separation distance h ¼ 0 m. As can be seen, the Gaussian character is roughly preserved, but the bivariate pdf is noisy.
It should be underlined that the percentiles of sediment thickness and the bedrock topography yield a smoothed version of the reality ( Figure 10). Thus, the percentiles do not give a realistic picture of the bedrock topography itself. The percentiles, however, is used to visualize the estimation uncertainty of the bedrock topography. If the physical structure of the bedrock topography or the sediment thickness plays an important role, then equally probable realizations should be generated. This subject was beyond the scope of the present paper, and it is, therefore, left for further studies.
Multi-Gaussian methods maximize entropy, which implies that the connectivity of extreme values is underestimated (Gómez-Hernández & Wen ). Subsurface canyons for example, which are extremes with respect to topography, will not be captured well by multi-Gaussian modelling. This is a drawback in areas with glacial erosion.
Glacial flow may give rise to significant differences in topography over short distances perpendicular to the flow direction, and the longitudinal extent of the valleys may, therefore, be significant. In geological terms, it means that steep and narrow valleys cannot be simulated unconditionally by multi-Gaussian methods.

Geological simplifications
Geological maps hold vast amounts of information that might be utilized to estimate the bedrock topography. In this study, however, the only information we used was the indicator function R(u), where R(u) ¼ 1 in locations where the bedrock was exposed to the atmosphere, and else R(u) ¼ 0. Another relevant question is the relation between the geological setting (i.e. the sediment type) and the bedrock topography. It is common knowledge that the glacial history of the area explains the main character of the landscape. The weight of the glacier suppressed the continental crust, and the isostatic rebound after the melting explains why marine deposits cover the bedrocks in the lower part of the study area. Glacial transgression and retreat explain the deposits of the area, but surface mapping of Quaternary deposits does not usually include thickness of sediments.
Some sediment types (in specific areas) usually have significant thickness (like marine sediments), while other kinds of deposits usually are patchy and sparse (like colluvial or weathered material). Thus, a relevant project for further research is to match the GRANADA recordings with DQMs to find statistical relations between sediment category and sediment thickness. This kind of information might be utilized for future (soft) conditioning of the bedrock topography. For sediment category and sediment thickness D within the study area, it can be noted that boreholes located in fluvial deposits have the largest sediment thickness D. If there exists a statistically significant relation between sediment category and D, then modelling of D might be improved by using sediment category as an auxiliary function.

CONCLUSIONS
With this paper, we wanted to draw attention toward modelling of sediment thickness and the associated bedrock topography (Equation (2)). We explored the opportunities of combining information from the national well database (GRANADA), DQMs, and DEMs to reduce the estimation uncertainty. Figure 14 | Synthetic multi-Gaussian probability density function to the left (a) normal score transformed sediment thickness D and the horizontal distance to the nearest bedrock outcrop L in the middle (b), and the difference between the synthetic pdf and the empirical to the right (c). The synthetic pdf was generated with zero mean and covariance function equal to the empirical data for separation distance h ¼ 0 (cf. Table 1).
Several MATLAB scripts were tailored to process the dataset for this study. This included declustering of the national data, normal score transformation, semivariogram calculations, and routines for inverse modelling of Poisson's equation. In this case, the small-scale variance (the nugget) was large compared with the total variance (the sill), which explains why the mean absolute error and the estimation variance were large. The large-scale national semivariogram model for the sediment thickness showed a correlation length of about 2 km, and the cross-semivariogram between the sediment thickness and the horizontal distance to the bedrock outcrop had a correlation length of more than 5.5 km (Table 1) Neuman condition might be applied by using the gradient of the exposed bedrock topography, but this approach was not evaluated in the current study.
Local trend analysis improves the mean absolute error, but the independent (jackknife) cross-validation shows that the estimation uncertainty is large. Because the number of recorded boreholes, wells, and geotechnical test drillings increases every year, the conditional estimation uncertainties will be reduced. Point databases (as GRAN-ADA and/or NADAG (NGU d)) in combination with line and/or areal estimates (e.g. from geophysical surveys) will provide plentiful information in coming years. Thus, an important job for the future is to develop simple and robust estimation procedures that can benefit from the increasing amount of information continuously coming in.

AUTHOR CONTRIBUTIONS
The main part of the work was done by N.O.K. with significant contributions from É.L.