Supervisor 8.5 – Cross Validation

Sn0wdenadminLatest News, Technical Articles

Cross-validation is a technique for validating variograms and optimising estimation parameters. The technique can be used much like a dress rehearsal for estimation and, although the results of the final estimate cannot be guaranteed or predicted, it provides an indication of the validity of the variograms and the parameters used for estimation. While this method is most often used to validate variograms and kriging estimation parameters, cross validation can also be used to assess the effects of using different estimation techniques.

During cross-validation, each composite sample is successively removed from the dataset, and the grade at that sample location estimated using the remaining dataset, variograms and estimation parameters. The process therefore provides a true (composite) grade and an estimated grade for each data location. The estimated value can then be compared to the true value and assessed in terms of difference (error), usually using a scatter plot, as shown below in Figure 1. In addition to error statistics, calculating a least squares regression line and correlation coefficient allow conditional bias in the estimate to be assessed.

Cross-validation scatter plot and statistics

Figure 1 Cross-validation scatter plot and statistics

This technique is particularly useful to assess if error is being introduced into estimates as a result of variograms and parameters used for estimation. It is very effective comparing estimation results from different variogram models, especially where different interpretations are possible from widely spaced data. It can also be used to identify domains with variogram modelling errors and can be used to provide a measure of the expected error associated with estimation.

Estimation results can be highly influenced by proximity to data and it is therefore useful to compare results from well informed areas to sparsely informed areas. For the same reason it is useful to assess the effect on the error if composites from the same drillhole are used in the estimate, compared to when composites from the same drillhole as the estimation point are omitted from estimation. The technique can therefore be used to help understand, explain and account for any variances seen in the final estimate as a result of data spacing. Effects of data clustering, variability and other estimation parameters including estimation technique can also be compared and assessed using cross validation in an attempt to minimise conditional bias and error in the final estimate.

The new cross-validation tool in Supervisor is easy to insert and quick to implement. A cross validation scatter plot is inserted directly below the modelled variograms. Estimation parameters for kriging are input in a cross validation properties tab. A default search neighbourhood is established based on the modelled variogram angles and ranges, however, these can be easily modified by the user if necessary. The minimum number and maximum number of samples are stipulated and the user also has the option of setting a maximum number of samples per drillhole or omitting samples from the same hole as the estimation point for estimation. Supervisor will automatically calculate the correlation coefficient, mean difference statistics and plot a linear regression line on the graph. In addition, points above and below the 1:1 line are colour coded so that a bias can be more readily seen. Supervisor’s easy copy and paste function allows the user to quickly duplicate the cross validation graph for different variogram models or the same variogram model with different search parameters. Figure 2 shows the layout of the cross validation tool in Supervisor.

Layout of cross-validation in Supervisor

Figure 2 Layout of cross-validation in Supervisor

To keep up to date with Snowden please follow us on LinkedIn

Related Articles