# Using exploratory data analysis to ensure you are laying the foundations for a robust resource estimate

A sound geological interpretation is the corner stone of a good resource estimate. For resource estimation, the underlying assumption is that the data being used is from a domain with stationarity. This means that:

• the data is from a single statistical population;
• the mean and variance are consistent throughout the domain;
• the domain is geologically homogeneous;
• the domain has a single orientation of grade continuity.

Statistical analysis of geological data, commonly referred to as exploratory data analysis can be used to validate this assumption of stationarity and determine an appropriate estimation technique. It is a critical step in the resource estimation process and is carried out to describe the characteristics of the data and hence the grade population being estimated.

The following tips will help ensure that you are laying the foundations for a robust resource estimate.

1. Mixed populations can sometimes be obscured in a histogram due to overlapping statistical populations. They are typically more evident on a probability plot where mixed populations show up as inflection points. 1. If working with multiple elements, make sure that the domains are validated for all attributes. Scatterplots and correlation coefficients are useful tools to examining relationships. Whilst the scatter plot directly compares paired data and provides a visual indication of the correlation between attributes the correlation coefficient can quantify the relationship. A correlation coefficient of 1 indicates a perfect positive linear correlation, 0 indicates no linear correlations and -1 a strong negative linear correlation. It’s important to look at both results to obtain a proper understanding of the relationships between attributes as outliers can influence results. 1. Where you have multiple domains a box and whisker plot provides a quick visual aid to determining which domains form part of the same statistical population. If geologically appropriate, the domains could be combined for variography and estimation. Quantile-quantile (QQ) plots can also be easily inserted between domains or data types for further analyses of statistical similarity of the populations. 1. Clustering is caused by irregular sampling of a volume through “Directors’ Holes”, fan drilling or infill drilling. Clustering results in extra samples (usually high grades) in the dataset used for statistical analysis. In order to remove any bias due to clustering, declustering can be carried out on the data. Whilst declustering is not necessary for ordinary kriging estimation it may be necessary when using other methods such as inverse distance. Declustering can also be an important preparation step for validation of the estimate; comparison of the model and drillhole grades, variography; declustering can change the mean and variance which affects the variogram and simulation where the sample histogram must be honoured. Snowden’s Supervisor geostatistical software provides powerful graphing tools to allow quick and easy statistical analysis of geological data with multiple windows, graph overlaying and 3D viewing of the data. Graphs such as histograms, probability plots and mean and variance plots, along with quantile-quantile plots and box and whisker plots, can all be generated for multiple domains and variables with ease. A correlation matrix allows the user to assess the relationship between variables, with scatterplots easily displayed be simply clicking on the desired variables in the matrix. Clustered data is not a problem as Supervisor also includes a simple to use declustering tool, which uses cell-weighted declustering to remove the bias caused by clustering on the population statistics and distribution.