This is the first in a series of posts examining the Geostatistical Analyst Toolbox in ArcGIS Pro. The Geostatistical Analyst package provides tools to create a continuous surface (or a map) from discrete measured points. I have used these methods a lot in my work as a geologist. After collecting soil samples in the field the next step is to create a map showing the high and low value samples. It can also be used to create a map of the water table from monitoring wells or interpret elevation between point measurements.
There are six major toolsets within the Geostatistical Analyst package including interpolation, sampling network design, simulation, utilities, and working with geostatistical layers. The tool I will be looking at today is the Radial Basis Function which is part of the interpolation toolbox.
Radial Basis Function
The radial basis function (RBF) is a deterministic interpolation technique. It is an exact interpolation technique as it predicts a value identical to the measured value at each sample location. The ArcGIS Pro help files have a good description of how a RBF function works. The RBF will create a surface that passes through each measured sample point to create the surface. The radial basis function in ArcGIS Pro has five options for how it does the interpolation between points. The interpolation used by the RBF is a type of artificial neural network. The values predicated from the RBF can be above or below the values observed in the sample set (this is in contrast to the Inverse Weighted Distance function where the min and max values are set by the sample set). This technique is best used for smooth changes in the data over long distances. If the values change over short distances it will introduce more error into the predicted surface.
There are five different options to change how the surface is created between points:
- Thin-plate spline
- Spline with tension
- Completely regularized spline
- Multiquadric function
- Inverse multiquadric function
Creating a RBF Interpolation Layer
To get to the RBF tool, load the analysis toolbox and search for Radial Basis, click and load the tool. This tool has quite a few options and many of them are common to interpolation tools in the Geostatistical Analyst Toolbox. The input feature contains the points with associated measurements. The Z value field is the value that will be used in creating the surface. There are two options for Output, you may select either an output geostatistical layer or an output raster. Detailed information on geostatistical layers can be found here. They are similar to other layers but can only be created by a Geostatistical Analyst interpolation methods. They store additional data including the source of the data, the symbology, and the model parameters used in the interpolation. I would recommend to create a geostatistical layer over a raster as they retain important information and I will be using them in later steps for cross validation.
There are four different methods for the search neighbourhood. A search neighbourhood is used to define how far and what direction values are coming from when making a prediction. As you move away from a point other measurements which are further away have lower spatial autocorrelation with that point. By limiting values for prediction you ensure that a prediction is based on closer and more relevant points. The options for search neighbourhood are:
- One sector
- Ellipse with four sectors
- Ellipse with four sectors and a 45-degree offset
- Eight sectors
The search neighbourhood collects a number of points between the minimum and maximum neighbours. If there is not enough points to satisfy the minimum number of points it will not be able to estimate that location. If you selected an ellipse with sectors the search neighbourhood will collect the specified maximum and minimum number of points from each of the sectors. By dividing the ellipse into sectors you are ensuring that you are taking samples from ever direction around the estimation point. You can also add in a search direction if there is a general directional trend in the data. More detailed information on search neighbourhoods in ArcGIS Pro can be found here.
After the tool has completed you are provided with a coloured interpolation layer. I ran the tool for each of the five different radial basis functions to produce the following results.
You can see how the functions treated high points and low points differently and the smoothness of the interpretation changes. Overall the general pattern for highs and lows is very similar but there are key differences. It also highlights how the point values match the predicted values exactly.
Using the Cross-validation Tool in ArcGIS Pro
The next step in a complete analysis to cross validate your data. The point of cross validation is to give you an idea of how well your model performed. The cross-validation process goes through each point in the dataset. It then uses the model (radial basis function in this example) to predict the value of that point leaving the rest of the data in place. The predicted value for that point is then compared to the actual value to get an idea of the difference between the actual and modeled data.
To load the cross-validation tool search for it under analysis tools. The dialogue is simple, you have to pick a geospatial analysis layer, and you have the option to produce an output file containing the original values, predicted values and the error.
For radial basis function geostatistical layer the cross-validation tool gives you the mean error and the root mean square error. The mean error is the averaged difference between the measured and predicted values. The root mean square error indicates how closely the model predicted the actual values. The closer this value is to 0 the better your model is.
Here is a table comparing the mean error and root mean square error for the five radial basis functions I used.
|Method||Thin Plate Spline||Inverse Multiquadric||Multiquadric||Spline with Tension||Completely regularized spline|
The thin plate spline had the lowest RMS value, although it was very close to the results produced by the multiquadric and spline with tension methods.
If you produced the output file containing each point and its predicted value you can make a plot showing the measured vs. predicted values. This gives you an idea of how close the values are and you can see if your model is working better at the higher or lower range of the estimated values.
From this plot we can see that the very high values are not being predicted by the model. This makes sense as these values are much higher than the surrounding values and change over a short distance. This type of sample is not ideal for predicting using the radial basis function.
In this tutorial we went over using the radial basis function in ArcGIS Pro. The radial basis function is part of the Geostatistical Analyst Toolbox. We used this tool to create a predictive surface of soil sample values. There are a lot of different settings available to make sure you are using an analysis which matches your data. The nearest neighbour tool allows you to configure the number and location of points used in a prediction. By changing the radial basis function you change how the interpolation surface is created. Finally by cross-validating your model you can get an idea of how well it works. This process removes each point in the data set and predicts it using the rest of the dataset.