For example, station B has data from 1970-1974 and 1979-1980. So the data from 1975-1978 are missing!
But if I now calculate the average at this station and interpolate it with the data that includes a data set (A) for all 10 years, I will distort my model! Is there a simple mathematical solution for weighting the data?
Is it a solution to multiply the B Average Value with 0.7, so the weighted average value will be 3.5?
不,这不是一个解决方案。这样做可以有效地将0值分配给没有数据的站,这是错误的做法。
如果你想填补缺失的数据,你必须应用数据插值技术。这种技术只有在采样点之间存在相关性时才有效。
为了填补这些空白,你必须做额外的计算。你使用哪种技术取决于你想要的结果,或者什么是现在被认为是数据适用领域的最新技术状态。< / p >
Until the mid 1990s to early 2000s, when estimating values of grade for ore reserves, using block modelling methods, inverse distance weighting methods were used. Initially, $\frac{1}{d}$ was used as this was a relatively quick calculation to perform. It was later discovered that inverse distance squared $(\frac{1}{d^2})$ gave better results. In some instances, inverse distance to the power $n$, $(\frac{1}{d^n})$, where $n$ generally ranged between 2 and 2.5 was also used. Generally, however, inverse distance squared became industry "standard" for a period of time.
The benefit of inverse distance squared weighting compared to simple inverse distance weighting was it gave less weighting to more distance samples. This is based on the logic that samples that are close to one another are more likely to have been deposited under more similar conditions and thus would more likely be related to one another.
Since the early 2000s inverse distance weighting methods have been replaced by kriging. One of the main differences between kriging and all forms of inverse distance weighting is with kriging, the sum of the weights equals 1. With the other methods, there is no limit, or specification to what the sum of the weight should be.
对于长达一年的数据差距,任何不考虑环境强迫(即模型)的插值都是不充分的比较。例如,在A中包含但在B中不包含的极端年份将极大地影响您的比较。我宁愿掩盖A数据集,以匹配B中可用的年份,或者使用基于过程的模型,用环境强迫数据输出所需的同位素数据来填充缺失的年份