On the cause of the non-Gaussian distribution of residuals in geomagnetism
Abstract
To describe errors in the data, Gaussian distributions naturally come to mind. In many practical instances, indeed, Gaussian distributions are appropriate. In the broad field of geomagnetism, however, it has repeatedly been noted that residuals between data and models often display much sharper distributions, sometimes better described by a Laplace distribution. In this study, we make the case that such non-Gaussian behaviours are very likely the result of what is known as mixture of distributions in the statistical literature. Mixtures arise as soon as the data do not follow a common distribution or are not properly normalized, the resulting global distribution being a mix of the various distributions followed by subsets of the data, or even individual datum. We provide examples of the way such mixtures can lead to distributions that are much sharper than Gaussian distributions and discuss the reasons why such mixtures are likely the cause of the non-Gaussian distributions observed in geomagnetism. We also show that when properly selecting subdata sets based on geophysical criteria, statistical mixture can sometimes be avoided and much more Gaussian behaviours recovered. We conclude with some general recommendations and point out that although statistical mixture always tends to sharpen the resulting distribution, it does not necessarily lead to a Laplacian distribution. This needs to be taken into account when dealing with such non-Gaussian distributions.
Origin : Publisher files allowed on an open archive