Let’s say that we make measurements of a large group of people. Such measurements might include height, weight, IQ, blood pressure, credit score, hair length, preference, personality traits, etc. You can imagine obtaining a mass of data about people like this where each measurement is taken to lie on a continuous scale. Typically the distribution of the population along each one of these measurements will be a bell curve. Most people have average height for example. The interesting fact is that the more measurements you take, the less likely it is that you will find anyone who is simultaneously average along all the dimensions that you consider. All of us are abnormal if you consider enough personal attributes.
This brings us to the shell property of high dimensional spaces.
Let’s consider a normal (Gaussian) distribution in D-dimensions. In 1D it is obvious that all the probability bulk is in the middle, near zero. In 2D the peak is also in the middle. One might imagine that for any number of dimensions this would continue to hold, but this is false. The shell property of high dimensional spaces shows that the probability mass of a D-dimensional Gaussian distribution where D>>3 is all concentrated in a thin shell at a distance of sqrt(D) away from the origin, and the larger the value of D, the thinner that shell becomes. This is because the volume of the shell grows exponentially with D compared with the volume around the origin, and so with large D there is essentially zero probability that a point will end up near the center: Mr Average does not exist. Continue reading