This is part 3 of my series of posts on the statistics of financial markets. Part 1 is here.

In previous posts, I have found that working in log prices makes sense and that the double exponential distribution is a good fit to price change data. In this post, I will look at correlations over time in price changes.

Let’s ask a simple question: Does yesterday’s price change predict today’s price change? To analyze this, I created a scatter plot of each price change along the x-axis and the very next price change on the y-axis. I did this for the bitcoin price, and also for some NASDAQ stocks (here, I show Microsoft).

Looking at the red scatter plot for the bitcoin case, its clear that today’s changes are not very predicting of tomorrow’s changes. It almost looks like it is uncorrelated. However the correlation coefficient (r value) is actually 0.06, meaning that future price changes are slightly dependent on previous changes (this is for 30 minute intervals). I found that longer time intervals between data samples for bitcoin gave larger r values, with the 1 hour interval giving a coefficient of 0.1. The daily stock movements however, had much larger r values, with Microsoft haven a coefficient of 0.31.

To visualize the dependencies, I plotted as the green lines the mean value of histograms of the y-axis data points. This shows that at the center there is an upward slope in the mean, showing that if the price is going up then it is likely to keep going up and vice versa. (The zig zag green lines outside the center are an artifact of the limited number of samples in those regions.)

Since we have established that there is momentum in the market, this suggests a naive trading algorithm: We monitor the stock once per day, and if it went up in the last 24 hour period, then if we were not in the market, we buy in, but if it went down in the period then we sell out if we were already in the market. This means we get the gain from multiple consecutive rises, but don’t take the hit from multiple consecutive falls.

But does this work? The quick answer is that it would if it were not for trade commissions and bid ask spreads. In fact, its probably the case that commissions and spreads are the reason that these correlations remain in the data, and if trading were free, these statistics would go away as everyone tried to use this algorithm.

This is a work in progress; I’ll be updating it.