# Rescaled Range Analysis: A Method for Detecting Persistence, Randomness, or Mean Reversion in Financial Markets

*Editor’s note: Thanks to the diligence of Armin Grueneich this post has been amended to reflect the addition of step #5, below, in the calculation of the rescaled range.*

Rescaled range analysis is a statistical technique designed to assess the nature and magnitude of variability in data over time. In investing rescaled range analysis has been used to detect and evaluate the amount of persistence, randomness, or mean reversion in financial markets time series data. Insight of this kind into financial data naturally suggests investment strategies.

Originally invented for the field of hydrology by Harold Edwin Hurst, the technique was developed to predict Nile River flooding in advance of the construction of the Aswan High Dam. The dam needed to fulfill multiple and divergent purposes, including serving as both a store of water to protect against drought for farmers down river, and as flood protection for those same farmers during typical annual flooding. Rainfall levels in Central Africa were seemingly random each year, yet the Nile River flows seemed to show autocorrelation. That is, rainfall in one time period seemed to influence rainfall in subsequent periods. Hurst needed to be able to see if there was a hidden long-term trend — statistically known as a *long-memory process* — in the Nile River data that might guide him in building a better dam for Egypt.

Does this sound familiar? A time series of varying levels that is seemingly random but in which it is suspected that there might also be a long-term, hidden trend. Not surprisingly rescaled range analysis had its moment in the financial analysis sun in the mid-1990s, when chaos theory, as applied to financial markets was a hot topic. Chaos theory is a branch of science that studies the interconnectedness of events that otherwise, on the surface, seem random.

Closely associated with rescaled range analysis is the Hurst exponent, indicated by *H*, also known as the “index of dependence” or the “index of long-range dependence.” A Hurst exponent ranges between 0 and 1, and measures three types of trends in a time series: persistence, randomness, or mean reversion.

- If a time series is persistent with
*H*≥ 0.5, then a future data point is likely to be like a data point preceding it. So an equity with*H*of 0.77 that has been up for the past week is more likely to be up next week as well, because its Hurst exponent is greater than 0.5. - If the Hurst exponent of a time series is
*H*< 0.5, then it is likely to reverse trend over the time frame considered. Thus, an equity with*H*= 0.26 that was up last month is more likely than chance to be down next month. - Time series that have Hurst exponents near to 0.5 display a random (i.e., a stochastic) process, in which knowing one data point does not provide insight into predicting future data points in the series.

So what are the steps to conducting a rescaled range analysis and to estimating the Hurst exponent? As an instructional example, please reference the spreadsheet of the rescaled range analysis of daily return data for the S&P 500 Index from 3 January 1950 through 15 November 2012.

**Rescaled Range Analysis Steps**

1. **Choose your time series.** Do you want to analyze fluctuations in the yield curve? West Texas sweet crude? Apple (AAPL) or Google (GOOG) stock? Or the Dow Jones Industrial Average (DJIA)? Here I am going to select the S&P 500’s daily returns.

2. **Choose your ranges.** Rescaled range analysis depends on multiple lengths of time (i.e., ranges) to be analyzed and chosen arbitrarily by the analyst. In the example of the S&P 500, there are 15,821 daily returns. So I chose the following ranges, all powers of two:

**a.** Size of range is the entire data series = one range of 15,821 daily returns.

**b.** Size of each range is 1/2 of the entire data series = 15,821 ÷ 2 = two ranges of either 7,911 or 7,910 daily returns.

**c.** Size of each range is 1/4 of the entire data series = 15,821 ÷ 4 = four ranges of either 3,956 or 3,955 daily returns.

**d.** Size of each range is 1/8 of the entire data series = 15,821 ÷ 8 = eight ranges of either 1,978 or 1,977.

**e.** Size of each range is 1/16 of the entire data series = 15,821 ÷ 16 = sixteen ranges of either 989 or 988 daily returns.

**f.** Size of each range is 1/32 of the entire data series = 15,821 ÷ 32 = thirty-two ranges of either 495 or 494 daily returns.

3. **Calculate the mean for each range .** For each of the ranges, calculate a mean per the formula below.

*Note: In the above example of the S&P 500 there are 1 + 2 + 4 + 8 + 16 + 32 = 63 means calculated, one for each range.*

Where:

*s* = series (Series 1 is whole data series for S&P 500, or 15,821 daily returns; series 5 is 16 ranges of either 989 or 988 daily returns.)

*n* = the size of the range for which you are calculating the mean

*X* = the value of one element in the range

4. **Create a series of deviations for each range.** Create another time series of deviations using the mean for each range. *Note: In the case of the S&P 500, there will be six new “deviations from the mean” ranges, given the six categories of ranges chosen in Step 2 above (i.e. ranges a, b, c, d, e, and f).*

Where:

*Y* = the new time series adjusted for deviations from the mean

*X* = the value of one element in the range

*m* = the mean for the range calculated in Step 3 above

5.* *Create a series which is the running total of the deviations from the mean.** **Now that you have a series of deviations from the mean for each range, you need to calculate a running total for each range’s deviations from the mean.

Where:

y = the running total of the deviations from the mean for each series

Y = the time series adjusted for deviations from the mean

6. **Calculate the widest difference in the series of deviations.** Find both the maximum and minimum values in the series of deviations for each range. Take the difference between the maximum and minimum in order to calculate the widest difference. *Note: For the S&P 500 example, there are 63 calculations, one for each of the 63 ranges.*

Where:

*R* = the widest spread in each range

*Y* = the value of one element in the “deviations from the mean” range

7. **Calculate the standard deviation for each range.** *Note: There will be 63 standard deviations, one for each range.*

8. **Calculate the rescaled range for each range in the time series.** This step creates a new measure for each range in the time series that shows how wide is the range measured in standard deviations.

Where:

*R*/*S* = the rescaled range for each range in the time series

*R* = the range created in step 5 above

*σ* = the standard deviation for the range under consideration

9. **Average the rescaled range values for each region to summarize each range.** For each region, average the rescaled range (*R*/*S*) values. Using the S&P 500 data as an example, we have the following *R*/*S* values for each of the four ranges of ~3,955 in size:

“Range 1/4”, part 1, *R*/*S*: 83.04

“Range 1/4”, part 2, *R*/*S*: 63.51

“Range 1/4”, part 3, *R*/*S*: 84.16

“Range 1/4”, part 4, *R*/*S*: 88.09

Average of the four *R*/*S* values for “Range 1/4” = (83.04 + 63.51 + 84.16 + 88.09) ÷ 4 = 79.70

For the S&P 500 we have the following values for the rescaled ranges:

Now that you have rescaled each range in the time series, you can calculate the Hurst exponent, *H*, that will summarize in one number the degree of persistence, randomness, or mean reversion in your time series.

**Calculating the Hurst Exponent Steps**

1. **Calculate the logarithmic values for the size of each region and for each region’s rescaled range.** For example, consider the above S&P 500 data:

2. **Plot the logarithm of the size ( x axis) of each series versus the logarithm of the rescaled range (y axis).** This results in a graph that looks something like this one for the S&P 500:

**Rescaled Range Analysis of the S&P 500 (3 January 1950 to 15 November 2012)**

3. **Calculate the slope of the data to find the Hurst exponent.** *H* is the slope of the plot of each range’s log (*R*/*S*) versus each range’s log (size). For the S&P 500 for 3 January 1950 to 15 November, *H* is 0.49. Recall that this means that the S&P 500 demonstrates randomness.

Knowing the *H*, suggests some hypothetical trading strategies. For example, stocks with *H* ≥ 0.5 — that is, persistence — and positive price appreciation would be attractive to a growth manager wanting future capital appreciation. Whereas, stocks with H < 0.5 with prices declining for some time suggest an eventual price trend reversal to a value investor.

*Please note that the content of this site should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute.*

Photo credit: ©iStockphoto.com/ugurhan

I have read your write-up with great interest. Have you carried out the hurst exponent calculations, and if so which stocks and markets. How can I use this for shares listed in the Colombo Stock Exchange. Just one more question is this something like mean reversion?

Hello Hisham,

Thanks for your comment. Yes, the Hurst exponent calculations for the S&P 500 will appear in a subsequent post here on CFA Institute’s The Enterprising Investor blog.

In order to use rescaled range analysis for the Colombo Stock Exchange you simply need long time series of data. I would recommend daily closing prices over the course of at least a decade. Then follow the steps I describe in the above blog post. Also, there is a link in the above post that allows you to download an Excel spreadsheet that demonstrates each of the calculations.

A Hurst exponent of less than 0.5 suggests mean reversion. The lower the Hurst exponent the greater the mean reversion.

I hope that helps!

With smiles,

Jason

Hello Jason,

Very interesting work. I am preparing an article to publish. I was interested to know if you could run the Hurst technique on the S&p for me for specific time periods? Happy to provide you with more info, and happy to cite you should the results be supportive.

Paul

Hi Paul,

Thank you for the kind offer, however I am going to decline and blame it on my lack of time.

If you follow the link in the article above – http://cfa.wpengine.netdna-cdn.com/investor/files/2013/01/SP-500-Rescaled-Range-Analysis.xlsx – you can download the spreadsheet that contains all of my data as well as all of the above calculations. Hopefully from that you can conduct the analysis for the time periods you need for your research.

With smiles!

Jason

Hello Jason

Your article and accompanying workbook is well done and very helpful.

I am testing a quant strategy using all listed ETFs in the US, and I was looking for a supplementary measure to help predict short-term price movements.

I will let you know if this technique works consistently across the 12 years that I am doing my backtesting.

Thanks again for your time in putting this together.

Best wishes

Savio

Hello Savio,

Thanks very much for your comments. I hope that you found this piece useful. And please do let us know what you find when applying ‘rescaled range analysis’ to your time series.

With smiles!

Jason

Excellent article, and very well explained. The attached example was really helpful too. Thank you very much.

Hi Jason.

Quick question. At the end of point 9. we have a matrix with region count, average data point count and average rescaled range.

If i understand correctly by using these data and logarithms we should be able to get the data in point one under “Calculating the Hurst Exponent Steps”.

But i dont understand what i have to do to get 4.2 and 2.18 from 1, 15821 and 151.77.

Do you see where my problem is and can you help me?

Thank you.

Hi,

Can you upload the document that contains all of your data as well as all of the calculations?

Thanks

Hello Andris and Dani,

The downloadable spreadsheet has been a part of the post from the beginning. Click on the language in the post above that says, “the spreadsheet of the rescaled range analysis of daily return data for the S&P 500 Index from 3 January 1950 through 15 November 2012.” Once you open the spreadsheet you should be able to see all of the data – many thousands of daily returns for the S&P 500 – plus all of the calculations. I just downloaded it myself to confirm this is possible.

If that doesn’t work, please let me know and I will see about giving a more in-depth answer.

Thanks for reading!

Jason

Hi Jason.

Why do you use your calculations on a Returns column? What`s the reason not to use Adj Close column?

Thank you.

Andris.

In your opinnion, is there anything better out there than Hurst for determining trend?

Hi Jason,

Thank you for your clear explanation as to how to calculate the rescaled range values and the Hurst exponent.

I saw that you had included a link to an Excel spreadsheet that allowed to calculate the Hurst exponent, but the link you provided unfortunately no longer works.

Could you please update this link? I’m trying to calculate R/S values for different currency pairs, and it would be a big help to have a professional spreadsheet like yours to do the calculations.

Thanks,

Adrian

Hi Adrian,

Thank you for pointing out that the link did not work. I will endeavor to update the spreadsheet and ensure that the link is restored.

With smiles,

Jason

Hello Jason,

Thanks for the spreadsheet, very helpful. Just one query, Peters (1991) has the fractal as 1/H rather than 2-H. Is this subjective?

Thanks,

Aaron.

Hello Aaron,

I have not read Peters’ work since about 1995 so am not sure what he uses to estimate the fractal dimension. In preparing the above blog post I used several references that were in agreement with the method I described above. However, I have also found other folks using an entirely different method to estimate the fractal dimension. So agreement here is not unanimous.

I think that the important point here is the concept of a fractal dimension. Namely, that the geometry we were taught is not entirely descriptive of real world phenomenon.

I hope this helps!

Jason

That should read ‘fractal dimension’ rather than ‘fractal’

Why do you calculate y in step 5, but don’t use it for anything?

In the Wikipedia article you link to (http://en.wikipedia.org/wiki/Hurst_exponent) they use this cumulative sum (what you call “the running total”) in the calculation of R, but you use the deviations Y.

Hi Filip,

Thanks for your comment. Take a look at the spreadsheet that I provided to see how that series is utilized.

With smiles,

Jason

Hi,

Thank you for the information on this page.

I was wondering if you should shed any light on this graph:

Graph: http://www.bearcave.com/misl/misl_tech/wavelets/hurst/moving_hurst.jpg

From website: http://www.bearcave.com/misl/misl_tech/wavelets/hurst/

Say a 15 day return, would you a;

1) Just take the stock price every 15 days and calculate returns from that

2) or make the width of your 1/32 component above to be 15 days so the whole range is 480 trading days (15 x 32).

I presume it is 1) above otherwise it would be impossible to calculate a 2 day Hurst exponent using this method (as 1/32 would be 1.5hr!).

Anyway if you could clarify that would be great.

Thank you

Hi Tejay,

The decision of what ranges to use is entirely subjective and up to the analyst. Each analyst will be interested in persistence in time horizons unique to their individual analytical work seeking a signal from amongst the noise. That said, Hurst developed rescaled range analysis to look at very long term data on the level of the Nile river. But in a world of high frequency trades being executed in picoseconds, a minute seems like an eternity. If pressed for a recommendation I would say that if your consciousness can comprehend an insight for a chosen time scale (picosecond all the way up to millenia) then you should be able to use rescaled range analysis. The conditional factor here is not the length of time, but whether or not there is meaningful data in dividing the time frame up into ever smaller bits of time.

Hope that helps!

Jason

Hi Jason.

The hurst algorithm takes time series F1,….,FN. Financial time series looks something like brownian random walk: http://upload.wikimedia.org/wikipedia/commons/d/da/Random_Walk_example.svg

But you have transported the brownian signal to something like gausian by (FN+1/FN)-1. This is confusing for me. Could you explain why you did that? Why is that necessary?

The results are different for the same time series, so i assume it is important.

What type of input did the Hurst use for Nile?

Thank you.

Hello Kovalevskis,

So sorry for the delayed response on your question. Somehow my automatic notification of comments is broken on this post so just now saw your question. Apologies.

In answer to your question about why I calculated something the way I calculated it…I was following the procedure outlined by Hurst himself nearly 100 years ago.

Hurst’s input for the Nile, if I remember correctly, was the annual level of flooding, probably measured in meters. If I remember correctly he was trying to help build a dam for the river and he needed to know how high to make the dam in order to ensure there was no downstream flooding caused by having too short a dam height.

Cheers!

Jason

the sub-samples are taken without replacement.

There are some flaws in rescaled range analysis as noted in this paper

http://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=2220942

Sorry, here is the correct link to my rescaled range analysis paper

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2448648

two questions…

in light of the comments of July 4, 2014 is the spreadsheet with data through November 2012 still correct?

I am updating the spreadsheet to more recent data, I noticed that there are hardcoded values spread from cell J15829 to J16289. they are not used anywhere. can you shed some light as to whether they are needed (I suspect not) or what function they perform?

thanks in advance

Hello San Fran Sam,

To the best of my memory those hardcoded data are the values that were originally calculated for the maximums and minimums in the ranges in the first iteration of the spreadsheet. If you take a look at the top of this post describing rescaled range analysis you will see that someone pointed out an error in my calculations in the spreadsheet. In order to show the readers of this piece the difference that made, I hardcoded the old (mistaken) values to the right so as to eliminate confusion for those that had different versions of the spreadsheet.

I hope that helps!

Jason