Thursday, January 7, 2010

Degrees of freedom--KISS

I’m not the person you would ever hire to design a trading system, mainly because my programming skills are (being kind to myself) modest. But I think it’s critically important for every trader and investor, whatever his style, to know in principle how to go about developing a trading system. I plan to write a short series of posts on this topic geared to the intelligent novice. Today’s theme is the statistical concept of degrees of freedom.

“Degrees of freedom” is defined mathematically as the rank of a quadratic form. Well, that gets us nowhere fast. The statistical definition isn’t much better: the number of values in the final calculation of a statistic that are free to vary. But we know that the concept is critical in system development. Ralph Vince put it succinctly if not elegantly: “The key to ensure that you have a positive mathematical expectancy in the future is to not restrict your system’s degrees of freedom.” (The Mathematics of Money Management, p. 19)

Urban Jaekle and Emilio Tomasini, in Trading Systems: A New Approach to System Development and Portfolio Optimisation (Harriman House, 2009) tackle this subject from enough points of view that eventually even the most statistically challenged should understand the relevance of this concept in developing a trading system. First, a feeble statistical joke. A married man comments: “There is only one subject, my wife, and my degree of freedom is zero. I should increase my ‘sample size’ by looking at other women.” (p. 16) Second, building on this joke, an illustration by Robert Schulle: “In a scatter plot when there is only one data point, you cannot make any estimation of the regression line. The line can go in any direction . . . Here you have no degrees of freedom . . . for estimation (this may remind you of the joke about the married man). In order to plot a regression line you must have at least two data points (a wife and a mistress). In this case you have one degree of freedom for estimation. . . . In other words, the degree of freedom tells you the number of useful data for estimation. However, when you have two data points only, you can always join them to be a straight regression line and get a perfect correlation. . . Thus the lower the degree of freedom is, the poorer the estimation is.” (p. 17)

What the concept of the degrees of freedom is expressing in a rigorous way is the intuitive notion that the larger the sample size and the smaller the number of variables the better the estimation. Generally, the authors state, “less than 90% remaining degrees of freedom is considered too few.”

Let’s consider some examples so that we have a better idea of how to calculate degrees of freedom for practical purposes. We have a trading strategy that uses a 20-day average of highs and a 60-day average of lows and we’re working with a data sample of three years of highs, lows, opens, and closing prices for a total of 3120 data points (260 days per year x 3 x 4). The 20-day average uses 21 degrees of freedom (20 highs plus 1 more as a rule); the 60-day average uses 61 degrees of freedom (60 lows plus 1 as a rule). The total is 82 degrees of freedom. In percentage terms we’re using 2.6% degrees of freedom (82/3120), leaving 97.4% degrees of freedom. This sample size is adequate to the trading strategy.

Degrees of freedom don’t double count. For instance, if you are using a 5-day and a 10-day moving average of closes you would consume only 12 data points. The 5-day moving average is included in the 10-day moving average. So count only 10 plus 2 rules.

In brief, in order to produce a statistically reliable historical simulation you have match system complexity to sample size. Or, as Vince argues, “You want to keep your system’s degrees of freedom as high as possible to ensure the positive mathematical expectation in the future. This is accomplished not only by eliminating, or at least minimizing, the number of optimizable parameters, but also by eliminating, or at least minimizing, as many of the system rules as possible. Every parameter you add, every rule you add, every little adjustment and qualification you add to your system diminishes its degrees of freedom. Ideally, you will have a system that is very primitive and simple, and that continually grinds out marginal profits over time. . . . [I]t is important that you realize that it really doesn’t matter how profitable the system is, so long as it is profitable. The money you will make trading will be made by how effective the money management you employ is.” (p. 19)

No comments:

Post a Comment