Tuesday, December 1, 2009

Trading and the problem of random reinforcement

In Beating the Financial Futures Market (Wiley, 2006) Art Collins makes a case for systems trading over discretionary trading. His argument is that the very nature of markets works against the discretionary trader since they are “overwhelmingly random noise with a small trend component. The latter is what mechanical traders largely hang their hopes on, but it’s not perceivable on an individual trade basis. You need many trials to expose biases the same way Vegas needs many bets to exemplify the fact that their long-term edges are ultimately insurmountable.” (p. 1) Discretionary traders are attuned to the noise and very, very few can extract anything meaningful from it.

Moreover, Collins argues, the discretionary trader is rewarded in an arbitrary fashion. Unlike most areas of life where repeated experience results in learning, “trading doesn’t conform because good ideas (whatever they are) won’t necessarily produce winning trades any more than bad decisions will become losses. People get frustrated trying to form rules around an overload of information.”

That is, the discretionary trader is essentially receiving random feedback. And “researchers have scientifically demonstrated that the hardest behavior to break or modify is that which is rewarded in an arbitrary fashion. A lab rat that only occasionally receives food, a drug rush, and so forth when he pushes a lever will continue to hit it until his paw is raw.” (p. 2)

Ah, you might reply, but what about random reinforcement in dog training? Didn’t you quote Scott Page in your October 12 post as saying that when you’re teaching a dog to sit you shouldn’t give him a treat every time as a reward? Yes, but here’s the big difference. First, the dog trainer rewards only the sought-after behavior; she doesn’t give the dog a treat if he doesn’t sit. Second, the dog trainer rewards the appropriate response most of the time, so there’s not a lot of randomness. The element of randomness that is introduced into the training process supposedly strengthens the link in the dog’s brain between the trainer’s command and the dog’s response; it doesn’t turn him into a compulsive wreck.

By contrast, the market (especially in a small time horizon) rewards both bad behavior and good behavior, punishes both good behavior and bad behavior, and may do this more or less randomly. And, contrary to Collins, I would suggest that it doesn’t single out the discretionary trader for this random reinforcement. The market is a terrible, perhaps even deranged teacher! You can’t take your cues from how this teacher reacts to your behavior.

There’s been a lot written about replacing an external reward system with an internal one—rewarding yourself for sticking to your plan, following your rules, being disciplined. There’s no doubt that this task is psychologically extraordinarily difficult because it flies in the face of a lifetime of experience where, at least for the most part, a person is rewarded by the outside world for doing the “right” thing and punished for doing the “wrong” thing. I for one don’t take much solace in giving myself an expensive “A.”

We need to get out of this random feedback loop. And I don't think we can do it by substituting less powerful feedback for more powerful feedback. We have to substitute for this debilitating random feedback something that’s even more powerful. Taking our cue from Art Collins (and, yes, I understand it’s not exactly original), one solution is to substitute the continuous for the discrete. The market gives us discrete feedback; it can recognize only one trade at a time. If we shift our focus from these discrete data points to a line that connects them in a meaningful way—to wit, our equity curve, we presumably have something that no longer exhibits the properties of randomness. (Or, if we do, we know that we’re doing something very wrong.) In the simplest terms, if our equity curve is sloping upward, we know we’re doing the right thing and are being rewarded. If it’s sloping downward, we know we’re doing the wrong thing and are being punished. Yes, there will always be some random reinforcement in the equity curve, but with any luck it will make us more obedient (and wealthier) dogs—oops, traders.

No comments:

Post a Comment