Wednesday, April 21, 2010

Recognizing a spoken word, predicting a market

Renaissance Technologies, the renowned quantitative hedge fund firm, is now headed by two men who came from the world of voice-recognition technology. See the WSJ article from March 16. A Renaissance researcher enticed them to join the firm almost 20 years ago because, as he said, “I realized that there are some deep technical links between the way speech recognition is done and some good ways of predicting the markets.”

I’m not planning to go very far into yet another unknown world, but here’s a linguistic thought that might just trigger some trading ideas. In “Spoken Word Recognition,” a contribution to Traxler and Gernsbacher’s Handbook of Psycholinguistics, 2d ed. (Academic Press, 2006), Delphine Dahan and James S. Magnuson argue that we cannot recognize a word simply by looking at a string of phonemes. I won’t follow out their argument, but I was struck by one paragraph. See if it rings a bell with you.

“What purpose might . . . fine-grained sensitivity serve? One challenge posed by assuming that words are identified from a string of phonemes is the embedding problem; most long words have multiple shorter words embedded within their phonemic transcriptions (e.g., . . . unitary contains you, unit, knit, it, tarry, air, and airy). . . . Successful spoken word recognition depends on distinguishing words from embeddings. However, the embedding problem is significantly mitigated when subphonemic information in the input is considered. For example, listeners are sensitive to very subtle durational differences (in the range of 15-20 ms) that distinguish phonemically identical syllables that occur in short words (ham) from those embedded in longer words (hamster).” (pp. 250-51)

Think, for instance, about comparing charts with volume bars to charts with time bars. Volume-based bars can sometimes differentiate between bars that look identical on a time-base chart. Can we distinguish bars that will immediately reverse from those that belong to a trend by separating out “sub-bar” information? There are lots of ways to play around with this idea, especially if you have access to relatively high frequency data. Go for it!

No comments:

Post a Comment