Hacker News new | past | comments | ask | show | jobs | submit login

It's not a case of exclusive or. Gather a list of a 100 or so human stock pickers. Train a model to predict the effect of following their suggestions (buy, hold, sell) to both identify which experts are most right, and which particular experts do well for particular kinds of stocks. This model will do better than random guessing, and quite likely, better than any individual stock picker.



Algorithms that predict the past accurately but are no better than random at the future are a dime a dozen, and I don't see how your suggested algorithm is different.


Corollary: having a time machine that can go a few minutes back in time is worth much less than one that can go a few minutes forward in time.


This implies the time machine can go only one way. Taking information from the future back in time is worth more than taking information from the current to the future. We do the latter all the time, we do not need a time machine for this, just patience.


Then look at backtesting. The evaluation data set is out of time, meaning the better than random performance is on unseen future data. The algorithm was implemented, not merely a suggestion.


You can still leak information from your backtested time series by choosing WHICH algorithm to use out of a large pool of algorithms. You'll get regression to the mean because you optimized for a noisy signal (backtesting performance) of future earnings.


It is possible to leak information, but then you are doing it wrong. Don't use only a single out of time test set to do parameter or model selection, keep an out of time holdout set.

But really, this is the bare basic of forecasting. It is somewhat annoying to have to regurgitate all of this: Like non-leaking forecasting is impossible somehow. It would be a better discussion if everyone just assumes proper forecasting practices. Instead people seem to assume I have no clue what I am doing, discarding my technique, because I did not mention removing duplicates, scaling, proper validation techniques, ... and a 100 other things, which are of no importance to the technique itself.


So you're basically saying the model should identify which human stock pickers to follow for each sector?

If it's so easy why not to identify directly which stocks to invest in?

If you can predict which investors will perform well, then just try to predict which stocks will perform well directly.


> If it's so easy why not to identify directly which stocks to invest in?

It is not easy. Both to set-up, and the problem itself (you won't get a very high accuracy, but you will get much better than random guessing).

> If you can predict which investors will perform well, then just try to predict which stocks will perform well directly.

This won't work, because you don't have access to all the information that the stock pickers have access to, just their advice, and some features about the stock / the company the stock pickers work for.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: