I'd love to hear more about what kind of data augmentation you're doing. A friend of mine recently got a GAN to work for timeseries which is really interesting.
I've done a lot of work in the space and would love to chat - just emailed you :)
I use a supervised learning setup (although with a custom loss function).
The kind of data augmentation I do is adding different candles sizes. I validate with 5m candles, but I train with 2,3,4,5,6,7m ones.
I also sample more frequently more recent data. I train jointly with ~22 symbols, but in each X with those symbols, I randomly set some to 0, some I invert their price, some I invert time-wise. This helps generalization for some reason. I tried many kinds of noise, but what I described above is what I found to work best in my case.
I have a more ambitious idea to generate synthetic data using self play: have a bunch of agents trading one against another. This create new price data I can train the agents with, and repeat (this self-play training scheme would be similar to what DeepMind did with AlphaGo/AlphaZero). The issue with it is the need to tune the parameters exactly so that the resulting synthetic data is realistic enough that I can tranfer the agents to real data.
For example, during self-play, should you have only trading agents or should you add "retail traders" that buy during bubbles, "normal buyers" that buy only below, sell above certain prices, institutional buyers that randomly move the price a lot in a given direction. This is a lot of parameters to get right, and it's an optitization problem on it own. You could treat this a as two-fold optimization problem such as in this paper: https://arxiv.org/pdf/1810.02513.pdf, but it gets tricky very fast.
I've done a lot of work in the space and would love to chat - just emailed you :)