"which is 3 orders of magnitude faster than a single frame in a video game."
You're right on the money. I worked in HFT for half a decade.
Back in the late 2000s you were on the cutting edge if you were writing really good C++ and had overclocked some CPUs to hell and back and then shoved them in a rack in a datacenter in new jersey. "To hell and back" means "they only crash every hour or two" (because while they're up, they're faster than your competition).
Around the mid 2010's it had moved entirely to FPGAs. Do something smart in software, then give the FPGA something dumb to wait for (e.g. use software to produce a model and wait for some kind of alpha to develop, then tell the FPGA "ok wait for X > Y and fire the order" was basically the name of the game. And it was often better to be faster than it was to be smarter, so really you just kept the FPGA as simple and fast as possible.
At that point, your hard realtime constraints for the language doing the smart stuff really dissolve. Compared to an FPGA getting an order out the door in some fraction of a mic, a CPU doing anything is slow. So you might as well use python, yknow?
People also play all kinds of different speed games. There's latency races around NJ and chicago where microseconds matter around price movements and C++ isn't really part of the picture in a competitive way anymore, but that's not to say someone isn't ekeing out a niche where they're doing something vaguely smarter faster than opponents. But these things, at scale, tend to be questions of - either your alpha is very short-lived (microseconds, and if it isn't organically microseconds, it will be competed until it is), or it is fundamentally longer-lived (seconds to minutes?) and you might as well use python and develop 100x faster.
The silly thing is, the really good trades that focus on short-term alphas (ms's and less) are generally obviously good trades and so you don't have to be super smart to realize it's a really fucking good trade, you had better be fast and go get it. So there's also this kind of built-in bias for being fast because if a trade is so marginally good that you needed a crazy smart and slow model to tell it apart from a bad trade, it's probably still a relatively crummy trade and all your smarts let you do is pick up a few extra pennies for your trouble.
I'll close by saying don't take my words as representative of the industry - the trades my firm specialized in was literally a dying breed and my understanding is that most of the big crazy-crazy-fast-microwave-network people have moved to doing order execution anyway, because most of the money in the crazy-crazy-fast game dried up as more firms switched to either DIYing the order execution or paying a former-HFT to do it for them.
Even if the low-latency logic moves to the FPGA, you still need your slow path to be reasonably fast, say about 100us, and absolutely never more than 1ms.
To my knowledge Python is not suitable, though I know some players embed scripting languages that are.
It depends entirely on what you've got on the FPGA and what's slow-path. My firm had order entry done exclusively by the FPGA and the non-FPGA code would have orders in the FPGA for sometimes tens of seconds before they would fire. The non-FPGA code was still C++, but the kind of everyday C++ where nobody has been wringing all the water out of it for speed gains. If we rewrote it all, it'd probably be python. But that's just us. Every firm has their own architecture and spin on how this all works. I'm sure somebody out there has C++ code that still has to be fast to talk to an FPGA. We didn't - you were either FPGA-fast or we didn't care.
edit: also, fwiw, "say about 100us, and absolutely never more than 1ms." things measured in that many microseconds are effectively not what my firm considered "low-latency". But like I said, there are lots of different speed games out there, so there's no one-size-fits-all here.
Your hardware is configured by your software. You pre-arm transactions to be released when a trigger occurs.
Releasing the transaction when the trigger occurs is the fast path, and what the hardware does.
The slow path is what actually decides what should be armed and with what trigger.
If your software is not keeping up with information coming in, either you let that happen and trigger on stale data, or you automatically invalidate stale triggers and therefore never trigger.
100us is short enough to keep up with most markets, though obviously not fast enough to be competitive in the trigger to release loop.
But it's mostly an interesting magnitude to consider when comparing technologies; Python can't really do that scale. In practice a well-designed system in C++ doesn't struggle at all to stay well below that figure.
The idea for aggressive orders in HFT is to buy before the market goes up, and sell before the market goes down.
HFT signals are about detecting those patterns right before they occur, but as early as the information is available globally. For example someone just bought huge amounts of X, driving the price up, and you know Y is positively correlated to X, so Y will go up as well, so you buy it before it does.
X and Y might be fungible, correlated mechanically or statistically.
You can also do it on Y=X as well; then what you need is the ability to statistically tell whether a trade will initiate/continue a trend or revert. You only need to be right most of the time.
On one hand you have quantitatively driven strategies that try to predict either a price or direction based on various inputs. Here you’re mostly focused on predictive accuracy, and the challenge is in exiting the trade at the right time. This is where a lot of the speed comes into play (what is your predictive horizon, and can you act fast enough to take advantage of the current market prices?).
The other mode of trading tends to focus on structural mispricing in the market. An easy to understand example is an intermarket arbitrage trade where one market’s buyer or seller crosses prices with the opposite side of the market on another exchange. These events permit a trader to swoop in a capture the delta between the two order prices (provided they can get to both markets in time).
As easy opportunity has dried up (markets have grown more efficient as systems have gotten faster, and parties understanding of the market structure has improved) you see some blending of the two styles (this is where another commenter was talking about mixing a traditionally computed alpha with some hardware solution to generate the order), but both come with different technical challenges and performance requirements.
Isn't there challenges with slippage and managing the volumes while exiting? And isn't speed also about processing the data feed as fast as possible to time the exit decisions accurately?
Absolutely to both questions, with different answers depending on what style of strategy you’re running.
The market mechanics trades tend to have no recoverability if you miss the opportunity, so you’re often trading out in error and it’s a matter of trying to stem the loss on a position that you do not have an opinionated value signal on.
And there’s definitely an angle to inbound processing speed for both styles of trading, with differing levels of sensitivity depending on the time horizons you are attempting to predict or execute against. Using the example above again, detecting the arb opportunity and firing quickly is obviously paramount, but if you’re running a strategy where you have a 1 minute predictive time horizon sure, there’s some loss that can be associated with inefficiency if you aren’t moving quickly and someone else is firing at a similar signal, but generally speaking there’s enough differentiation in underlying alpha between you and any competitors that the sensitivity to absolute speed isn’t as prevalent as most people expect.
Basically it boils down to going fast enough to beat the competition, and if there isn’t any you have all the time in the world to make decisions and act on them.
This is true -- in the early 00's there was hardly any competition and we could take out both sides of a price cross with 100ms latency. Even after colocation we could still be competitive with over 4ms latency (plus the network). Trading technology has come a long way in 20 years.
If every other exchange is selling $AAPL at $100 and suddenly the top level of one exchange drops to $99, then if you just take out that order you basically gain a free dollar. Do this very fast and have pricing the product accurately and you will print tons of money.
Yeah I gather that is the expectation, but if you are the first to execute an order you will sell that order at the old 100 price before it lowers. You are fighting for making an order before the information spreads to the other bots. (Right?!)
You're right on the money. I worked in HFT for half a decade.
Back in the late 2000s you were on the cutting edge if you were writing really good C++ and had overclocked some CPUs to hell and back and then shoved them in a rack in a datacenter in new jersey. "To hell and back" means "they only crash every hour or two" (because while they're up, they're faster than your competition).
Around the mid 2010's it had moved entirely to FPGAs. Do something smart in software, then give the FPGA something dumb to wait for (e.g. use software to produce a model and wait for some kind of alpha to develop, then tell the FPGA "ok wait for X > Y and fire the order" was basically the name of the game. And it was often better to be faster than it was to be smarter, so really you just kept the FPGA as simple and fast as possible.
At that point, your hard realtime constraints for the language doing the smart stuff really dissolve. Compared to an FPGA getting an order out the door in some fraction of a mic, a CPU doing anything is slow. So you might as well use python, yknow?
People also play all kinds of different speed games. There's latency races around NJ and chicago where microseconds matter around price movements and C++ isn't really part of the picture in a competitive way anymore, but that's not to say someone isn't ekeing out a niche where they're doing something vaguely smarter faster than opponents. But these things, at scale, tend to be questions of - either your alpha is very short-lived (microseconds, and if it isn't organically microseconds, it will be competed until it is), or it is fundamentally longer-lived (seconds to minutes?) and you might as well use python and develop 100x faster.
The silly thing is, the really good trades that focus on short-term alphas (ms's and less) are generally obviously good trades and so you don't have to be super smart to realize it's a really fucking good trade, you had better be fast and go get it. So there's also this kind of built-in bias for being fast because if a trade is so marginally good that you needed a crazy smart and slow model to tell it apart from a bad trade, it's probably still a relatively crummy trade and all your smarts let you do is pick up a few extra pennies for your trouble.
I'll close by saying don't take my words as representative of the industry - the trades my firm specialized in was literally a dying breed and my understanding is that most of the big crazy-crazy-fast-microwave-network people have moved to doing order execution anyway, because most of the money in the crazy-crazy-fast game dried up as more firms switched to either DIYing the order execution or paying a former-HFT to do it for them.