I think I actually saw these folks present at JavaOne a couple years ago? Either that or there's more than one shop branding itself as "HFT" that uses Java.

I worked in the industry and it's always a little funny to see who calls themselves HFTs vs quants.

Basically, there's a bit of a spectrum of fast vs smart. In general it's hard to do incredibly smart stuff fast enough to compete in the "speed-critical" bucket of trades and vice-versa there's barely any point in being ultra-fast in the "non-speed-critical" bucket because your alphas last for minutes to hours.

Just from this read, I feel like these folks are just a hair to the right of "fast" in the [fast]---------[smart] continuum. I mostly make this appraisal based on these paragraphs:

>To gain those few crucial microseconds, most players invest in expensive hardware: pools of servers with overclocked liquid-cooled CPUs (in 2020 you can buy a server with 56 cores at 5.6 GHz and 1 TB RAM), collocation in major exchange datacentres, high-end nanosecond network switches, dedicated sub-oceanic lines (Hibernian Express is a major provider), even microwave networks. >It’s common to see highly customised Linux kernels with OS bypass so that the data “jumps” directly from the network card to the application, IPC (Interprocess communication) and even FPGAs (programmable single-purpose chips).

That's nice but that's where the cutting edge of the speed game was in 2007ish. Everything mentioned here is table stakes at this point (colocation, dedicated fiber, expensive switches, bypassing the kernel in network code, etc). The fact that "even FPGAs" is listed as "even" is the biggest thing I focus on. FPGA's and/or custom silicon is where the speed game is right now. Similarly, "even microwave networks" is also table stakes at this point (you can get on nasdaq's wireless[0] just by paying).

This is the kind of game where capex for technology is dwarfed by the margin you're slinging around every day in trading, so you see some pretty absurd hardware justified.

[0] http://n.nasdaq.com/WirelessConnectivitySuite

Edit: Also shout-out to a different comment in this thread mentioning ISLD, a story I considered telling as well: https://news.ycombinator.com/item?id=24896603

oppositelock

I discovered something amazing when working with some people who were writing HFT software.

Why do you need 1TB of RAM in these machines? Because when you're Java based, you want to avoid stop-the-world GC pauses. These trading systems only have to be up from 9:30AM-4:30PM EST, so they simply disable GC altogether! At the end of a trading day, restart the app or reboot the system.

benjaminjackman

Speaking from experience with JVM HFT applications (we used Scala).

There are a lot of tricks though to not require 1TB.

And allocation in general is a bad idea even if you don't collect because it scatters stuff all over memory and messes up cache locality. You really, really don't want to allocate in a performance sensitive jvm application if you can avoid it. It's the opposite of a lot of what I was told and taught (e.g. never do object pooling), but empirically, in my experience, allocations are the biggest slowdown. You can get an application a lot faster just by opening up the memory allocation tab in a jmc flightrecording and refactoring the biggest allocators, usually there is a lot of easy to optimize low hanging fruit that will give good performance improvements, even better than focusing on hot spots in code (in my personal experience).

By far the biggest allocator in trading is going to be marketdata and calculations on it. For reading marketdata from the exchange it's best to leave raw data in memory and access it with a ByteBuffer / sun.misc.unsafe. Under this pattern classes have 1 value, the memory address to pass into sun.misc.unsafe, then everything from there on is done with offsets onto that address. For calculations it's better to write things as static functions, or use object pooling.

In the course of optimizing a trading engine I wrote lots and lots of code to get allocations down to zero. It's definitely doable, but best done from the start, I refactored an existing trading engine to do that, it was not very fun.

the_only_law

Do you know of any public examples of how this sort of Java looks? I imagine you lose out on being able to take advantage of much of the JVM ecosystem and I struggle to see what using Java even adds anymore.

bluestreak

here is one: https://github.com/questdb/questdb. Disclaimer, I work on this project. Main reason we use Java is speed of development (which increased with amount of base libraries written) and ease of testing.