We use cookies and other tracking technologies to improve your browsing experience on our site, analyze site traffic, and understand where our audience is coming from. To find out more, please read our privacy policy.

By choosing 'I Accept', you consent to our use of cookies and other tracking technologies.

We use cookies and other tracking technologies to improve your browsing experience on our site, analyze site traffic, and understand where our audience is coming from. To find out more, please read our privacy policy.

By choosing 'I Accept', you consent to our use of cookies and other tracking technologies. Less

We use cookies and other tracking technologies... More

Login or register
to apply for this job!

Login or register
to save this job!

Login or register to start contributing with an article!

Login or register
to see more jobs from this company!

Login or register
to boost this post!

Show some love to the author of this blog by giving their post some rocket fuel 🚀.

Login or register to search for your ideal job!

Login or register to start working on this issue!

Engineers who find a new job through Functional Works average a 15% increase in salary 🚀

Blog hero image

What I’ve learned after coding for HFT and Low Latency Systems

Ariel Silahian 24 November, 2020 | 2 min read

Choose the right language: FORGET about scripting languages, they won’t work. When you are looking to shave those last few microseconds off your processing time you cannot have the overhead of an interpreted language. Additionally, you will want a strong memory model to enable lock free programming so you should be looking at Java, Scala or my preferred one, C++11.

Keep it all in memory: Read everything at the beginning. FORGET about databases or active persistence. I/O will kill your latency, so make sure all of your data is in memory. This generally means managing your own in-memory data structures and maintaining a persistent log, so you can rebuild the state after a machine or process restart. Alternatively, you might be able to get away with running a local, persisted in-memory database like Redis or MongoDB (with memory >> data). Note that you can loose some data on crash due to their background syncing to disk, but that’s the price isn’t?

Keep your hardware underutilized: Low latency requires always having resources to process the request. Don’t try to run at the limit of what your hardware/software can provide. Always have lots of head room for bursts and then some.

Keep context switches to a minimum: your code must be tailored on your hardware. Context switches are a sign that you are doing more compute work than you have resources for. You will want to limit your number of threads to the number of cores on your system and to pin each thread to its own core.

Keep your reads sequential: make sure you are using CPU caches. All forms of storage, wither it be rotational, flash based, or memory performs significantly better when used sequentially. When issuing sequential reads to memory you trigger the use of prefetching at the RAM level as well as at the CPU cache level. If done properly, the next piece of data you need will always be in L1 cache right before you need it. The easiest way to help this process along is to make heavy use of arrays of primitive data types or structs. Following pointers, either through use of linked lists or through arrays of objects, should be avoided at all costs.

Non blocking as much as possible: design your system to create non blocking and wait free data structures and algorithms. Every time you use a lock you have to go down the stack to the OS to mediate the lock which is a huge overhead. Often, if you know what you are doing, you can get around locks by understanding the memory model of the JVM or C++11

Async as much as possible: Any processing and particularly any I/O that is not absolutely necessary for building the response should be done outside the critical path.

Parallelize as much as possible: Any processing and particularly any I/O that can happen in parallel should be done in parallel. For instance, if your high availability strategy includes logging transactions to disk and sending transactions to a secondary server those actions can happen in parallel.

Originally published on www.linkedin.com

Author's avatar
Ariel Silahian
Lead Developer | Design and implementation of algorithmic trading strategies/models.
    engineering manager
    quantitative trading
    C#
    C# dotnet
    C++
    Agile Development

Related Issues

WorksHub / client
  • Started
  • 0
  • 17
  • Intermediate
  • Clojure
  • $150
viebel / klipse-clj
viebel / klipse-clj
  • Open
  • 0
  • 0
  • Intermediate
  • Clojure
viebel / klipse
  • Open
  • 0
  • 0
  • Intermediate
  • Clojure
viebel / klipse
  • 1
  • 0
  • Intermediate
  • Clojure
viebel / klipse
  • Started
  • 0
  • 2
  • Intermediate
  • Clojure
  • $80
viebel / klipse
  • Open
  • 0
  • 0
  • Advanced
  • Clojure
  • $80
viebel / klipse
  • Started
  • 0
  • 2
  • Advanced
  • Clojure
  • $180
viebel / klipse
  • Started
  • 0
  • 1
  • Intermediate
  • Clojure
viebel / klipse
  • 1
  • 1
  • Advanced
  • Clojure
  • $300

Get hired!

Sign up now and apply for roles at companies that interest you.

Engineers who find a new job through Functional Works average a 15% increase in salary.

Start with GitHubStart with Stack OverflowStart with Email