Alpaca unlocks 100x faster order processing with Redpanda

Better performance and lower latencies = happy customer

By
on
November 30, 2021

After years of high-volume algorithmic trading, Alpaca set out to introduce major performance improvements into our Order Management System (OMS). For the unfamiliar, Alpaca is a developer-first API for stock, options, and crypto trading. Our custom-built OMS is designed for ultra-low latency and scale to accommodate millions of investors worldwide.

Performance is at the core of Alpaca’s DNA, and we wanted to build a best-of-breed system that could keep up with our rapid growth. To roll out a faster and easily scalable version of this system, we turned to Redpanda — a unified streaming data platform fully compatible with Apache Kafka® APIs without the need for JVM or other complexities.  

Now, Alpaca is excited to roll out version 2.0 of our OMS and provide customers with the following upgrades:

  • 100x faster order processing
  • Consistently low latency (even during heavy trading volumes)

The graph below illustrates how Alpaca’s order processing improved from ~150 milliseconds to consistently under 1.5 milliseconds!

Graph showing the speed of order processing with Redpanda.

In this post, we walk through Alpaca’s journey to 100x faster order processing with the help of Redpanda’s lean streaming data platform and their invaluable tech support.

What’s an order management system (OMS) in trading?

At Alpaca, The OMS is the “middle man” between every client order and the market. As the backbone of our trading service, the OMS needs to process orders efficiently. At a high level, OMS processes include:

  • Request processing and validation: When an order-related request is received, we must determine what account it relates to and, optionally, an existing order and ensure the validity of the order parameters.
  • Account state: To validate the order, we associate the request to an account and evaluate the account’s buying power, current positions, etc. Since each processed order would impact the account’s effective buying power and asset positions, order processing is inherently sequential by design.
  • Market venue routing: Depending on the parameters of the order and account, we route the order to the appropriate market venue through rule processing.
  • Execution report processing: When messages, known as execution reports, come back from the market venues, we receive FIX messages for order acknowledgment, partial fills, fills, cancels, etc. The OMS parses these execution reports so that the order state reflects what the market has returned.
  • Complex order triggers: At Alpaca, we support various complex order types. Some order types require triggers that the OMS must execute based on the conditions of the order type.

Since order processing is sequential per account, optimizing order processing time dramatically benefits getting the order to market swiftly. The OMS v2 is entirely in-memory and independent from other systems and services, so we eliminated all the overhead of data marshaling and network round trips of our previous OMS.

Key components of Alpaca’s OMS

For the curious, here’s an overview of our OMS’ architecture.

Architecture of Alpaca’s order management system

We won’t go into every detail here, but a few key components are worth highlighting.

OMS nodes with in-memory state

At the core of OMS v2 are horizontally scalable OMS nodes that maintain the account state entirely in memory. Each OMS is responsible for a particular set of accounts (defined via our sharding strategy).

By maintaining all account states in memory, we avoid significant overhead of the round trips required if the account state lived in a traditional RDBMs. We shifted the account and position state from a single relational database to a set of distributed nodes, each maintaining the state of a subset of accounts.

Distributed Write Ahead Log (WAL)

For durability and recovery, we built a write-ahead log powered by Redpanda. When an OMS instance starts up, it can rebuild the state by hydrating the account state from a database and replaying events from Redpanda, given the last known partition offsets the particular OMS node consumes.

This setup has the added benefit that other services can listen to trade and order-related events by consuming the WAL in an entirely decoupled fashion.

As a bonus, Redpanda introduced us to Travis Bischel — creator of franz-go — the Go client we used to interact with Kafka within OMS v2. TraAs a bonus, Redpanda introduced us to Travis Bischel — creator of franz-go — the Go client we used to interact with Kafka within OMS v2. Travis was extremely helpful in providing suggestions and adding changes to franz-go to support our use case.

Balancer

We use stateless load balancers that are made aware of the OMS nodes via configuration. Our externally facing platform API communicates with OMS v2 via GRPC calls to our OMS balancer pool. The balancers maintain GRPC connections to each healthy OMS node and forward route calls from the API to the appropriate OMS instance.

What’s next for Alpaca?

While we’re thrilled with our new OMS's significant improvements and scalability, we look forward to continuing our roadmap with further improvements to reduce our order processing time, lower latencies, and enhance the data throughput of our order execution services.

Lastly, we’d like to thank the Redpanda team for their continued support and guidance in designing our write-ahead log. We highly recommend checking out Redpanda’s streaming data platform for its speed and simplified deployment. Working with both the platform and their team — from development to production — was a pleasure. Based on our experience, we plan to leverage Redpanda across many more services.

Check out the Redpanda Blog for examples, step-by-step tutorials, and real-world customer stories. To get started with Redpanda, sign up for a free trial and dive into the Docs.

No items found.

Related articles

VIEW ALL POSTS
Streamlining day 2 operations: how Zafin modernized banking
Shahir Daya
&
&
&
June 4, 2024
Text Link
Going real time in AdTech: a batch-to-streaming journey
Abhishek Jain
&
&
&
December 5, 2023
Text Link
Accelerating real-time alerts with Redpanda: a successful migration story
Eric Laguer
&
&
&
November 14, 2023
Text Link