Benchmarking database sharding in Akka

5 minute read

Akka 24.05 introduced a significant feature: database sharding for event storage. This innovation allows developers to achieve unprecedented throughput with ordinary relational databases like PostgreSQL–throughput that normally requires high-priced databases. In this post, we'll explore the power of this new feature through a series of benchmarks, demonstrating how it scales from one to eight database instances.

For a comprehensive understanding of Akka's innovative approach to database sharding, we encourage you to review the “The database is the bottleneck article” by Patrik Nordwall. This article provides an in-depth explanation of how Akka implements sharding, the concept of slice ranges, and how these are mapped to databases

Benchmark setup

Application

To evaluate the performance of Akka's new database sharding feature, we created a small Akka application that emulates a digital twin solution for IoT devices. Each device keeps track of the last ten updates, just to simulate some complex failure detection logic when processing a new update and put some pressure on the heap memory. A single update (write request) contains a 500 bytes array with random data, appended to the event store. A read request returns the latest update from the device. Depending on the test, we keep from 3k to 5k device entities per Akka node. Each Akka node is deployed as a Kubernetes pod on a separate host. In other words, one host = one Akka pod. We test a request-response HTTP communication, with JSON-based payload serialization. Internally we use Protocol Buffers to exchange data between Akka nodes, and for persistence.

Database

Every RDS database used for benchmarks has the same configuration:

  • Instance type: db.r6g.2xlarge (8vCPUs, 64 GB RAM)
  • Storage: 100 GB io2, 5000 IOPS

Test

For testing, we used Gatling as a load generator tool. In each scenario we are testing a mixed load with 75% read operations and 25% update (write) operations. We gradually increase the throughput until we reach the maximum for a given setup and keep it for 20 minutes.

Results

1 database instance

  Throughput (req/s) Latency (ms, 99p)
All 108,000 9
Read 81,000 3
Write Write 27,000 14

To handle this load, the Akka cluster used 6 * EC2 m5.4xlarge (16 vCPUs) instances. That’s our baseline. If you are not familiar with Akka, keep in mind that thanks to the Actor Model, read requests do not touch the database. Handling 27k writes per second with a single Postgres database is also pretty impressive. Using AWS io2 storage type definitely helps a lot to hold that throughput.

2 databases instances

  Throughput (req/s) Latency (ms, 99p)
All 252,000 17
Read 189,000 8
Write Write 63,000 28

This time we pushed the throughput even further, sacrificing some latency. The Akka cluster used 12 * EC2 m5.4xlarge (16 vCPUs) instances.

4 databases instances

  Throughput (req/s) Latency (ms, 99p)
All 500,000 29
Read 375,000 15
Write Write 125,000 49

Slowly we can see a very nice, close to linear, scaling characteristic. To handle 500k req/s Akka cluster required 25 * EC2 m5.4xlarge (16 vCPUs) instances.

8 databases instances

  Throughput (req/s) Latency (ms, 99p)
All 1,000,000 34
Read 750,000 23
Write Write 250,000 52

I must admit, seeing 1M req/s in Grafana was very satisfying. Keep in mind that the latency was measured at very high resource saturation in all tests. It can be improved by switching to a less intensive load or increasing the number of Akka nodes.

requests-responses 1

This time we used a different instance configuration. The Akka cluster fully utilized 25 * EC2 m5.8xlarge (32 vCPUs) instances. Together with databases, 864 vCPUs were required to handle this load. The load generation part required 4 * c7i.4xlarge Gatling instances to produce a given throughput.

Analysis

A graph is worth more than a thousand words, so let’s examine this data from a different perspective.

scaling-characteristic 1

The benchmark results reveal a compelling story of scalability. As we increase the number of database instances from one to eight, we observe a near-linear growth in throughput. This impressive scalability demonstrates the effectiveness of Akka's database sharding strategy.

cost-per-1k-iops-per-month 1

On the other hand, costs are close to a flat line, varying from $24.32 to $28.78 per 1000 IOPS per month, where one IOPS is one read or write operation per second. For all calculations, we used a one-year reserved instance pricing.

Summary

The ability to distribute write operations across multiple databases while maintaining data consistency is a key factor in this performance boost. By alleviating the pressure on a single database, Akka allows each instance to operate more efficiently, resulting in a cumulative increase in overall system throughput.

However, it's important to note that perfect linear scaling is rare in real-world scenarios. Factors such as network latency, coordination overhead, and potential hotspots in data distribution may introduce some deviation from the ideal linear growth.

By enabling high throughput with ordinary PostgreSQL instances, Akka provides a cost-effective scaling solution. Organizations can leverage their existing PostgreSQL expertise and infrastructure while achieving performance levels that previously may have required specialized and more complex database solutions, which could have led to a major system rewrite.

For me, as an architect, the actual numbers are not that important. Whether it is 500k req/s or 501 req/s is not a big difference. The most important aspect is that even if I decide to use a moderate Postgres instance, I keep my options open for future scaling requirements. I can scale not only vertically but also horizontally. Some data migration is required for a live data set, but at least I’m not blocked. Not to mention that from the application perspective, not a single line of code change is required to work with more than one database. It’s just a matter of changing the database configuration.

Stay Responsive
to Change.