Emils Solmanis

io is not (that) slow

the boogie man used to be real

You may have heard that IO is slow. When I was just learning engineering, it was always emphasised that disk access is the ultimate sin if you want your application to perform. It would grind your program to a halt, but if you really HAD to do it, it would have to be sequential. Random disk access used to be feared worse than swapping. Similar things could be said about sending or receiving data across network.

.. but he’s afraid of the future

Nowadays, round-trip latency in data centres has gone way down, and some have started supporting proper RDMA, which drives network latencies into the tens-of-microseconds ranges. Even without RDMA, the latencies are not in the 500μs range anymore.

Similarly, storage drives have seen explosive innovation, and even an SSD from 2012 is many times slower than the ones available today. But here’s the thing not many people know — it’s not the drive that’s slow. It’s the interface.

The SATA 3 interface is (in theory) limited to 6Gbps by the standard. Most SSDs nowadays clock in a bit under 550 MB/s in sequential read. While the interface in theory allows for 6Gbps, in practice that 550 MB/s (4.4 Gbps) mark is what you’ll get with high-end hardware. The non-tech reason for this is rather simple — the spec was made while rotational disks still hadn’t properly died off.

an alternate universe

Well not really. It’s this one that’s evolving, and at an astonishing pace. Since introducing new interfaces is expensive, the simplest and most obvious solution is using PCIe — the interface that already powers your GPU’s monstrous appetite. For example, Samsung recently released their first proper consumer-grade PCIe SSD, the 950 Pro. Retailing at £270, it’s not much more expensive than a regular SSD. It uses a PCIEx4 interface, and the 512GB model clocks a ridiculous 354k IOPS. That translates into a bit less than 3μs latency for a random 4K read. Go check that archaic table about how slow disk access is. I’ll wait. Yes, it says a random 4K read is 0.15ms, i.e., 150μs. At 2.6GBps the sequential reads are just plain ridiculous. Think about that number for a moment. For comparison, DDR4 DRAM bandwidth even at 3200 MHz is around 60 GBps. That’s not orders of magnitude faster than non-volatile storage anymore. That’s merely 20 times faster, and that’s for high-end gaming memory. Server grade ECC RAM modules don’t have the advantage of sacrificing reliability, they are stuck at lower speeds and would generally be in the 30 GBps range, a mere 10 times faster.

into the void

There are a couple of take-aways from these technological advances. One is that you should study your algorithms. Optimising for hardware is going away to a large degree, so you better know your data structures. This doesn’t mean that CPU caches are going anywhere, but much of the “in-memory” fuzz is going to become irrelevant.

Also, chucking resources at things is going to become really cheap. With persistent storage being as fast as memory, real-time data pipelines become much easier to build. Imagine having terabytes upon terabytes of RAM speed storage available. It’s hard to believe only a couple of years ago we were stuck with Hadoop and Map-Reduce.

What do you think? Is super-fast persistent storage just another hipster fad? Will it take roots and eventually substitute existing drives? Let us know in the comments!

If you enjoyed the read, drop us a comment below or share the article, follow us on Twitter or subscribe to our #MetaBeers newsletter. Before you go, grab a PDF of the article, and let us know if it’s time we worked together.

blog comments powered by Disqus