Modern Memory

Each of the solid-state disks in the bbc's media server will be written to and read from almost constantly, so it's critical that the disks have good I/O performance lest they become a bottleneck in our system. post

The results of initial tests seemed good for an upgrade candidate ssd - little regression on our current disks but with double the capacity. So we selected what will be referred to as disk 'B', a 1TB disk to replace the current 500GB disk 'A'. After building out several pools things didn't look so good.

This is where we started considering all the knobs and levers we could twist and pull, bearing in mind disk-related properties, such as flash cell geometry, the behaviour of our RAID controller, and any on-controller or on-disk cache. We also started looking at disk C, the "enterprise version" of disk B.

Before we go on, it's worth having a quick overview on how an operating system interacts with a disk, and how SSDs in particular work.

Most SSDs do some kind of wear levelling - distributing utilisation over all the available erase blocks in order to extend the lifetime of the disk. SSDs tend to be black boxes in this respect, and there are more than a few kinds, e.g. static and dynamic wear levelling.

Now recall that some approaches to write amplification include compression and block deduplication... disk A does these things, probably turning several gibibytes of writes into a few kibibytes of flash usage. The other disks tested do not appear to do these tricks.

After replacing the mock origin with something that produced data that looked truly random, the incredible load test results from disk A started looking a lot like the other disks. The difference is remarkable.

As a team developing and maintaining caching infrastructure, we should know that caches are important. It seems that the impact of write-back cache can vary and any new disk should be tested with it both enabled and disabled.