Author: Leif Walsh (Two Sigma)
Presented at: Percona Live, Amsterdam
Abstract: Whether your data’s in MySQL, a NoSQL, or somewhere in the cloud, you’re likely paying decent money for storage and IOPS. With ever-growing data volumes, and the need for SSDs to cut latency and replication to provide insurance, your storage footprint is an important place to look for savings. It makes sense, then, why so many storage vendors tout compression as a key metric and differentiator.
The language vendors and users employ to reason about storage footprint and compression is embarrassingly vague if not meaningless or downright deceptive, but we can do better, and we must do better.
This presentation discusses each part of the durable storage stack, from the hardware on up, and how usage numbers can take on different meanings at each layer. It covers what’s important to know at each layer, and how to think about and talk about concepts like compression, fragmentation, write amplification, and wear leveling. Finally, it examines different ways benchmarketers present data deceptively, and provides some techniques for identifying and cutting through those kinds of misrepresentations.