What’s broken? - Storage sprawl

In the physical world, the ratio of OS instances to boxes was 1. VM technology blows this up, not only with all the live VMs, but all the other images that are suspended, saved for backup or rollback, etcetera. It is not hard to build a data center with tens – even hundreds – of thousands of VM images. That sucks up a lot of disk space.

VM Sprawl

IT pros say that, all things being equal, virtual servers consume 15-25% more storage than physical servers. VM sprawl has a direct hardware budget impact, as well as an ongoing operating cost burden.

The irony is that the vast majority of the bits inside those images are exactly the same, because the VMs are cloned from a small set of golden masters, and a lot of the bits in a VM don’t change as the VM runs. Deduplication could help here, but in this use case, dedupe is more of a bug than a feature. If the storage infrastructure was built right, dupes wouldn’t be created in the first place.