Can your legacy SAN deliver Quality of Service (QoS)? Is popcorn a vegetable?
Tuesday, February 12, 2013 posted by Dave Wright
Think about it. If corn is a vegetable, why isn't popcorn?
Likewise, if storage performance can be guaranteed, why can't any
storage architecture do it?
It's a hard truth to face: legacy storage systems are simply not
designed to handle the demands of multi-tenant cloud environments.
More specifically, the few systems that claim storage Quality of
Service (QoS) - or want to claim it on their roadmap - are really
just "bolting it on" as an afterthought. And these "bolted on"
methods of achieving QoS have unfortunate side effects.
Before we dive in further, let's first discuss why you should care
about true storage QoS as a cloud service provider.
Hosting business-critical applications in the cloud represents a
large revenue growth opportunity for cloud service providers. But
until storage performance is predictable and guaranteed, you won't
be able to programmatically attract this type of business from your
enterprise customers. Is there a solution? Yes, and the answer is
storage QoS architected from
the ground up with guaranteed performance in mind.
Let's take a closer look at some of the "bolt-on" methods that
legacy systems use to try to perform something they can market as
"QoS."
Prioritization
How it works - Prioritization defines applications simply as "more" or "less" important in relation to one another. This is often done in canned and well described tiers such as "mission critical," "moderate," and "low."
Why it doesn't really offer QoS - While prioritization can indeed help give higher relative performance to some apps and not others, it doesn't actually tell you what performance to expect from any given tier. Additionally, it certainly can't guarantee performance, particularly if the problematic "noisy neighbor" is located at the same priority level. So for starters, there is no ability to guarantee that any one application will get the performance it needs. What's more, there is no functionality for one tenant to understand what their priority designation means in relation to the other priorities on the same system. It means nothing to tell a tenant they are prioritized as "moderate" unless they know how moderate compares to the other categorizations. Moderate is also meaningless without knowing what system resources are dedicated to this particular tier. In addition, priority-based QoS can often make a "noisy neighbor" LOUDER if that tenant has a higher priority because that higher priority tenant is allowed more resources to turn up the volume.
Rate limiting
How it works - Rate
limiting attempts to deal with performance requirements by setting
a hard limit on an application's rate of IO or bandwidth. Customers
that pay for a higher service will get a higher limit.
Why it doesn't really offer QoS - Rate limiting
can help quiet noisy neighbors, but does so only by "limiting" the
amount of performance that an application has access to. This
one-sided approach does nothing to guarantee that the set
performance limit can actually be attained. Rate limiting is all
about protecting the storage system rather than delivering true QoS
to the applications. In addition, firm rate limits set on high
performance or bursty applications can inject significant undesired
latency.
Dedicated storage
How it works - IT managers attempt to deliver predictable performance by dedicating specific disks or drives to a particular application, isolating it from other applications or noisy neighbors.
Why it doesn't really offer QoS - Dedicating storage to an application goes a long way toward eliminating "noisy neighbors," yet even dedicated infrastructure cannot guarantee a level of performance. A component failure in one of these storage islands can have a massive impact on application performance as system bandwidth and IO are redirected to recovering from the failure. Despite the dedication of resources, this approach still falls short in its ability to guarantee performance at any level.
Tiered storage
How it works -
Multiple tiers of different storage media (SSD, 15K rpm HDD, 7.2K
rpm HDD) are combined to deliver different tiers of performance and
capacity. Application performance is determined by the type of
media the application resides on. In an effort to optimize
application performance, predictive algorithms are layered over the
system which literally try to predict, based on historical
performance information, which data is "hot" and kept in SSD vs.
data that is "cold" and kept in HDD.
Why it doesn't really offer QoS - Tiering is the
worst of all the "bolted on" solutions designed for delivering
predictable performance. Quite simply, this solution is unable to
deliver any level of storage QoS. Tiering actually amplifies "noisy
neighbors" because they appear hot and are promoted to higher
performing (and scarcer SSDs), thereby displacing other volumes to
lower performing, cold disks. Performance for every tenant varies
wildly as algorithms move their data between media. No particular
tenant knows what to expect of their IO as they don't control the
tiering algorithm or have any insight into the effect on other
tenants. Some tiering solutions try to offer QoS by pinning the
data of a particular application into a specific tier, but this is
essentially dedicated storage (discussed above) at an even higher
cost than usual.
Stay tuned to our blog to learn more about storage QoS and how a scale-out storage system architecture designed from the ground up to deliver and guarantee consistent performance to thousands of volumes simultaneously is the ideal system for building performance SLAs in a multi-tenant cloud environment.
-Adam Carter, Director of Product Management

