Blog

RSS Icon

Extending The Storage Disruption Cycle

Thursday, January 26, 2012 posted by Dave Cahill

"There comes a time when a storage company needs to define itself by what it does for customers and not by the machinery it uses to do so."

Chris Mellor, "How to tell if your biz will do a Kodak", The Register

The Register's Chris Mellor penned a great article the other day reflecting on the continuous cycles of innovation and disruption that have come to characterize the storage media industry. He uses Kodak to paint the picture of an incumbent getting capsized by a media transition. He goes on to cite other examples across tape and optical media where incumbents failed to manage the transition to the next generation media.

As the storage industry has transitioned through different media types there have always been opportunistic stopgap innovations that have bridged the gap from one generation to the next. Virtual Tape Library (VTL) technology is a great example of an innovation serving as a transitional bridge between the tape and disk eras. Once applications were written with the capability to natively interface with disk, deduplication and compression drove down solution costs quickly making it an effective bulk storage medium.  Once financially viable, the flood gates were opened and tape was relegated as a deep archive. Similarly, today we are seeing flash-based caching and tiering technologies forming a similar transitional bridge while the $/GB economics of flash fully converge with, and eventually eclipse, disk.

So with history as a guide for how this plays out, why will the disk to flash media transition be any different than the ones before it? Well, I suspect this cloud thing might have something to do with it.

In the enterprise IT sector, systems always seem to consume features over time. At its core, the cloud is a massive infrastructure system that when used properly is an extension of existing IT. However, cloud infrastructures will increasingly chip away at the incumbent IT footprint by rapidly incorporating new innovations into its architecture. These enabling innovations allow cloud providers to continually expand their portfolio of cloud services. Over time the IT use cases applicable to this medium naturally expand as applications and interfaces catch up, performance improves and the economic value proposition can no longer be ignored.

So what does this mean? From our perspective, the cloud adds a third leg to the innovation sequence we have witnessed in the past. New component level technologies will continue to enable new architectures. But where it gets interesting is when these new architectures drive the performance and economics to enable new cloud services.

In storage, the media innovations that Mellor refers to, and their related price/performance value proposition, are a powerful enabling force behind new storage architectures. Applied to traditional IT cost centers these architectures are interesting, when applied to profit-driven cloud services they are game changing. Amazon's recently announced DynamoDB service is an early instantiation of this extended innovation sequence where component level technologies (SSD), enable new architectures that drive new services. Fortunately for the end-customers, the economics of flash are only getting better from here. Now is it up to the storage industry to innovate on top of this medium, delivering next generation systems that can extend the reach of cloud hosted services to an even wider range of application workloads.

-Dave Cahill, Director of Strategic Alliances

Inefficiency & Unpredictability...A Service Providers Worst Enemy

Tuesday, January 24, 2012 posted by Dave Wright

In our first two posts on storage tiering we talked through the difference between capacity-centric vs. performance-centric approaches and also exposed some of the hidden costs of an automated tiering implementation. Closing out this mini-series I wanted to touch on a few other deficiencies inherent to an automated tiering solution.

Within a storage infrastructure it is IOPS, not capacity, that are the most expensive and limited resource. In a tiered architecture, SSDs are inserted into the equation to try and improve the balance between IOPS and capacity. However, while an SSD tier may reduce performance issues for well-placed data, the usage of this expensive tier remains inefficient. This inefficiency stems from a lack of granularity in the data movement of a tiered system. If a sub-LUN tiering system needs to move hot data chunks anywhere from 32MB to 1GB, it will likely promote a lot of cold data in the process. This overhead forces sub-optimal utilization of the premium SSD capacity.

Another potential problem area from tiering, specifically in a multi-tenant environment, is dealing with IO density - that is, how IO is distributed across a range of disk space. Applications whose IOs are concentrated within close proximity to each other (IO dense) will gain greater benefit from sub-LUN tiering than those whose IOs are spread more evenly over the entire logical block address space (IO sparse). Because tiering mechanisms measure data usage at the chunk level, an application who has more hits within a small number of chunks is more likely to be promoted than an application who spreads the same number of IOPS across more chunks. From an array performance perspective this approach is reasonable, as you get more performance within the same resource footprint. However, in a multi-tenant setting with data distributed across many distinct application this leads to serious problems with fairness and performance consistency across workloads.

We originally discussed the performance implications of tiering in July of last year. In a multi-tenant setting this performance variability exposure is magnified. Customers are continually exposed to the risk that the promotion of another customer's hot data will result in the demotion of their own. The order of magnitude difference in latencies and IOPS between the different tiers makes it practically impossible for a service provider to guarantee performance to an individual application (or tenant) under these conditions.

In recognition of the deficiencies of a tiered architecture, SolidFire sought a better way. Our Performance Virtualization technology decouples the tight binding between the storage performance and capacity, resulting in a far more precise allocation of IOPS and capacity on a volume by volume basis regardless of issues such as IO density. Instead of best guess efforts as to the size and tiers of media required to meet customer performance requirements, a service provider can now dial-in IOPS and capacity individually at the volume-level from cluster-wide independent pools of capacity and performance. These allocations can also be dynamically adjusted over time as application requirements change. All things considered, Performance Virtualization is a far more efficient way to address IOPS scarcity, without exposing customers to the inefficiency and unpredictable performance inherent in an automated tiering architecture.

-Dave Wright, Founder & CEO

Amazon launches DynamoDB...We like what we see!

Wednesday, January 18, 2012 posted by Dave Wright

Amazon launched a new service today: DynamoDB. It's a scaleable NoSQL database service that will run in the AWS cloud. It is akin to a hosted version of Cassandra or MongoDB with unlimited scalability. The most notable section of Werner Vogel's blog announcing the new service is worth repeating:

Cloud-based systems have invented solutions to ensure fairness and present their customers with uniform performance, so that no burst load from any customer should adversely impact others. This is a great approach and makes for many happy customers, but often does not give a single customer the ability to ask for higher throughput if they need it.

As satisfied as engineers can be with the simplicity of cloud-based solutions, they would love to specify the request throughput they need and let the system reconfigure itself to meet their requirements. Without this ability, engineers often have to carefully manage caching systems to ensure they can achieve low-latency and predictable performance as their workloads scale. This introduces complexity that takes away some of the simplicity of using cloud-based solutions.

The number of applications that need this type of performance predictability is increasing: online gaming, social graphs applications, online advertising, and real-time analytics to name a few. AWS customers are building increasingly sophisticated applications that could benefit from a database that can give them fast, predictable performance that exactly matches their needs.

Looking under the covers a bit further here there are two really interesting enabling components of the DynamoDB service that deserve highlighting:

  1. All-SSD- the service is deployed using 100% SSDs to provide consistent high performance at a very large scale. This is notable in that it is AWS' first use of SSDs in their cloud architecture.
  2. Guaranteed Throughput - The DynamoDB service includes a concept called "Provisioned Throughout". This is essentially a guaranteed QoS model, where a customer can purchase reserved capacity (measured in queries per second), rather than paying for the actual queries run. Applied to a storage service, this would be akin to paying based on guaranteed IOPS. Currently Amazon EBS's current pricing model is based on actual IO operations with no guaranteed throughput or latency.

Amazon DynamoDB is a strong endorsement of several of SolidFire's key principals. The first being that the cloud needs Solid-State Drives (SSD) to adequately support the evolving performance demands of multi-tenant storage. The second is the idea that as more of these performance-sensitive applications make their way to the cloud there is a clear requirement for guaranteed QoS controls that can dynamically support performance requirements at a much more granular level. Finally, and building off the first two, is the validation that when armed with the enabling architecture to confidently and economically deliver performance-based services, service providers can stand-up cloud service offerings based on committed performance.

Amazon is a great indicator on the pulse and direction of the industry. The broader implications here for running performance sensitive applications in a cloud environment are intriguing to think about. Here at SolidFire, the continued innovations around the enabling architectures required to make this a reality are what get us really excited.

-Dave Wright, Founder & CEO

The Diseconomies of Tiering

Tuesday, January 17, 2012 posted by Dave Wright

In the initial post of our series on tiering we covered the merits of a proactive performance-driven approach to tiering relative to the more traditional capacity-centric discussions. Today we take a closer look at some of the less obvious cost implications of "automated" tiering. On the surface, the promise of tiering looks like an clear win - SSD performance with spinning disk capacity and cost. However, the true economics of this type of solution are not nearly as compelling as some vendors would lead you to believe. Considered in the context of the unique burdens faced by cloud service providers and the proposed value proposition is even less appealing.

To start with, the "SSD performance" promise part of the catchy tagline above must be caveatted by the fact that this only proves to be the case if the data is actually residing in the SSD tier. Easier said than done. The ability to guarantee SSD performance in a tiered architecture requires a substantial SSD tier and/or extremely accurate data placement algorithms. Rightsizing the former skews the proposed economics of a tiered solution substantially, while the latter has been long on promise but short on delivery for at least three generations of marketing executives. Before the industry marketed this functionality as Automated Tiering it was known as Information Lifecycle Management (ILM) and a few years before that it was Hierarchical Storage Management (HSM). Regardless of what you call it, tiering has always been impaired by the inability to accurately predict and automate the movement of data between tiers. In the context of cloud environments the significant scale requirements and extremely low application-level visibility make solving this challenge even more difficult.

It's also important to consider the flash media requirements of a tiered solution. The write patterns in the flash layer of a tiered architecture require a higher grade flash solution to withstand the impact of write amplification and churn. Vendors are forced to use the most expensive SLC flash to ensure adequate media endurance. The cost impact even modest amounts of SLC flash destroy the economic advantage of a tiered architecture relative to an all-MLC design. In many examples we've seen that the "combined" $/GB of a storage solution that incorporates SLC-flash, 15k SAS and SATA is actually higher than an all-flash MLC solution with similar raw capacity. Importantly, this price advantage for MLC over tiered storage is achieved before factoring in the favorable impact of compression and deduplication for the all-flash solution, making the flash design even more compelling.

Tiering also hurts capacity utilization and controller performance. In order to ensure data is in the right place at the right time it is constantly being promoted and demoted between the flash and disk tiers. There needs to be a certain capacity buffer to accommodate this movement. There is also a controller processing cost to keep up with all this activity. Most legacy systems have limited CPU and controller memory relative to their overall capacity, making the overhead of tiered storage processing one more burden for them to manage. Even complex tiering requires only a fraction of the processing power and memory needed for in-line data reduction features like compression and dedupliction, which is why those features are seldom found on legacy primary storage controllers. A recent article from TechWorld references a Forrester Research report by Andrew Reichman (@ReichmanIT) that expands on the data management burden of a tiered storage topology.

The issues outlined above are just a few examples of the hidden costs embedded in an "automated" tiering solution. In some cases these deficiencies may be acceptable in smaller IT environments. However, in a large scale multi-tenant cloud infrastructure the capital and management costs of these shortcomings are magnified. The hyper-competitive nature of service provider business model necessitates a more efficient approach.

-Dave Wright, Founder & CEO

Capacity vs. Performance Tiering

Tuesday, January 10, 2012 posted by Dave Wright

In our end of year blog we reviewed a number of the unique storage challenges that infrastructure service providers face in building and operating a large-scale, and profitable, cloud offering. A clear understanding of these issues provides a more constructive lens through which to the viability of a storage solution within a high-performance cloud-scale setting. This approach is particularly useful for understanding the basis of SolidFire's thoughts on the merits of "automated" storage tiering in a large scale cloud.

As promised, we kick off our first of three blogs on this topic below. If you happened to miss our initial thoughts on this subject you can go back and read them here and here. We look forward to your feedback as we go.

Within the enterprise, storage tiering has become a popular vendor solution to improve performance for a subset of applications. With tiering the performance gain is achieved by retrofitting a disk-based array with an SSD tier and some intelligent fetching/data placement algorithms. Tiered storage systems are most effective when an IT manager has direct visibility into the usage profiles of the applications that reside on the system.  This allows the IT manager to size each tier appropriately, continually ensuring there is enough room in "fast disk" to accommodate demand. When data is not in demand it is then moved to slower speed disk. Overall, this is both a reactive and human-centric model that requires constant monitoring and adjustments to ensure each storage tier is rightsized to accommodate the access patterns of different volumes across the data set. The continuous promotion and demotion of data to the different tiers also comes at the cost of endurance due to excess wear on the flash media.

When operating a large scale public cloud environment customer applications and their associated usage patters are largely unknown to the service provider. How do you most effectively allocate tiers of storage without ongoing visibility into the access requirements of a particular application?  How big should the SSD tier be? How much SATA capacity should be used? When should data be promoted or demoted between tiers? Might a better question be; how many IOPS need to be available within the storage system? Unfortunately, for cloud service providers with unpredictable demand patterns across a large number of tenants, trying to spec out a system in this manner is impossible.

From SolidFire's perspective, the best way to manage performance in a multi-tenant cloud environment is to approach this problem from the demand side of the equation (i.e. application performance) as opposed to the supply side (i.e. storage capacity).  Proactive performance management based on IOPS demanded by the application offers a far more efficient approach to allocating storage resources, rather than trying to guess the right quantity and capacity of each tier within the system. Armed with fine-grain performance controls, storage performance management should no longer be a complex, reactive and resource intensive experience. By leveraging a system that can assign and guarantee IOPS on a volume by volume basis, all of the guesswork around right sizing for application performance is eliminated. 

For a quick graphical depiction of how SolidFire brings this concept to life, check out our 90 second video on  Performance Virtualization.

-Dave Wright, Founder & CEO

 

Looking Back Before We Charge Forward

Tuesday, December 13, 2011 posted by Dave Wright

2011 was a foundational year for us here at SolidFire. Emerging from stealth mode at Structure in June, to a great VMworld panel and TechFieldDay appearance in September, and more recently announcing our new financing round in late-October, we have been hard at work. The best part is we are just getting started. Beyond enhancing our product, building our team and spreading the word, we have spent countless hours with cloud service providers (CSPs) listening to the challenges they encounter when attempting to deploy profitable high-performance cloud-storage infrastructures.  Today we are solving these problems with a select group of early access customers, and we look forward to making the SolidFire system broadly available in 2012.  We don't think that cloud computing will ever be the same.

Conversations with CSPs throughout the past year has continued to reinforce our belief that this customer segment is unique in its scale, business model and the solutions that it requires. The driving force behind this conclusion are four important qualifiers that clearly differentiate their IT environment and resulting storage system requirements from that of the traditional enterprise. These factors are:

  • Ability to Provide Predictable Performance
  • Massive Scale
  • Multi-Tenancy
  • Lack of Application-level visibility

Individually, each of these factors impose unique pressures on the IT environment. Taken together, they demand an entirely new approach. Deeply understanding the architectural implications of these collective burdens provides a more constructive lens from which to assess the viability of one solution versus another in a cloud-scale setting.


A frequently debated topic that highlights the importance of evaluating customer requirements from a more holistic viewpoint is that of automated storage tiering. Originally blogging on the topic back in July, we have continued to evolve our thinking on the topic and I would like to introduce a blog series in which we cover our view on tiering at length. 

Talk to any IT manager about how they are keeping up with performance demands and you will increasingly hear talk about resorting to unpredictable and resource intensive band-aids like tiering. In a controlled single system environment the dynamic tiering of data between SSD and SATA drives would seem to make sense. Unfortunately, the economics of these more tactical approaches, while viable in smaller topologies, start to break down under the burdens imposed in a cloud environment.

Once you introduce the elements of multi-tenancy, multiple applications, and the need to scale across multiple systems, a tiered approach exposes CSP customers to "all or nothing" performance disparity. Working around the unpredictable nature of this setup requires human intervention eroding the proposed cost benefits of the automated tiering value proposition. When evaluated against criteria above, the shortcomings of an automated tiering approach start to become very clear. Cloud service providers are forced to seek out alternative solutions that are better aligned with both the performance controls required in a multi-tenant cloud environment, and the efficiency mandated by the hyper-competitive nature of the cloud services market.

There is little debate that Quality of Service (QoS) is a key competitive differentiator for CSPs.  Consequently, they cannot afford to gamble with the performance variability inherent to a tiering or cache-based solution. The manual intervention required to tune and optimize these architectures on an ongoing basis is the antithesis of a profitable cloud-scale management model.  Coming out of the holiday break we will further explore storage tiering in even greater detail.  We will look at the differences between capacity and performance tiering, dive into the true economics of tiered solutions, and hash out the merits of local versus global deduplication. As always, please provide your feedback here on our blog.  We look forward to the conversation.

Happy and safe holidays to everyone and we look forward to seeing you in 2012.

-Dave Wright, Founder & CEO

SolidFire adds fuel to all-SSD storage solution with $25M in funding

2 commentsMonday, October 31, 2011 posted by Jay Prassl

Over the last few months we have been writing about a number of topics surrounding  SolidFire's all-SSD storage technology. It is important for us that we strike a balance between  being informational about SolidFire, but also educational about how some of the most successful cloud service providers in the world are thinking about SSD technology and how guaranteed QoS is impacting their business.  Here are SoftLayer and Virtustream discussing their thoughts on the use of solid-state technology in their cloud.

Currently there are a number of world-class cloud service providers evaluating over 500TB of SolidFire's all-SSD storage technology. They are evaluating the solution technically, but also evaluating it from a business perspective as well.  For every customer we work with, SolidFire technology is radically changing their business.  These IaaS providers are now able to invite new mission critical and performance sensitive applications into their cloud, and build new revenue streams and customer value around guaranteed performance. There is a very good reason that 3Par, EMC, and NetApp customers have all joined our Early Access program.

No other storage technology in the world has SolidFire's capability to combine revolutionary performance management, in-line storage efficiency, and full system automation.

There are many service providers out there simply making due with what has been available in the market. If you are reading this blog, you are probably one of them.  You, and each of our early access customers all feel the same way - current technology can't get me to where I want to go.

SolidFire can.  SolidFire can bring your cloud to the next level and add to your bottom line in a way that no 3Par or NetApp system ever could.  Think deduplicating a single volume is interesting? How about deduplicating your entire data store across thousands of customers.  SolidFire offers not  just incremental change, but rather a massive leap forward in how storage systems really SHOULD be built.  Why wait for your current vendor to drag themselves up to date?

To help get SolidFire technology in front of every cloud service provider, today we announced the closing of our $25 million Series B funding round, bringing our total funding to $37 million. We will be investing in our sales and marketing teams to broaden our reach, and will be accelerating our technical development as well.

SolidFire is on a fantastic roll and we want to give you the chance to learn about our technology.  We have a webinar coming up on November 17th and want to urge you to carve out an hour to spend with us. We will be talking about: How Performance Virtualization Enables New Storage Services in the Cloud.

If your cloud is held back by complex, expensive storage systems and would like to know more about our solution attend our webinar or Talk with Us!!

-Jay Prassl, VP of Marketing

The Challenges of Cloud Service Providers-Part Three - Recap

Tuesday, October 18, 2011 posted by Dave Wright

recap video

To wrap up our VMworld video series hosted by Silicon Angle TV, I sat down with Virtustream's Matt Theurer and Softlayer's Duke Skarda to discuss as a group, some of the challenges faced by cloud service providers. This conversation focuses largely around the barriers that these two companies face with traditional storage systems in the cloud, and the opportunities that flash storage presents.

For both companies, the use of all-SSD based technologies is changing the way they think about storage, and how they approach resolving the gap between server and storage performance.  Matt discusses how SSD technology has inverted the capacity / performance imbalance that has existed for many years and how capacity will soon be the limiting factor within cloud storage architectures; a much easier metric to manage.  Duke explaines how block storage is a fundamental building block of cloud infrastructure, and traditionally the most problematic part to deal with.

I also got a bit of airtime to talk about the history of SolidFire and how my experience at Rackspace, and evaluating how traditional storage is used within the cloud, both helped me shape the technology of SolidFire and the market focus of the company.  It is important to keep in mind that SSDs do not constitute a different approach to storage.  SSDs are just part of the system.  How that system is architected, the functionality designed around the SSDs, and deep knowledge of your customer and their key feature-set, are all required when delivering a next generation storage solution. 

Many thanks to Matt and Duke for sharing their views on performance storage in the cloud, and to Silicon Angle TV for hosting us!

-Dave Wright, Founder & CEO

The Challenges of Cloud Service Providers-Part Two - Virtustream

Wednesday, October 12, 2011 posted by Jay Prassl

virtustream thumbnail

At VMworld earlier this fall Matt Theurer, SVP of Solutions Architecture and Rodney Rodgers, Chairman and CEO of Virtustream took some time to sit down with the folks SiliconAngle TV to discuss some of the challenges of that cloud service providers are facing.  During their discussion they talked about the specifics of their business and their focus on enabling high performance applications like SAP within their shared infrastructure.  Key to their success is this space has been their ability to carve up compute, networking and IOPS and bundle them into what they call an "infrastructure unit".  Customers can combine as many IUs as needed to meet their requirements, and this enables Virtustream to provide some of the most comprehensive SLAs in the industry.

Matt takes the conversation a bit deeper discussing some of the more granular performance challenges posed by traditional spinning media.  He discusses how the ability to guarantee storage performance would allow them to be even more exacting in their SLAs and raise their overall efficiency.  At SolidFire, one of our primary goals is to enable cloud service providers to allocate storage performance as easily as they allocate storage capacity; and to do so for thousands of volumes within a shared infrastructure.  This capability allows companies like Virtustream to wrap SLAs around exact performance metrics and maintain customer performance expectations regardless of the activity within the system.

-Jay Prassl, VP of Sales & Marketing

The Challenges of Cloud Service Providers - Part One - Softlayer

Tuesday, October 04, 2011 posted by Adam Carter

 

softlayer video thumbnail

Nathan Day and Duke Skarda of SoftLayer were kind enough to talk with the guys from Silicon Angle/Wikibon.org on the Cube at this years VMworld. During their discussion they touched on a major challenge that many cloud service providers are dealing with today… storage performance.   One of the points that was brought up was fine grain control on Quality of Service. They referred to "per volume" or "per account" control as storage nirvana. At SolidFire, our architecture was designed from the ground up with this in mind. The ability to guarantee consistent QoS to thousands of applications and thousands of customers is how SolidFire is making storage nirvana a reality.

-Adam Carter, Director of Product Management