9 Responses

  1. vaughn stewart (@vStewed)
    vaughn stewart (@vStewed) at |

    == Disclaimer – Pure Storage employee ==

    Michael,

    I’d like to elaborate on some of the points in this post. Some i vehemently agree with and others… well let’s say I have a differing opinion.

    When it comes to data reduction results, both the data type and the implementation of the data reduction technologies will determine the results. For example, look at a complex data set to reduce like Oracle. By in large, these databases are more conducive to data reduction via compression algorithms than deduplication.

    Inline compression processes tend to produce modest results with Oracle (let’s say 1.5:1 to 2:1 on average). This is due in large part to the storage platform prioritizing serving host side (or front end) I/O over achieving maximum data reduction. Platforms with background compression processes tend to produce greater data reduction results (let’s say 3:1 to 4:1), although the capacity savings are achieved sometime in the future. The Coup De Grâce are the platforms that implement both inline and background.

    With data deduplication, inline or post process tends to be a moot point. Granularity of the implementation will correlate to the reducibility of data. Fixed block implementations are challenged (and often unable) to find commonalities due to the unique block header and tailcheck section of each data block in the database. Variable length dedupe with 512B granularity provides a ‘sliding window to operate within and thus can find a 7.5KB dedupe match in the default Oracle data block size of 8KB.

    With all considerations to data compression and deduplication, data reduction will be reduced if the contents of the database are encrypted. The loss of reduction often closely correlates to the complexity of the encryption.

    Bottom line: data type and the implementation of data reduction technologies mater equally in terms of results. I would position that testing dedupe friendly workloads like VDI will produce the most similar results between dissimilar storage platforms, whereas a varied testbed will produce varied results; allowing a customer to best determine the capabilities of the data reduction engine for broad adoption.

    All of this is meaningless unless we go beyond the technical vacuum and consider the impact of implementation on performance.

    I believe the poignant point the AFA vendors have put forth is affordability with high performance. Modern AFA’s tend to provide robust storage efficiencies (data reduction and dual-fault, parity based data protection) and sub-millisecond, high performance I/O.

    While flash is fast, one needs to understand that architectures matter. For example, readers can refer to a recent Storage Field Day 9 presentation, where an all-flash HCI configuration produced sub-millisecond latency results with an OLTP workload. These high performance results were obtained on a capacity inefficient mirroring and no data reduction configuration. In other words, strong performance was achieved from a costly configuration (much more expensive than an AFA). Test results declined to around 6.5ms (aka high end hybrid class) with a moderately more cost effective configuration (mirroring with data reduction) and no results were shared from testing with a cost efficient architecture (erasure encoding and data reduction).

    I share these points not as a means to go negative but to reinforce clarity of message. Modern AFA platforms promote affordability and high performance. This combination allows one to run their most demanding workloads alongside those that are cost sensitive.

    I’d suggest in order to demonstrate data reduction results with a hybrid architecture that are similar to what is possible from a modern AFA, then you need to share test results that go beyond the modest I/O profiles (bandwidth and IOPs) shared in this post.

    I commend you for having the courage to ‘lift the komono’ if you will, but based on the results shared in this post, I don’t think you have demonstrated that data reduction can be enabled for every or even most workloads that reside on a platform with a hybrid storage subsytem.

    – Cheers,
    v

    (apologies in advance for any typos)

    Reply
  2. Carlos Quintas
    Carlos Quintas at |

    Great article. I think the strongest point here is that focus show be on the type of Data AND type of implemented reductions techniques used rather than on the media (HDD, SSD, etc.) although I fell that when facing customers, that turns out to be an exercise in itself since it requires deeper assessment on the environment to build a proper business case…

    Reply
  3. vaughn stewart (@vStewed)
    vaughn stewart (@vStewed) at |

    I need to correct a mistake I made in my earlier comment re: the SFD9 test results with an AFA configuration. The sub-millisecond results were obtained on a configuration which was capacity inefficient (single mirror without data reduction0. The test results declined to around 5.5ms with a moderately cost effective configuration (single mirror with data reduction) and 6.5ms were achieved from a more cost efficient architecture (RAID 5 erasure encoding with data reduction).

    The point I was trying to convey was the trade off in performance and cost based on implementing capacity efficiency technologies. The SFD9 presentation did not include results obtained with a cost optimized (RAID 6 erasure encoding with data reduction enabled) configuration.

    My apologies for misrepresenting the results in my prior comments.

    Reply
  4. Citrix XenDesktop deployment? Here is everything you need to know about Nutanix vs. Other SDS » myvirtualcloud.net

    […] Real World Data Reduction from Hybrid SSD + HDD Storage […]

  5. Насколько эффективны средства компрессии данных у Nutanix? - Virtualization solution with a nuts

    […] Оригинальный пост на английском тут: http://longwhiteclouds.com/2016/05/23/real-world-data-reduction-from-hybrid-ssd-hdd-storage […]

  6. Your Network Is Too Slow For Flash And What To Do About It | Long White Virtual Clouds

    […] that exist with more traditional storage systems based on spinning disks. As I demonstrated in Real World Data Reduction from Hybrid SSD + HDD Storage you don’t need flash to get data reduction, but flash storage with data reduction […]

  7. Citrix MCS on Nutanix AHV: Unleashing the power of clones! • My Virtual Vision

    […] Real World Data Reduction from Hybrid SSD + HDD Storage […]

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.