Violin Bolts Dedupe, Compression Onto All-Flash Arrays
All-flash array maker Violin Memory has beefed up its all-flash arrays with much-needed inline deduplication and compression features. While both of these features have been around for years with disk-based arrays, adding them to flash arrays is a bit trickier given the high I/O rates of flash devices. But both dedupe and compression are absolutely necessary if the cost of flash arrays is going to be brought down closer to disk arrays.
The new Concerto 2200 is a dual-controller head that runs the dedupe and compression software separately from the Violin controllers inside of the all-flash arrays. These Concerto 2200 heads can either be acquired new and equipped with flash memory shelves for capacity or plugged into existing flash arrays. Eric Herzog, chief marketing officer and senior vice president of business development at Violin, tells EnterpriseTech that the dual-head controllers are the same Xeon E5-based controllers that came out with the Concerto 7000 arrays, which launched back in June and which run a suite of data management services such as asynchronous and synchronous replication, snapshots, transparent LUN mirroring, thin provisioning, and thin cloning, just to name a few.
To support the deduplication and compression software, Violin has bumped up the main memory in the dual controller heads. Deduplication and compression are both CPU-intensive workloads, and in fact, the performance penalty associated with deduplication is one reason why many companies do not use it in production.
The dedupe and compression software in the Concerto 2200 are absolutely meant to be used in production, and in fact are necessary to get the cost of all-flash arrays down to be competitive with all-disk arrays, says Herzog. The dedupe and compression software was not developed from scratch by Violin, but through a source code licensing agreement from FalconStor Software, which inked a $12 million joint development agreement with Violin Memory back in July 2013 to provide the data protection, dedupe, and compression software.
Herzog says that this FalconStor code had to be modified quite a bit to work with the Violin Memory Operating System, or VMOS for short, and the hardware-accelerated vRAID flash controllers inside of the storage shelves. The compression used by FalconStor in the Violin Memory controller heads is based on the LZO algorithm, and the combination of dedupe and compression – what Violin Memory calls data reduction – somewhere between 3:1 to 10:1 reduction, depending on the workload. On average, Violin Memory is telling customers to expect somewhere around a 6:1 data reduction with the combination of the two, which is a little better than the 5:1 reduction factors a lot of the industry players are giving as a baseline.
There are many workloads where deduplication is going to hinder performance or not offer a big benefit, and therefore the way Violin Memory has implemented its data reduction algorithms allows for customers to turn them on and off at a granular level. The data reduction features of the Concerto 2200 controllers will initially support NGFS file systems, offering deduplication on a share-by-share basis, and in the first quarter deduplication will be available on block-style data on a LUN-by-LUN basis.
The NFS ingest support is particularly useful for virtual server and virtual desktop workloads, and Violin Memory is chasing the VDI opportunity pretty hard. Herzog says that according to surveys done by VMware and Citrix Systems, whose respective Horizon View and XenDektop VDI brokers account for the majority of the market, about 85 percent of customers have a mix of persistent and non-persistent virtual desktops. A typical desktop needs somewhere around 20 I/O operations per second (IOPS) and about 20 GB of storage for a persistent desktop. A typical setup would include the Concerto 2200 head nodes, with the deduplication and compression software, plugging into a Violin 6000 flash array through Fibre Channel links. A cluster of servers running the ESXi hypervisor hosts the VDI images, and links back to the flash arrays over an Ethernet network using the NFS protocol. With persistent desktop images, such a setup can handle about 2,500 end users, and with non-persistent images, you can double that to around 5,000 seats, says Herzog.
The dual-head Concerto 2200 data reduction controller has a list price of $160,000. You can hang up to three Violin 6000 arrays off it in a 13U form factor, providing 112 TB of usable capacity across those three shelves that after the 6:1 compression and deduplication is added results in an effective usable capacity of 672 TB. This setup would cost around $4.02 per GB at list, says Herzog, but after discounts to channel partners, it would work out to around $1.81 per GB. That works out to somewhere around $75 per desktop to drive storage and IOPS in a VDI setup, after discounts.
The Concerto 2200 heads can plug into the Violin 6000 series of all-flash arrays as well as into the Concerto 7000s, which are versions of the arrays with the other data management software on them. In the latter case, with the Concerto 7100 you will end up with four controllers and two shelves for a total of 140 TB of raw capacity and with the Concerto 7200, it will be four controllers and four shelves with 280 TB of capacity. List price for the top-end Concerto 7200 hardware is $2.56 million, with another $105,000 for the data management software and another $160,000 for the dual-head Concerto 2200 running the inline dedupe and compression software. That works out to $2.83 million for 280 TB of raw flash capacity. Apply your discounts and data reduction ratios to get the effective cost per GB, but it could be under $1 per GB.