Covering Scientific & Technical AI | Monday, November 25, 2024

IBM Appliances Ease Deployment of DB2 BLU Analytics 

Back in April, IBM put out a tech preview of its BLU Acceleration in-memory feature for its DB2 database. Now the software is shipping, and the company is packaging it up into an appliance to make it easier for companies to get up and running quickly.

The appliances are modestly sized to start, but IBM says they will scale up to take on bigger jobs in the coming months.

The BLU Acceleration feature for IBM's homegrown DB2 database is only available on Power Systems machines running AIX, and it is not clear if that is a technical decision based on the feeds and speeds of the Power7+ processors or a marketing one because IBM's mission these days is to peddle more Power-based systems. (We suspect it is a little of the former and a lot of the latter.)

With BLU Acceleration, IBM is doing a number of different things to radically boost the performance of queries against the DB2 database. First, it includes a column-oriented table structure that sits alongside the traditional row-oriented tables in a relational database. The software also has data compression which can squeeze the tables down by a factor of 10 to 1, and interestingly, the encoding on the data compression is such that queries can actually be run against the compressed data. BLU Acceleration is a bit different from other in-memory database implementations in that you do not have to put the entire database into main memory. The software also makes use of all of the threads in a Power processor to do scans of database tables in parallel, and also boosts the processing of those scans.

To take an example, start with a machine with 32 Power7+ cores and 1 TB of memory and cram a 10 TB database table with ten years of data in it. What if you wanted to scan sales in that database for one year? Well, the compression gets you down to a 1 TB footprint, so the database can now fit into main memory instead of being on disks, and memory is orders of magnitude faster. Shifting to the columnar tables means you need to process only 10 GB of the data, and using a feature called data skipping (which IBM has not adequately explained) gets it down to 1 GB of data to be scanned because you only pick out the data for that year. With 32 cores, you can have each core scan a 32 MB chunk of the database each, and with vector processing, you can speed that up by a factor of four. The result is a query that might have taken several minutes now takes a few seconds. Generically speaking, IBM says that DB2 BLU speeds up query processing by somewhere between a factor of 8 and 25, with some early customers seeing a speedup as high as a factor of 1,000.

The initial DB2 BLU Accelerator appliances as based on IBM's midrange Power 770 enterprise-class servers, and they come in the following configurations:

ibm-blu-acceleration-appliance

The initial machine comes on a single node in a Power 770 box, which in theory can have up to four nodes linked together through NUMA clustering. IBM will eventually scale up the node counts on the Power 770 machine to extend the performance, and will also offer BLU Acceleration appliances based on its Power 780 system, which at 128 cores has double the number of the Power 770. Both machines top out at 4 TB of memory, which means you can compress a 400 TB database into main memory. If you want to use the memory paging feature of the BLU Acceleration feature, you can put an even larger database on the machine.

AIwire