MapR Distribution for Hadoop Employed by IDEXX
MapR Technologies, Inc. today announced that IDEXX Laboratories, Inc. is using the MapR Distribution for Apache Hadoop via Amazon Web Services (AWS) to flexibly scale its business at lower cost, and gain access to critical customer data instantly for rapid response times.
IDEXX aggregates and analyzes data from participating practices and offers a free industry benchmark report to the participating practices. IDEXX also utilizes the aggregated and de-identified data to create syndicated reports for pharmaceutical and nutrition companies to identify industry trends.
As the IDEXX businesses continued to grow, its primary data store, a relational database hosted on Amazon Web Services (AWS), could not keep pace. The growing size of this database meant that daily jobs to aggregate and summarize data were taking too long to run and consuming database resources that were impacting online operations. "We needed a solution which could offload this processing, easily scale to support the growth of the existing product and could be leveraged for future data intensive projects," said Terry Schutte, IDEXX senior systems administrator for software R&D.
The new solution had to be compatible with existing systems, which included the AWS infrastructure and Java 7. "Our primary reason for choosing MapR M3 on Amazon Elastic Compute Cloud (Amazon EC2) was the ability to run Hadoop under Java 7 against Java 7 compiled applications," said Schutte. "It was hard to find support for Java 7 in the MapReduce ecosystem. MapR's performance and architectural improvements stood out. From an operational perspective, MapR is easier to use than other distributions we tested and higher performing, and we benefit from the optimizations made to Hadoop."
IDEXX has realized multiple benefits from its MapR and AWS solution, including increased flexibility and control to scale its business, faster customer response times, the ability to retain all its data and ease of experimentation, and support for additional lines of business.
"By running MapR clusters on EC2, we retain full control over the configuration and operations of the MapR cluster, while continuing to have the benefits of EC2 hosting which means no capital expenditures and the flexibility to scale our environment based on demand. We pay only for capacity that we use and it lets us easily scale to support growth of the business," said Schutte.
The new MapR/AWS solution dramatically improves the company's ability to respond to customer requests. For example, if a customer asks a specific question related to the marketplace or trends, IDEXX can respond immediately. "Before, we would have to schedule developer time to write a query to see if we even have data. The whole process could take months," explained Schutte. "Being able to scale the environment quickly and cost effectively helps us turn around answers to our customers' questions much faster. Today we can cut through our data and run queries using MapReduce and get the answer in a day, or even hours."
The new system removes prior constraints on how much data IDEXX can store and process. "Now we have the ability to store and process all the data, all the time. Everything is at our fingertips. If a customer has a question, we simply write a quick job to answer the query," he said.
The MapR/AWS solution makes the cycle of testing much easier. IDEXX can have multiple development environments running in parallel with less risk.
"IDEXX is realizing the advantages of Hadoop at scale," said Jack Norris, chief marketing officer, MapR Technologies. "Other distributions let them test the waters and gain experience with Hadoop and once they were ready to scale their business, they turned to MapR as their enterprise-grade partner. The MapR and AWS solution is enabling IDEXX to flexibly scale operations and to do so at lower cost."