Business Challenge: Minimize search time and maximize number of simultaneous customized data streams for real-time ecommerce data
Application: Apache Lucene / elasticsearch
Data size: 400 GB
Hardware: Cluster of 16 custom-built servers
Gigabyte Technology motherboard
Hexacore AMD Phenom II X6 processors
1x 56 GB OCZ Agility3 SSD as primary storage in each server (no hard disks)
16 GB RAM per server
4.7x increase in I/O performance
50% increase in application performance
35% reduction of CPU load
Non-disruptive, 60-second installation, transparent to applications and storage infrastructure
DataFeedFile.com is an e-commerce platform that aggregates merchant store data feeds and provides price comparisons for products sold online. Thousands of affiliate websites use DataFeedFile.com because of the platform’s ability to deliver unique, highly customized data streams to each of its clients. DataFeedFile.com simultaneously performs, in real time, thousands of permutations of hundreds of millions of records in order to update pricing, inventory, and other information for each product advertised on its client websites.
The engineers at DataFeedFile.com knew that the company’s IT infrastructure was a key competitive differentiator. To improve performance, they upgraded DataFeedFile.com’s information database to a distributed data library solution based on Appache Lucene. The most performance-sensitive front end search application (Appache Lucene elasticsearch) was deployed on a cluster of 16 custom-built servers using all-SSD storage. Each server was configured with 1 OCZ Agility3 SSD as primary storage and 16 GB of RAM (2 GB of RAM used by the operating system and 14 GB of RAM used by the application for caching).
DataFeedFile.com quickly outgrew its infrastructure. The servers ran at capacity almost 100% of the time. To add new clients, while keeping the same service levels, the company had to either purchase and deploy additional servers or find a way to squeeze more performance from its existing infrastructure. The engineering team investigated multiple potential opportunities to increase price-performance of the all-SSD server cluster.
How VeloBit Helped
VeloBit offered an all-software solution that increased I/O performance through a combination of SSD caching, RAM caching, and compression. VeloBit was configured to cache data in RAM and use the OCZ Agility 3 SSD as primary storage. 4 GB RAM were allocated to VeloBit, thus reducing the application cache from 14 GB to 10 GB.
The VeloBit software installed seamlessly in 60 seconds. The results were nearly instantaneous: VeloBit compressed cached data by 4x, quadrupling the effective cache memory. VeloBit produced higher cache hit rates, which reduced IO operations to primary storage. Response times improved from 1234 microseconds to 263 microseconds, 4.7x increase in I/O performance. Application performance increased by 50% (from an average speed of 120 queries per second to 180 queries per second) and server CPU load decreased by 25%.
“At DataFeedFile.com, we call VeloBit ‘the magic dust’ - invisible software that boosts the performance of even the fastest hardware you can build,” said Andrew Nurchaya, CEO of DataFeedFile.com. “VeloBit delivered a dramatic performance boost to our storage infrastructure, which enabled us to increase productivity and reduce the cost of scaling I/O capacity. Before selecting VeloBit, we evaluated multiple caching solutions; VeloBit demonstrated the highest performance with unmatched deployment simplicity."