Feb 29, 2008 (10:02 AM EST)
How eBay Manages Its Data Centers

Read the Original Article at InformationWeek

1   2  
The road to data center automation has been long and arduous, and even innovative companies like eBay haven't gotten it perfected. But eBay's closer than most.

The company's a third of the way through a major three-year grid computing initiative, hard at work developing software and employing technologies that can describe the relationships between and among hardware and software in its data centers with the eventual goal of making eBay easier to manage, quicker to upgrade, and scalable beyond imagination.

InformationWeek Reports

At eBay's scale, commercial software is often out of the question. After all, there are more than 3 petabytes of data to manage across six data centers. Even with such a complicated environment, the company's able to manage all that storage with fewer than a dozen administrators. As more management tools and methods fall in line, eBay is hoping to take a step toward automatic service-level management at scale.

There's plenty of information to sort through at eBay. The company's site lists 106 million active auctions at any given point in time, which can be accessed by any of its 243 million registered users. All told, the company has to manage more than 3 petabytes of data spread across about 600 production database instances, many of which run inside virtual machines, sitting on more than 100 computing clusters in six different data centers. One of the reasons this complexity -- the storage piece, anyway -- can be managed with so few people comes down to limiting the number of ways data is stored.

Many companies already do this on a lesser scale. For example, a company might store business-critical information on redundant, high-performance SANs and older information in backup. For eBay, that means fewer vendors and products to manage overall, and fewer simultaneous storage-related processes to manage.

"Management is made easier by having fewer things to manage," eBay distinguished research scientist Paul Strong, who helps to design and manage the company's massive IT infrastructure, said in an interview. "By having patterns and fixing processes around them, you minimize variability, risk, and cost, and you maximize efficiency and to some degree agility."

Beyond simplifying the IT infrastructure, management software is key for eBay, especially since the Web site is constantly changing. "Whenever we deploy new code or change something, we have to examine which other sets of services inside eBay's infrastructure they interact with," Strong said.

In some cases, eBay's infrastructure is so large that commercial products just won't cut it. "We would prefer commercial off-the-shelf software if it could meet our needs," Strong said. "However, historically we have tended to break most that we have used." Though distributed resource management tools -- software to manage grids -- from companies including Gemstone, GigaSpaces, and Oracle's Tangosol have evolved over the last few years, Strong said the functionality wasn't available when eBay first required it, which forced eBay to develop its own custom management software.