TechWeb

How Oracle Helps Polk Decode Car Buying Secrets

Aug 30, 2011 (10:08 AM EDT)

Read the Original Article at http://www.informationweek.com/news/showArticle.jhtml?articleID=231600497


10 Lessons Learned By Big-Data Pioneers
10 Lessons Learned By Big-Data Pioneers
(click image for larger view and for slideshow)
How many BMW 3 Series vehicles were sold in California last month--and is the model gaining or losing share against the Audi A4? Which Hyundai dealers are driving the lion's share of that brand's sales gains? Which websites and mailing lists would be best to use to reach potential Chevy Camaro buyers?

These are just a few of the sorts of questions than can be answered by Polk, the kingpin of auto-market data and analysis. It's a blue-chip marketing company and longtime Oracle customer that last year decided to upgrade to that vendor's Oracle Exadata Database Machine.

One year later, Polk is now nearly halfway through a steady enterprise-wide migration of multiple databases and applications. Along the way it has consolidated database licenses, eliminated servers, compressed data dramatically, and delivered results 10 times faster than on its previous Oracle Real Application Clusters (RAC) deployments. It has also upgraded its BI deployment to improve data visualization and dashboarding capabilities.

As the database market share leader, Oracle has plenty of customers contemplating whether to follow in Polk's footsteps. Here's a look inside an Oracle Exadata and Oracle Business Intelligence Enterprise Edition deployment, along with instructive advice and lessons learned by Polk's top application and database executives.

Serving Many Customers

Polk has tracked auto sales since the 1920s, and its information stores and services have grown along with the industry. Today Polk combines state-supplied information on new and used auto registrations with manufacturer-supplied sales data, and third-party demographic and lifestyle data.

The company has several key customer groups, but the top revenue driver is Polk Insight, an analytic datamart that helps manufacturers understand where they're winning and where they're losing by product, competitor, new vs. used, region, dealer, and other dimensions.

Polk's Multichannel Target Marketing products help the people who market cars and trucks acquire new customers, retain existing ones, build brand awareness, and increase revenue by identifying specific consumer segments that can be reached with advertising and promotional campaigns delivered across email, direct mail, and websites. And it's not just about what might work in theory; complementary services help marketers measure the performance of campaigns through actual vehicle sales.

The manufacturer and marketer insights all rely on data that doesn't include personal information, but the demographic and lifestyle enrichment data provides a clear picture of consumer tastes and preferences by age, income, zip code, leisure activities, and other dimensions.

Some applications do call for personally identifiable information. Polk's VINtelligence database, for example, keeps track of owners tied to specific vehicle identification numbers. It's used by manufacturers in the event of a recall, and insurance companies check VIN numbers to fight fraud.

These are the core data services provided by Polk, but there are several other niche databases that complete the picture. As of last year, the total data environment was approaching 46 terabytes, but it's not all managed as a single data warehouse. Polk makes focused datamarts accessible online through subscription-accessed portal interfaces. Many manufacturers pay for customized blends of information sliced and diced in particular ways.

Moving To Exadata

Diversity and customization demands are two big reasons Polk has remained an Oracle customer. The company did a proof-of-concept (POC) project with Netezza back in 2008, but the company wanted to stick with familiar and extensive capabilities.

"Netezza offered a good product, but we're an Oracle shop and the Netezza database itself seemed primitive by comparison; it didn't offer all the partitioning and database-management capabilities we need," said Doug Miller, Polk's director of database development and operations. (Indeed since 2008, Netezza and other data-warehouse appliance manufacturers have focused on adding database-management capabilities, but perhaps not all those required in an earlier era of database deployment.)

Soon after the Netezza POC, Exadata 1.0 was announced, but Polk wasn't ready to make a change just yet.

"There were two reasons we didn't buy Exadata 1.0: first it was a 1.0 product, and our CIO said, 'no, we won't be buying any version one products," Miller said. "Second, it was bad timing for a refresh because we had just bought a bunch of new technology."




10 Lessons Learned By Big-Data Pioneers
10 Lessons Learned By Big-Data Pioneers
(click image for larger view and for slideshow)
When Polk finally made the decision to upgrade last summer, you could say the move was overdue. By that point the toughest multidimensional queries--those requesting demographic buyer insight within specific regions and dealer zones--were taking as long as three to five minutes.

The first wave of database migrations from a conventional RAC environment onto an Oracle Exadata Database Machine X2-2 half-rack took about four months, according to Miller. An immediate benefit was storage efficiency, thanks to Exadata's hybrid columnar compression. As an example, the Polk Insight database that previously housed about 2 terabytes compressed down to about 600 gigabytes. Performance instantly improved without any Exadata-specific performance tuning.

"Initially we didn't do anything other than migrate the databases, and all the legacy queries and reports started running dramatically faster," Miller said, on the order of 10 times faster, he added.

Polk chose a capacious half rack with consolidation and room for growth in mind. Since the initial wave last fall, Polk has migrated multiple databases totaling about 22 terabytes out of conventional RAC deployments. On Exadata, compression has brought those stores down to about 13 terabytes. Among the tricks the company has employed is careful sorting of data before loading, a step that helps improve database performance.

"Exadata builds storage indexes automatically, and if you can lump the data that goes together, the index knows exactly where to find the data you're after," Miller said. If you sort data by zip code, for example, the database knows just where to find the data needed for a query by geography and it will run that much faster.

Polk is also making extensive use of materialized views, which effectively store often-requested query results for rapid recall. Exadata's compression has helped reduce the size of materialized views and supported more sophisticated views, said Miller, and that has reduced the number of table joins that have to take place in real time. This, too, improves performance, even when exploring multiple dimensions of data.

"If a customer wanted to look across multiple dealer zones and then start bringing customer demographics and data such as lease versus purchase, that would have taken as long as two to five minutes in the old environment," Miller said. "In Exadata these sorts of queries are running in 10 seconds."

Polk's Exadata migration is still in progress, but to date it has consolidated nine separate databases down to four and eliminated eight out of 22 production database servers. Polk has yet to move the other half of its data still in conventional deployments, but it expects to consolidate on a total of 10 servers.

Improving Insight

Polk had some 10,000 custom reports built in Oracle Discoverer before the Exadata migration began. With an upgrade to Oracle Business Intelligence Enterprise Edition (OBIEE), Polk is now delivering user-customizable dashboards that used to require custom development work. This has greatly improved productivity at Polk.

Customers now take advantage of OBIEE threshold and alerting capabilities so they can review dashboards when certain conditions are met, rather than having to review reports and look for notable changes.

OBIEE has ushered in new types of reports. For example, the suite taps Oracle 11g-managed spatial information to map data geographically. Thus, manufacturers can see customer and model counts projected onto maps, with five-year trends and competitive insights.

"We can use shading to indicate market-share gains and losses," said Kelly Garcia, Polk's VP of global application development. "They can instantly see whether they are up or down in a particular region."

There are areas where Polk is still taking a wait-and-see stance where Oracle capabilities are concerned. For example, several Polk customers have equipped or want to equip their field reps with Apple iPads for email access and dealer interaction, but Garcia said Oracle's current iPad release is just a start.

"We're unclear whether we're going to go with that product because you can't rebrand the interface, as we may need to do," he explained. "If a customer really needs a way to get our data on the iPad, we can give it to them, but it's not very interactive and we'd like to see more capabilities."

As was the case with Exadata, Polk seems content to communicate what it wants in an iPad release to its incumbent vendor and to wait a bit--say to a 2.0 version--to get what it wants.

In the new, all-digital issue of Network Computing: Microsoft and Citrix are closing the gap with VMware. Before you roll out the latest edition of vSphere, reconsider your virtualization platform. Download the issue now. (Free registration required.)