TechWeb

Businesses Mine Data To Predict What Happens Next

May 28, 2006 (08:05 PM EDT)

Read the Original Article at http://www.informationweek.com/news/showArticle.jhtml?articleID=188500520


Real-time information, once a competitive differentiator that produced more timely and relevant business decisions, is now a commodity. Even midsize companies process transactions as fast as the New York Stock Exchange, while decision makers communicate and collaborate over broadband networks as if they were in the same office. Sheer speed isn't the advantage it once was.

So what's next? What's next is what's next--the ability to forecast where events are heading, then make informed decisions based on that assessment. Predictive analytics, the scientific name for using a data warehouse as a crystal ball, is where business intelligence is going. It involves running historical data through mathematical algorithms--neural networks, decision trees, Bayesian networks--to identify trends and patterns and predict future outcomes. Will product demand surge? Will a patient relapse? Will a customer take his business elsewhere? Our ability to make such educated guesses is key to improving service, cutting costs, and exploiting new market opportunities.

Blue Cross Blue Shield of Tennessee now predicts the health care resources postoperative patients will need years down the road. The Federal Aviation Administration is identifying links between pilot health conditions and aviation accidents, with an eye toward avoiding them. FedEx anticipates which customers are most likely to respond to a new service or defect to a competitor.

The idea isn't new. Insurance companies have used actuarial tables for decades to predict how long policy holders will live or the likelihood of their getting into a car accident. Financial firms have used predictive analytics to assign credit-risk scores to borrowers.

What's different now is that vendors are building predictive analytics into mainstream applications for everyday decision making by all types of employees. IDC expects sales of predictive analytics software to grow 8% annually, to $3 billion by 2008.

Startup TrueDemand Software has developed supply chain applications that use data from radio-frequency identification systems to help retailers and manufacturers predict product demand and optimize inventories. Atrenda makes software that uses predictive analytics to verify early in development whether a semiconductor design will meet its specifications.

IBM last week introduced an inventory management application for retailers that uses built-in predictive analytics and replenishment rules to monitor product inventory, develop safety stock, and recommend orders based on an analysis of historical demand. The commercial app has been used by IBM consultants for years.




Crime Stoppers
In Richmond, Va., police use predictive analysis to determine the probability that a particular type of crime--armed robbery, auto theft, murder--will occur in a specific area at a given time. Police lieutenants who command the city's 12 sectors use desktop computers linked to the system to decide where to deploy a mobile task force of 30 officers. "Based on the predictive models, we deploy them almost every three or four hours," Police Chief Rodney Monroe says.

Officers have arrested 16 fugitives and confiscated 18 guns based on the system's guidance. In the first week of May, Richmond had no homicides, compared with three in the same week last year. Monroe attributes that outcome, in part, to moving officers around based on the calculated probability of shooting incidents. "It's more proactive," Monroe says. "We're not waiting for a homicide to occur."


Police Chief Monroe crunches data, then locks up fugitives. -- Photo by Dean Hoffmeyer/Richmond Times-Dispatch/AP

Police Chief Monroe crunches data, then locks up fugitives.

Photo by Dean Hoffmeyer/Richmond Times-Dispatch/AP
A 911 call is a real-time event; pre-empting that call involves predictive analysis. Vivek Ranadive, CEO of Tibco Software, the data-integration and middleware company, is convinced a growing number of companies are about to begin applying data forecasting. "I've spent all my life evangelizing 'real time,' but by its nature it's still reactive," Ranadive says. "You really have to get ahead of the curve."

Ranadive lays out his thinking in a new book titled The Power To Predict (McGraw Hill, 2006), which includes a for-ward by FedEx CEO Fred Smith. "Companies have always had to be nimble to succeed," Smith writes, "but the imperative to become proactive gets stronger each day." Smith envisions predicting what customers want "before they know they want it" and anticipating service disruptions before they occur.

Ranadive says he expects predictive analytics to be widely adopted over the next few years to make customers more loyal and supply chains more efficient and keep store shelves stocked with the right items. Tibco, of course, smells an opportunity. The company has trademarked the phrases "predictive business" and "the power to predict," and it has a hand in the game: Tibco Business Events is a rules engine that combs databases and applications to find patterns in data for identifying business opportunities.

Researchers have advanced the science by developing new algorithms with exotic names--the Markov decision process, stream mining, and support vector machines--that provide more ways to analyze data and find subtle patterns. And there's a practical reason a growing number of companies are interested in such tools: They need help making sense of all those terabytes of data they're accumulating. Cheaper, more powerful computers put it all within reach. "We have the ability to apply predictive analytics in ways that would have been impossible a few years ago," says Richard Vlasimsky, CTO at Valen Technologies, which markets predictive analysis software for the property and casualty insurance market.

But vendors need to be careful not to overpromise. The Richmond police will never be able to foretell the actions of lawbreakers in the way Tom Cruise's character, Chief John Anderton, did as a member of a "precrime" team in the Steven Spielberg film Minority Report.

"I watched that and thought, 'We could almost do that now,'" says Charlie Berger, an Oracle senior director of data mining technologies. Oracle has built predictive modeling into its database using data mining software acquired with its purchase of Thinking Machines in 1999, and it added predictive analysis applications for CRM and retailing through its PeopleSoft and ProfitLogic acquisitions. Tech vendors are automating the steps of building predictive models and preprocessing the data, making the technology easier to use by a wider audience, Berger says.




Your Accuracy May Vary
The accuracy of predictive analytics hinges on the complexity of the situation being assessed and the number of other variables. In other words, a lot can go wrong in the process of presaging. "Let's be real," says Valen Technologies' Vlasimsky. "It will never be clairvoyant."

The day will never come when we can predict the outcome of the stock market, says Lutz Hamel, a professor of computer science and statistics at the University of Rhode Island. There are simply too many variables that change too quickly. On the other hand, Wall Street firms can predict short-term trading trends--that's what automated trading is all about--and many make a nice profit doing just that.

Tom Wicinski, managing director of customer marketing analytics at FedEx, will happily take the 65% to 90% accuracy rate he says the package-shipping company's predictive analysis system is providing. FedEx uses SAS Institute's Enterprise Miner and other tools to develop models that predict how customers will respond to price changes and new services, which customers are at risk of jumping to a competitor, and how much revenue will be generated by new storefront or drop-box locations. Accuracy, Wicinski says, depends not just on a problem's complexity and the number of variables, but also on the amount and quality of the supporting data.

FedEx began using predictive analytics for customer prospecting in the 1990s. But the company has broadened its use of the technology, applying it to more complex business problems. Applications, including the customer-at-risk system, are relatively new. "It's becoming a more mainstream business process," Wicinski says.

FedEx next will deploy predictive analytics in real-time operational settings such as call centers, he says, helping customer service reps identify at-risk customers and take the necessary steps to make them happy. Today, FedEx call-center agents and other front-line personnel must alert a sales rep when red flags go up--and that process may not be fast enough.

Financial results always are a good indicator of success. Alumni donations to the University of Utah's David Eccles School of Business increased 73% last year after the school used predictive analysis software from Kintera to determine which of the 300,000 people in its alumni database were most likely to respond to its annual appeal for donations. "It's always a question of who do we want to reach given the limited resources we have," says Erika Marken, development research director at the university.

While predictive analytics technology is most prevalent in financial and marketing applications, it's expanding into areas such as health care and crime prevention and even counterterrorism.

In Richmond, the predictive analysis system police began using two years ago determines where and when crimes are most likely to occur using a database of past calls to police, arrests, and crime incidents--some going back 15 years. The system also factors in weather data, and it tracks local festival, sporting, and other events. The system comprises SPSS's Clementine predictive analysis software, reporting and visualization tools from Information Builders, and predictive models developed by RTI International, a research organization.

Police commanders can query the system about specific crimes, such as determining which neighborhoods are most likely to experience armed robberies or auto thefts. For example, the police have zeroed in on armed robberies in nightclub parking lots near closing time--robbers consider inebriated club-goers to be easy marks, says Colleen McCue, a senior research scientist at RTI.

As more data is added to the system, accuracy should improve, Chief Monroe says. But it has its limitations. The analysis is primarily restricted to time, place, and type of crime, while details such as the type of weapon used in past crimes aren't considered. And the predictive models must be updated with new information, such as increases or decreases in the types of drugs being sold on the streets.

When it comes to homeland security, details about how government agencies are using data mining and predictive analytics aren't easy to get. But work under way at the Pacific Northwest National Laboratory provides some hints. As part of the coun- terterrorism work done under the departments of Defense and Homeland Security, the lab is combin-ing predictive analytics with visualization technol-ogy for trend analysis and pattern recognition to detect signs of an impending terrorist attack. Sen- ior program manager Steve Martin is circumspect about how exactly such applications would be used, but he says it's likely the feds are using the technology to analyze phone call patterns.

The lab also is combining predictive analytics and behavioral analysis in the belief that terrorists might be caught on security cameras and identified through their behavior--if they loiter in a specific place, for example--before they have a chance to carry out an attack.




It Gets Personal
As electronic health records become more prevalent, those databases will provide a rich source of information for predictive analysis. Blue Cross Blue Shield of Tennessee has used a neural net-based predictive model from MEDai for the past 18 months to analyze claims data to predict which health care resources individual members will need months and even years into the future.

"If we're seeing a pattern that predicts heart failure, kidney failure, or diabetes, we want to know that as soon as possible," says Soyal Momin, manager of research and development and consulting at the insurer. While Momin says the technology can't predict disease, it can portend the severity of an illness.

Children's Memorial Research Center in Chicago hopes to come one step closer to actually predicting the recurrence of tumors. It uses SPSS's Clementine data mining software to classify pediatric brain tumors. Then, using predictive analytics, genomic research, and tools that search electronic medical texts for relevant information, doctors can determine the best therapy and predict the probability that a tumor will recur. "I think this idea is going to go hand in hand with personalized medicine," says Dr. Eric Bremer, director of the center's brain tumor research program.


Beyond Fortunetelling

Predictive analytics is moving from a specialty into the business process mainstream



The main ingredients are business data, mathematical algorithms, and forecasting models



The market will grow to $8 billion by 2008, according to IDC



In the airline industry, the FAA just started a program to correlate pilot health and aircraft accidents. The agency is applying S-Plus and Insightful Miner predictive analysis software from Insightful to cross-analyze information about plane crashes and pilot incapacitation incidents with data on pilot health and certification. (Private and commercial pilots are required to have regular medical exams, and the results are maintained in an FAA database that holds 15 million records.)

Researchers plan to examine pilot cardiovascular and neurological conditions for possible links to accidents, as well as scrutinize whether pilot age is a factor in flight safety. "Can we spot an issue before it becomes a safety problem, like older pilots taking up flying later in life?" asks Stephen Veronneau, bioinformatics research team leader for the FAA's Civil Aerospace Medical Institute. The program, which uses medical data that doesn't identify individual pilots, also will look for markers in pilots' blood, such as sugars and even genomic data, that indicate a pilot is more susceptible to fatigue. "But that's way out there," Veronneau concedes.

Valen's Vlasimsky says his company's technology can similarly be used to examine telematics data collected from trucks to identify drivers who may be at risk for getting into accidents because of their driving habits or susceptibility to fatigue.

All this data analysis raises ethical questions. Will predictive analytics lead to mass profiling, denying people jobs because of their genes or denying them insurance coverage because of their predicted health conditions? "You could certainly imagine some kind of Orwellian system with all kinds of data," the FAA's Veronneau says.

The police in Richmond are considering adding more data to their system, such as intelligence from informants about drug dealers and gangs, and even the crime records of individuals, to determine their propensity to commit more crimes. While privacy advocates are sure to squeal, Chief Monroe thinks he's on solid ground using hard data rather than discredited practices like racial profiling.

At Blue Cross Blue Shield of Tennessee, Brooks is aware of the potential misuse of predictive technology and says the insurer would never deny coverage to subscribers based on a predicted health condition. We'll see.




The Future Of Forecasting
Indeed, there's no stopping the proliferation of predictive analytics apps. It will be up to the people who deploy the technology to do so responsibly. Ultimately, predictive analytics could be used to mine unstructured content across the Internet, which the University of Rhode Island's Hamel calls "the largest text database on the planet."

Predictive analytics remains an analyst-intensive undertaking, but in the future it will be built into business processes, shortening the distance between analysis and actions, says Chid Apte, senior manager of data analytics at IBM Research. IBM is studying ways to embed predictive analysis into service-oriented architecture apps so that businesses can act immediately on the results.

"Predictive analytics is going to become more operational," says Scott Burk, senior statistician and technical lead for marketing analytics at Overstock.com, which predicts demand for specific products at specific prices using a Teradata-based data warehouse, Teradata Warehouse Miner, and models built using Kxen tools. The online retailer uses that information to manage its inventory. Says Burk, "We're definitely doing things a lot smarter than we were six months ago."

So the past really can be used to predict the future. And all arrows are pointing in one direction: Your company will be doing it, too.

Illustration by Steven Noble

Continue to the sidebars:
Mining For Love In Myriad Places
and Systems Management Tools Predict Then Adapt