Data-Driven Crime Fighting

Nov 13, 2009 (10:11 AM EST)

Read the Original Article at

Police can use data to 'change the environment,' says Berkow
Police can use data to 'change the environment,' says Berkow
On a Saturday afternoon last summer, Mark Rasch took his son to his baseball game at a park in Georgetown, Md. The ballpark is located in an area that has zone parking with a two-hour limit. Rasch was forced to park in a spot that was a bit of a hike from the ball field. He later eyed an opening closer to the park and moved his car there.

Game ended, Rasch packed up and was ready to pull away when he noticed a parking enforcement officer writing tickets. "I'm OK, right?" he asked, assuming that because he had moved his car she wouldn't know he'd been parked in the zone for longer than two hours.

Wrong. The officer not only knew that he had moved his car but when and how long he'd been parked within the zone. Fortunately, she didn't write him a ticket, as he was about to pull out. But the encounter left Rasch, who is a lawyer and a cybersecurity consultant, a little spooked at the realization of just how much information law enforcement is generating.

If there was a time when law enforcement agencies suffered from an information deficit, it's passed. Of the more than 18,000 law enforcement agencies across the United States, the vast majority has some form of technology for collecting crime-related data in digital form. The biggest city agencies have sophisticated data warehouses, and even the most provincial are database savvy.

So it's not surprising that law enforcement and criminal justice agencies are running into the same data-related problems that CIOs have been experiencing for years: ensuring data quality and accessibility, developing and enforcing standards for interoperability, and exploiting those digital resources in the most effective manner.

The era of data-driven law enforcement began in the early 1990s in New York City. It was there that police chief William Bratton sought to impress newly elected mayor Rudolph Giuliani with a radical approach to policing that came to be known as CompStat. CompStat put an emphasis on leveraging data--accurate, detailed, and timely--to optimize police work.

"Police departments are powerful collectors of data," says Michael Berkow, president of Altegrity Security Consulting, a newly launched division of security firm Altegrity. Before joining ASC last month, Berkow was chief of the Savannah-Chatham police department, and before that he was second in command to Bratton in Los Angeles after Bratton left New York to be chief of the LAPD.

Police departments were motivated to implement or upgrade IT systems by the Y2K frenzy, Berkow says. "By 2000-2001, everybody had some level of digital information," he says. That and CompStat led to a movement known by the initials ILP, which stand for "information-led policing" or, according to some, "intelligence-led policing."

Smarter Business Multimedia
video snap: Smarter Crime Prevention
Data can put cops where they're most needed

With this special issue, InformationWeek is launching multimedia executive summaries of select stories, called Media Snaps, which you can view on your computer (by clicking the video player above) or smartphone.
Sign up for e-mail delivery to your phone.
The concept is simple: leverage data to help position limited police resources where they can do the most good. It's an effort to be more proactive, to "change the environment," Berkow says, from the reactive, response-oriented methods of the past.

To a great extent, data is about the context of criminal behavior. "We know that the same small group of criminals is responsible for a disproportionate amount of crime," says Berkow. Police refer to that group as PPOs--persistent prolific offenders. Past criminal behavior, such as domestic violence, can be a strong indicator of potential future problems. When Berkow was chief in Savannah, his department went through data on recent homicide cases and noticed an interesting data point: Of twenty-some arrests for homicide, 18 of those people had prior arrests for possession of firearms. "We started this very detailed review of every aspect of our gun arrests," he says.

Data Impetus

In the corporate world, the impetus for improving data policy and procedures might be a delayed project or a lost sale. In law enforcement, it's sometimes much more dramatic.

Last year, the North Carolina legislature voted to fund the Criminal Justice Law Enforcement Automated Data Services (CJLEADS) project following the March 2008 slaying of University of North Carolina student body president Eve Carson. During the subsequent investigation, prosecutors learned that at least one suspect in the case, now under arrest, was on parole at the time of the slaying and had been in court just a few days before the crime.

The CJLEADS project seeks to reconcile and integrate data that's spread out across many systems and agencies so that known offenders don't slip through the cracks. "Right now that information is available but you have to log on to multiple systems to manually build that complete profile," says Kay Meyer, data integration manager in the Office of the State Controller, who is heading up the project for the state.

The project has two objectives. The first is to be able to generate a comprehensive picture of an offender that incorporates as much info as possible from the state's systems. The second is to provide a "watchlist" capability for criminal justice professionals to keep track of repeat offenders.

State officials are using data analysis technology from SAS Institute, which is headquartered in North Carolina. A data warehouse that stores some of the CJLEADS data is, for the time being, located in a SAS facility; other data from state agencies is accessed in real time. The project is slated for rollout starting next year and into 2011.

The North Carolina project involves justice system data--courts, probation, and parole, for example. Closer to the streets is what's known as crime-report or incident-report data.

There are two basic IT systems that almost all police departments use: computer-aided dispatch and a records management system. The CAD system handles 911 calls and retrieves and displays data related to those calls, such as location of origin. The RMS stores data generated from crime incident reports, such as arrests and bookings. Alongside the RMS, many police agencies have developed data stores of specific information, such as gang or sex-offender databases. Employing off-the-shelf technology and developed by third-party service providers, many of these systems are standalone, which has led to another situation familiar to corporate IT--silos of data.

Government Innovation alert
Government IT On The Leading Edge
We take a look at new ideas coming out of the public sector, including software that learns your schedule, supercomputing tricks that PCs will learn, and networks resilient enough for the rigors of outer space.
One of the first sophisticated data integration systems written for law enforcement is called Coplink. It was developed in the late 1990s by a Tucson police officer and a computer science professor at the University of Arizona.

The Tucson police department was Coplink's first customer, says James Wysocki, an IT administrator for the city of Tucson who was the IT administrator for Tucson PD when Coplink was first implemented. "It became obvious that this was a game-changer," he says. "I reconfigured my division to take advantage of it."

Coplink is a data analysis tool that searches for patterns among data loaded into the system from disparate sources. Wysocki first used Coplink to reconcile and query data from the various systems within the Tucson PD. Because each Coplink data warehouse is a node on the system, he reached out to police agencies across the county that had implemented Coplink, then to agencies in other jurisdictions.

Today, Coplink is used in more than 3,000 police jurisdictions, says Bob Griffin, CEO of i2 Inc., which markets the system. "Everybody recognizes that you have to share data," he says.

The next step is querying systems around the state and interacting with the feds. Tucson has worked out deals with the Justice Department and Homeland Security to query into their systems, Wysocki says, which is "a big thing, because for years the information flowed in only one direction."

Fusion Centers

Another bottom-up data integration effort is what's called the Fusion Center initiative. Fusion Centers emerged after the Sept. 11, 2001, terrorist attacks as a way for states to work with federal agencies on anti-terrorist activities. There are 74 fusion centers around the country in states and major cities.

Fusion centers are made up of "multiagency groups trying to sift through multiagency data," says Ron Hawley, executive director of Search Group, a consortium of state agencies that promotes information sharing.

In terms of top-down data sharing, there are two national databases of crime-related information available to local, state, and federal agencies: the National Crime Information Center (NCIC) and the Integrated Automated Fingerprint Identification System.

The FBI recently launched a more comprehensive data-sharing initiative, a national data warehouse of crime report information. Called the Law Enforcement National Data Exchange, or N-DEx, the database repository resides in an FBI facility in Clarksburg, Va. "There are 65 million reports in there already even though it only opened for business last July," says Paul Wormelli, executive director of the Integrated Justice Information Systems Institute, a non-profit organization funded by many of the largest technology vendors.

The vast majority of those reports come from local and state agencies. In fact, "Coplink is the largest feeder of information to the N-DEx system," says i2's Griffin.

Integrating data from all levels of law enforcement--local, state, and federal--is a big challenge. The Global Justice XML Data Model, a data exchange standard sponsored by the Justice Department, was already in the works when Sept. 11 happened, but the attacks accelerated that process. Since its introduction in 2003, the GJXDM has gathered momentum. "It got very popular, very fast, particularly in the state and local world," according to Wormelli, who says the standard now has 6,500 data elements.

In 2005, officials from Justice and Homeland Security suggested a more inclusive standard that would incorporate data from emergency, immigration, and trade systems. This new standard is called the National Information Exchange Model (NIEM), and it's also gathering steam. The FBI is putting all its active cases in N-DEx using NIEM. "Now we have for the first time in history a single data standard," Wormelli says.

Business Intelligence

Law enforcement officials often refer to the need for "actionable information." One of the first ways police agencies used incident-report data in digital form was in conjunction with geographical information systems, in support of what's known as electronic crime mapping, or hot-spot analysis.

Police in the city of Edmonton, Alberta, brought in data analysis technology from business intelligence vendor Cognos (now part of IBM) a few years ago. The first project police officials concentrated on was using the reporting tool in conjunction with a new geographic-based resource-deployment model being implemented by the agency. "Our business analytics reports became a key component of how we deployed policemen around the city," says John Warden, staff sergeant in the business performance section of the Edmonton Police Service.

Now the agency is using the data to plot criminal activity according to both geographic area and comparative history. "We're really delving into those analytics in terms of place and time," says Warden.

The holy grail of information-led policing is what's referred to as predictive policing: being able to predict where and when crimes may occur.

Chicago PD is experimenting with 'predictive policing,' says Lewin
Chicago PD is experimenting with 'predictive policing,' says Lewin
That's where Chicago wants to go. The Chicago PD operates what Jonathan Lewin, commander of information services, refers to as "the largest police transaction database in the United States." Costing $35 million, Chicago's Citizen and Law Enforcement Analysis and Reporting (CLEAR) system processes "all the arrests for all the departments in Cook county--about 120--in real time," Lewin says, and 450 local, state and federal law enforcement agencies have query access to it. Lewin's IT shop has about 100 staffers and employs between 10 and 20 contract workers from Oracle, whose database technology the system is based on.

Chicago PD is working with the Illinois Institute of Technology, by way of a $200,000 grant from the National Institute of Justice, on an "initial exploration" of a predictive policing model. The grant was awarded partly on the basis of work done by Dr. Miles Wernick of IIT in the area of medical imaging and pattern recognition, and the project involves exploring "non-traditional disciplines" and how they might apply to crime projection. "We're going to be using all the data in the CLEAR system," Lewin says, including arrests, incidents, calls for service, street-gang activity, as well as weather data and community concerns such as reports of streetlights out. "This model will seek to use all these variables" in attempting to model future patterns of criminal activity, he says.

SPSS is a name often associated with predictive policing. The statistical-analysis software developer, recently acquired by IBM, has customer histories that tout the success of its tools in the criminal justice environment, such as the Memphis, Tenn., police force, which SPSS says reduced robberies by 80% by identifying a particular "hot spot" and proactively deploying resources there.

But can software really predict crime? "It's not a binary yes or no; it's more of an assessment of risk--how probable something is," says Bill Haffey, technical director for the public sector at SPSS.

Technology Trends

In an effort to involve citizens more closely in law enforcement, many police agencies have begun to post crime statistics and regional geographic crime data on their Web sites. The Edmonton Police Service uses Google Maps and an in-house application to display essentially the same geographic crime data police officers get at their morning report, minus specific addresses.

That effort hasn't been lost on Internet entrepreneurs. offers a crime mapping capability in the software-as-a-service model to small and midsize police agencies. "Most agencies are tiny," says CEO Greg Whisenant. "Ninety-three percent serve a population of fewer than 20,000 people."'s software can query an agency's CAD system or RMS or any ODBC-compliant database, and generate reports as often as the agency wants, generally once a day. "Whatever data store they have, we can connect to it," he says.

Mobility is an important aspect of police work, and one that is receiving increasing attention from technology vendors. Tucson's Wysocki says he is piloting a PDA-based version of Coplink and, in the same vein, Chicago PD's Lewin says Oracle is developing "a scaled-down incident-reporting system for the Blackberry."

As they are in the corporate world, cell phones are an increasingly common piece of technology equipment among police officers. Nixle offers a service that publishes text messages and e-mails to cell phones or Web sites in targeted geographic areas, and has signed on 3,000 public agencies since its launch in March, according to CEO Craig Mitnick. Over and above sending out traffic alerts, Nixle increasingly is being tested by police agencies as a way for officers to communicate in emergency situations. The Pittsburgh Police Department successfully used Nixle in September to facilitate cell-phone communication among its 900 police officers and the 4,200 auxiliary police brought in to help with crowd control at the G20 Summit.

Too Much?

All this collecting, warehousing, and mining crime-related data begs the question, How much is too much?

In-Q-Tel, the investment arm of the CIA, caused a stir this fall when it invested in Visible Technologies, which markets a service that monitors the digital conversations going on in social media. And John Motler, a sales engineer with SAS Institute, says one metropolitan police department has discussed with him the possibility of using SAS technology for social network analysis.

Both examples illustrate a trend known as "open source intelligence," which involves examining information available in the public domain for potential intelligence leads.

So how does this all relate to parking your car? The Georgetown incident still bothers Rasch. "What it meant was that D.C. was keeping a database of people who are legally parked," says Rasch, which, from a privacy standpoint, is "more intrusive than chalking the tires."

Pertinent questions include: How long do they hold on to that data? And with whom do they share it? It's an important discussion to have, both in terms of privacy and effective police methods. After all, as Rasch points out, it was a parking ticket that led to the arrest of serial killer Son of Sam.

The Vision The Challenges
  • Share data more broadly among police departments and local, state, and federal agencies
  • Use advanced analytics to model patterns of criminal activity
  • Scan the Web for potential leads, known as 'open source intelligence'
  • Integrate many disparate, disconnected databases
  • Adopt standards for interoperability of various data formats
  • Address privacy issues around the type of data stored and who sees it

John Soat is a freelance business and technology writer and former editor of InformationWeek. He has covered many key developments in the IT industry, including Microsoft's long battle with the U.S. Justice Department.