Sep 29, 2009 (12:09 PM EDT)
Thoughts on Integrating OLTP and Data Warehousing (Especially in Exadata 2)

Read the Original Article at InformationWeek

Oracle is pushing Exadata 2 as being a great system for OLTP (OnLine Transaction Processing), data warehousing or, presumably, the integration of same. This claim rests on a few premises, namely:
  • Exadata is great for data warehousing. At this time, that's a claim much better supported by marketing and theory than by practice.

  • Exadata 2 is a suitable annual improvement over last year's Exadata 1. That's quite plausible.
  • Oracle is outstanding for OLTP. That's borne out by vast amounts of experience, especially if by "outstanding" you mean "Gets the job done really, really well at a very high cost in terms of both licenses and labor."
  • The Flash memory in Exadata 2 makes Oracle even better for OLTP.* That's plausible too. Worst-case is probably that Flash support doesn't really work well in those release, but will be cleaned up soon.**
  • OLTP and data warehousing uses for Exadata don't interfere with each other. That one bears some discussion.*Oracle has repeatedly emphasized that the Flash memory in Exadata 2 is meant to speed up OLTP. By way of contrast, I've only noticed one vague claim that Flash memory helps data warehousing -- a reference to a doubling in "user scan rates", which perhaps was a slip of the marketing pen. **Oracle probably has been working on Flash memory support for a long time. But it's likely that Oracle didn't have a strategic commitment to Sun's specific technology until April of this year. After all, back in March it looked as if IBM would wind up owning Sun.

The integration-versus-separation argument for OLTP and analytic databases is an old one. In the early 1980s, IBM pushed both the "Information Center" (precursor to the data warehouse) and relational DBMS (portrayed as good for query and maybe for OLTP as well). In the early 1990s, Ted Codd opined that relational DBMS were good for OLTP but not analytics, instead favoring "OLAP" systems like Arbor Software's Essbase (which, ironically, is now owned by Oracle). As the 1990s progressed, a consensus emerged that most large* enterprises should have at least one relational data warehouse separate from the core OLTP DBMS, a view that has persisted to this day. Until the announcement of Exadata 2, Oracle hadn't seriously disputed this consensus, although of course it always has wanted its DBMS software to run your OLTP and analytic databases alike.

*At a sufficiently small enterprise, one DBMS suffices. If a single commodity server has enough power to do all your processing, without even requiring you to have the expertise to tune very seriously, that's probably the right way to go.

Assuming one DBMS has plenty of functionality for OLTP and analytics alike -- as Oracle certainly does -- the main arguments for separating OLTP and data warehousing revolve around performance. Reasons to split out a separate analytic database include:

  • You might just want to run a separate brand of DBMS for your OLTP and data warehousing. Oracle thinks this is a terrible idea. (I disagree, as do a whole lot of analytic DBMS vendors -- Teradata, Netezza, Greenplum, Sybase, Vertica, Aster Data, Infobright, Kognitio, et al. -- and their customers.)
  • You may want to lay out or index your tables differently for OLTP and data warehousing. Materialized view capabilities as flexible as Oracle's should let you do that in a single database.
  • You may want to lay out your files differently for OLTP and data warehousing (e.g., in terms of block sizes). Oracle might claim that ASM (Automatic Storage Management) and, in particular, the "Stripe and Mirror Everything" option obviate that point. I'm far from convinced.
  • OLTP and analytic workloads step on each other's toes, Part 1. For example, analytic queries that call for table scans often don't mix well with OLTP operations that call for random reads and (especially) writes.* In principle, Flash memory could greatly reduce the problem, if the OLTP workload talks mainly to Flash, while Flash talks to disk mainly via microbatches. But I'll be quite surprised if Oracle has aced that challenge on the first try. More likely, a longish stretch of Bottleneck Whack-A-Mole lies ahead.
  • OLTP and analytic workloads step on each other's toes, Part 2. Even more fundamentally: If you don't have sufficiently good workload management tools, combining OLTP and analytic workloads is a ghastly performance idea, with OLTP slowing to a crawl while analytic queries rumble to completion. However, I'd think Oracle is in pretty good shape in that area.

    *If this weren't a terribly difficult problem, Oracle, IBM, and/or Teradata -- all of which can do a reasonably decent job of mixing long and short queries in the same workload -- would probably have solved it years ago.

Bottom line: Some day, Oracle Exadata may be a great system for integrated OLTP and data warehousing -- but probably not in the current release.Oracle is pushing Exadata 2 as a great system for OLTP (OnLine Transaction Processing), data warehousing or both. This claim rests on a few premises, the first being that Exadata is great for data warehousing. At this time, that's a claim much better supported by marketing and theory than by practice...