Jan 31, 2008 (12:01 PM EST)
ELT vs. ETL: Much Ado about Something
Read the Original Article at InformationWeek
My recent blog post on Informatica not only led to what an ITBusinessEdge.com blog called a "mini-buzz" about the fate of the company, it invited reader comments that, among other things, took up opposing points of view on Informatica's ELT capabilities - and, yes, that's extract-load-transform (also called "pushdown") not conventional extract-transform-load (ETL). There's no doubt that ELT is now a mainstream capability, and Informatica's inclusion of pushdown optimization in the recently released PowerCenter version 8.5 brings ELT the legitimacy it deserves.At first sight, ELT seems like a new twist on the conventional ETL approach at best (as well as a dyslexic's nightmare), but if done right, there's a lot to like about it (except the acronym). The central premise of ELT is the ability to send the ETL process, mid-stream, swooping down into the database, not unlike a bird that, mid-flight, takes a dip into a lake and resumes the flight without pause. While a bird slows down when it dips into a pond, ELT pushdown actually speeds up the overall data movement process. Perhaps the best known pure-play vendor for ELT was Sunopsis, which was acquired by Oracle in 2006.
Compared to pure ETL or pure ELT, I find mixed-mode ETL (mixing some pushdown capabilities into the overall ETL process) more appealing. Whereas ETL usually brings better organization of the data load process, in-database processing is generally faster than ETL, so the ability to send part of the process (part of an Informatica "mapping," for example) into the database is attractive. The standard work-around today, when we need to do some heavy lifting in the database, is to write, say, stored procedures invoked from an ETL process. The down side of this approach - besides having to develop logic in two environments, ETL and database SQL - is that often it is not easy making the mid-stream data available to the stored procedure, leading to convoluted architecture and ETL design compromises. If you can code uniformly in one environment and then just send part of that code down to the database, you gain flexibility as well as performance.
It's getting increasingly hard to stay excited about ETL (Indisputable Need + Near Ubiquity ≠ Enthrallment) but pushdown seems like the best thing that has happened to ETL since data quality and EII, and I can see mixed-mode options, such as that offered by Informatica, giving ETL solution architects and designers a needed boost in designing sophisticated ETL solutions and containing data load jobs within processing windows. I also fully expect that pushdown will become a new frontier in the battle for ETL supremacy - and once again, Informatica seems to have the edge.
But please, can we just continue calling all that ETL without coining a new acronym for a clever improvisation?There's no doubt that ELT - yes, that's extract-load-transform (also called "pushdown") not conventional extract-transform-load (ETL) - is now a mainstream capability. Informatica's inclusion of pushdown optimization in the recently released PowerCenter version 8.5 brings ELT the legitimacy it deserves... I fully expect pushdown will be come a new frontier in the battle for ETL supremacy.