Product and service reviews are conducted independently by our editorial team, but we sometimes make money when you click on links. Learn more.
 

Five Data Warehousing Disasters and How to Avoid Them

Five Data Warehousing Disasters and How to Avoid Them
By
Joe Brockmeier

Joe 'Zonker' Brockmeier is a freelance writer and editor who has covered IT for years. Formerly the openSUSE Community Manager for Novell, Brockmeier has written for Linux Magazine, Sys Admin, Linux Pro Magazine, IBM developerWorks, Linux.com, CIO.com, Linux Weekly News, ZDNet, and many other publications

Companies often spend a lot of time and resources implementing a data warehousing system without doing the homework required to make sure it will be used appropriately or even at all. In this article, we outline five strategies an IT pro should follow when implementing a data warehousing project to avoid disaster.

1: Get the End User on Board

Robin Schumacher, director of product strategy for EnterpriseDB, says that it's not uncommon for a company to set up a data warehouse without getting the appropriate buy-in from executives and users.

“You're going to go through a lot of expense and time, and then end up with conflicting sources,” according to Schumacher. For example, an expensive data warehouse solution may end up competing with Microsoft Access.

Schumacher says that it's not uncommon for financial analysts to have their own Microsoft Access databases that they've used forever and don't want to switch. “I can't say how many customers we've talked to. The problem is they have all these Access DBs and people compiling their own information, and none of it agrees."

2: Properly Train Stakeholders

Another problem, says Schumacher, is that users aren't trained or given the right tools prior to deployment: “I've seen both extremes; users are given very few tools, no automation, no assistance and told 'here's the data model. Good luck.'”

In one extreme case, Schumacher recounts a user in the mid-‘90s trying to wrangle data out of a Teradata database, but the user was unsure if they were writing the query correctly. In short: no, he wasn’t.

“We ran Explain on it [to see what the query would do prior to running it], it ran a very long time,” Schumacher says. The result? Explain found that query would have finished sometime in 2049 and would likely bog the entire system down in the process.

How do you avoid queries from Hell?

Schumacher says it’s simple: Tools and training. Provide tools appropriate for the audience and training for everyone. Some users are sophisticated; others need drag-and-drop tools that can help create queries without users having to know a SQL query from a spreadsheet.