Getting Big Data Organized Is An Iterative Process
A new report out from Oracle gives one take on big data in the enterprise, at least from executives’ point of view. Oracle interviewed 333 executives in 11 industries.
Some of the findings were at the “water is wet” level of insight (e.g. “Nearly all surveyed (97%) say their organization must make a change to improve information optimization over the next two years”). One of the more interesting findings is that “29% of executives give their organization a “D” or “F” in preparedness to manage the data deluge.” Only 29%? If I had to bet, I’d say there a lot of executives who don’t understand the potential for big data and what is needed to actually extract value from it. At the risk of sounding Rumsfieldian, there are executives who understand the known unknowns of big data and those that do not.
Jessica Miller-Merrell, for example, sees problems big data in HR:
“It’s a giant collection of crap, fashion favorites and memories stored from years past with no clear sense of direction, organization, or any idea in which to start.”
Big data by itself is not neatly organized.
It isn’t waiting to unveil useful information implied within it. You have to go looking for it. That means we need direction from the executives on what is strategically important. That shouldn’t be a problem -- executives set and execute strategies, it’s their job. It is a problem though. We can get high level direction from C-level executives but it won’t be nearly precise enough to direct a group of analysts. Analysts that are skilled with statistical analysis packages and data mining software and generate regressions, clusters, classifiers and a whole host of other models that might be useful to someone somewhere. The problem is the chasm between what executives want in terms of quantifiable measurements they can use to make decisions to drive operations and what analyst target in their data mining efforts.
Big data can be organized with respect to a particular analytics problem. To do that we need someone who understands both the business drivers and the details of the structure and semantics of big data sets: a data scientist. The work of a data scientist is iterative. There will be days spent generating descriptive statistics about data sets, determining how to join multiple data sets, and performing other data exploration tasks. Sometimes these exercises don’t lead anywhere and its back to the drawing board. Other days are spent proposing plans for what can be done with the data. This is where those with deep business knowledge are needed. They are needed to help identify the valuable and feasible big data analysis projects. Their feedback is critical input to the next round analysis by data scientists.
If executives think they are making the grade when it comes to managing the data deluge, I hope they are thinking about more than servers and storage arrays. To extract value from big data we’ll need an iterative analytics process that includes personnel from a wide subset of the organization chart.
Dan Sullivan is an author, systems architect, and consultant with over 20 years of IT experience with engagements in systems architecture, enterprise security, advanced analytics and business intelligence. He has worked in a broad range of industries, including financial services, manufacturing, pharmaceuticals, software development, government, retail, gas and oil production, power generation, life sciences, and education. Dan has written 16 books and numerous articles and white papers about topics ranging from data warehousing, Cloud Computing and advanced analytics to security management, collaboration, and text mining.
See here for all of Dan's Tom's IT Pro articles.
Check out these Tom's IT Pro Training and Certification Videos:
- VIDEO: Build A Secure Future In Information Security -- Looking for long-term job security? Check out our top 5 information security certification options.
- VIDEO: The Cloud Is Inevitable - Get Training Now! -- Organizations are adopting at a rapid pace. Don't be left behind as local data centers shrink.
- VIDEO: Move Ahead With Server Virtualization Certification -- Server virtualization is hot and its at the heart of data center convergence. Learn more about it and the top 5 training options.
- VIDEO: Get Ready For A Career In Unified Communications -- Learn about Unified Communications and training opportunities in this growing area so vital to business success.
Dan Sullivan's The Silver Lining blog:
- Tips for Avoiding Unnecessary Cloud Charges -- There are some obvious and not so obvious ways to avoid running up a tab in the cloud.
- Oracle's In the Cloud: So Are Threats to Its Business Model -- Oracle has spent seven years retooling their code to make it work in the cloud.
- The Silver Lining: Data Analytics with Google BigQuery -- Google's BigQuery is a data service that lets you run SQL-like queries on extremely large data sets.
- SaaS, New Releases and Your Lack of Control -- SaaS vendors view upgrades differently from those offering on premise, licensed software.
- The Silver Lining: No SaaS Is an Island -- Options for integrating cloud-based services into your business.
- The Cloud: How to Validate Data Mining Models -- Data mining as a service not a “just add water” solution to your analytics problems.
- The Cloud: Don’t Fall for Turnkey Data Mining -- A little knowledge is a dangerous thing – especially when it is applied to data analytics.
- The Parallel Universes of Cloud Computing -- Science fiction mainstay helps explain some common but divergent views on cloud computing.
- Learn the Right Skills for Big Data -- Think you’re ready for Big Data analysis, right? Not necessarily.
- The Stack is Dead, Long Live the Stack -- Days of building apps with Linux, Apache, MySQL and Perl, Python or PHP over thanks to the Cloud.
- Cloud Computing’s Cambrian Explosion -- A particularly apt biological metaphor for the state of cloud computing today.
- The Silver Lining: Avoiding IaaS Tunnel Vision -- Software as a service is and will be the most innovative and profitable segment of cloud computing.
- Cloud Computing Lets Us Rethink How We Use Data -- Inaugural post for Dan Sullivan's Tom's IT Pro "The Silver Lining" blog about Cloud Computing.
(Shutterstock image credit: Cloud Data Folder)