Converged Architecture: When Big Data Meets Better Infrastructure

Converged Architecture: When Big Data Meets Better Infrastructure
By

Businesses that learn to harness today's inundation of bits, known as "big data," stand to benefit immensely.

Businesses that learn to harness today's inundation of bits, known as "big data," stand to benefit immensely.

According to the May 2011 issue of The Economist, people stored enough data during 2010 to fill 60,000 Libraries of Congress. While the number of smartphones consuming data (much of it video) continues to grow by 20% annually, the number of sensors generating data is growing by 30%. The business landscape is being inundated by a flood of data the likes of which have never been indexed before.

Collectively, this inundation of bits is known as “big data.” Businesses that are learning to harness it stand to benefit immensely. For example, a May 2011 report by McKinsey & Company noted that a retailer leveraging big data could “increase its operating margin by more than 60 percent.” Similar, manufacturers could see up to a 50% reduction in product development and assembly costs. Such are the potential benefits of being able to channel the big data tide and extract from it the information needed to achieve business goals more effectively.

In the world of large-scale data processing, there are three basic types of data: structured, semi-structured, and unstructured. Structured data are best typified by spreadsheets and relational databases. There are tabular columns and roles. Structured data is very common in data processing systems, and it’s the type of data that enterprises have grown up on over the last few decades.

Semi-structured data has no discrete columns or cells that can be examined, but there is some loose structure. For example, on a blog, certain types of information tend to fall into general areas consistently. Unstructured data, on the other hand, has no such divisions. This might include email, Word documents, videos, pictures, and content needing natural language processing. Twenty years ago, when the systems architectures for data processing were created, these types of files were comparatively rare or not used at all in the business world. Today, they are ubiquitous. Taken together, all three of these data types as found within an organization comprising its big data.

“Big data is more about characteristic than size,” says Greg Battas, chief technologist, data management for HP’s Business Critical Systems division. “Maybe think of big data as ‘non-traditional’ data. When an organization starts to use unstructured and semi-structured data, they start to use a lot of different technologies. That typically makes the data larger, with more volume, but it’s a different way of processing. But a lot of customers are starting to leverage these non-traditional sources and use them for things they never did before.”

To illustrate the data difference in real life, imagine a retail chain in a large college town. Some stores are doing well while others are not. The chain’s IT department wants to analyze the problem and so requests all of the data for every line item invoiced over the last six months. But this is not big data. An outside consultancy analyst wants to look at the promotions run by other stores around the days that our chain was also promoting. What was the competition’s ad copy? What was the weather history? What are the distances of each store in question to the campus? Data of this type is typically not considered in classic IT analyses.

Similarly, if we look at even SMBs, old school data might have been strictly derived from point-of-sale systems. But this evolved into Web storefront POS, which then grew to include browsing habits and might now add time spent viewing each item and what gets left in abandoned shopping carts. Such minutiae are exactly the type of raw data that modern analysts use to unearth constructive, actionable information.

William Van Winkle has been a full-time tech writer and author since 1998. He specializes in a wide range of coverage areas, including unified communications, virtualization, Cloud Computing, storage solutions and more. William lives in Hillsboro, Oregon with his wife and 2.4 kids, and—when not scrambling to meet article deadlines—he enjoys reading, travel, and writing fiction. See here for all of William's Tom's IT Pro articles.

(Shutterstock cover image credit: Cloud Computing)

Take your big ideas off the back burner with Converged Infrastructure

Comments
intervention