Big Data – What it is and why it matters

Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses.

More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.

Big data defined

As far back as 2001, industry analyst Doug Laney (currently with Gartner) articulated the now mainstream definition of big data as the three Vs: volume, velocity and variety1.

  • Volume. Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to-machine data being collected. In the past, excessive data volume was a storage issue. But with decreasing storage costs, other issues emerge, including how to determine relevance within large data volumes and how to use analytics to create value from relevant data.
  • Velocity. Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations.
  • Variety. Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.

At SAS, we consider two additional dimensions when thinking about big data:

  • Variability. In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved.
  • Complexity. Today’s data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.

Why big data should matter to you

The real issue is not that you are acquiring large amounts of data. It’s what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smarter business decision making. For instance, by combining big data and high-powered analytics, it is possible to:

  • Determine root causes of failures, issues and defects in near-real time, potentially saving billions of dollars annually.
  • Optimize routes for many thousands of package delivery vehicles while they are on the road.
  • Analyze millions of SKUs to determine prices that maximize profit and clear inventory.
  • Generate retail coupons at the point of sale based on the customer’s current and past purchases.
  • Send tailored recommendations to mobile devices while customers are in the right area to take advantage of offers.
  • Recalculate entire risk portfolios in minutes.
  • Quickly identify customers who matter the most.
  • Use clickstream analysis and data mining to detect fraudulent behavior.

(http://www.sas.com)

What Is Big Data?

Big data is new and “ginormous” and scary –very, very scary. No, wait. Big data is just another name for the same old data marketers have always used, and it’s not allthat big, and it’s something we should be embracing, not fearing. No, hold on. That’s not it, either. What I meant to say is that big data is as powerful as a tsunami, but it’s a deluge that can be controlled . . . in a positive way, to provide business insights and value. Yes, that’s right, isn’t it?

Over the past few years, I have heard big data defined in many, many different ways, and so, I’m not surprised there’s so much confusion surrounding the term. Because of all the misunderstanding and misperceptions, I have to ask:

You won’t get far untangling your big data hairballif, for example, half of your company is forgetting to include traditional data in the calculus or if some don’t think social network interactions “really” matter. So, please, take a minute to get back to basics and do a simple self-check. Ask yourself, your team, the C-suite:

How do we define big data?

While I fully expect your company to add its own individual tweaks here or there, here’s the one-sentence definition of big data I like to use to get the conversation started:

Big data is a collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis.

Some people like to constrain big data to digital inputs like web behavior and social network interactions; however the CMOs and CIOs I talk with agree that we can’t exclude traditional data derived from product transaction information, financial records and interaction channels, such as the call center and point-of-sale. All of that is big data, too, even though it may be dwarfed by the volume of digital data that’s now growing at an exponential rate.

In defining big data, it’s also important to understand the mix of unstructured and multi-structured data that comprises the volume of information.

Unstructured data comes from information that is not organized or easily interpreted by traditional databases or data models, and typically, it’s text-heavy. Metadata, Twitter tweets, and other social media posts are good examples of unstructured data.

Multi-structured data refers to a variety of data formats and types and can be derived from interactions between people and machines, such as web applications or social networks. A great example is web log data, which includes a combination of text and visual images along with structured data like form or transactional information.  As digital disruption transforms communication and interaction channels—and as marketers enhance the customer experience across devices, web properties, face-to-face interactions and social platforms—multi-structured data will continue to evolve.

Industry leaders like the global analyst firm Gartner use phrases like “volume” (the amount of data), “velocity” (the speed of information generated and flowing into the enterprise) and “variety” (the kind of data available) to begin to frame the big data discussion. Others have focused on additional V’s, such as big data’s “veracity” and “value.”

One thing is clear: Every enterprise needs to fully understand big data – what it is to them, what is does for them, what it means to them –and the potential of data-driven marketing, starting today. Don’t wait. Waiting will only delay the inevitable and make it even more difficult to unravel the confusion.

Once you start tackling big data, you’ll learn what you don’t know, and you’ll be inspired to take steps to resolve any problems. Best of all, you can use the insights you gather at each step along the way to start improving your customer engagement strategies; that way, you’ll put big data marketing to work and immediately add more value to both your offline and online interactions.