What's "Big Data"?

The first reference to "big data" I have come across (please send me citations to any earlier appearances if you know them) was in the McKinsey Global Institute Report on Big Data: The Next Frontier for Innovation, Competition and Productivity, in May 2011. Since then, it has appeared in numerous sources and has been used to mean a wide spectrum of different things. 

In its simplest and most literal sense, "big data" means "high volumes of information in systems that support analytics at fast speed" (Shawn Rogers, Enterprise Management Associates, October 2011). Picking this apart, big data's more nuanced meanings can be arranged in a coherent framework around the themes of "bigness," "access," "analysis," and "management".

The "bigness" part entails:
  • An explosion in volume and sources. In 2011, the amount of digital data in the world reached one zettabyte. Data are now not just the figures you key in to your own dataset, but include all kinds and volumes of data from sensors, machine-specific records, transactions, social networks, telemetry, etc. For example, a Boeing jumbo jet transmits one terabyte of data back to earth every 29 minutes. It won't be long till your car insurance premium calculation is based  on the degree of safety with which you drive - as calculated from real time driving behavior data transmitted directly from your car to your insurance company.
  • Going beyond connectivity. Big data goes a level beyond large databases in your own system to integrate multiple internal and external data sets across partners and chains
The "access" part means:
  • Cloud-based locations for operations and data.
  • Open-source frameworks (e.g. Hadoop)
  • Having info ‘as it occurs’: i.e. moving from batch to stream
  • Flexibility at the lowest, most decentralized level, with information uploaded and viewed in real time from the field and sliceable in multiple ways
  • Hand-held access, review, and management
The "big analysis" part means:
  • A proliferation of dashboards and metrics.
  • Data mining.
  • Predictive analytics
  • Precision analytics: multiple-level slice-and-dice to the smallest cell level
  • Dynamic visualization: rapidfire interrogation of datasets with instant display and toggling between views to allow identificaiton of peak performers and underperformers, and outliers.
Finally, the "big management" part means:
  • Operating within a culture of data-based decision-making, where executives expect and look forward to periodic reports and rely on them for strategic as well as operational insight.
  • Having the ability to ask the right questions of the data. As many others have nioted, "if you are not asking the right question, then the answer really doesn't matter."
  • Performance-relevant measures.
  • Driving past activities and output accounting, and towards ROI.
  • A need for more data-capable executives who understand data and take meaningful action on what the data are showing.

No comments:

Post a Comment