BY ART VAN BODEGRAVEN AND
KENNETH B. ACKERMAN
basictraining

“Big data”—bloated or just
large-boned?

IT APPEARS THAT “BIG DATA” IS THE NEW RALLYING cry for consultants and others who can’t quite explain what they’re up to. So, at any conference worth its price tag, those who have been expounding ad infinitum about risk mitigation are often being out-shouted by those who see big data as the key to unlocking the heretofore insoluble mysteries of why supply chains operate the way they do.

But what is big data? Shoveling a path through the amassed piles of bovine excrement reveals a perhaps too-simple truth. Simply put, big data is more data than you know what to do with.

Said another way, big data is a pile of transactions and conditions that is just too humongous, and has too many dimensions, to be processed and analyzed by the traditional tools you have on hand. Last-century architectures resting on foundations of relational databases are not up to new-century tasks, volumes, and velocities or to the nuances in the data that are available today (with more on the way).

TRASH AND TREASURE

To defy those who would like to have a universal definition, it
turns out that one company’s trash is another’s treasure. For a
number of reasons.

One is raw data availability, in quantity and in quality, in refinement and in nuance, in relevance and in robustness. But each organization might well look at these radically different buckets of stuff, with enormously different potentials for analysis and the development of business intelligence, as big data.

Further, each company’s tools for storage and analysis could be far apart in capacity and capability. The company with a really robust tool would, then, have a much different threshold for calling its source material big data than would the company with less sophisticated analytical capabilities, even if the data itself were virtually identical.

HOW WE GOT HERE

Much like the 800-pound person who can’t be extricated from his home to be taken to a hospital, this is a process that takes time and starts small. It’s the first binge on a bag of Cheetos that leads to snacking on gallons of ice cream.

Many of us are still recovering from the real-ization that a cheap wristwatch today has enough computing power and storage capacity to support a NASA Apollo mission. It is just too much for us to contemplate that the cloud (or buildings full of servers for the less-poetical-ly inclined) permits data storage in quantities that we don’t even know how to say. So, comfortable prefixes of “mega” and “giga” have given way to “tera” and “zebi.” These new terms are, by definition, useful. If we did not have the technology to capture these quantities, we would have little need for names for them.

NOW THAT WE’VE ARRIVED

Not that we have reached an end point; we’re only at one point in the continuing evolution of data acquisition, storage, and analysis. But we are at a point at which there is a more-than-nagging question, namely: What do we do with all this stuff?

The simple answer is that we analyze it; make up hypotheses; search for evidence in the data, pro or con; and make decisions. But there is a hard reality. Inordinately few supply chain practitioners know how to frame the questions, let alone how to tease relevant analytics out of the