BY ART VAN BODEGRAVEN AND
KENNETH B. ACKERMAN
basictraining
“Big data”—bloated or just
large-boned?
IT APPEARS THAT “BIG DATA” IS THE NEW RALLYING
cry for consultants and others who can’t quite explain what
they’re up to. So, at any conference worth its price tag, those
who have been expounding ad infinitum about risk mitigation
are often being out-shouted by those who see big data as the
key to unlocking the heretofore insoluble mysteries of why
supply chains operate the way they do.
But what is big data? Shoveling a path through the amassed
piles of bovine excrement reveals a perhaps too-simple truth. Simply put, big
data is more data than you know what to
do with.
Said another way, big data is a pile of
transactions and conditions that is just too
humongous, and has too many dimensions, to be processed and analyzed by the
traditional tools you have on hand. Last-century architectures resting on foundations of relational databases are not up to
new-century tasks, volumes, and velocities
or to the nuances in the data that are available today (with more on the way).
TRASH AND TREASURE
To defy those who would like to have a universal definition, it
turns out that one company’s trash is another’s treasure. For a
number of reasons.
One is raw data availability, in quantity and in quality, in
refinement and in nuance, in relevance and in robustness.
But each organization might well look at these radically different buckets of stuff, with enormously different potentials
for analysis and the development of business intelligence, as
big data.
Further, each company’s tools for storage and analysis could
be far apart in capacity and capability. The company with a
really robust tool would, then, have a much different threshold
for calling its source material big data than would the company with less sophisticated analytical capabilities, even if the
data itself were virtually identical.
HOW WE GOT HERE
Much like the 800-pound person who can’t be
extricated from his home to be taken to a hospital, this is a process that takes time and starts
small. It’s the first binge on a bag of Cheetos
that leads to snacking on gallons of ice cream.
Many of us are still recovering from the real-ization that a cheap wristwatch today has
enough computing power
and storage capacity to support a NASA Apollo mission. It is just too much for
us to contemplate that the
cloud (or buildings full of
servers for the less-poetical-ly inclined) permits data
storage in quantities that we
don’t even know how to say.
So, comfortable prefixes of
“mega” and “giga” have
given way to “tera” and
“zebi.” These new terms are,
by definition, useful. If we did not have the
technology to capture these quantities, we
would have little need for names for them.
NOW THAT WE’VE ARRIVED
Not that we have reached an end point; we’re
only at one point in the continuing evolution of
data acquisition, storage, and analysis. But we
are at a point at which there is a more-than-nagging question, namely: What do we do with
all this stuff?
The simple answer is that we analyze it; make
up hypotheses; search for evidence in the data,
pro or con; and make decisions. But there is a
hard reality. Inordinately few supply chain practitioners know how to frame the questions, let
alone how to tease relevant analytics out of the