Revolution That Will Transform How We
Live, Work, and Think. In this book, they
define big data as the “universe” of data
for a given subject. They are not saying
whether the amount of data is too great
for current mainstream technologies, but
they are saying that it is now possible to
have all the data. The authors argue that
there are two important implications of
having (or having the possibility of getting) this universe of data. First, once you
have all the data, it is possible to look for
correlations and gain insights you would
not have seen before. For example, if you
have data on everything a truck was doing
before it was involved in an accident, you
can better determine what led to the accident and use this information to prevent
future accidents. Since accidents are hopefully rare events, normal data sampling
would not have given you the information
needed to find the right correlations.
The second important implication is
that there may be a lot of value in having
the universe of data. For example, right
now there is a lively debate going on in
the farming industry. Large seed manufacturers say they can help drive up crop
yields if farmers send them detailed data
on their soil quality on something like a
square-foot-by-square-foot basis. (While
this may seem like a huge amount of data
to collect, it is actually easy to do with
today’s high-tech tractors.) Obviously,
an increase in yields is a good thing,
but farmers recognize that if the seed
companies had detailed data about every
farm, they could use that data for other
purposes, such as trading on agricultural
commodity futures. Understandably, the
farmers would like to make sure they own
this valuable data.
There are two important lessons here
for supply chain managers. First, you may
have access to a “universe” of data that has
economic value outside of its original purpose. Second, you should be careful about
giving away your data to other organizations that could reap its economic value.
The third definition of big data is derived
from how the term is used in the popular
press. The press tends to label some inter-
esting or creative use of data as “big data.”
When you read the article, though, you
quickly realize that the data set being dis-
cussed is not “the universe of data on
a particular subject.” In fact, the data
set typically is not even very large.
Rather, in most cases, the data set is
being used creatively.
It would be wrong to dismiss this
definition as simply a case of the
press misusing a buzzword. Instead,
this definition points out something
meaningful: It is important to think
creatively—about using the data
you have access to, about combining
the data you have in unique ways,
and about looking for new, readily
available external data sets (such as
weather, housing starts, or demo-