The Beginning of “Big Data”
The first mention of the term “Big Data” appeared in the Association for Computer Machinery (ACM) library over two decades ago. Michael Cox and David Ellsworth wrote, “Visualization provides an interesting challenge for computer systems: data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of big data. When data sets do not fit in main memory (in core), or when they do not fit even on local disk, the most common solution is to acquire more resources.” In other words, at that time, a Big Data definition was essentially “data that could no longer fit on available hardware.”
What Is Big Data—Today?
Fast forward to just over two decades later—after the explosion of the Internet, smartphones, the Internet of Things, and cloud computing—and the Big Data definition has expanded far beyond the confines of a “local disk.”
Wikipedia’s Big Data definition is “a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software.” Some experts define Big Data with the “Four V’s”: Volume, the amount of information produced; Variety, the diversity of data; Velocity, the speed at which data is created; and Veracity, the integrity and accuracy of the data you create and collect.
But for many, these definitions aren’t precise enough. Enter the phrase “what is big data” in Quora or Google and you’ll find a diverse array of answers as curious thinkers wonder, “How big does data need to be to be ‘big’?” “If ‘big data’ is data that can’t be handled with the typical tools, what tools count as ‘typical’?” Of course, even if we define Big Data correctly, is amassing huge collections of data sets the right goal for today’s enterprise?
A New Big Data Definition
In 2015, Gartner analyst Nick Heudecker wrote that Big Data “is no longer a topic unto itself.” Instead, the term could now be divided amongst several other ideas, including advanced analytics and data science, business intelligence, enterprise information management, and more. He wrote, “The characteristics that defined big data…are no longer exotic. They’re common. The technology landscape continues to change rapidly, but new options look increasingly like old options and old options are evolving quickly.” A better approach, according to Heudecker, was to think less about “doing” Big Data and more about “actual business needs, infrastructure impacts and how your enterprise architectures need to evolve.”
At Teradata, this has been a helpful framework as we help enterprises achieve tangible outcomes from data. We’ve found it best to think of Big Data in terms of value-adding actions that actually move the business forward. Too often, the enterprise spends too much time, effort, and money on Big Data preparation and loading and not nearly enough resources on applying analytics to find difference-making insights.
Big Data isn’t one approach or tool — for example, visualizations are needed in some situations, while connected analytics are the right answer in others. Like so much else in Big Data, it comes down to business problems and objectives. Are users seeking:
-
Temporal patterns or geographical views of market data?
-
Procedural insights from machine logs or sensor data?
-
Correlations of behavioral patterns for a single product, multiple products or a yet-to-be-launched product?
Big Data is often about predictive capabilities and recommendation engines. But it’s also about operational actions guided by market sensitivity. Gaining deeper understanding of the structure and nature of relationships between people and processes and defining patterns that lead to user-defined outcomes.
In the end, defining Big Data comes down to how a particular enterprise will use it. As experts debate whether corporations should focus on data minimization and smart data instead of Big Data, the enterprises that focus on harnessing data to create business value will succeed. Making Big Data work requires a strategic design and thoughtful architecture that not only examines current data streams and repositories but also accounts for specific business objectives, customer behavioral context, and longer-term market trends.
Curious about how Teradata can help you harness data effectively?
Learn about our Big Data Solutions