Tapping Into the “Long Tail” of Big Data

Variety, not volume or velocity, drives big-data investments !!!

Gartner defines big data as the three Vs:high-volume, high-velocity, high-variety information assets. While all three Vs are growing, variety is becoming the single biggest driver of big-data investments, as seen in the results of a recent survey by New Vantage Partners. This trend will continue to grow as firms seek to integrate more sources and focus on the “long tail” of big data. From schema-free JSON to nested types in other databases (relational and NoSQL), to non-flat data (Avro, Parquet, XML), data formats are multiplying and connectors are becoming crucial. In 2017, analytics platforms will be evaluated based on their ability to provide live direct connectivity to these disparate sources.

Tapping Into the “Long Tail” of Big Data

When asked about drivers of Big Data success, 69% of corporate executives named greater data variety as the most important factor, followed by volume (25%), with velocity (6%) trailing. In the corporate world, the big opportunity is to be found in integrating more sources of data, not bigger amounts. Variety, not volume, is king. MIT professor and 2015 Turing Award recipient Michael Stonebraker calls this the “long tail” of Big Data, as companies focus on integrating sources of data that have traditionally been ignored, as well as identifying new data sources. Stonebraker cites the example of life sciences firms with thousands of research scientists, each with their own research databases that have not been tied together for analysis in the past. Tapping into more data sources has emerged as the new data frontier within the corporate world.

How are corporations focusing their data management efforts to develop more robust data and analytics? There are 3 primary paths that firms are taking:

Capture Legacy Data Sources

It may come as a surprise, but many firms see the big opportunity in Big Data resulting from the capture of traditional legacy data sources that have gone untapped in the past. These are data sets that have typically sat outside the purview of traditional data marts or warehouses — the “long tail” data. A significant majority (57%) of firms identified this as their top data priority. One of the beauties of Big Data is that organizations can now go deeper into their own data before they turn to new sources.

Integrate Unstructured Data

Businesses have been inhibited in their ability to mine and analyze the vast amounts of information residing in text and documents. Traditional data environments were designed to maintain and process structured data — numbers and variables — not words and pictures. A growing percentage of firms (29%) are now focusing on integrating this unstructured data, for purposes ranging from customer sentiment analysis to analysis of regulatory documents to insurance claim adjudication. The ability to integrate unstructured data is broadening traditional analytics to combine quantitative metrics with qualitative content.

Add Social Media and Behavioral Data Sources

While much of the early excitement around Big Data resulted from the capture of social media and behavioral activities by firms like eBay and Facebook, these applications have been relatively nascent among the Fortune 1,000, with just 14% citing this as a priority. As firms progress with their Big Data efforts, it is likely that they will turn attention to untapped opportunities presented by social data in areas such as patient adherence and mobile device recommendations based on consumer purchasing behavior and preferences. Timely recommendations can yield immediate results.

As mainstream companies progress on their Big Data journey, we should expect that expanding the variety of data sources for analysis will continue to dominate their interests.

Big Data & Microservices - Know How?

Search This Blog