Introduction:
In this blog, I will discuss Big Data, its characteristics, different sources of Big Dataand some key components of Hadoop Framework.
In the two part blog series, I will cover the basics of Hadoop Ecosystem.
Let us start with Big Data and its importance in Hadoop Framework. Ethics, privacy, security measures are very important and need to be taken care while dealing with the challenges of Big Data.
Big Data: When the Data itself becomes the part of the problem.
Big Data: When the Data itself becomes the part of the problem.
Data is crucial for all organizations. It has to be stored for future use. We can refer the term Big Data as the data, which is beyond the storage capacity and the processing power of an organization.
What are the sources of this huge data?
What are the sources of this huge data?
There are different sources of data such as the social networks, CCTV cameras, sensors, online shopping portals, hospitality data, GPS, automobile industry etc., that generate data massively.
Big Data can be characterized as:
- The Volume of the Data
- Velocity of the Data
- The Variety of Data being processed
Volume of Data à Data is increasing rapidly in GB, TB, PB and so on, and requires huge disk space to store it.
Velocity of Data à Huge Data is stored in Data Centres to cater to the organizational needs. In order to get data to the local workstation high-speed data processors are needed.
Variety of Data à Data can be broadly classified into the following types-Structured, Unstructured & Semi structured.
Big Data = (Volume + Velocity + Variety) of Data
Comments
Post a Comment