Today , The World is dependent on Internet and there are many Peoples using Internet just like 80% peoples are internet user and there are huge amounts of Data are generated through the internet like GB, PB, TB , etc and these huge amount of data are called the Big Data.
Big Data are classified into 3 types :-
1) Structured
2) Semi-structured
3) Unstructured
And it has 3V , it means Volume , Velocity , variety , so let us see them what is 3V
1) Volume :- In this volume section , it describes that how much data are generated in the internet
2) Velocity :- In this Velocity section , it describes that what speed of data are generated in internet
3) Variety :- In this Variety section, it describes that what type of data are generated in the internet
So , by Tackling this large datasets We use Big -data framework like Hadoop , Splunk , Spark Cloudera etc , so Lets discuss about the Hadoop Frame work .and why we use it ?
1. Hadoop
Hadoop is the one of the Best Data analytics tool , and it has a various components Like HDFS, Mapreduce , Hive , Pig, Ozzie , Sqoop etc . and those tools are more useful in big data and other stuf
so hadoop is the framework of big data and it is divided into two parts 1) HDFS 2) Mapreduce.
1) HDFS :- HDFS called (Hadoop Distributed File System) , it is a storage file system which we can store our big data in the HDFS section and process the task what we want to do.
2) Mapreduce:- It is the data processing system ,which we can extract the relavant data into the HDFS section and show the output in the console by writing the Mapper and Reducer code , and this Mapreduce code is written in Java , or Python etc.
0 on: "What is Big Data ?"