[hadoop] When to use Hadoop, HBase, Hive and Pig?

Hadoop:

HDFS stands for Hadoop Distributed File System which uses Computational processing model Map-Reduce.

HBase:

HBase is Key-Value storage, good for reading and writing in near real time.

Hive:

Hive is used for data extraction from the HDFS using SQL-like syntax. Hive use HQL language.

Pig:

Pig is a data flow language for creating ETL. It's an scripting language.

Examples related to hadoop

Hadoop MapReduce: Strange Result when Storing Previous Value in Memory in a Reduce Class (Java) What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check Spark Version What are the pros and cons of parquet format compared to other formats? java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient How to export data from Spark SQL to CSV How to copy data from one HDFS to another HDFS? How to calculate Date difference in Hive Select top 2 rows in Hive Spark - load CSV file as DataFrame?

Examples related to hbase

When to use Hadoop, HBase, Hive and Pig? Hive load CSV with commas in quoted fields Hbase quickly count number of rows How to delete all data from solr and hbase

Examples related to hive

select rows in sql with latest date for each ID repeated multiple times PySpark: withColumn() with two conditions and three outcomes java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient Hive cast string to date dd-MM-yyyy How to save DataFrame directly to Hive? How to calculate Date difference in Hive Select top 2 rows in Hive Just get column names from hive table Create hive table using "as select" or "like" and also specify delimiter Hive Alter table change Column Name

Examples related to apache-pig

When to use Hadoop, HBase, Hive and Pig? PIG how to count a number of rows in alias Difference between Pig and Hive? Why have both?