[database] Insert data into hive table

Using a Cygwin distribution, I've installed Hadoop 0.20.3 and Hive 0.11.0.

First of all, I don't understand how to use the Hive CLI:

hive> show tables;

Then enter and nothing happens. I can execute queries using hive -e/-f.

Then, I've created a table:

CREATE TABLE tweet_table(
tweet STRING
)
COMMENT 'Table of string'

But how can I insert data into this table? I see some INSERT INTO examples but when I try:

INSERT INTO TABLE tweet_table (tweet) VALUES ("data")

I've got an error:

FAILED: ParseException line 1:30 cannot recognize input near '(' 'tweet' ')' in select clause

How can I append data in my table?

This question is related to database hadoop hive

The answer is


Try to use this with single quotes in data:

insert into table test_hive values ('1','puneet');

If table is without partition then code will be,

Insert into table table_name select col_a,col_b,col_c from another_table(source table)

--here any condition can be applied such as limit, group by, order by etc...

If table is with partitions then code will be,

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

insert into table table_name partition(partition_col1, paritition_col2) select col_a,col_b,col_c,partition_col1,partition_col2 from another_table(source table)

--here any condition can be applied such as limit, group by, order by etc...


Although there is an accepted answer I would want to add that as of Hive 0.14, record level operations are allowed. The correct syntax and query would be:

INSERT INTO TABLE tweet_table VALUES ('data');

If you already have a table pre_loaded_tbl with some data. You can use a trick to load the data into your table with following query

INSERT INTO TABLE tweet_table 
  SELECT  "my_data" AS my_column 
    FROM   pre_loaded_tbl 
   LIMIT   5;

Also please note that "my_data" is independent of any data in the pre_loaded_tbl. You can select any data and write any column name (here my_data and my_column). Hive does not require it to have same column name. However structure of select statement should be same as that of your tweet_table. You can use limit to determine how many times you can insert into the tweet_table.

However if you haven't' created any table, you will have to load the data using file copy or load data commands in above answers.


Examples related to database

Implement specialization in ER diagram phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' cannot be loaded Room - Schema export directory is not provided to the annotation processor so we cannot export the schema SQL Query Where Date = Today Minus 7 Days MySQL Error: : 'Access denied for user 'root'@'localhost' SQL Server date format yyyymmdd How to create a foreign key in phpmyadmin WooCommerce: Finding the products in database TypeError: tuple indices must be integers, not str

Examples related to hadoop

Hadoop MapReduce: Strange Result when Storing Previous Value in Memory in a Reduce Class (Java) What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check Spark Version What are the pros and cons of parquet format compared to other formats? java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient How to export data from Spark SQL to CSV How to copy data from one HDFS to another HDFS? How to calculate Date difference in Hive Select top 2 rows in Hive Spark - load CSV file as DataFrame?

Examples related to hive

select rows in sql with latest date for each ID repeated multiple times PySpark: withColumn() with two conditions and three outcomes java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient Hive cast string to date dd-MM-yyyy How to save DataFrame directly to Hive? How to calculate Date difference in Hive Select top 2 rows in Hive Just get column names from hive table Create hive table using "as select" or "like" and also specify delimiter Hive Alter table change Column Name