Enter this line into Hive command line interface:
insert overwrite directory '/data/test' row format delimited fields terminated by '\t' stored as textfile select * from testViewQuery;
testViewQuery
- some specific view
@sarath how to overwrite the file if i want to run another select * command from a different table and write to same file ?
INSERT OVERWRITE LOCAL DIRECTORY '/home/training/mydata/outputs'
SELECT expl , count(expl) as total
FROM (
SELECT explode(splits) as expl
FROM (
SELECT split(words,' ') as splits
FROM wordcount
) t2
) t3
GROUP BY expl ;
This is an example to sarath's question
the above is a word count job stored in outputs file which is in local directory :)
I agree with tnguyen80's response. Please note that when there is a specific string value in query better to given entire query in double quotes.
For example:
$hive -e "select * from table where city = 'London' and id >=100" > /home/user/outputdirectory/city details.csv
The ideal way to do it will be using "INSERT OVERWRITE DIRECTORY '/pathtofile' select * from temp where id > 100" instead of "hive -e 'select * from...' > /filepath.txt"
Two ways can store HQL query results:
INSERT OVERWRITE DIRECTORY "HDFS Path" ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
SELECT * FROM XXXX LIMIT 10;
$hive -e "select * from table_Name" > ~/sample_output.txt
$hive -e "select * from table where city = 'London' and id >=100" > /home/user/outputdirectory/city details.csv
This will put the results in tab delimited file(s) under a directory:
INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/YourTableDir'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
SELECT * FROM table WHERE id > 100;
The following query will insert the results directly into HDFS:
INSERT OVERWRITE DIRECTORY '/path/to/output/dir' SELECT * FROM table WHERE id > 100;
This command will redirect the output to a text file of your choice:
$hive -e "select * from table where id > 10" > ~/sample_output.txt
To directly save the file in HDFS, use the below command:
hive> insert overwrite directory '/user/cloudera/Sample' row format delimited fields terminated by '\t' stored as textfile select * from table where id >100;
This will put the contents in the folder /user/cloudera/Sample in HDFS.
Example:
Creating external table to store the query results at '/user/myName/projectA_additionaData/'
CREATE EXTERNAL TABLE additionaData
(
ID INT,
latitude STRING,
longitude STRING
)
COMMENT 'Additional Data gathered by joining of the identified cities with latitude and longitude data'
ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' STORED AS TEXTFILE location '/user/myName/projectA_additionaData/';
Feeding the query results into the temp table
insert into additionaData
Select T.ID, C.latitude, C.longitude
from TWITER
join CITY C on (T.location_name = C.location);
Dropping the temp table
drop table additionaData
To set output directory and output file format and more, try the following:
INSERT OVERWRITE [LOCAL] DIRECTORY directory1
[ROW FORMAT row_format] [STORED AS file_format]
SELECT ... FROM ...
Example:
INSERT OVERWRITE DIRECTORY '/path/to/output/dir'
ROW FORMAT DELIMITED
STORED AS PARQUET
SELECT * FROM table WHERE id > 100;
Source: Stackoverflow.com