Load hdfs data into Hive table

网友投稿 255 2022-11-26

Load hdfs data into Hive table

confirm the target location is empty:

[cloudera@quickstart ~]$ hadoop fs -ls /user/hive/warehouse

[cloudera@quickstart ~]$

put source data file into hdfs location:

[cloudera@quickstart ~]$ hadoop fs -ls /test

[cloudera@quickstart ~]$ hadoop fs -put T1.csv /test

[cloudera@quickstart ~]$ hadoop fs -put T1.csv /test/T2.csv

[cloudera@quickstart ~]$ hadoop fs -ls /test

Found 2 items

-rw-r--r--   1 cloudera supergroup          8 2020-03-26 09:31 /test/T1.csv

-rw-r--r--   1 cloudera supergroup          8 2020-03-26 09:31 /test/T2.csv

[cloudera@quickstart ~]$

enter hive database:

[cloudera@quickstart ~]$ hive

Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties

WARNING: Hive CLI is deprecated and migration to Beeline is recommended.

hive> show tables;

OK

Time taken: 0.318 seconds

create table T1 and T2:

hive> create table T1(a int,b int);

OK

Time taken: 0.253 seconds

hive> create table T2(a int,b int) row format delimited fields terminated by ',' stored as textfile;

OK

Time taken: 0.194 seconds

load data into T1, check data loaded to T1 is NULL:

hive> load data inpath '/test/T1.csv' into table T1;

Loading data to table default.t1

Table default.t1 stats: [numFiles=1, totalSize=8]

OK

Time taken: 0.632 seconds

hive> select * from T1;

OK

NULL    NULL

NULL    NULL

Time taken: 0.395 seconds, Fetched: 2 row(s)

load data to T2, works fine:

hive> load data inpath '/test/T2.csv' into table T2;

Loading data to table default.t2

Table default.t2 stats: [numFiles=1, totalSize=8]

OK

Time taken: 0.259 seconds

hive> select * from T2;

OK

1       2

3       4

Time taken: 0.057 seconds, Fetched: 2 row(s)

hive> exit;

WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.

WARN: Please see for an explanation.

[cloudera@quickstart ~]$

After data loading, the source data is moved to the target location, with the same file name:

[cloudera@quickstart ~]$ hadoop fs -ls /test

[cloudera@quickstart ~]$ hadoop fs -ls -R /user/hive/warehouse/

drwxrwxrwx   - cloudera supergroup          0 2020-03-26 09:35 /user/hive/warehouse/t1

-rwxrwxrwx   1 cloudera supergroup          8 2020-03-26 09:31 /user/hive/warehouse/t1/T1.csv

drwxrwxrwx   - cloudera supergroup          0 2020-03-26 09:35 /user/hive/warehouse/t2

-rwxrwxrwx   1 cloudera supergroup          8 2020-03-26 09:31 /user/hive/warehouse/t2/T2.csv

[cloudera@quickstart ~]$ hadoop fs -cat /user/hive/warehouse/t1/T1.csv

1,2

3,4

[cloudera@quickstart ~]$ hadoop fs -cat /user/hive/warehouse/t2/T2.csv

1,2

3,4

[cloudera@quickstart ~]$

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:PYNQ设计案例:基于HDL语言+Vivado的自定义IP核创建
下一篇:java.lang.Runtime.exec的左膀右臂:流输入和流读取详解
相关文章

 发表评论

暂时没有评论,来抢沙发吧~