【Hive】基本建库、建表操作

网友投稿 316 2022-08-25

【Hive】基本建库、建表操作

文章目录

​​一、Hive 数据仓库的操作​​

​​1、创建数据仓库​​​​2、查看 db 数据仓库的信息及路径​​​​3、删除 db 数据仓库​​

​​二、Hive 数据表的操作​​

​​1、创建内部表​​​​2、创建外部表​​​​3、修改表结构​​​​4、删除表​​​​5、创建同结构表​​

环境准备Hadoop 完全分布式(一主两从即可)MySQL环境、Hive环境

一、Hive 数据仓库的操作

验证 hadoop 集群、mysql服务均已启动:

在 ​​Hive​​ 安装目录下的 ​​bin​​ 目录下启动: ​​./hive​​ :

[root@server bin]# ./hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/src/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/src/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Logging initialized using configuration in jar:file:/usr/local/src/hive/lib/hive-common-2.3.9.jar!/hive-log4j2.properties Async: trueHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive>

1、创建数据仓库

// 小技巧:显示当前所使用的数据库名hive> set hive.cli.print.current.db=true // 创建db数据库hive (test)> create database if not exists db;OKTime taken: 0.011 seconds// 使用db数据库hive (test)> use db;OKTime taken: 0.021

在 ​​Hive​​ 中创建一个 ​​db​​ 数据仓库。在创建时,为了避免新建的库名与已有的库名重复,创建的时候添加 ​​if not exists​​ 如果库名重复将会报错:​​FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database db already exists​​,

2、查看 db 数据仓库的信息及路径

hive (db)> describe database db;OKdb hdfs://192.168.64.183:9000/user/hive/warehouse/db.db root USER Time taken: 0.329 seconds, Fetched: 1 row(s)

3、删除 db 数据仓库

// 删除数据仓库hive (db)> drop database if exists db;Sun Mar 06 09:21:29 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.Sun Mar 06 09:21:30 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.Sun Mar 06 09:21:30 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.Sun Mar 06 09:21:30 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.OKTime taken: 0.438 seconds// 再次查看所有数据仓库hive (db)> show databases;OKdefaulttestTime taken: 0.025 seconds, Fetched: 2 row(s)

​​返回顶部​​

二、Hive 数据表的操作

Hive 的数据表分为两种:​​内部表​​ 和 ​​外部表​​。

Hive 创建内部表的时候,会将数据移动到数据仓库指向的路径;若创建外部表,仅记录数据所在的路径,不会对数据的位置做出任何改变。

在删除表的时候,内部表的元数据和数据会被一起删除,而外部表只删除元数据,不删除数据。这样外部表相对来说更加安全些,数据组织也更加灵活,方便共享源数据,生产中常使用外部表。

1、创建内部表

查看已存在的表(当前数据仓库中没有表):

hive (db)> show tables;OKTime taken: 0.026

创建一个名为 ​​cat​​ 的内部表,包含有两个字段:​​cat_id​​、​​cat_name​​,字符类型为 ​​string​​:

hive (db)> create table cat(cat_id string,cat_name string);OKTime taken: 1.046 secondshive (db)> show tables;OKcatTime taken: 0.02 seconds, Fetched: 1 row(s)

如果 cat 内部表已经存在,会报错:​​FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Table cat already exists)​​

2、创建外部表

创建一个名为 ​​cat1​​ 的外部表,包含有两个字段:​​cat_id​​、​​cat_name​​,字符类型为 ​​string​​:

// 创建外部表hive (db)> create external table if not exists cat1(cat_id string,cat_name string) > row format delimited fields terminated by '\t' > location '/usr/root/goods';OKTime taken: 0.315 seconds// 查看表hive (db)> show tables;OKcatcat1Time taken: 0.017 seconds, Fetched: 2 row(s)

3、修改表结构

修改 ​​cat​​ 表的表结构,对 ​​cat​​ 表添加两个字段 ​​group_id​​ 和 ​​cat_code​​:

hive (db)> alter table cat add columns(group_id string,cat_code string);OKTime taken: 0.2 secondshive (db)> desc cat; // 查看表结构OKcat_id string cat_name string group_id string cat_code string Time taken: 0.042 seconds, Fetched: 4 row(s)

修改 ​​cat1​​ 表的表名,将其重命名为 ​​cat2​​ :

hive (db)> alter table cat1 rename to cat2;OKTime taken: 0.142 secondshive (db)> show tables; // 查看表OKcatcat2Time taken: 0.019 seconds, Fetched: 2 row(s)

该命令可以修改表名,数据所在的位置和分区名并不改变!!!

4、删除表

删除名为 ​​cat2​​ 的表:

hive (db)> drop table cat2;OKTime taken: 0.337 secondshive (db)> show tables;OKcatTime taken: 0.024 seconds, Fetched: 1 row(s)

5、创建同结构表

创建与已知表相同结构的表,创建一个与 ​​cat​​ 表相同结构的表,名为 ​​cat3​​,使用 ​​like​​ 关键字:

hive (db)> create table cat3 like cat;OKTime taken: 0.271 secondshive (db)> show tables;OKcatcat3Time taken: 0.018 seconds, Fetched: 2 row(s)hive (db)> desc cat3;OKcat_id string cat_name string group_id string cat_code string Time taken: 0.031 seconds, Fetched: 4 row(s)hive (db)> select * from cat3;OKTime taken: 1.599

通过查看表 cat3 的表结构及其表数据,可以看到 like 就是复制已有表的表结构。

​​返回顶部​​

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:内容和结果均不及格,12强赛收官战国足两球不敌阿曼!(国足vs阿曼历史战绩)
下一篇:【Hive】Hive 分区表、桶的基本操作
相关文章

 发表评论

暂时没有评论,来抢沙发吧~