hive on spark配置

网友投稿 290 2022-11-23

hive on spark配置

1、安装java、maven、scala、hadoop、mysql、hive 略 2、编译spark ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-2.6,parquet-provided" 3、安装spark tar -zxvf spark-1.6.0-bin-hadoop2-without-hive.tgz -C /opt/cdh5/ 4、配置spark ：spark-env.sh export JAVA_HOME=/opt/service/jdk1.8.0_151export SCALA_HOME=/opt/service/scala-2.10.5export HADOOP_HOME=/opt/cdh5/hadoop-2.6.0-cdh5.10.0export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport YARN_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HIVE_CONF_DIR=/opt/cdh5/hive-2.1.0/confexport SPARK_WORKER_CORES=4export SPARK_WORKER_INSTANCES=4export SPARK_WORKER_MEMORY=1gexport SPARK_DRIVER_MEMORY=1gexport SPARK_MASTER_IP=chavin.kingexport SPARK_LIBRARY_PATH=/opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/libexport SPARK_MASTER_WEBUI_PORT=8080export SPARK_WORKER_WEBUI_PORT=8081export SPARK_WORKER_DIR=/opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/workexport SPARK_MASTER_PORT=7077export SPARK_WORKER_PORT=7078export SPARK_LOG_DIR=/opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/log ：spark-default.xml #spark.master yarnspark.master spark://chavin.king:7077spark.home /opt/cdh5/spark-1.6.0-bin-hadoop2-without-hivespark.eventLog.enabled truespark.eventLog.dir hdfs://chavin.king:8020/spark-logspark.serializer org.apache.spark.serializer.KryoSerializerspark.executor.memory 1gspark.driver.memory 1gspark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" ：slaves chavin.king 5、配置yarn ：yarn-site.xml yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 6、配置hive hive.execution.engine spark hive.enable.spark.execution.engine true spark.home /opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive spark.master spark://chavin.king:7077 spark.enentLog.enabled true spark.enentLog.dir hdfs://chavin.king:8020/spark-log spark.serializer org.apache.spark.serializer.KryoSerializer spark.executor.memeory 1g spark.driver.memeory 1g spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three" 7、为hive添加spark jar包： cp /opt/software/spark-1.6.0/core/target/spark-core_2.10-1.6.0.jar /opt/cdh5/hive-2.1.0/lib/ln -s /opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/lib/spark-assembly-1.6.0-hadoop2.6.0.jar /opt/cdh5/hive-2.1.0/lib/ bin/hdfs dfs -put /opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/lib/spark-assembly-1.6.0-hadoop2.6.0.jar hdfs://chavin.king:8020/spark-assembly-1.6.0-hadoop2.6.0.jar 在hive-site.xml中添加： spark.yarn.jar hdfs://chavin.king:8020/spark-assembly-1.6.0-hadoop2.6.0.jar 8、验证hive on spark是否成功配置 $ bin/hivewhich: no hbase in (/opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/bin:/opt/service/maven-3.3.3/bin:/opt/service/scala-2.10.5/bin:/opt/service/jdk1.8.0_151/bin:/opt/service/jdk1.8.0_151/jre/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/home/hadoop/bin)SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/opt/cdh5/hive-2.1.0/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/opt/cdh5/spark-1.6.0-bin-hadoop2-without-hive/lib/spark-assembly-1.6.0-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/opt/cdh5/hadoop-2.6.0-cdh5.10.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in file:/opt/cdh5/hive-2.1.0/conf/hive-log4j2.properties Async: truehive (default)> show tables ;OKtab_namet1Time taken: 0.966 seconds, Fetched: 1 row(s)hive (default)> select count(*) from t1;Query ID = hadoop_20171204024017_cda99c42-21eb-480f-9d2a-e0dbb18a9b63Total jobs = 1Launching Job 1 out of 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers: set hive.exec.reducers.max=In order to set a constant number of reducers: set mapreduce.job.reduces=Starting Spark Job = e8b4ccc6-2dfa-43b9-99cc-7a066e2c0a0f Query Hive on Spark job[0] stages:01 Status: Running (Hive on Spark job[0])Job Progress FormatCurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]2017-12-04 02:40:32,861 Stage-0_0: 0/1 Stage-1_0: 0/1 ... ...2017-12-04 02:44:11,388 Stage-0_0: 1/1 Finished Stage-1_0: 0(+1)/1 2017-12-04 02:44:50,826 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 Finished Status: Finished successfully in 268.11 secondsOKc03Time taken: 338.493 seconds, Fetched: 1 row(s)hive (default)> exit;

标签：java

暂时没有评论，来抢沙发吧~

hive on spark配置

java怎么拦截某个对象

swing可视化界面怎么使用

java上下页翻转功能怎么实现

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）