Hadoop之——Hadoop3.x运行自带的WordCount报错Container exited with a non-zero exit code 1.

网友投稿 367 2022-11-20

Hadoop之——Hadoop3.x运行自带的WordCount报错Container exited with a non-zero exit code 1.

问题:

今天,基于Hadoop3.2.0搭建了Hadoop集群,对NameNode和Yarn做了HA,但是在运行Hadoop自带的WordCount程序时报错了,具体报错信息为:

2019-06-26 16:08:50,513 INFO mapreduce.Job: Job job_1561536344763_0001 failed with state FAILED due to: Application application_1561536344763_0001 failed 2 times due to AM Container for appattempt_1561536344763_0001_000002 exited with exitCode: 1Failing this attempt.Diagnostics: [2019-06-26 16:08:48.218]Exception from container-launch.Container id: container_1561536344763_0001_02_000001Exit code: 1[2019-06-26 16:08:48.287]Container exited with a non-zero exit code 1. Error file: prelaunch.err.Last 4096 bytes of prelaunch.err :Last 4096 bytes of stderr :log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).log4j:WARN Please initialize the log4j system properly.log4j:WARN See for more info.[2019-06-26 16:08:48.288]Container exited with a non-zero exit code 1. Error file: prelaunch.err.Last 4096 bytes of prelaunch.err :Last 4096 bytes of stderr :log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).log4j:WARN Please initialize the log4j system properly.log4j:WARN See for more info.For more detailed output, check the application tracking page: Then click on links to logs of each attempt.. Failing the application.

分析与解决:

在网上搜索了半天,基本上都说的是classpath的问题,于是我也先设置下classpath,具体操作如下:

1.查看Yarn的classpath

在命令行输入如下命令查看Yarn的classpath

-bash-4.1$ yarn classpath/usr/local/hadoop-3.2.0/etc/hadoop:/usr/local/hadoop-3.2.0/share/hadoop/common/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/common/*:/usr/local/hadoop-3.2.0/share/hadoop/hdfs:/usr/local/hadoop-3.2.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/hdfs/*:/usr/local/hadoop-3.2.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/mapreduce/*:/usr/local/hadoop-3.2.0/share/hadoop/yarn:/usr/local/hadoop-3.2.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-3.2.0/share/hadoop/yarn/*

注:查看对应的classpath的值

如果上述输出的类环境变量为空,继续下面的步骤。

2.修改mapred-site.xml

添加:

mapreduce.application.classpath $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*

3.yarn-site.xml

添加:

yarn.application.classpath $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*

4.修改环境变量

sudo vim /etc/profile

在文件末尾添加如下信息:

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoopexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_YARN_HOME=$HADOOP_HOMEexport HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

然后是系统环境变量生效

source /etc/profile

但是到这里,并没有解决问题!没有解决问题!没有解决问题!没有解决问题!

最终解决:

静下心来好好分析下问题,从日志中可以看出:由于跑AM的container退出了,并没有为任务去RM获取资源,所以,这里怀疑是AM和RM通信有问题;一台是备RM,一台活动的RM,在YARN内部,当MR去活动的RM为任务获取资源的时候没问题,但是去备RM获取时就会出现这个问题。于是找到了解决问题的方向,接下来,就在yarn-site.xml进行相应的配置。

打开yarn-site.xml,添加如下配置:

yarn.resourcemanager.address.rm1 binghe103:8032 yarn.resourcemanager.scheduler.address.rm1 binghe103:8030 yarn.resourcemanager.webapp.address.rm1 binghe103:8088 yarn.resourcemanager.resource-tracker.address.rm1 binghe103:8031 yarn.resourcemanager.admin.address.rm1 binghe103:8033 yarn.resourcemanager.ha.admin.address.rm1 binghe103:23142 yarn.resourcemanager.address.rm2 binghe104:8032 yarn.resourcemanager.scheduler.address.rm2 binghe104:8030 yarn.resourcemanager.webapp.address.rm2 binghe104:8088 yarn.resourcemanager.resource-tracker.address.rm2 binghe104:8031 yarn.resourcemanager.admin.address.rm2 binghe104:8033 yarn.resourcemanager.ha.admin.address.rm2 binghe104:23142

具体如下图所示:

整个yarn-site.xml的所有配置如下:

yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 binghe103 yarn.resourcemanager.hostname.rm2 binghe104 yarn.resourcemanager.zk-address binghe105:2181,binghe106:2181,binghe107:2181 yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address.rm1 binghe103:8032 yarn.resourcemanager.scheduler.address.rm1 binghe103:8030 yarn.resourcemanager.webapp.address.rm1 binghe103:8088 yarn.resourcemanager.resource-tracker.address.rm1 binghe103:8031 yarn.resourcemanager.admin.address.rm1 binghe103:8033 yarn.resourcemanager.ha.admin.address.rm1 binghe103:23142 yarn.resourcemanager.address.rm2 binghe104:8032 yarn.resourcemanager.scheduler.address.rm2 binghe104:8030 yarn.resourcemanager.webapp.address.rm2 binghe104:8088 yarn.resourcemanager.resource-tracker.address.rm2 binghe104:8031 yarn.resourcemanager.admin.address.rm2 binghe104:8033 yarn.resourcemanager.ha.admin.address.rm2 binghe104:23142

添加完配置之后,将yarn-site.xml上传到集群中的每台服务器上重新运行程序即可。

备注:Yarn RM的HA是部署在主机名为:binghe103和主机名为:binghe104的两台服务器上的,但是yarn-site.xml文件需要上传到集群中的每台服务器上。

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:JAVA NIO实现简单聊天室功能
下一篇:Hadoop之——HDFS容错
相关文章

 发表评论

暂时没有评论,来抢沙发吧~