hadoop 分布式部署

网友投稿 293 2022-11-25

hadoop 分布式部署

1、节点:192.168.100.40 node04.magedu.com node04 node04192.168.100.30 node03.magedu.com node03 node03192.168.100.20 node02.magedu.com node02 node02192.168.100.10 node01.magedu.com node01 node01

2、配置java 环境1、yum install java-1.7.0-openjdk-devel2、vim /etc/profile.d/java.shexport JAVA_HOME=/usr#. /etc/profile.d/java.sh source /etc/profile.d/java.sh####pssh -ih iplist 'yum install java-1.7.0-openjdk-devel -y'pssh -ih iplist 'echo export JAVA_HOME=/usr > /etc/profile.d/java.sh'pssh -ih iplist '. /etc/profile.d/java.sh' [root@node01 ~]# java -version

3、时间同步:vim /etc/chrony.confserver 192.168.100.10 iburst allow 192.168.100.0/24

systemctl is-enabled chronydsystecmtl enable chronydsystemctl start chronyd clock -w 软时间同步应时间clock -s 硬件时间同步软时间

pssh -ih iplist 'date'

4、配置hadoop.sh[root@node01~]#cat /etc/profile.d/hadoop.shexport HADOOP_PREFIX=/bdapps/hadoopexport PATH=$PATH:${HADOOP_PREFIX}/bin:${HADOOP_PREFIX}/sbinexport HADOOP_YARN_HOME=${HADOOP_PREFIX}export HADOOP_MAPPERD_HOME=${HADOOP_PREFIX}export HADOOP_COMMON_HOME=${HADOOP_PREFIX}export HADOOP_HDFS_HOME=${HADOOP_PREFIX}. /etc/profile.d/hadoop.sh for i in node0{2..4} ;do scp /etc/profile.d/java.sh /etc/profile.d/hadoop.sh $i:/etc/profile.d/; pssh -ih iplist ". /etc/profile.d/java.sh ; . /etc/profile.d/hadoop.sh "

5、创建用户:groupadd hadoop useradd hadoop(4个节点同一用户)echo 'magedu' |passwd --stdin hadoop useradd -g hadoop hadoopsu - hadoop

pssh -ih iplist "groupadd hadoop"pssh -ih iplist "useradd -g hadoop hadoop"pssh -ih iplist "echo 'centos' |passwd --stdin hadoop "

6、hadoop 用户免密:su - hadoopfor i in 1 2 3 4 ; do ssh-copy-id -i .ssh/id_rsa.pub hadoop@node0${i}; done

7、visudo

useradd -g wheel hadoopusermod -aG wheel hadoop pssh -ih iplist 'usermod -aG wheel hadoop'pssh -ih iplist 'id hadoop'

8、创建数据目录及配置目录[root@node01 ~]#pssh -ih iplist 'mkdir -pv /bdapps /data/hadoop/hdfs/{nn,snn,dn}'[root@node01 ~]# pssh -ih iplist 'chown -R hadoop:hadoop /data/hadoop/hdfs'[root@node01 ~]# pssh -ih iplist 'tar xf hadoop-2.7.3.tar.gz -C /bdapps' [root@node01 ~]# pssh -ih iplist ' ln -sv /bdapps/hadoop-2.7.3 /bdapps/hadoop' [root@node01 ~]# pssh -ih iplist ' ls -l /bdapps/hadoop ' [root@node01 ~]# pssh -ih iplist ' cd /bdapps/hadoop ; mkdir logs;chmod g+w logs'[root@node01 ~]# pssh -ih iplist ' chown -R hadoop:hadoop /bdapps' [root@node01 ~]# pssh -ih iplist ' ls -ld /bdapps/hadoop/logs' scp /bdapps/hadoop/etc/hadoop/* node3:/bdapps/hadoop/etc/hadoop

9 、hdfs 文件配置:[root@node01 ~]# cd /bdapps/hadoop/etc/hadoop/[root@node01 hadoop]# cat core-site.xmlfs.defaultFShdfs://node01:8020true[root@node01 hadoop]# cat hdfs-site.xml dfs.replication2

dfs.namenode.name.dir file:///data/hadoop/hdfs/nn dfs.datanode.data.dir file:///data/hadoop/hdfs/dn fs.checkpoint.dir file:///data/hadoop/hdfs/snn fs.checkpoint.edits.dir file:///data/hadoop/hdfs/snn

[root@node01 hadoop]# cat mapred-site.xmlmapreduce.framework.nameyarn

[root@node01 hadoop]# cat yarn-site.xml yarn.resourcemanager.addressnode01:8032yarn.resourcemanager.scheduler.addressnode01:8030yarn.resourcemanager.resource-tracker.addressnode01:8031yarn.resourcemanager.admin.addressnode01:8033yarn.resourcemanager.webapp.addressnode01:8088yarn.nodemanager.aux-servicesmapreduce_shuffleyarn.nodemanager.auxservices.mapreduce_shuffle.classorg.apache.hadoop.mapred.ShuffleHandleryarn.resourcemanager.scheduler.classorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

10、拷贝配置文件:[root@node01 hadoop]# su - hadoop cd /bdapps/hadoop/etc/hadoop/[hadoop@node01 hadoop]$ for i in node0{2..4}; do scp core-site.xml hdfs-site.xml mapred-site.xml yarn-site.xml $i:/bdapps/hadoop/etc/hadoop/; done [root@node01 ~]# pssh -ih iplist 'ls /bdapps/hadoop/etc/hadoop/ -l '

11、格式化:su - hadoop

hdfs namenode -format

12、启动服务:启动hadoop:namenode:hadoop-daemon.sh(start/stop) namenodedatanode:hadoop-daemon.sh(start|stop) datanodesecondary namenode: hadoop-daemon.sh(start|stop)secondarynamenode

Jps:查看java 进程; jps -v

resourcemanager:yarn-daemon.sh(start|stop)resourcemanager

nodemanager:yarn-daemon.sh(start|stop) nodemanager

###################[hadoop@node01 mapreduce]$ cd /bdapps/hadoop/sbin/[hadoop@node01 sbin]$ lsdistribute-exclude.sh kms.sh start-balancer.sh stop-all.cmd stop-yarn.cmdhadoop-daemon.sh mr-jobhistory-daemon.sh start-dfs.cmd stop-all.sh stop-yarn.shhadoop-daemons.sh refresh-namenodes.sh start-dfs.sh stop-balancer.sh yarn-daemon.shhdfs-config.cmd slaves.sh start-secure-dns.sh stop-dfs.cmd yarn-daemons.shhdfs-config.sh start-all.cmd start-yarn.cmd stop-dfs.sh start-all.sh start-yarn.sh stop-secure-dns.sh[hadoop@node01 sbin]$ stop-yarn.sh stop-dfs.sh start-dfs.sh start-yarn.sh####################

13、hadoop 测试hdfs dfs -mkdir /test hdfs dfs -put /etc/fstab /test/fstab

hdfs dfs -cat /test/fstab

hdfs dfs -put /etc/rc.d/init.d/functions /test/

hdfs dfs -put /etc/issue hdfs://master/test/

14、排错:1、dn 与nn 一直刷尝试连接:[hadoop@node02 dn]$ tail /bdapps/hadoop/logs/hadoop-hadoop-datanode-node02.magedu.com.log 2020-11-08 19:34:49,420 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.100.10:8020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)没有关闭防火墙:sudo systemctl stop firewalld

2、nn节点 格式化:导致cluster id 与 dn clusterid 不一致:

[hadoop@node01 mapreduce]$ hdfs dfs -ls /test/fstab.out/Found 2 items-rw-r--r-- 2 hadoop supergroup 0 2020-11-08 20:29 /test/fstab.out/_SUCCESS-rw-r--r-- 2 hadoop supergroup 432 2020-11-08 20:29 /test/fstab.out/part-r-00000

[hadoop@node01 mapreduce]$ hdfs dfs -cat /test/fstab.out/part-r-00000'/dev/disk' 1/ 1/boot 1/etc/fstab 10 608:19:43 12020 15 1Accessible 1Created 1Nov 1See 1Thu 1

yarn :

[hadoop@node01 sbin]$ yarn application -list -appStates=ALL20/11/08 20:40:02 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING, FINISHED, FAILED, KILLED]):1Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URLapplication_1604890022094_0001 word count MAPREDUCE hadoop default FINISHED SUCCEEDED 100% sbin]$ yarn application -list20/11/08 20:40:26 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):0Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL[hadoop@node01 sbin]$

[hadoop@node01 sbin]$ yarn application -status application_1604890022094_0001 20/11/08 20:44:49 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032Application Report : Application-Id : application_1604890022094_0001Application-Name : word countApplication-Type : MAPREDUCEUser : hadoopQueue : defaultStart-Time : 1604896147557Finish-Time : 1604896174398Progress : 100%State : FINISHEDFinal-State : SUCCEEDEDTracking-URL : Port : 39072AM Host : node01.magedu.comAggregate Resource Allocation : 81943 MB-seconds, 46 vcore-secondsDiagnostics :

-kill

[hadoop@node01 sbin]$ yarn application -kill application_1604890022094_0001 20/11/08 20:47:24 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032Application application_1604890022094_0001 has already finished

node #############yarn node -list yarn node -status $node-idyarn logs -applicationId application_1444_01 yarn classpath RMAdminyarn rmadmin -refreshNodesDaemonLog;####################

[hadoop@node01 sbin]$ yarn node -list 20/11/08 20:49:48 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032Total Nodes:4Node-Id Node-State Node-Http-Address Number-of-Running-Containersnode01.magedu.com:44801 RUNNING node01.magedu.com:8042 0node02.magedu.com:35211 RUNNING node02.magedu.com:8042 0node03.magedu.com:38156 RUNNING node03.magedu.com:8042 0node04.magedu.com:38078 RUNNING node04.magedu.com:8042 0

[hadoop@node01 sbin]$ yarn node -status node01.magedu.com:44801 20/11/08 20:51:53 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032Node Report : Node-Id : node01.magedu.com:44801Rack : /default-rackNode-State : RUNNINGNode-Http-Address : node01.magedu.com:8042Last-Health-Update : Sun 08/Nov/20 08:50:39:428PSTHealth-Report : Containers : 0Memory-Used : 0MBMemory-Capacity : 8192MBCPU-Used : 0 vcoresCPU-Capacity : 8 vcoresNode-Labels :

[hadoop@node01 sbin]$ yarn logs -applicationId application_1604890022094_000120/11/08 20:53:55 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8032/tmp/logs/hadoop/logs/application_1604890022094_0001 does not exist.Log aggregation has not completed or is not enabled.

[hadoop@node01 sbin]$ yarn rmadmin -refreshNodes 20/11/08 20:55:51 INFO client.RMProxy: Connecting to ResourceManager at node01/192.168.100.10:8033

运行yarn application

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:创基Type-C分线器用芯产品确保客户性能需求
下一篇:利用Java实现mTLS调用
相关文章

 发表评论

暂时没有评论,来抢沙发吧~