hadoop 3节点高可用分布式安装
1、先对即将安装的服务进行规划
Ip | host | 安装软件
| 进程 |
10.10.10.5
| master | hadoop、zookeeper | NameNode
|
DFSZKFailoverController |
JournalNode
|
DataNode |
ResourceManager |
jobHistoryServer |
NodeManager |
10.10.10.6
| slave1 | hadoop、zookeeper | NameNode
|
DFSZKFailoverController |
JournalNode |
dataNode |
ResourceManager |
NodeManager |
QuoruPeerMain |
10.10.10.6 | slave2 | hadoop、zookeeper | JournalNode |
DataNode |
NodeManager |
QuorumPeerMain |
环境准备
关闭防火墙
systemctl stop iptables.service
systemctl disable iptables.service
systemctl disable iptables.service
1、上传安装包 hadoop-2.6.0-cdh5.16.2.tar.gz zookeeper-3.4.5-cdh5.16.2.tar.gz 到 /opt/soft 目录下
2、设置主机名
master:
hostname master
vi /etc/sysconfig/network
slave1:
hostname slave1
slave2:
hostname slave2
配置ip 和hostname 的映射关系
vim cat /etc/hosts
通过 将修改后的文件发到slave1 和slave2
scp /etc/hosts root@slave1:/etc/
scp /etc/hosts root@slave2:/etc/
我在三台服务上配置了互信,因此可以直接发送,若不能直接发送,可百度看看互信怎么配置
3、 配置jdk 环境 hadoop zookeeper
如图我的jdk jar 包 解压的文件在 /usr/local/jdk 、hadoop:/opt/soft2/hadoop zookeeper:/opt/soft2/zookeeper
4、修改zookeeper 配置
cd /opt/soft2/zookeeper/conf
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg
主要修改dataDir,zk 存放数据的路径
mkdir /opt/soft2/zookeeper/zkData
使用 scp -r zookeeper slave1:/opt/soft2/
使用 scp -r zookeeper slave2:/opt/soft2/
将zookeeper 文件整个拷贝到其余节点
在每个 节点data目录中根据根据配置文件的
master中 echo 1 > /opt/soft2/zookeeper/zkData/myid
slave1中 echo 2 > /opt/soft2/zookeeper/zkData/myid
slave2中 echo 3 > /opt/soft2/zookeeper/zkData/myid
安装hadoop
修改hadoop 的配置文件
cd /opt/soft2/hadoop/etc/hadoop
vim hadoop-env.sh
配置jdk 环境
配置hadoop的核心配置
vim core-site.xml
fs.defaultFS
hdfs://mycluster
fs.trash.checkpoint.interval
0
fs.trash.interval
10080
hadoop.tmp.dir
/opt/soft2/hadoop/data
ha.zookeeper.quorum
master:2181,slave1:2181,slave2:2181
ha.zookeeper.session-timeout.ms
2000
hadoop.proxyuser.hadoop.hosts
*
hadoop.proxyuser.hadoop.groups
*
io.compression.codecs
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.SnappyCodec
vim hdfs-site.xml 配置hdfs
dfs.permissions.superusergroup
hadoop
dfs.webhdfs.enabled
true
dfs.namenode.name.dir
/opt/soft2/hadoop/data/dfsname
namenode 存放name table(fsimage)本地目录(需要修改)
dfs.namenode.edits.dir
${dfs.namenode.name.dir}
namenode粗放 transaction file(edits)本地目录(需要修改)
dfs.datanode.data.dir
/opt/soft2/hadoop/data/dfsdata
datanode存放block本地目录(需要修改)
dfs.replication
3
dfs.blocksize
134217728
dfs.blocksize
134217728
dfs.nameservices
mycluster
dfs.ha.namenodes.mycluster
nn1,nn2
dfs.namenode.rpc-address.mycluster.nn1
master:8020
dfs.namenode.rpc-address.mycluster.nn2
slave1:8020
dfs.namenode.editlog同步 ============================================ -->
dfs.journalnode.用于存储editlog -->
dfs.namenode.shared.edits.dir
qjournal://master:8485;slave1:8485;slave2:8485/mycluster
dfs.journalnode.edits.dir
/home/hadoop/data/dfs/jn
dfs.client.failover.proxy.provider.ruozeclusterg10
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/home/hadoop/.ssh/id_rsa
dfs.ha.fencing.ssh.connect-timeout
30000
dfs.ha.automatic-failover.enabled
true
dfs.hosts
/opt/soft2/hadoop/etc/hadoop/slaves
修改mapred-site.xml
配置中不存在该配置
cp mapred-site.xml.template mapred-site.xml
vim mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.jobhistory.address
master:10020
mapreduce.jobhistory.webapp.address
slave1:19888
mapreduce.map.output.compress
true
mapreduce.map.output.compress.codec
org.apache.hadoop.io.compress.SnappyCodec
vim slaves 将下列添加进去
master
slave1
slave2
vim yarn-env.sh
vim yarn-site.xml
yarn.nodemanager.aux-services
mapreduce_shuffle
yarn.nodemanager.aux-services.mapreduce.shuffle.class
org.apache.hadoop.mapred.ShuffleHandler
yarn.nodemanager.localizer.address
0.0.0.0:23344
Address where the localizer IPC is.
yarn.nodemanager.webapp.address
0.0.0.0:23999
NM Webapp address.
yarn.resourcemanager.connect.retry-interval.ms
2000
yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.ha.automatic-failover.enabled
true
yarn.resourcemanager.ha.automatic-failover.embedded
true
yarn.resourcemanager.cluster-id
yarn-cluster
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.scheduler.class
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
yarn.resourcemanager.recovery.enabled
true
yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms
5000
yarn.resourcemanager.store.class
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
yarn.resourcemanager.zk-address
master:2181,slave1:2181,slave2:2181
yarn.resourcemanager.zk.state-store.address
master:2181,slave1:2181,slave2:2181
yarn.resourcemanager.address.rm1
master:23140
yarn.resourcemanager.address.rm2
slave1:23140
yarn.resourcemanager.scheduler.address.rm1
master:23130
yarn.resourcemanager.scheduler.address.rm2
slave1:23130
yarn.resourcemanager.admin.address.rm1
master:23141
yarn.resourcemanager.admin.address.rm2
slave1:23141
yarn.resourcemanager.resource-tracker.address.rm1
master:23125
yarn.resourcemanager.resource-tracker.address.rm2
slave1:23125
yarn.resourcemanager.webapp.address.rm1
master:8088
yarn.resourcemanager.webapp.address.rm2
slave1:8088
yarn.resourcemanager.webapp.: 启动 zkServer.sh start 查看状态 zkServer.sh status
启动hadoop(hdfs+yarn)
1、三台电脑均启动日志 JournalNode
hadoop-daemon.sh start journalnode
2、格式化hadoop
hadoop namenode -format
将生成的元数据发送到各个节点
[root@master hadoop]# scp -r data slave1:/opt/soft2/hadoop/
fsimage_0000000000000000000 100% 317 0.3KB/s 00:00
VERSION 100% 202 0.2KB/s 00:00
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
seen_txid
3、初始化zkfc
hdfs zkfc -formatZK
4、启动hdfs的分布式文件系统
start-dfs.sh
5、启动yarn
start-yarn.sh
关闭集群
关闭yarn stop-yarn.sh
关闭hdfs stop-dfs.sh
关闭zookeeper: 所有的节点都执行 zkServer.sh stop
启动集群
1、启动zookeeper 所有的节点都执行 zkServer.sh start 2、启动hadoop start-dfs.sh start-yarn.sh 另一个备份节点 yarn-daemon.sh start resourcemanager
监控集群: hdfs dfsadmin -report
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。
暂时没有评论,来抢沙发吧~