【ceph故障排查】ceph集群添加了一个osd之后,该osd的状态始终为down

网友投稿 305 2022-11-12

【ceph故障排查】ceph集群添加了一个osd之后,该osd的状态始终为down

背景

ceph集群添加了一个osd之后,该osd的状态始终为down。

错误提示

状态查看如下

1、查看osd tree [root@node1 Asia]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.05388 root default -2 0.01469 host node1 0 0.00490 osd.0 up 1.00000 1.00000 1 0.00490 osd.1 up 1.00000 1.00000 2 0.00490 osd.2 up 1.00000 1.00000 -3 0.01959 host node2 4 0.00490 osd.4 up 1.00000 1.00000 5 0.00490 osd.5 up 1.00000 1.00000 6 0.00490 osd.6 up 1.00000 1.00000 7 0.00490 osd.7 up 1.00000 1.00000 -4 0.01959 host node3 8 0.00490 osd.8 up 1.00000 1.00000 9 0.00490 osd.9 up 1.00000 1.00000 3 0.00490 osd.3 up 1.00000 1.00000 10 0.00490 osd.10 up 1.00000 1.00000 11 0 osd.11 down 0 1.00000 [root@node1 Asia]# 2、查看osd状态 [root@node1 /]# systemctl status ceph-osd@11 ● ceph-osd@11.service - Ceph object storage daemon Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled) Active: failed (Result: start-limit) since Sun 2018-09-09 22:15:25 EDT; 4h 57min ago Process: 10331 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=1/FAILURE) Sep 09 22:15:05 node1 systemd[1]: ceph-osd@11.service: control process exited, code=exited status=1 Sep 09 22:15:05 node1 systemd[1]: Failed to start Ceph object storage daemon. Sep 09 22:15:05 node1 systemd[1]: Unit ceph-osd@11.service entered failed state. Sep 09 22:15:05 node1 systemd[1]: ceph-osd@11.service failed. Sep 09 22:15:25 node1 systemd[1]: ceph-osd@11.service holdoff time over, scheduling restart. Sep 09 22:15:25 node1 systemd[1]: start request repeated too quickly for ceph-osd@11.service Sep 09 22:15:25 node1 systemd[1]: Failed to start Ceph object storage daemon. Sep 09 22:15:25 node1 systemd[1]: Unit ceph-osd@11.service entered failed state. Sep 09 22:15:25 node1 systemd[1]: ceph-osd@11.service failed. 3、启动osd [root@node1 /]# systemctl start ceph-osd@11 Job for ceph-osd@11.service failed because the control process exited with error code. See "systemctl status ceph-osd@11.service" and "journalctl -xe" for details. 4、查看错误 root@node1 /]# journalctl -xe Sep 10 03:12:52 node1 polkitd[723]: Unregistered Authentication Agent for unix-process:10473:4129481 (system bus name :1.52, object p Sep 10 03:13:12 node1 systemd[1]: ceph-osd@11.service holdoff time over, scheduling restart. Sep 10 03:13:12 node1 systemd[1]: Starting Ceph object storage daemon... -- Subject: Unit ceph-osd@11.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit ceph-osd@11.service has begun starting up. Sep 10 03:13:12 node1 ceph-osd-prestart.sh[10483]: OSD data directory /var/lib/ceph/osd/ceph-11 does not exist; bailing out. Sep 10 03:13:12 node1 systemd[1]: ceph-osd@11.service: control process exited, code=exited status=1 Sep 10 03:13:12 node1 systemd[1]: Failed to start Ceph object storage daemon. -- Subject: Unit ceph-osd@11.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit ceph-osd@11.service has failed. -- -- The result is failed. Sep 10 03:13:12 node1 systemd[1]: Unit ceph-osd@11.service entered failed state. Sep 10 03:13:12 node1 systemd[1]: ceph-osd@11.service failed.

其实我也不知道上卖弄的错误是什么原因,但是根据我的记录,这个osd添加的时候,整个集群处于ERR的状态。

错误解决

添加osd时集群状态如下:

[root@node1 ceph]# ceph -s cluster 8eaa3f15-0946-4500-b018-6d31d1cc69f6 health HEALTH_ERR clock skew detected on mon.node2, mon.node3 121 pgs are stuck inactive for more than 300 seconds 121 pgs peering 121 pgs stuck inactive 121 pgs stuck unclean Monitor clock skew detected monmap e1: 3 mons at {node1=192.168.209.100:6789/0,node2=192.168.209.101:6789/0,node3=192.168.209.102:6789/0} election epoch 266, quorum 0,1,2 node1,node2,node3 osdmap e5602: 12 osds: 11 up, 11 in; 120 remapped pgs flags sortbitwise,require_jewel_osds pgmap v16259: 128 pgs, 1 pools, 0 bytes data, 0 objects 1421 MB used, 54777 MB / 56198 MB avail 120 remapped+peering 7 active+clean 1 peering

当集群处于ERR的时候,添加osd是会有问题的。所以我决定重新添加一次osd(目前ceph集群的状态为ok)

删除步骤如下:

1、集群状态如下 [root@node1 ceph]# ceph -s cluster 8eaa3f15-0946-4500-b018-6d31d1cc69f6 health HEALTH_OK monmap e1: 3 mons at {node1=192.168.209.100:6789/0,node2=192.168.209.101:6789/0,node3=192.168.209.102:6789/0} election epoch 292, quorum 0,1,2 node1,node2,node3 osdmap e5664: 12 osds: 12 up, 12 in flags sortbitwise,require_jewel_osds pgmap v16508: 128 pgs, 1 pools, 0 bytes data, 0 objects 1500 MB used, 59806 MB / 61307 MB avail 128 active+clean 2、将down的osd踢出ceph集群 [root@node1 /]# ceph osd out osd.11 osd.11 is already out. 3、将down的osd删除 [root@node1 /]# ceph osd rm osd.11 removed osd.11 4、将down的osd从CRUSH中删除 [root@node1 /]# ceph osd crush rm osd.11 device 'osd.11' does not appear in the crush map 5、删除osd的认证信息 [root@node1 /]# ceph auth del osd.11 updated

添加步骤如下:

1、擦除磁盘 [root@node1 /]# ceph-disk zap /dev/sde Caution: invalid backup GPT header, but valid main header; regenerating backup header from main header. Warning! Main and backup partition tables differ! Use the 'c' and 'e' options on the recovery & transformation menu to examine the two tables. Warning! One or more CRCs don't match. You should repair the disk! **************************************************************************** Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk verification and recovery are STRONGLY recommended. **************************************************************************** GPT data structures destroyed! You may now partition the disk using fdisk or other utilities. Creating new GPT entries. The operation has completed successfully. [root@node1 /]# 2、创建osd [root@node1 ceph]# ceph-deploy --overwrite-conf osd create node1:/dev/sde [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.39): /usr/bin/ceph-deploy --overwrite-conf osd create node1:/dev/sde [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] block_db : None [ceph_deploy.cli][INFO ] disk : [('node1', '/dev/sde', None)] [ceph_deploy.cli][INFO ] dmcrypt : False [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] bluestore : None [ceph_deploy.cli][INFO ] block_wal : None [ceph_deploy.cli][INFO ] overwrite_conf : True [ceph_deploy.cli][INFO ] subcommand : create [ceph_deploy.cli][INFO ] dmcrypt_key_dir : /etc/ceph/dmcrypt-keys [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] fs_type : xfs [ceph_deploy.cli][INFO ] filestore : None [ceph_deploy.cli][INFO ] func : [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.cli][INFO ] zap_disk : False [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks node1:/dev/sde: [node1][DEBUG ] connected to host: node1 [node1][DEBUG ] detect platform information from remote host [node1][DEBUG ] detect machine type [node1][DEBUG ] find the location of an executable [ceph_deploy.osd][INFO ] Distro info: CentOS Linux 7.5.1804 Core [ceph_deploy.osd][DEBUG ] Deploying osd to node1 [node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.osd][DEBUG ] Preparing host node1 disk /dev/sde journal None activate True [node1][DEBUG ] find the location of an executable [node1][INFO ] Running command: /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /dev/sde [node1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid [node1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph [node1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph [node1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph --setuser ceph --setgroup ceph [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] set_type: Will colocate journal with data on /dev/sde [node1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mkfs_options_xfs [node1][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mkfs_options_xfs [node1][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs [node1][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] ptype_tobe_for_name: name = journal [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] create_partition: Creating journal partition num 2 size 5120 on /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/sbin/sgdisk --new=2:0:+5120M --change-name=2:ceph journal --partition-guid=2:22ea9667-570d-4697-b9dc-21968d31c445 --typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt -- /dev/sde [node1][DEBUG ] The operation has completed successfully. [node1][WARNIN] update_partition: Calling partprobe on created device /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 [node1][WARNIN] command: Running command: /usr/bin/flock -s /dev/sde /usr/sbin/partprobe /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde2 uuid path is /sys/dev/block/8:66/dm/uuid [node1][WARNIN] prepare_device: Journal is GPT partition /dev/disk/by-partuuid/22ea9667-570d-4697-b9dc-21968d31c445 [node1][WARNIN] prepare_device: Journal is GPT partition /dev/disk/by-partuuid/22ea9667-570d-4697-b9dc-21968d31c445 [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] set_data_partition: Creating osd partition on /dev/sde [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] ptype_tobe_for_name: name = data [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] create_partition: Creating data partition num 1 size 0 on /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/sbin/sgdisk --largest-new=1 --change-name=1:ceph data --partition-guid=1:e9aecd36-93a6-456a-b05f-b8097d16d88d --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be --mbrtogpt -- /dev/sde [node1][DEBUG ] Warning: The kernel is still using the old partition table. [node1][DEBUG ] The new table will be used at the next reboot. [node1][DEBUG ] The operation has completed successfully. [node1][WARNIN] update_partition: Calling partprobe on created device /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 [node1][WARNIN] command: Running command: /usr/bin/flock -s /dev/sde /usr/sbin/partprobe /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde1 uuid path is /sys/dev/block/8:65/dm/uuid [node1][WARNIN] populate_data_path_device: Creating xfs fs on /dev/sde1 [node1][WARNIN] command_check_call: Running command: /usr/sbin/mkfs -t xfs -f -i size=2048 -- /dev/sde1 [node1][DEBUG ] meta-data=/dev/sde1 isize=2048 agcount=4, agsize=327615 blks [node1][DEBUG ] = sectsz=512 attr=2, projid32bit=1 [node1][DEBUG ] = crc=1 finobt=0, sparse=0 [node1][DEBUG ] data = bsize=4096 blocks=1310459, imaxpct=25 [node1][DEBUG ] = sunit=0 swidth=0 blks [node1][DEBUG ] naming =version 2 bsize=4096 ascii-ci=0 ftype=1 [node1][DEBUG ] log =internal log bsize=4096 blocks=2560, version=2 [node1][DEBUG ] = sectsz=512 sunit=0 blks, lazy-count=1 [node1][DEBUG ] realtime =none extsz=4096 blocks=0, rtextents=0 [node1][WARNIN] mount: Mounting /dev/sde1 on /var/lib/ceph/tmp/mnt.5St2Fg with options noatime,inode64 [node1][WARNIN] command_check_call: Running command: /usr/bin/mount -t xfs -o noatime,inode64 -- /dev/sde1 /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] command: Running command: /usr/sbin/restorecon /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] populate_data_path: Preparing osd data dir /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.5St2Fg/ceph_fsid.10803.tmp [node1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.5St2Fg/ceph_fsid.10803.tmp [node1][WARNIN] command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.5St2Fg/fsid.10803.tmp [node1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.5St2Fg/fsid.10803.tmp [node1][WARNIN] command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.5St2Fg/magic.10803.tmp [node1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.5St2Fg/magic.10803.tmp [node1][WARNIN] command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.5St2Fg/journal_uuid.10803.tmp [node1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.5St2Fg/journal_uuid.10803.tmp [node1][WARNIN] adjust_symlink: Creating symlink /var/lib/ceph/tmp/mnt.5St2Fg/journal -> /dev/disk/by-partuuid/22ea9667-570d-4697-b9dc-21968d31c445 [node1][WARNIN] command: Running command: /usr/sbin/restorecon -R /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.5St2Fg [node1][WARNIN] get_dm_uuid: get_dm_uuid /dev/sde uuid path is /sys/dev/block/8:64/dm/uuid [node1][WARNIN] command_check_call: Running command: /usr/sbin/sgdisk --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sde [node1][DEBUG ] The operation has completed successfully. [node1][WARNIN] update_partition: Calling partprobe on prepared device /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 [node1][WARNIN] command: Running command: /usr/bin/flock -s /dev/sde /usr/sbin/partprobe /dev/sde [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm settle --timeout=600 [node1][WARNIN] command_check_call: Running command: /usr/bin/udevadm trigger --action=add --sysname-match sde1 [node1][INFO ] Running command: systemctl enable ceph.target [node1][INFO ] checking OSD status... [node1][DEBUG ] find the location of an executable [node1][INFO ] Running command: /bin/ceph --cluster=ceph osd stat --format=json [node1][WARNIN] there is 1 OSD down [node1][WARNIN] there is 1 OSD out [ceph_deploy.osd][DEBUG ] Host node1 is now ready for osd use. 3、查看osd [root@node1 ceph]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.05878 root default -2 0.01959 host node1 0 0.00490 osd.0 up 1.00000 1.00000 1 0.00490 osd.1 up 1.00000 1.00000 2 0.00490 osd.2 up 1.00000 1.00000 11 0.00490 osd.11 up 1.00000 1.00000 -3 0.01959 host node2 4 0.00490 osd.4 up 1.00000 1.00000 5 0.00490 osd.5 up 1.00000 1.00000 6 0.00490 osd.6 up 1.00000 1.00000 7 0.00490 osd.7 up 1.00000 1.00000 -4 0.01959 host node3 8 0.00490 osd.8 up 1.00000 1.00000 9 0.00490 osd.9 up 1.00000 1.00000 3 0.00490 osd.3 up 1.00000 1.00000 10 0.00490 osd.10 up 1.00000 1.00000 4、查看osd状态 [root@node1 ceph]# systemctl status ceph-osd@11 ● ceph-osd@11.service - Ceph object storage daemon Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled) Active: active (running) since Mon 2018-09-10 03:20:37 EDT; 20min ago Main PID: 11379 (ceph-osd) CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@11.service └─11379 /usr/bin/ceph-osd -f --cluster ceph --id 11 --setuser ceph --setgroup ceph Sep 10 03:20:36 node1 systemd[1]: ceph-osd@11.service holdoff time over, scheduling restart. Sep 10 03:20:36 node1 systemd[1]: Starting Ceph object storage daemon... Sep 10 03:20:37 node1 ceph-osd-prestart.sh[11325]: create-or-move updating item name 'osd.11' weight 0.0049 at location {hos...sh map Sep 10 03:20:37 node1 systemd[1]: Started Ceph object storage daemon. Sep 10 03:20:38 node1 ceph-osd[11379]: starting osd.11 at :/0 osd_data /var/lib/ceph/osd/ceph-11 /var/lib/ceph/osd/ceph-11/journal Sep 10 03:21:13 node1 ceph-osd[11379]: 2018-09-10 03:21:13.399072 7f09b5797ac0 -1 osd.11 0 log_to_monitors {default=true} Hint: Some lines were ellipsized, use -l to show in full. [root@node1 ceph]#

在添加osd的中途,集群会短暂出现ERR的状态。

[root@node1 ceph]# ceph -s cluster 8eaa3f15-0946-4500-b018-6d31d1cc69f6 health HEALTH_ERR 11 pgs are stuck inactive for more than 300 seconds 13 pgs peering 11 pgs stuck inactive 11 pgs stuck unclean monmap e1: 3 mons at {node1=192.168.209.100:6789/0,node2=192.168.209.101:6789/0,node3=192.168.209.102:6789/0} election epoch 292, quorum 0,1,2 node1,node2,node3 osdmap e5664: 12 osds: 12 up, 12 in flags sortbitwise,require_jewel_osds pgmap v16499: 128 pgs, 1 pools, 0 bytes data, 0 objects 1519 MB used, 59788 MB / 61307 MB avail 98 active+clean 17 activating 13 peering

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:大联大推出基于RICHTEK产品的Type-C PD电源扩展坞方案
下一篇:java利用Future实现多线程执行与结果聚合实例代码
相关文章

 发表评论

暂时没有评论,来抢沙发吧~