ceph 故障处理记录

mon故障

mon无法启动

  • 现象描述

因为某些原因,导致ceph-mon无法启动。ceph-mon 无法启动,原因可能是mointor level db 损坏,数据的丢失等。在OS 正常的情况下(osd运行正常),只是ceph-mon 无法启动。
这种情况下,剔除节点然后再 加入,会导致数据的2次均衡,如果数据量大,耗时比较久。所以,可以通过重建ceph-mon的方式来实现

  • 处理

1,在当前节点,停掉ceph-mon,导出monmap,然后通过monmaptool工具手动修改 monmap,修改完 再导入到 剩余的mon map里,最后通过ceph mon dump验证是否删除成功

ceph-mon -i {mon-id} --extract-monmap {map-path}
# for example,
ceph-mon -i a --extract-monmap /tmp/monmap

monmaptool {map-path} --rm {mon-id}
#for example
monmaptool /tmp/monmap --rm b

ceph-mon -i {mon-id} --inject-monmap {map-path}
# for example,
ceph-mon -i a --inject-monmap /tmp/monmap

2,删除成功后,开始重新添加 ,通过ceph-mon leader 导出 现有的monmap

ceph -m 10.10.10.62 mon getmap -o monmap

3,在/data/目录生成ceph-mon 目录,mon-id 是5个字符串,

ceph-mon -i wsgws --mkfs --monmap monmap

4,将新的ceph-mon 加入monmap

ceph mon add wsgws 10.10.10.61:6789

4,重启 ezs3-agent 会将 新的mon加入到 ceph.conf

OSD故障

FAILED assert(0 == “unexpected error”)

  • 现象描述

ceph 0.67版本,osd的日志中有 如下报错

    -3> 2017-10-18 16:44:55.365325 7fdb5a6c0bc0  0 filestore(/data/osd.13)  error (39) Directory not empty not handled on operation 21 (28581659019.0.0, or op 0, counting from 0)
    -2> 2017-10-18 16:44:55.365340 7fdb5a6c0bc0  0 filestore(/data/osd.13) ENOTEMPTY suggests garbage data in osd data dir
    -1> 2017-10-18 16:44:55.365343 7fdb5a6c0bc0  0 filestore(/data/osd.13)  transaction dump:
{ "ops": [
        { "op_num": 0,
          "op_name": "rmcoll",
          "collection": "24.36_TEMP"}]}
     0> 2017-10-18 16:44:55.367380 7fdb5a6c0bc0 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int)' thread 7fdb5a6c0bc0 time 2017-10-18 16:44:55.365391
os/FileStore.cc: 2471: FAILED assert(0 == "unexpected error")
  • 处理

可以看到日志 提示 24.36_TEMP 中存在垃圾数据,手动进入该目录,发现有个文件,删除即可。

unfound object处理

  • 现象描述

ceph 0.67的版本,osd故障恢复后,发现集群有2个unfound object

  • 处理

    1,确认unfoud 所在pg
    root@Storage-b6:/etc/cron.d# ceph health detail |grep unfound
    HEALTH_WARN 1219 pgs backfill; 1 pgs backfilling; 1079 pgs degraded; 2 pgs recovering; 1222 pgs stuck unclean; recovery 3661003/50331903 degraded (7.274%); 3/25061739 unfound (0.000%); recovering 5 o/s, 25439KB/s; noout flag(s) set
    pg 17.2c3 is active+recovering+degraded, acting [20,2], 1 unfound
    pg 17.2b1 is active+recovering+degraded, acting [20,2], 2 unfound
    unable to open OSD superblock on /data/osd.2
    2,如果想要确认具体是哪些 object,可以用
    ceph pg 17.2b1 list_missing
    3,标记object
    ceph pg 2.5 mark_unfound_lost revert
    这个命令不一定是 roll back,如果object 是新的,ceph object 有版本信息吗?

  • 参考

http://docs.ceph.com/docs/arg…

unable to open OSD superblock on /data/osd

  • 现象描述

    2017-10-10 12:00:15.534831 7fda0e7b97c0 -1 ^[[0;31m ** ERROR: unable to open OSD superblock on /data/osd.2: (2) No such file or directory^[[0m
    2017-10-10 14:14:58.196144 7f71363c27c0 0 ceph version 0.94.9-1061-g9bcd143 (9bcd143a70b819086b9e58e0799ba93364d7ee31), process ceph-osd, pid 7250
    2017-10-10 14:14:58.198476 7f71363c27c0 -1 ^[[0;31m ** ERROR: unable to open OSD superblock on /data/osd.2: (2) No such file or directory^[[0m
    该报错说明osd盘并没有mount上,所以superblock文件无法打开(superblock ),df -h检查 osd盘确实没有mount上。

    检查/var/log/kern.log中有如下报错
    Oct 10 14:14:54 142 kernel: [ 120.133734] EXT4-fs (dm-0): ext4_check_descriptors: Checksum for group 32640 failed (58283!=0)
    Oct 10 14:14:54 142 kernel: [ 120.133739] EXT4-fs (dm-0): group descriptors corrupted!
    Oct 10 14:14:54 142 kernel: [ 120.214674] EXT4-fs (dm-1): ext4_check_descriptors: Inode bitmap for group 34688 not in group (block 2037277037)!
    Oct 10 14:14:54 142 kernel: [ 120.214679] EXT4-fs (dm-1): group descriptors corrupted!

tune2fs 检查dm设备,发现并没有EXT-4 错误记录,也说明 tune2fs 也有不及时的时候。

  • 处理方法

    e2fsck -p /dev/dm-0 提示不行
    e2fsck -y /dev/dm-0 进行勉强修复,后面需要reformat osd进行彻底修复

  • 原因说明

检查disk cache policy是disabled的,之前设备关机有强制关机(raid卡passthrough的converger环境)

mds故障

mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)

mds 状态为replay,一直无法恢复,ceph 的版本为 0.67的老版本

    -4> 2017-10-06 21:21:23.199440 7f5c63631700  0 mds.0.cache.dir(609) _fetched  badness: got (but i already had) [inode 1000004e2b0 [2,head] /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.5b8ebf83-30c9-4c3c-b6be-bb37544d672f.log auth v296 s=0 n(v0 1=1+0) (iversion lock) 0x368f050] mode 33188 mtime 2017-09-29 05:09:03.364737
    -3> 2017-10-06 21:21:23.199470 7f5c63631700  0 log [ERR] : loaded dup inode 1000004e2b0 [2,head] v4917497405 at ~mds0/stray9/1000004e2b0, but inode 1000004e2b0.head v296 already exists at /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.5b8ebf83-30c9-4c3c-b6be-bb37544d672f.log
    -2> 2017-10-06 21:21:23.199478 7f5c63631700  0 mds.0.cache.dir(609) _fetched  badness: got (but i already had) [inode 1000004e2b1 [2,head] /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.bf45b4c0-717f-4ae4-b709-0f3cb8cd8650.log auth v298 s=0 n(v0 1=1+0) (iversion lock) 0x3695590] mode 33188 mtime 2017-09-29 05:09:03.413889
    -1> 2017-10-06 21:21:23.199500 7f5c63631700  0 log [ERR] : loaded dup inode 1000004e2b1 [2,head] v4917497407 at ~mds0/stray9/1000004e2b1, but inode 1000004e2b1.head v298 already exists at /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.bf45b4c0-717f-4ae4-b709-0f3cb8cd8650.log
     0> 2017-10-06 21:21:23.201731 7f5c63631700 -1 mds/MDCache.cc: In function 'void MDCache::add_inode(CInode*)' thread 7f5c63631700 time 2017-10-06 21:21:23.199831
mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)

  • 处理方法

ceph.conf 中 加入可以把mds config: mds_wipe_ino_prealloc=true
重启mds,然后 ceph.conf再 删除该配置项

mds/journal.cc: 1397: FAILED assert(mds->sessionmap.version == cmapv)

active mds 的log 中有该assert

  • 处理方法

因为ceph 的版本0.67,重新 build 了mds 的binary,拿掉 该assert

    原文作者:shengguo
    原文地址: https://segmentfault.com/a/1190000011474137
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞