mon故障
mon无法启动
- 现象描述
因为某些原因,导致ceph-mon无法启动。ceph-mon 无法启动,原因可能是mointor level db 损坏,数据的丢失等。在OS 正常的情况下(osd运行正常),只是ceph-mon 无法启动。
这种情况下,剔除节点然后再 加入,会导致数据的2次均衡,如果数据量大,耗时比较久。所以,可以通过重建ceph-mon的方式来实现
- 处理
1,在当前节点,停掉ceph-mon,导出monmap,然后通过monmaptool工具手动修改 monmap,修改完 再导入到 剩余的mon map里,最后通过ceph mon dump验证是否删除成功
ceph-mon -i {mon-id} --extract-monmap {map-path}
# for example,
ceph-mon -i a --extract-monmap /tmp/monmap
monmaptool {map-path} --rm {mon-id}
#for example
monmaptool /tmp/monmap --rm b
ceph-mon -i {mon-id} --inject-monmap {map-path}
# for example,
ceph-mon -i a --inject-monmap /tmp/monmap
2,删除成功后,开始重新添加 ,通过ceph-mon leader 导出 现有的monmap
ceph -m 10.10.10.62 mon getmap -o monmap
3,在/data/目录生成ceph-mon 目录,mon-id 是5个字符串,
ceph-mon -i wsgws --mkfs --monmap monmap
4,将新的ceph-mon 加入monmap
ceph mon add wsgws 10.10.10.61:6789
4,重启 ezs3-agent 会将 新的mon加入到 ceph.conf
OSD故障
FAILED assert(0 == “unexpected error”)
- 现象描述
ceph 0.67版本,osd的日志中有 如下报错
-3> 2017-10-18 16:44:55.365325 7fdb5a6c0bc0 0 filestore(/data/osd.13) error (39) Directory not empty not handled on operation 21 (28581659019.0.0, or op 0, counting from 0)
-2> 2017-10-18 16:44:55.365340 7fdb5a6c0bc0 0 filestore(/data/osd.13) ENOTEMPTY suggests garbage data in osd data dir
-1> 2017-10-18 16:44:55.365343 7fdb5a6c0bc0 0 filestore(/data/osd.13) transaction dump:
{ "ops": [
{ "op_num": 0,
"op_name": "rmcoll",
"collection": "24.36_TEMP"}]}
0> 2017-10-18 16:44:55.367380 7fdb5a6c0bc0 -1 os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int)' thread 7fdb5a6c0bc0 time 2017-10-18 16:44:55.365391
os/FileStore.cc: 2471: FAILED assert(0 == "unexpected error")
- 处理
可以看到日志 提示 24.36_TEMP 中存在垃圾数据,手动进入该目录,发现有个文件,删除即可。
unfound object处理
- 现象描述
ceph 0.67的版本,osd故障恢复后,发现集群有2个unfound object
- 处理
1,确认unfoud 所在pg
root@Storage-b6:/etc/cron.d# ceph health detail |grep unfound
HEALTH_WARN 1219 pgs backfill; 1 pgs backfilling; 1079 pgs degraded; 2 pgs recovering; 1222 pgs stuck unclean; recovery 3661003/50331903 degraded (7.274%); 3/25061739 unfound (0.000%); recovering 5 o/s, 25439KB/s; noout flag(s) set
pg 17.2c3 is active+recovering+degraded, acting [20,2], 1 unfound
pg 17.2b1 is active+recovering+degraded, acting [20,2], 2 unfound
unable to open OSD superblock on /data/osd.2
2,如果想要确认具体是哪些 object,可以用
ceph pg 17.2b1 list_missing
3,标记object
ceph pg 2.5 mark_unfound_lost revert
这个命令不一定是 roll back,如果object 是新的,ceph object 有版本信息吗? - 参考
http://docs.ceph.com/docs/arg…
unable to open OSD superblock on /data/osd
- 现象描述
2017-10-10 12:00:15.534831 7fda0e7b97c0 -1 ^[[0;31m ** ERROR: unable to open OSD superblock on /data/osd.2: (2) No such file or directory^[[0m
2017-10-10 14:14:58.196144 7f71363c27c0 0 ceph version 0.94.9-1061-g9bcd143 (9bcd143a70b819086b9e58e0799ba93364d7ee31), process ceph-osd, pid 7250
2017-10-10 14:14:58.198476 7f71363c27c0 -1 ^[[0;31m ** ERROR: unable to open OSD superblock on /data/osd.2: (2) No such file or directory^[[0m
该报错说明osd盘并没有mount上,所以superblock文件无法打开(superblock ),df -h检查 osd盘确实没有mount上。检查/var/log/kern.log中有如下报错
Oct 10 14:14:54 142 kernel: [ 120.133734] EXT4-fs (dm-0): ext4_check_descriptors: Checksum for group 32640 failed (58283!=0)
Oct 10 14:14:54 142 kernel: [ 120.133739] EXT4-fs (dm-0): group descriptors corrupted!
Oct 10 14:14:54 142 kernel: [ 120.214674] EXT4-fs (dm-1): ext4_check_descriptors: Inode bitmap for group 34688 not in group (block 2037277037)!
Oct 10 14:14:54 142 kernel: [ 120.214679] EXT4-fs (dm-1): group descriptors corrupted!
tune2fs 检查dm设备,发现并没有EXT-4 错误记录,也说明 tune2fs 也有不及时的时候。
- 处理方法
e2fsck -p /dev/dm-0 提示不行
e2fsck -y /dev/dm-0 进行勉强修复,后面需要reformat osd进行彻底修复 - 原因说明
检查disk cache policy是disabled的,之前设备关机有强制关机(raid卡passthrough的converger环境)
mds故障
mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
mds 状态为replay,一直无法恢复,ceph 的版本为 0.67的老版本
-4> 2017-10-06 21:21:23.199440 7f5c63631700 0 mds.0.cache.dir(609) _fetched badness: got (but i already had) [inode 1000004e2b0 [2,head] /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.5b8ebf83-30c9-4c3c-b6be-bb37544d672f.log auth v296 s=0 n(v0 1=1+0) (iversion lock) 0x368f050] mode 33188 mtime 2017-09-29 05:09:03.364737
-3> 2017-10-06 21:21:23.199470 7f5c63631700 0 log [ERR] : loaded dup inode 1000004e2b0 [2,head] v4917497405 at ~mds0/stray9/1000004e2b0, but inode 1000004e2b0.head v296 already exists at /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.5b8ebf83-30c9-4c3c-b6be-bb37544d672f.log
-2> 2017-10-06 21:21:23.199478 7f5c63631700 0 mds.0.cache.dir(609) _fetched badness: got (but i already had) [inode 1000004e2b1 [2,head] /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.bf45b4c0-717f-4ae4-b709-0f3cb8cd8650.log auth v298 s=0 n(v0 1=1+0) (iversion lock) 0x3695590] mode 33188 mtime 2017-09-29 05:09:03.413889
-1> 2017-10-06 21:21:23.199500 7f5c63631700 0 log [ERR] : loaded dup inode 1000004e2b1 [2,head] v4917497407 at ~mds0/stray9/1000004e2b1, but inode 1000004e2b1.head v298 already exists at /shareroot/block-hz/routinebackup/2017-09-29/log/check_img.bf45b4c0-717f-4ae4-b709-0f3cb8cd8650.log
0> 2017-10-06 21:21:23.201731 7f5c63631700 -1 mds/MDCache.cc: In function 'void MDCache::add_inode(CInode*)' thread 7f5c63631700 time 2017-10-06 21:21:23.199831
mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
- 处理方法
ceph.conf 中 加入可以把mds config: mds_wipe_ino_prealloc=true
重启mds,然后 ceph.conf再 删除该配置项
mds/journal.cc: 1397: FAILED assert(mds->sessionmap.version == cmapv)
active mds 的log 中有该assert
- 处理方法
因为ceph 的版本0.67,重新 build 了mds 的binary,拿掉 该assert