1 2 3 4 5 6 7
| 作者:李晓辉
联系方式:
1. 微信:Lxh_Chat
2. 邮箱:939958092@qq.com
|
描述 OSD 映射
OSD 映射包含每个 OSD 的地址和状态、池列表和详情,以及其他信息,如 OSD 接近容量限制信息
当集群的基础架构有变化时,例如 OSD 加入或离开集群, MON 会相应地更新对应的映射。MON 维护映射修订的历史记录。Ceph 使用递增整数的有序集合(称为 epoch)来标识各个映射的每一版本
使用 ceph status -f json-pretty
命令可显示每个映射的epoch。使用 ceph map dump
子命令来显示各个映射,如 ceph osd dump
。每当 OSD 加入或离开集群时,Ceph 都会更新 OSD 映射。OSD 可能会因为 OSD 故障或硬件故障而离开 Ceph 集群。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| [root@serverc ~]# ceph status -f json-pretty | more
{ "fsid": "2ae6d05a-229a-11ec-925e-52540000fa0c", "health": { "status": "HEALTH_OK", "checks": {}, "mutes": [] }, "election_epoch": 122, "quorum": [ 0, 1, 2, 3 ], "quorum_names": [ "serverc.lab.example.com", "clienta", "serverd", "servere" ], "quorum_age": 3832, "monmap": { "epoch": 4, "min_mon_release_name": "pacific", "num_mons": 4 }, "osdmap": { "epoch": 290, "num_osds": 9, "num_up_osds": 9, "osd_up_since": 1722915456, "num_in_osds": 9, "osd_in_since": 1635491258, "num_remapped_pgs": 0 },
|
查看osd的信息
果然看到了epoch和池信息,osd信息等
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| [root@serverc ~]# ceph osd dump | more epoch 290 fsid 2ae6d05a-229a-11ec-925e-52540000fa0c created 2021-10-01T09:30:32.028240+0000 modified 2024-08-07T06:56:44.969909+0000 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 36 full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85 require_min_compat_client luminous min_compat_client luminous require_osd_release pacific stretch_mode_enabled false pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_c hange 261 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 262 flags hashpspool stripe_width 0 application rgw pool 3 'default.rgw.log' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_chang e 263 flags hashpspool stripe_width 0 application rgw pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_c hange 264 flags hashpspool stripe_width 0 application rgw pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 265 lfor 0/184/182 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 8 application rgw pool 6 'abc' erasure profile default size 4 min_size 3 crush_rule 3 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_chan ge 290 flags hashpspool stripe_width 8192 application rbd pool 7 'ecpool' erasure profile lxhecprofile size 6 min_size 5 crush_rule 3 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on l ast_change 289 flags hashpspool stripe_width 16384 application rbd max_osd 9 osd.0 up in weight 1 up_from 190 up_thru 281 down_at 189 last_clean_interval [24,185) [v2:172.25.250.12:6800/1196803373,v1:172.25.250. 12:6801/1196803373] [v2:172.25.249.12:6802/1196803373,v1:172.25.249.12:6803/1196803373] exists,up 5be66be9-8262-4c4b-9654-ed549f6280f7 osd.1 up in weight 1 up_from 189 up_thru 276 down_at 188 last_clean_interval [24,185) [v2:172.25.250.12:6807/4220845531,v1:172.25.250. 12:6808/4220845531] [v2:172.25.249.12:6809/4220845531,v1:172.25.249.12:6810/4220845531] exists,up 3f751363-a03c-4b76-af92-8114e38bfa09
|
分析OSD Map更新
每当有OSD加入或离开集群时,Ceph都会更新OSD的map。一个OSD可以因为OSD故障或硬件故障而离开Ceph集群
OSD 不使用leader来管理 OSD 映射;它们会在自身之间传播映射。OSD 会利用 OSD 映射epoch标记它们交换的每一条消息。当 OSD 检测到自己已拖后时,它会使用其对等 OSD 执行映射更新,接收节点会执行增量映射更新
传播 OSD 映射
OSD 定期向监控器报告自己的状态。此外,OSD 会交换心跳,这样 OSD 可以检测对等点的故障,并将该事件报告给监控器。
当leader监控器了解到 OSD 故障时,它会更新映射,递增epoch,并使用 Paxos 更新协议来通知其他监控器,同时撤销它们的租用。在多数监控器确认更新并且集群具有仲裁后,leader监控器会发布新的租用,使得监控器能够分发更新的 OSD 映射。这种方法可避免映射epoch后退到集群中的任何位置并查找仍然有效的旧租用。
版权声明: 本博客中的内容未经允许不得转载和引用,转载和引用需获得作者同意, 作者微信: Lxh_Chat