0 环境说明
ip地址 | 主机名 | 额外硬盘 | 是否加入ceph集群 |
---|---|---|---|
10.0.0.141 | ceph141 | sdb 300G,sdc 500G | 是 |
10.0.0.142 | ceph142 | sdb 300G,sdc 500G, sdd 1000G | 否 |
10.0.0.143 | ceph143 | sdb 300G,sdc 500G | 否 |
在上一篇文章中,已经成功地初始化了一个ceph管理节点ceph141。接下来要做的是把ceph142、ceph143节点给添加到集群。
在新的主机加入集群前,还不能使用,因为资源列表是空的,
[root@ceph141~]# ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
ceph141 /dev/sdb hdd ATA_VMware_Virtual_S_01000000000000000001 300G Yes 6m ago
ceph141 /dev/sdc hdd ATA_VMware_Virtual_S_02000000000000000001 500G Yes 6m ago
[root@ceph141~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 10.0.0.141 _admin
1 hosts in cluster
查看ceph状态,目前只有这一个刚刚初始化的节点
1 ceph节点的添加和删除
orch
是 orchestrator
的缩写,表示编排器(Orchestrator)
1.查看集群主机只有1个,接下来把其他节点添加进去。添加操作也可以在web页面上进行
[root@ceph141~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 10.0.0.141 _admin
1 hosts in cluster
2.把ssh秘钥放到其他服务器上,方便免密登录
[root@ceph141 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub ceph142
[root@ceph141 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub ceph143
3.添加其他节点到集群
[root@ceph141~]# ceph orch host add ceph142 10.0.0.142
Added host 'ceph142' with addr '10.0.0.142'
[root@ceph141~]# ceph orch host add ceph143 10.0.0.143
Added host 'ceph143' with addr '10.0.0.143'
4.再次查看主机列表,3个节点已经都在列表中了
[root@ceph141~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 10.0.0.141 _admin
ceph142 10.0.0.142
ceph143 10.0.0.143
3 hosts in cluster
5.测试删除一个节点,删除后记得添加回来
[root@ceph141~]# ceph orch host rm ceph143
Error EINVAL: Not allowed to remove ceph143 from cluster. The following daemons are running in the host:
type id
-------------------- ---------------
ceph-exporter ceph143
crash ceph143
node-exporter ceph143
mon ceph143 Please run 'ceph orch host drain ceph143' to remove daemons from host
2 添加OSD设备到集群
如果一个OSD(对象存储设备)想要加入ceph集群,要求满足2个条件
- 设备未被使用,已经分区使用的磁盘无法加入
- 设备的存储大小必须大于5GB
1.添加OSD之前环境查看当前OSD设备
[root@ceph141~]# ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
ceph141 /dev/sdb hdd ATA_VMware_Virtual_S_01000000000000000001 300G Yes 45s ago
ceph141 /dev/sdc hdd ATA_VMware_Virtual_S_02000000000000000001 500G Yes 45s ago
ceph142 /dev/sdb hdd ATA_VMware_Virtual_S_01000000000000000001 300G Yes 7m ago
ceph142 /dev/sdc hdd ATA_VMware_Virtual_S_02000000000000000001 500G Yes 7m ago
ceph142 /dev/sdd hdd ATA_VMware_Virtual_S_03000000000000000001 1024G Yes 7m ago
ceph143 /dev/sdb hdd ATA_VMware_Virtual_S_01000000000000000001 300G Yes 6m ago
ceph143 /dev/sdc hdd ATA_VMware_Virtual_S_02000000000000000001 500G Yes 6m ago
[root@ceph141~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
2.把上述设备进行添加。磁盘命名根据实际情况添加
daemon
表示以守护进程添加
[root@ceph141 ~]# ceph orch daemon add osd ceph141:/dev/sdb
[root@ceph141 ~]# ceph orch daemon add osd ceph141:/dev/sdc
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdb
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdc
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdd
[root@ceph141 ~]# ceph orch daemon add osd ceph143:/dev/sdb
[root@ceph141 ~]# ceph orch daemon add osd ceph143:/dev/sdc
3.查看添加好的OSD以及LVM
[root@ceph141~]# ceph orch device ls
Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected
Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected[root@ceph141~]# lsblk | egrep -A 1 '^sdb|^sdc'
sdb 8:16 0 300G 0 disk
└─ceph--ae423c15--8ff6--4e72--af97--ac909ca57fac-osd--block--6fc95050--e75e--4833--845b--b0e7698e1543 253:0 0 300G 0 lvm
sdc 8:32 0 500G 0 disk
└─ceph--00a6dda3--90da--46ba--8962--31835455173a-osd--block--2f6e56ff--9926--4c4e--8931--a780eb353431 253:1 0 500G 0 lvm
由此可见,ceph底层是基于LVM技术的
4.再次查看ceph集群状态。命令cephs tatus
等于ceph -s
[root@ceph141~]# ceph status cluster:id: 12fad866-9aa0-11ef-8656-6516a17ad6ddhealth: HEALTH_OKservices:mon: 3 daemons, quorum ceph141,ceph142,ceph143 (age 10m)mgr: ceph141.yvswvf(active, since 10m), standbys: ceph142.gtcikxosd: 7 osds: 7 up (since 7m), 7 in (since 7m)data:pools: 1 pools, 1 pgsobjects: 2 objects, 577 KiBusage: 192 MiB used, 3.3 TiB / 3.3 TiB availpgs: 1 active+clean
5.全部添加后,查看OSD设备的数量为7个。正好对应上
[root@ceph141~]# ceph osd ls
0
1
2
3
4
5
6
3 ceph集群配置时间同步
ceph集群基于chrony进行同步时间,如果集群时间偏差较大,可能导致healthy异常
1.所有节点安装chrony时间同步
apt -y install chrony
2.ceph141修改配置,这个节点设置为时间同步服务器。
在配置文件中添加一行:pool ntp.aliyun.com iburst maxsources 4
最后重启服务:systemctl restart chronyd
[root@ceph141~]# yy /etc/chrony/chrony.conf
confdir /etc/chrony/conf.d
pool ntp.ubuntu.com iburst maxsources 4
pool 0.ubuntu.pool.ntp.org iburst maxsources 1
pool 1.ubuntu.pool.ntp.org iburst maxsources 1
pool 2.ubuntu.pool.ntp.org iburst maxsources 2
pool ntp.aliyun.com iburst maxsources 4
sourcedir /run/chrony-dhcp
sourcedir /etc/chrony/sources.d
keyfile /etc/chrony/chrony.keys
driftfile /var/lib/chrony/chrony.drift
ntsdumpdir /var/lib/chrony
logdir /var/log/chrony
maxupdateskew 100.0
rtcsync
makestep 1 3
leapsectz right/UTC
验证服务可用:
[root@ceph141~]# chronyc activity -v
200 OK
7 sources online
0 sources offline
2 sources doing burst (return to online)
0 sources doing burst (return to offline)
7 sources with unknown address
4 ceph的管理节点设置
1.拷贝apt源及认证文件到ceph142节点
scp /etc/apt/sources.list.d/ceph.list ceph142:/etc/apt/sources.list.d/
scp /etc/apt/trusted.gpg.d/ceph.release.gpg ceph142:/etc/apt/trusted.gpg.d/
2.在ceph142节点执行安装
[root@ceph142~]# apt update
[root@ceph142~]# apt -y install ceph-common
[root@ceph142~]# ceph -v
ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)
3.ceph141节点拷贝认证文件到ceph142节点
[root@ceph141~]# scp /etc/ceph/ceph.{conf,client.admin.keyring} ceph142:/etc/ceph/
4.在ceph142节点测试可以正常访问ceph集群
[root@ceph142~]# ceph -scluster:id: 12fad866-9aa0-11ef-8656-6516a17ad6ddhealth: HEALTH_OKservices:mon: 3 daemons, quorum ceph141,ceph142,ceph143 (age 15m)mgr: ceph141.yvswvf(active, since 14m), standbys: ceph142.gtcikxosd: 7 osds: 7 up (since 14m), 7 in (since 35m)data:pools: 1 pools, 1 pgsobjects: 2 objects, 577 KiBusage: 1003 MiB used, 3.3 TiB / 3.3 TiB availpgs: 1 active+clean
5 节点的标签管理
为节点打上对应的标签,以便于日后管理。
参考链接
1.给ceph142节点添加自定义标签
[root@ceph141 ~]# ceph orch host label add ceph142 _admin
Added label _admin to host ceph142
[root@ceph141 ~]# ceph orch host label add ceph143 _admin
Added label _admin to host ceph143
[root@ceph141 ~]# ceph orch host label add ceph143 wzy666
Added label wzy666 to host ceph143
2.移除标签
[root@ceph141 ~]# ceph orch host label rm ceph143 wzy666
Removed label wzy666 from host ceph143[root@ceph141 ~]# ceph orch host label rm ceph143 admin
Host ceph143 does not have label 'admin'. Please use 'ceph orch host ls' to list all the labels.[root@ceph141 ~]# ceph orch host label rm ceph143 _admin
Removed label _admin from host ceph143
Removed label wzy666 from host ceph143[root@ceph141 ~]# ceph orch host label rm ceph143 admin
Host ceph143 does not have label 'admin'. Please use 'ceph orch host ls' to list all the labels.[root@ceph141 ~]# ceph orch host label rm ceph143 _admin
Removed label _admin from host ceph143