0%

K8s学习笔记——StatefulSet实践

学习极客时间上的《深入剖析Kubernetes》

秉持眼过千遍不如手过一遍的原则。动手实践并记录结果

对应章节:20 | 深入理解StatefulSet(三):有状态应用实践

MySQL集群

描述

  1. 是一个“主从复制“的MySQL集群
  2. 有一个主节点和多个从节点
  3. 从节点可水平扩展
  4. 所有写操作都只能在主节点上完成
  5. 读操作可以在所有节点上完成

创建my.conf的configMap

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql
labels:
app: mysql
data:
master.cnf: |
[mysqld]
log-bin
slave.cnf: |
[mysqld]
super-read-only
1
2
$ kubectl get configmap
mysql 2 19s

创建两个Service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
ports:
- name: mysql
port: 3306
clusterIP: None
selector:
app: mysql
---
apiVersion: v1
kind: Service
metadata:
name: mysql-read
labels:
app: mysql
spec:
ports:
- name: mysql
port: 3306
selector:
app: mysql
1
2
3
4
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mysql ClusterIP None <none> 3306/TCP 4s
mysql-read ClusterIP 10.105.117.44 <none> 3306/TCP 4s

创建StatefulSet

这个yaml文件是如此的长,在不太懂的情况下手抄作业

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
serviceName: mysql
replicas: 3
template:
metadata:
labels:
app: mysql
spec:
initContainers:
- name: init-mysql
image: mysql:5.7
command:
- bash
- "-c"
- |
set -ex
# Generate mysql server-id from pod ordinal index.
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
echo [mysqld] > /mnt/conf.d/server-id.cnf
# Add an offset to avoid reserved server-id=0 value.
echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
# Copy appropriate conf.d files from config-map to emptyDir.
if [[ $ordinal -eq 0 ]]; then
cp /mnt/config-map/master.cnf /mnt/conf.d/
else
cp /mnt/config-map/slave.cnf /mnt/conf.d/
fi
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
- name: config-map
mountPath: /mnt/config-map
- name: clone-mysql
image: 10.160.15.5/google_containers/xtrabackup:1.0
command:
- bash
- "-c"
- |
set -ex
# Skip the clone if data already exists.
[[ -d /var/lib/mysql/mysql ]] && exit 0
# Skip the clone on master (ordinal index 0).
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
[[ $ordinal -eq 0 ]] && exit 0
# Clone data from previous peer.
ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
# Prepare the backup.
xtrabackup --prepare --target-dir=/var/lib/mysql
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ALLOW_EMPTY_PASSWORD
value: "1"
ports:
- name: mysql
containerPort: 3306
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
resources:
requests:
cpu: 500m
memory: 1Gi
livenessProbe:
exec:
command: ["mysqladmin", "ping"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
exec:
# Check we can execute queries over TCP (skip-networking is off).
command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
initialDelaySeconds: 5
periodSeconds: 2
timeoutSeconds: 1
- name: xtrabackup
image: 10.160.15.5/google_containers/xtrabackup:1.0
ports:
- name: xtrabackup
containerPort: 3307
command:
- bash
- "-c"
- |
set -ex
cd /var/lib/mysql
# Determine binlog position of cloned data, if any.
if [[ -f xtrabackup_slave_info ]]; then
# XtraBackup already generated a partial "CHANGE MASTER TO" query
# because we're cloning from an existing slave.
mv xtrabackup_slave_info change_master_to.sql.in
# Ignore xtrabackup_binlog_info in this case (it's useless).
rm -f xtrabackup_binlog_info
elif [[ -f xtrabackup_binlog_info ]]; then
# We're cloning directly from master. Parse binlog position.
[[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
rm xtrabackup_binlog_info
echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
fi
# Check if we need to complete a clone by starting replication.
if [[ -f change_master_to.sql.in ]]; then
echo "Waiting for mysqld to be ready (accepting connections)"
until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done
echo "Initializing replication from clone position"
# In case of container restart, attempt this at-most-once.
mv change_master_to.sql.in change_master_to.sql.orig
mysql -h 127.0.0.1 <<EOF
$(<change_master_to.sql.orig),
MASTER_HOST='mysql-0.mysql',
MASTER_USER='root',
MASTER_PASSWORD='',
MASTER_CONNECT_RETRY=10;
START SLAVE;
EOF
fi
# Start a server to send backups when requested by peers.
exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
"xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
resources:
requests:
cpu: 100m
memory: 100Mi
volumes:
- name: conf
emptyDir: {}
- name: config-map
configMap:
name: mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "nfs-client"
resources:
requests:
storage: 10Gi

检查

  • PV及PVC
1
2
3
4
5
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-ba5e0a68-59f6-4812-b3cc-919898c72f21 10Gi RWO Delete Bound default/data-mysql-2 nfs-client 116s
pvc-cd69aa2b-6e92-48da-998b-363fb3c796e9 10Gi RWO Delete Bound default/data-mysql-0 nfs-client 106m
pvc-f89ba84c-f87c-498a-ac19-99eca1645f4f 10Gi RWO Delete Bound default/data-mysql-1 nfs-client 105m
1
2
3
4
5
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-mysql-0 Bound pvc-cd69aa2b-6e92-48da-998b-363fb3c796e9 10Gi RWO nfs-client 106m
data-mysql-1 Bound pvc-f89ba84c-f87c-498a-ac19-99eca1645f4f 10Gi RWO nfs-client 105m
data-mysql-2 Bound pvc-ba5e0a68-59f6-4812-b3cc-919898c72f21 10Gi RWO nfs-client 110s
  • pod
1
2
3
4
5
$ kubectl get pods -l app=mysql
NAME READY STATUS RESTARTS AGE
mysql-0 2/2 Running 0 104m
mysql-1 2/2 Running 1 3m27s
mysql-2 2/2 Running 1 3m7s
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ kubectl describe pod mysql-2
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m35s default-scheduler running "VolumeBinding" filter plugin for pod "mysql-2": pod has unbound immediate PersistentVolumeClaims
Warning FailedScheduling 3m35s default-scheduler persistentvolumeclaim "data-mysql-2" not found
Normal Scheduled 3m32s default-scheduler Successfully assigned default/mysql-2 to node1
Normal Pulled 3m32s kubelet, node1 Container image "mysql:5.7" already present on machine
Normal Created 3m32s kubelet, node1 Created container init-mysql
Normal Started 3m31s kubelet, node1 Started container init-mysql
Normal Started 3m30s kubelet, node1 Started container clone-mysql
Normal Pulled 3m30s kubelet, node1 Container image "10.160.15.5/google_containers/xtrabackup:1.0" already present on machine
Normal Created 3m30s kubelet, node1 Created container clone-mysql
Normal Pulled 2m55s kubelet, node1 Container image "10.160.15.5/google_containers/xtrabackup:1.0" already present on machine
Normal Created 2m55s kubelet, node1 Created container xtrabackup
Normal Started 2m55s kubelet, node1 Started container xtrabackup
Normal Pulled 2m54s (x2 over 2m55s) kubelet, node1 Container image "mysql:5.7" already present on machine
Normal Created 2m54s (x2 over 2m55s) kubelet, node1 Created container mysql
Normal Started 2m54s (x2 over 2m55s) kubelet, node1 Started container mysql
  • service
1
2
3
4
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mysql ClusterIP None <none> 3306/TCP 21h
mysql-read ClusterIP 10.105.117.44 <none> 3306/TCP 21h
1
2
3
$ kubectl get statefulset
NAME READY AGE
mysql 3/3 107m

验证集群

  • 创建测试数据
1
2
3
4
5
6
7
8
9
$ kubectl run mysql-client --image=mysql:5.7 -i --rm --restart=Never --\
mysql -h mysql-0.mysql <<EOF
CREATE DATABASE test;
CREATE TABLE test.messages (message VARCHAR(250));
INSERT INTO test.messages VALUES ('hello');
EOF

If you don't see a command prompt, try pressing enter.
pod "mysql-client" deleted
  • 读取测试数据
1
2
3
4
5
6
7
8
$ kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never -- mysql -h mysql-read -e "SELECT * FROM test.messages"
If you don't see a command prompt, try pressing enter.
+---------+
| message |
+---------+
| hello |
+---------+
pod "mysql-client" deleted

扩容mysql集群

  • 修改replicas为5
1
2
3
4
5
6
7
8
9
10
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
serviceName: mysql
replicas: 5
  • pods
1
2
3
4
5
6
7
$ kubectl get pods -l app=mysql
NAME READY STATUS RESTARTS AGE
mysql-0 2/2 Running 0 4h51m
mysql-1 2/2 Running 1 3h10m
mysql-2 2/2 Running 1 3h10m
mysql-3 2/2 Running 1 119s
mysql-4 2/2 Running 0 71s
  • 获取数据
1
2
3
4
5
6
7
$ kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never -- mysql -h mysql-4.mysql -e "SELECT * FROM test.messages"
+---------+
| message |
+---------+
| hello |
+---------+
pod "mysql-client" deleted
  • 尝试写入
1
2
3
4
$ kubectl run mysql-client --image=mysql:5.7 -it --rm --restart=Never -- mysql -h mysql-4.mysql -e "INSERT INTO test.messages VALUES ('test')"
ERROR 1290 (HY000) at line 1: The MySQL server is running with the --super-read-only option so it cannot execute this statement
pod "mysql-client" deleted
pod default/mysql-client terminated (Error)

分析

实验伴随着解决各种问题完成了,现在再回头来总结一下我理解的YAML文件的内容

  1. init_containers中定义了两个container,分别是mysql和xtrabackup
    1. mysql的container根据hostname进行判断,是初始化为master还是slave
    2. xtrabackup的container,如果不为master,则根据数据是否存在决定是否要拷贝数据
  2. containers中也同样定义了两个container,也同样是mysql和xtrabackup
    1. mysql的container直接使用init_container中创建的数据,启动mysql服务
    2. xtrabackup的container则开启3307用来为其他的slave提供拷贝

更新

修改镜像版本为10.160.15.5/pub/mysql:5.7

由于新镜像实际上和原来的镜像是同一个md5,不会出现其他问题。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
$ kubectl get pods -l app=mysql
NAME READY STATUS RESTARTS AGE
mysql-0 2/2 Running 0 26h
mysql-1 2/2 Running 1 24h
mysql-2 2/2 Running 1 24h
mysql-3 2/2 Running 1 21h
mysql-4 2/2 Terminating 0 46s
$ kubectl get pods -l app=mysql
NAME READY STATUS RESTARTS AGE
mysql-0 2/2 Running 0 26h
mysql-1 2/2 Running 1 24h
mysql-2 2/2 Running 1 24h
mysql-3 2/2 Terminating 1 21h
mysql-4 2/2 Running 0 46s
$ kubectl get pods -l app=mysql
NAME READY STATUS RESTARTS AGE
mysql-0 2/2 Running 0 26h
mysql-1 2/2 Terminating 1 25h
mysql-2 2/2 Running 0 17s
mysql-3 2/2 Running 0 68s
mysql-4 2/2 Running 0 2m19s
$ kubectl get pods -l app=mysql
NAME READY STATUS RESTARTS AGE
mysql-0 0/2 Init:0/2 0 1s
mysql-1 2/2 Running 0 47s
mysql-2 2/2 Running 0 90s
mysql-3 2/2 Running 0 2m21s
mysql-4 2/2 Running 0 3m32s

可以看到,StatefulSet更新后,会先将最后一个pod进行更新,成功后,会更新mysql-3,依次类推,直到更新到mysql-0

小结

StatefulSet 其实是一种特殊的 Deployment,只不过这个“Deployment”的每个 Pod 实例的名字里,都携带了一个唯一并且固定的编号。这个编号的顺序,固定了 Pod 的拓扑关系;这个编号对应的 DNS 记录,固定了 Pod 的访问方式;这个编号对应的 PV,绑定了 Pod 与持久化存储的关系。所以,当 Pod 被删除重建时,这些“状态”都会保持不变。

可以看到,由于pod名称固定,PV以及PVC固定,即使删掉一个pod,重建后,仍然会创建一个与其名字相同的pod并使用原有的PV及PVC。从而对应的pod启动后,内容一致

然而,由于将master和slave写在一个StatefulSet,感觉实现的非常复杂。