ubuntu 二进制部署k8s
1. 环境准备
1.1 安装规划
角色 | IP | 组件 |
---|---|---|
k8s-master1 | 192.168.80.45 | etcd, api-server, controller-manager, scheduler, docker |
k8s-node01 | 192.168.80.46 | etcd, kubelet, kube-proxy, docker |
k8s-node02 | 192.168.80.47 | etcd, kubelet, kube-proxy, docker |
软件版本:
软件 | 版本 | 备注 |
---|---|---|
OS | Ubuntu 16.04.6 LTS | |
Kubernetes | 1.19.11 | |
Etcd | v3.4.15 | |
Docker | 19.03.9 |
1.2 系统设置
# 1. 修改主机名 hostnamectl set-hostname k8s-master1 hostnamectl set-hostname k8s-node01 hostnamectl set-hostname k8s-node02 # 2. 主机名解析 cat >> /etc/hosts <<EOF 192.168.80.45 k8s-master1 192.168.80.46 k8s-node01 192.168.80.47 k8s-node02 EOF # 3. 禁用 swap swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab # 4. 将桥接的IPv4流量传递到iptables的链 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system # 5. 域名解析 echo nameserver 8.8.8.8 >> /etc/resolv.conf # 6. 时间同步 apt install ntpdate -y ntpdate ntp1.aliyun.com crontab -e */30 * * * * /usr/sbin/ntpdate-u ntp1.aliyun.com >> /var/log/ntpdate.log 2>&1 # 7. 日志目录 mkdir -p /var/log/kubernetes
2. 安装 docker
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install # 1. 下载安装包 wget https://download.docker.com/linux/static/stable/x86_64/docker-19.03.9.tgz tar zxvf docker-19.03.9.tgz mv docker/* /usr/bin docker version # 2. 开机启动配置 cat > /lib/systemd/system/docker.service << EOF [Unit] Description=Docker Application Container Engine Documentation=https://docs.docker.com After=network-online.target firewalld.service Wants=network-online.target [Service] Type=notify ExecStart=/usr/bin/dockerd ExecReload=/bin/kill -s HUP $MAINPID LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity TimeoutStartSec=0 Delegate=yes KillMode=process Restart=on-failure StartLimitBurst=3 StartLimitInterval=60s [Install] WantedBy=multi-user.target EOF # 3. 启动 systemctl daemon-reload systemctl start docker systemctl status docker systemctlenable docker
3. TLS 证书
3.1 证书工具
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -O /usr/local/bin/cfssl wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -O /usr/local/bin/cfssljson wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 -O /usr/bin/cfssl-certinfo chmod +x /usr/bin/cfssl* /usr/local/bin/cfssl*
3.2 证书归类
生成的 CA 证书和秘钥文件如下:
组件 | 证书 | 密钥 | 备注 |
---|---|---|---|
etcd | ca.pem、etcd.pem | etcd-key.pem | |
apiserver | ca.pem、apiserver.pem | apiserver-key.pem | |
controller-manager | ca.pem、kube-controller-manager.pem | ca-key.pem、kube-controller-manager-key.pem | kubeconfig |
scheduler | ca.pem、kube-scheduler.pem | kube-scheduler-key.pem | kubeconfig |
kubelet | ca.pem | kubeconfig+token | |
kube-proxy | ca.pem、kube-proxy.pem | kube-proxy-key.pem | kubeconfig |
kubectl | ca.pem、admin.pem | kube-proxy-key.pem |
3.3 CA 证书
CA: Certificate Authority
mkdir -p /root/ssl && cd /root/ssl # 1. CA 配置文件 cat > ca-config.json <<EOF { signing: { default: { expiry: 87600h }, profiles: { kubernetes: { usages: [ signing, key encipherment, server auth, client auth ], expiry: 87600h } } } } EOF # 2. CA 证书签名请求文件 cat > ca-csr.json <<EOF { CN: kubernetes, key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: k8s, OU: System } ], ca: { expiry: 87600h } } EOF # 3. 生成CA证书和密钥 cfssl gencert -initca ca-csr.json | cfssljson -bare ca ls ca* #ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
3.4 etcd
注意:hosts 中的IP地址,分别指定了 etcd
集群的主机 IP
# 1. 证书签名请求文件 cat > etcd-csr.json <<EOF { CN: etcd, hosts: [ 127.0.0.1, localhost, 192.168.80.45, 192.168.80.46, 192.168.80.47 ], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: etcd, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
3.5 kube-apiserver 证书
注意:hosts 中的IP地址,分别指定了 kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.254.0.1)
# 1. 证书签名请求文件 cat > apiserver-csr.json <<EOF { CN: kubernetes, hosts: [ 127.0.0.1, localhost, 192.168.80.1, 192.168.80.2, 192.168.80.45, 192.168.80.46, 192.168.80.47, 10.254.0.1, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster, kubernetes.default.svc.cluster.local ], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: k8s, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver
3.6 kube-controller-manager 证书
# 1. 证书签名请求文件 cat > kube-controller-manager-csr.json <<EOF { CN: system:kube-controller-manager, hosts: [], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: system:masters, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
3.8 kube-scheduler 证书
# 1. 证书签名请求文件 cat > kube-scheduler-csr.json << EOF { CN: system:kube-scheduler, hosts: [], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: system:masters, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
3.9 admin 证书
-
后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
-
kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 的所有 API的权限;
-
O 指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
# 1. 证书签名请求文件 cat > admin-csr.json <<EOF { CN: admin, hosts: [], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: system:masters, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin ls admin* # admin.csr admin-csr.json admin-key.pem admin.pem
搭建完 kubernetes 集群后,可以通过命令: kubectl get clusterrolebinding cluster-admin -o yaml ,查看到 clusterrolebinding cluster-admin 的 subjects 的 kind 是 Group,name 是 system:masters。 roleRef 对象是 ClusterRole cluster-admin。 即 system:masters Group 的 user 或者 serviceAccount 都拥有 cluster-admin 的角色。 因此在使用 kubectl 命令时候,才拥有整个集群的管理权限。
kubectl get clusterrolebinding cluster-admin -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: true creationTimestamp: 2017-04-11T11:20:42Z labels: kubernetes.io/bootstrapping: rbac-defaults name: cluster-admin resourceVersion: 52 selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/cluster-admin uid: e61b97b2-1ea8-11e7-8cd7-f4e9d49f8ed0 roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:masters
3.10 kube-proxy 证书
-
CN 指定该证书的 User 为 system:kube-proxy;
-
kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
# 1. 证书签名请求文件 cat > kube-proxy-csr.json <<EOF { CN: system:kube-proxy, hosts: [], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: k8s, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
3.11 证书信息
cfssl-certinfo -cert apiserver.pem { subject: { common_name: kubernetes, country: CN, organization: k8s, organizational_unit: System, locality: BeiJing, province: BeiJing, names: [ CN, BeiJing, BeiJing, k8s, System, kubernetes ] }, issuer: { common_name: kubernetes, country: CN, organization: k8s, organizational_unit: System, locality: BeiJing, province: BeiJing, names: [ CN, BeiJing, BeiJing, k8s, System, kubernetes ] }, serial_number: 275867496157961939649344217740970264800633176866, sans: [ localhost, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster, kubernetes.default.svc.cluster.local, 127.0.0.1, 192.168.80.1, 192.168.80.2, 192.168.80.45, 192.168.80.46, 192.168.80.47, 10.254.0.1 ], not_before: 2021-06-09T05:20:00Z, not_after: 2031-06-07T05:20:00Z, sigalg: SHA256WithRSA, authority_key_id: , subject_key_id: E3:84:0F:9C:00:07:4A:8F:5C:B2:35:45:A0:50:4D:3E:9D:C0:B4:D0, pem: -----BEGIN CERTIFICATE-----\nMIIEezCCA2OgAwIBAgIUMFJTjEXe9sDDDpPXcAiUBt5+QyIwDQYJKoZIhvcNAQEL\nBQAwZTELMAkGA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAOBgNVBAcTB0Jl\naUppbmcxDDAKBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRMwEQYDVQQDEwpr\ndWJlcm5ldGVzMB4XDTIxMDYwOTA1MjAwMFoXDTMxMDYwNzA1MjAwMFowZTELMAkG\nA1UEBhMCQ04xEDAOBgNVBAgTB0JlaUppbmcxEDAOBgNVBAcTB0JlaUppbmcxDDAK\nBgNVBAoTA2s4czEPMA0GA1UECxMGU3lzdGVtMRMwEQYDVQQDEwprdWJlcm5ldGVz\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAw0BpjZQNEd6Oqu8ubEWG\nhbdwJecOTCfdbY+VLIKEm0Tys8ZBlu7OrtZ8Rj5OAZTXil0ZJz+hvHo8YTNJJ16g\njHV88VSpfoXD5DE59PITSFwfY1lWHVctC3Ddo9CM9cU9Ty+Kf29XcrLbc/VNGZTB\ncvKXoM3b6NkBKOdKphVjUvafhKC6ls2ac5uub3uqZTpPgBs/1PvINKNZkP5U6lUV\noTBMAT+qbQ9aggA+bA+WegL3jHU78ngo1XMnsb1HfAjwKDOf66smNJ/K+YjD+Cul\ngjpyqOQKGlz5xqXUcBgIMO9djI4f5hvaMsSje1aSJ/oh5AfQbxQsGjajlS80ED08\nxwIDAQABo4IBITCCAR0wDgYDVR0PAQH/BAQDAgWgMB0GA1UdJQQWMBQGCCsGAQUF\nBwMBBggrBgEFBQcDAjAMBgNVHRMBAf8EAjAAMB0GA1UdDgQWBBTjhA+cAAdKj1yy\nNUWgUE0+ncC00DCBvgYDVR0RBIG2MIGzgglsb2NhbGhvc3SCCmt1YmVybmV0ZXOC\nEmt1YmVybmV0ZXMuZGVmYXVsdIIWa3ViZXJuZXRlcy5kZWZhdWx0LnN2Y4Iea3Vi\nZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVygiRrdWJlcm5ldGVzLmRlZmF1bHQu\nc3ZjLmNsdXN0ZXIubG9jYWyHBH8AAAGHBMCoUAGHBMCoUAKHBMCoUC2HBMCoUC6H\nBMCoUC+HBAr+AAEwDQYJKoZIhvcNAQELBQADggEBAG+RUKp4cxz4EOqmAPiczkl2\nHciAg01RbCavoLoUWmoDDAQf7PIhQF2pLewFCwR5w6SwvCJAVdg+eHdefJ2MBtJr\nKQgbmEOBXd4Z5ZqBeSP6ViHvb1pKtRSldznZLfxjsVd0bN3na/JmS4TZ90SqLLtL\nN4CgGfTs2AfrtbtWIqewDMS9aWjBK8VePzLBmsdLddD4WYQOnl+QjdrX9bbqYRCG\nQo3CKvJ3JZqh6AJHcgKsm0702uMU/TCJwe1M8I8SpYrwA74uCBy3O9jXed1rZlrp\nRVURB6Ro7SMLjiadTJyf6AbLPMmZcPKHhZ1XG07q8Od2Kd+KVx1PxF3et6OOteE=\n-----END CERTIFICATE-----\n }
3.12 分发证书
所有节点
mkdir -p /etc/kubernetes/pki cp *.pem /etc/kubernetes/pki tar cvf pki.tar /etc/kubernetes/pki scp pki.tar [email protected]:/root scp pki.tar [email protected]:/root sudo -i cd / && mv /root/pki.tar / && tar xvf pki.tar && rm -f pki.tar
4. 安装 etcd
4.1 节点 etcd-1
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install # 1. 下载并安装 wget https://github.com/etcd-io/etcd/releases/download/v3.4.15/etcd-v3.4.15-linux-amd64.tar.gz tar zxvf etcd-v3.4.15-linux-amd64.tar.gz mv etcd-v3.4.15-linux-amd64/{etcd,etcdctl} /usr/bin/ # 2. 配置文件 mkdir -p /etc/etcd cat > /etc/etcd/etcd.conf << EOF #[Member] ETCD_NAME=etcd-1 ETCD_DATA_DIR=/var/lib/etcd/default.etcd ETCD_LISTEN_PEER_URLS=https://192.168.80.45:2380 ETCD_LISTEN_CLIENT_URLS=https://192.168.80.45:2379,https://127.0.0.1:2379 #[Clustering] ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.80.45:2380 ETCD_ADVERTISE_CLIENT_URLS=https://192.168.80.45:2379 ETCD_INITIAL_CLUSTER=etcd-1=https://192.168.80.45:2380,etcd-2=https://192.168.80.46:2380,etcd-3=https://192.168.80.47:2380 ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster ETCD_INITIAL_CLUSTER_STATE=new EOF # 3. 开机启动 cat > /lib/systemd/system/etcd.service << EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=-/etc/etcd/etcd.conf ExecStart=/usr/bin/etcd \ --cert-file=/etc/kubernetes/pki/etcd.pem \ --key-file=/etc/kubernetes/pki/etcd-key.pem \ --peer-cert-file=/etc/kubernetes/pki/etcd.pem \ --peer-key-file=/etc/kubernetes/pki/etcd-key.pem \ --trusted-ca-file=/etc/kubernetes/pki/ca.pem \ --peer-trusted-ca-file=/etc/kubernetes/pki/ca.pem \ --logger=zap Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF # 4. 准备克隆文件 tar cvf etcd-clone.tar /usr/bin/etcd* /etc/etcd /lib/systemd/system/etcd.service scp etcd-clone.tar [email protected]:/root scp etcd-clone.tar [email protected]:/root
4.2 其他节点
# 1. 解压克隆文件 sudo -i cd / && mv /root/etcd-clone.tar / && tar xvf etcd-clone.tar && rm -f etcd-clone.tar # 2. 修改配置文件 vim /etc/etcd/etcd.conf #[Member] ETCD_NAME=etcd-2 # change to local ETCD_DATA_DIR=/var/lib/etcd/default.etcd ETCD_LISTEN_PEER_URLS=https://192.168.80.46:2380 # change to local ETCD_LISTEN_CLIENT_URLS=https://192.168.80.46:2379,https://127.0.0.1:2379 # change to local #[Clustering] ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.80.46:2380 # change to local ETCD_ADVERTISE_CLIENT_URLS=https://192.168.80.46:2379 # change to local ETCD_INITIAL_CLUSTER=etcd-1=https://192.168.80.45:2380,etcd-2=https://192.168.80.46:2380,etcd-3=https://192.168.80.47:2380 ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster ETCD_INITIAL_CLUSTER_STATE=new
4.3 启动
# 1. 开机启动 systemctl daemon-reload systemctl start etcd systemctl status etcd systemctl enable etcd # 2. 运行状态 etcdctl member list --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --write-out=table +------------------+---------+--------+----------------------------+----------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+--------+----------------------------+----------------------------+------------+ | 46bc5ad35e418584 | started | etcd-1 | https://192.168.80.45:2380 | https://192.168.80.45:2379 | false | | 8f347c1327049bc8 | started | etcd-3 | https://192.168.80.47:2380 | https://192.168.80.47:2379 | false | | b01e7a29099f3eb8 | started | etcd-2 | https://192.168.80.46:2380 | https://192.168.80.46:2379 | false | +------------------+---------+--------+----------------------------+----------------------------+------------+ # 3. 健康状态 etcdctl endpoint health --cacert=/etc/kubernetes/pki/ca.pem --cert=/etc/kubernetes/pki/etcd.pem --key=/etc/kubernetes/pki/etcd-key.pem --cluster --write-out=table +----------------------------+--------+-------------+-------+ | ENDPOINT | HEALTH | TOOK | ERROR | +----------------------------+--------+-------------+-------+ | https://192.168.80.47:2379 | true | 20.973639ms | | | https://192.168.80.46:2379 | true | 29.842299ms | | | https://192.168.80.45:2379 | true | 30.564766ms | | +----------------------------+--------+-------------+-------+ # 4. 查看LEADER
5. Master 节点
kubernetes master 节点组件:
-
kube-apiserver
-
kube-scheduler
-
kube-controller-manager
-
kubelet (非必须,但必要)
-
kube-proxy(非必须,但必要)
5.1 安装准备
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install wget https://dl.k8s.io/v1.19.11/kubernetes-server-linux-amd64.tar.gz tar zxvf kubernetes-server-linux-amd64.tar.gz cd kubernetes/server/bin cp kube-apiserver kube-scheduler kube-controller-manager kubectl kubelet kube-proxy /usr/bin
5.2 apiserver
5.2.1 TLS Bootstrapping Token
启用 TLS Bootstrapping 机制:
TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy要与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。
TLS bootstraping
工作流程:
BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ') # 格式:token,用户名,UID,用户组 cat > /etc/kubernetes/token.csv <<EOF ${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,system:node-bootstrapper EOF
5.2.2 配置文件
--service-cluster-ip-range=10.254.0.0/16
: Service IP 段
cat > /etc/kubernetes/kube-apiserver.conf << EOF KUBE_APISERVER_OPTS=--logtostderr=false \\ --v=2 \\ --log-dir=/var/log/kubernetes \\ --etcd-servers=https://192.168.80.45:2379,https://192.168.80.46:2379,https://192.168.80.47:2379 \\ --bind-address=192.168.80.45 \\ --secure-port=6443 \\ --advertise-address=192.168.80.45 \\ --allow-privileged=true \\ --service-cluster-ip-range=10.254.0.0/16 \\ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \\ --authorization-mode=RBAC,Node \\ --enable-bootstrap-token-auth=true \\ --token-auth-file=/etc/kubernetes/token.csv \\ --service-node-port-range=30000-32767 \\ --kubelet-client-certificate=/etc/kubernetes/pki/apiserver.pem \\ --kubelet-client-key=/etc/kubernetes/pki/apiserver-key.pem \\ --tls-cert-file=/etc/kubernetes/pki/apiserver.pem \\ --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --client-ca-file=/etc/kubernetes/pki/ca.pem \\ --service-account-key-file=/etc/kubernetes/pki/ca-key.pem \\ --service-account-issuer=api \\ --service-account-signing-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --etcd-cafile=/etc/kubernetes/pki/ca.pem \\ --etcd-certfile=/etc/kubernetes/pki/etcd.pem \\ --etcd-keyfile=/etc/kubernetes/pki/etcd-key.pem \\ --requestheader-client-ca-file=/etc/kubernetes/pki/ca.pem \\ --proxy-client-cert-file=/etc/kubernetes/pki/apiserver.pem \\ --proxy-client-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --requestheader-allowed-names=kubernetes \\ --requestheader-extra-headers-prefix=X-Remote-Extra- \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --enable-aggregator-routing=true \\ --audit-log-maxage=30 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-path=/var/log/kubernetes/k8s-audit.log EOF
5.2.3 开机启动
# 1. 系统管理 cat > /lib/systemd/system/kube-apiserver.service << EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/etc/kubernetes/kube-apiserver.conf ExecStart=/usr/bin/kube-apiserver \$KUBE_APISERVER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF # 2. 启动 systemctl daemon-reload systemctl start kube-apiserver systemctl status kube-apiserver systemctl enable kube-apiserver
5.3 controller-manager
5.3.1 kubeconfig 文件
KUBE_CONFIG=/etc/kubernetes/kube-controller-manager.kubeconfig KUBE_APISERVER=https://192.168.80.45:6443 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-controller-manager \ --client-certificate=/etc/kubernetes/pki/kube-controller-manager.pem \ --client-key=/etc/kubernetes/pki/kube-controller-manager-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-controller-manager \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
5.3.2 配置文件
--cluster-cidr=10.244.0.0/16
: Pod IP 段
--service-cluster-ip-range=10.254.0.0/16
: Service IP 段
cat > /etc/kubernetes/kube-controller-manager.conf << EOF KUBE_CONTROLLER_MANAGER_OPTS=--logtostderr=false \\ --v=2 \\ --log-dir=/var/log/kubernetes \\ --leader-elect=true \\ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\ --bind-address=127.0.0.1 \\ --allocate-node-cidrs=true \\ --cluster-cidr=10.244.0.0/16 \\ --service-cluster-ip-range=10.254.0.0/16 \\ --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \\ --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \\ --root-ca-file=/etc/kubernetes/pki/ca.pem \\ --service-account-private-key-file=/etc/kubernetes/pki/ca-key.pem \\ --cluster-signing-duration=87600h0m0s EOF
5.3.3 开机启动
cat > /lib/systemd/system/kube-controller-manager.service << EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=-/etc/kubernetes/kube-controller-manager.conf ExecStart=/usr/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kube-controller-manager systemctl status kube-controller-manager systemctl enable kube-controller-manager
5.4 scheduler
5.4.1 kubeconfig 文件
KUBE_CONFIG=/etc/kubernetes/kube-scheduler.kubeconfig KUBE_APISERVER=https://192.168.80.45:6443 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-scheduler \ --client-certificate=/etc/kubernetes/pki/kube-scheduler.pem \ --client-key=/etc/kubernetes/pki/kube-scheduler-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-scheduler \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
5.4.2 配置文件
cat > /etc/kubernetes/kube-scheduler.conf << EOF KUBE_SCHEDULER_OPTS=--logtostderr=false \ --v=2 \ --log-dir=/var/log/kubernetes \ --leader-elect \ --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \ --bind-address=127.0.0.1 EOF
5.4.3 开机启动
cat > /lib/systemd/system/kube-scheduler.service << EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=-/etc/kubernetes/kube-scheduler.conf ExecStart=/usr/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS Restart=on-failure [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kube-scheduler systemctl enable kube-scheduler systemctl status kube-scheduler
5.5 kubelet
5.5.1 参数配置文件
cat > /etc/kubernetes/kubelet-config.yml << EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: 0.0.0.0 port: 10250 readOnlyPort: 10255 cgroupDriver: cgroupfs clusterDNS: - 10.254.0.2 clusterDomain: cluster.local failSwapOn: false authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% maxOpenFiles: 1000000 maxPods: 110 EOF
5.5.2 kubeconfig 文件
BOOTSTRAP_TOKEN=$(cat /etc/kubernetes/token.csv | awk -F, '{print $1}') KUBE_CONFIG=/etc/kubernetes/bootstrap.kubeconfig KUBE_APISERVER=https://192.168.80.45:6443 # 生成 kubelet bootstrap kubeconfig 配置文件 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kubelet-bootstrap \ --token=${BOOTSTRAP_TOKEN} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kubelet-bootstrap \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
5.5.3 配置文件
其中:--kubeconfig=/etc/kubernetes/kubelet.kubeconfig
在加入集群时自动生成
cat > /etc/kubernetes/kubelet.conf << EOF KUBELET_OPTS=--logtostderr=false \\ --v=2 \\ --log-dir=/var/log/kubernetes \\ --hostname-override=k8s-master1 \\ --network-plugin=cni \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \\ --config=/etc/kubernetes/kubelet-config.yml \\ --cert-dir=/etc/kubernetes/pki \\ --pod-infra-container-image=mirrorgooglecontainers/pause-amd64:3.1 EOF
5.5.4 授权 kubelet-bootstrap 用户允许请求证书
防止错误:failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User kubelet-bootstrap cannot create resource certificatesigningrequests in API group certificates.k8s.io at the cluster scope
kubectl create clusterrolebinding kubelet-bootstrap \ --clusterrole=system:node-bootstrapper \ --user=kubelet-bootstrap
5.5.5 开机启动
cat > /lib/systemd/system/kubelet.service << EOF [Unit] Description=Kubernetes Kubelet After=docker.service [Service] EnvironmentFile=/etc/kubernetes/kubelet.conf ExecStart=/usr/bin/kubelet \$KUBELET_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kubelet systemctl enable kubelet systemctl status kubelet
5.5.6 加入集群
# 查看kubelet证书请求 kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-ghWG-AWFM9sxJbr5A-BIq9puVIRxfFHrQlwDjYbHba8 25s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending # 批准申请 kubectl certificate approve node-csr-qlwTsndFeZfb4r45MpY8b0fRyf6NnH-Y42cCuWCF2dk # 再次查看证书 kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-qlwTsndFeZfb4r45MpY8b0fRyf6NnH-Y42cCuWCF2dk 53m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued # 查看节点(由于网络插件还没有部署,节点会没有准备就绪 NotReady) kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady <none> 4m8s v1.19.11
5.6 kube-proxy
5.6.1 参数配置文件
clusterCIDR: 10.254.0.0/16
: Service IP 段,与apiserver & controller-manager 的--service-cluster-ip-range
一致
cat > /etc/kubernetes/kube-proxy-config.yml << EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: 0.0.0.0 metricsBindAddress: 0.0.0.0:10249 clientConnection: kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig hostnameOverride: k8s-master1 clusterCIDR: 10.254.0.0/16 EOF
5.6.2 kubeconfig 文件
KUBE_CONFIG=/etc/kubernetes/kube-proxy.kubeconfig KUBE_APISERVER=https://192.168.80.45:6443 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials kube-proxy \ --client-certificate=/etc/kubernetes/pki/kube-proxy.pem \ --client-key=/etc/kubernetes/pki/kube-proxy-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=kube-proxy \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
5.6.3 配置文件
cat > /etc/kubernetes/kube-proxy.conf << EOF KUBE_PROXY_OPTS=--logtostderr=false \ --v=2 \ --log-dir=/var/log/kubernetes \ --config=/etc/kubernetes/kube-proxy-config.yml EOF
5.6.4 开机启动
cat > /lib/systemd/system/kube-proxy.service << EOF [Unit] Description=Kubernetes Proxy After=network.target [Service] EnvironmentFile=/etc/kubernetes/kube-proxy.conf ExecStart=/usr/bin/kube-proxy \$KUBE_PROXY_OPTS Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl start kube-proxy systemctl enable kube-proxy systemctl status kube-proxy
5.7 授权 apiserver
访问 kubelet
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install cat > apiserver-to-kubelet-rbac.yaml << EOF apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: true labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics - pods/log verbs: - * --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver namespace: roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF kubectl apply -f apiserver-to-kubelet-rbac.yaml
5.8 集群管理
5.8.1 kubeconfig 文件
mkdir -p /root/.kube KUBE_CONFIG=/root/.kube/config KUBE_APISERVER=https://192.168.80.45:6443 kubectl config set-cluster kubernetes \ --certificate-authority=/etc/kubernetes/pki/ca.pem \ --embed-certs=true \ --server=${KUBE_APISERVER} \ --kubeconfig=${KUBE_CONFIG} kubectl config set-credentials cluster-admin \ --client-certificate=/etc/kubernetes/pki/admin.pem \ --client-key=/etc/kubernetes/pki/admin-key.pem \ --embed-certs=true \ --kubeconfig=${KUBE_CONFIG} kubectl config set-context default \ --cluster=kubernetes \ --user=cluster-admin \ --kubeconfig=${KUBE_CONFIG} kubectl config use-context default --kubeconfig=${KUBE_CONFIG}
5.8.2 集群配置信息
kubectl config view apiVersion: v1 clusters: - cluster: certificate-authority-data: DATA+OMITTED server: https://192.168.80.45:6443 name: kubernetes contexts: - context: cluster: kubernetes user: cluster-admin name: default current-context: default kind: Config preferences: {} users: - name: cluster-admin user: client-certificate-data: REDACTED client-key-data: REDACTED
5.8.3 集群状态
kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-1 Healthy {health:true} etcd-2 Healthy {health:true} etcd-0 Healthy {health:true}
5.9 命令补全
apt install -y bash-completion source /usr/share/bash-completion/bash_completion source <(kubectl completion bash) echo source <(kubectl completion bash) >> ~/.bashrc
6. Node 节点
Kubernetes node节点组件:
-
kubelet
-
kube-proxy
6.1 克隆准备 (master节点执行)
mkdir -p $HOME/k8s-install && cd $HOME/k8s-install tar cvf worker-node-clone.tar /usr/bin/{kubelet,kube-proxy} /lib/systemd/system/{kubelet,kube-proxy}.service /etc/kubernetes/kubelet* /etc/kubernetes/kube-proxy* /etc/kubernetes/pki /etc/kubernetes/bootstrap.kubeconfig scp worker-node-clone.tar [email protected]:/root scp worker-node-clone.tar [email protected]:/root
6.2 克隆节点
cd / && mv /root/worker-node-clone.tar / && tar xvf worker-node-clone.tar && rm -f worker-node-clone.tar # 删除证书申请审批后自动生成的文件,后面重新生成 rm -f /etc/kubernetes/kubelet.kubeconfig rm -f /etc/kubernetes/pki/kubelet* # 日志目录 mkdir -p /var/log/kubernetes
6.3 修改配置
按实际节点名称修改
# kubelet vim /etc/kubernetes/kubelet.conf --hostname-override=k8s-node01 # kube-proxy vim /etc/kubernetes/kube-proxy-config.yml hostnameOverride: k8s-node01
6.4 开机启动
systemctl daemon-reload systemctl start kubelet kube-proxy systemctl enable kubelet kube-proxy systemctl status kubelet kube-proxy
6.5 加入集群 (master节点执行)
# 1. 节点信息 kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-j51DeSAxg95ZULzX0rm8RBIjUQU1O3d4gxBYcAsZkHk 28s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending node-csr-oK3jPn4eE3vsNrO88g4vSq2Z66k-8nhEJhDAKPgWZ5k 14s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending node-csr-qlwTsndFeZfb4r45MpY8b0fRyf6NnH-Y42cCuWCF2dk 14m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued # 2. 批准加入 kubectl certificate approve node-csr-j51DeSAxg95ZULzX0rm8RBIjUQU1O3d4gxBYcAsZkHk kubectl certificate approve node-csr-oK3jPn4eE3vsNrO88g4vSq2Z66k-8nhEJhDAKPgWZ5k # 3. 集群节点 kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady <none> 45m v1.19.11 k8s-node01 NotReady <none> 6s v1.19.11 k8s-node02 NotReady <none> 10s v1.19.11 # 4. 设置标签,即更改节点角色 kubectl label node k8s-master1 node-role.kubernetes.io/master= kubectl label node k8s-node01 node-role.kubernetes.io/node= kubectl label node k8s-node02 node-role.kubernetes.io/node= kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady master 49m v1.19.11 k8s-node01 NotReady node 3m45s v1.19.11 k8s-node02 NotReady node 3m49s v1.19.11 # 5. 设置污点:是master节点无法创建pod kubectl taint nodes k8s-master1 node-role.kubernetes.io/master=:NoSchedule kubectl describe node k8s-master1 Taints: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoSchedule
7. CNI 网络
# 节点状态 kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 NotReady master 49m v1.19.11 k8s-node01 NotReady node 3m45s v1.19.11 k8s-node02 NotReady node 3m49s v1.19.11 # 检查日志,发现网络插件未安装 journalctl -u kubelet -f Jun 02 14:24:29 k8s-master1 kubelet[75636]: W0602 14:24:29.172144 75636 cni.go:239] Unable to update cni config: no networks found in /etc/cni/net.d Jun 02 14:24:32 k8s-master1 kubelet[75636]: E0602 14:24:32.958021 75636 kubelet.go:2129] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
其中涉及的IP段,要与 kube-controller-manager中 “–cluster-cidr” 一致
7.1 安装 CNI 网络插件
所有节点都要操作
mkdir -p $HOME/k8s-install/network && cd $_ wget https://github.com/containernetworking/plugins/releases/download/v0.9.1/cni-plugins-linux-amd64-v0.9.1.tgz mkdir -p /opt/cni/bin tar zxvf cni-plugins-linux-amd64-v0.9.1.tgz -C /opt/cni/bin
7.2 calico
Calico
是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。
注意:镜像pending时需要先手动将镜像拉取到本地
mkdir -p $HOME/k8s-install/network && cd $HOME/k8s-install/network # 1. 下载插件 wget https://docs.projectcalico.org/manifests/calico.yaml # CIDR的值,与 kube-controller-manager中“--cluster-cidr=10.244.0.0/16” 一致 vi calico.yaml 3680 # The default IPv4 pool to create on startup if none exists. Pod IPs will be 3681 # chosen from this range. Changing this value after installation will have 3682 # no effect. This should fall within `--cluster-cidr`. 3683 - name: CALICO_IPV4POOL_CIDR 3684 value: 10.244.0.0/16 # 2. 安装网络插件 kubectl apply -f calico.yaml # 3. 检查是否启动 kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE calico-kube-controllers-7f4f5bf95d-tgklk 1/1 Running 0 2m7s calico-node-fwv5x 1/1 Running 0 2m8s calico-node-ttt2c 1/1 Running 0 2m8s calico-node-xjvjf 1/1 Running 0 2m8s # 4. 节点状态正常 kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 65m v1.19.11 k8s-node01 Ready node 20m v1.19.11 k8s-node02 Ready node 20m v1.19.11
7.3 flannel
这个也是一个网络组件方案可以和calico插件二选一
mkdir -p $HOME/k8s-install/network && cd $HOME/k8s-install/network # FQ访问 wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml kubectl apply -f kube-flannel.yml vim kube-flannel.yml Network: 10.244.0.0/16, kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE kube-flannel-ds-8qnnx 1/1 Running 0 10s kube-flannel-ds-979lc 1/1 Running 0 16m kube-flannel-ds-kgmgg 1/1 Running 0 16m kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 85m v1.19.11 k8s-node01 Ready node 40m v1.19.11 k8s-node02 Ready node 40m v1.19.11
源文件:
cat <<EOF > kube-flannel.yml --- apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: psp.flannel.unprivileged annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default spec: privileged: false volumes: - configMap - secret - emptyDir - hostPath allowedHostPaths: - pathPrefix: /etc/cni/net.d - pathPrefix: /etc/kube-flannel - pathPrefix: /run/flannel readOnlyRootFilesystem: false # Users and groups runAsUser: rule: RunAsAny supplementalGroups: rule: RunAsAny fsGroup: rule: RunAsAny # Privilege Escalation allowPrivilegeEscalation: false defaultAllowPrivilegeEscalation: false # Capabilities allowedCapabilities: ['NET_ADMIN'] defaultAddCapabilities: [] requiredDropCapabilities: [] # Host namespaces hostPID: false hostIPC: false hostNetwork: true hostPorts: - min: 0 max: 65535 # SELinux seLinux: # SELinux is unused in CaaSP rule: 'RunAsAny' --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel rules: - apiGroups: ['extensions'] resources: ['podsecuritypolicies'] verbs: ['use'] resourceNames: ['psp.flannel.unprivileged'] - apiGroups: - resources: - pods verbs: - get - apiGroups: - resources: - nodes verbs: - list - watch - apiGroups: - resources: - nodes/status verbs: - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: flannel roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: flannel subjects: - kind: ServiceAccount name: flannel namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: flannel namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flannel data: cni-conf.json: | { name: cbr0, cniVersion: 0.3.1, plugins: [ { type: flannel, delegate: { hairpinMode: true, isDefaultGateway: true } }, { type: portmap, capabilities: { portMappings: true } } ] } net-conf.json: | { Network: 10.244.0.0/16, Backend: { Type: vxlan } } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-amd64 namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - amd64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-amd64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: 100m memory: 50Mi limits: cpu: 100m memory: 50Mi securityContext: privileged: false capabilities: add: [NET_ADMIN] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-arm64 namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - arm64 hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-arm64 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-arm64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: 100m memory: 50Mi limits: cpu: 100m memory: 50Mi securityContext: privileged: false capabilities: add: [NET_ADMIN] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-arm namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - arm hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-arm command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-arm command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: 100m memory: 50Mi limits: cpu: 100m memory: 50Mi securityContext: privileged: false capabilities: add: [NET_ADMIN] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-ppc64le namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - ppc64le hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-ppc64le command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-ppc64le command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: 100m memory: 50Mi limits: cpu: 100m memory: 50Mi securityContext: privileged: false capabilities: add: [NET_ADMIN] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg --- apiVersion: apps/v1 kind: DaemonSet metadata: name: kube-flannel-ds-s390x namespace: kube-system labels: tier: node app: flannel spec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: beta.kubernetes.io/os operator: In values: - linux - key: beta.kubernetes.io/arch operator: In values: - s390x hostNetwork: true tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni image: quay.io/coreos/flannel:v0.11.0-s390x command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-s390x command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: 100m memory: 50Mi limits: cpu: 100m memory: 50Mi securityContext: privileged: false capabilities: add: [NET_ADMIN] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ volumes: - name: run hostPath: path: /run/flannel - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg EOF
8. Addons
CoreDNS用于集群内部Service名称解析
安装方式一:
有问题解决链接:https://blog.51cto.com/hexiaoshuai/2812394
官网:https://github.com/coredns/deployment/tree/master/kubernetes
apt install jq -y mkdir -p $HOME/k8s-install/coredns && cd $HOME/k8s-install/coredns git clone https://github.com/coredns/deployment.git export CLUSTER_DNS_SVC_IP=10.254.0.2 export CLUSTER_DNS_DOMAIN=cluster.local # 修改coredns.yaml.sed文件 loop 去掉 # 执行 ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
安装方式二:
# 下载文件 wget https://storage.googleapis.com/kubernetes-the-hard-way/coredns.yaml # 修改文件内容cluster ip clusterIP: 10.254.0.2 # 启动文件 kubectl apply -f coredns.yaml # 查询状态 kubectl get pods -n kube-system | grep coredns coredns-7bb48b4bc5-42j9n 1/1 Running 0 3m36s coredns-7bb48b4bc5-8ppbl 1/1 Running 0 3m36s # 查询状态 kubectl get pods -n kube-system | grep coredns coredns-746fcb4bc5-nts2k 1/1 Running 0 6m2s # 验证 busybox1.28.4有问题 kubectl run -it --rm dns-test --image=busybox:1.28.4 /bin/sh If you don't see a command prompt, try pressing enter. / # nslookup kubernetes Server: 10.254.0.2 Address: 10.254.0.2:53 Name: kubernetes.default.svc.cluster.local Address: 10.0.0.1
DNS问题排查:
# dns service kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 13m # endpoints 是否正常 kubectl get endpoints kube-dns -n kube-system NAME ENDPOINTS AGE kube-dns 10.244.85.194:53,10.244.85.194:53,10.244.85.194:9153 13m # coredns 增加解析日志 CoreDNS 配置参数说明: errors: 输出错误信息到控制台。 health:CoreDNS 进行监控检测,检测地址为 http://localhost:8080/health 如果状态为不健康则让 Pod 进行重启。 ready: 全部插件已经加载完成时,将通过 endpoints 在 8081 端口返回 HTTP 状态 200。 kubernetes:CoreDNS 将根据 Kubernetes 服务和 pod 的 IP 回复 DNS 查询。 prometheus:是否开启 CoreDNS Metrics 信息接口,如果配置则开启,接口地址为 http://localhost:9153/metrics forward:任何不在Kubernetes 集群内的域名查询将被转发到预定义的解析器 (/etc/resolv.conf)。 cache:启用缓存,30 秒 TTL。 loop:检测简单的转发循环,如果找到循环则停止 CoreDNS 进程。 reload:监听 CoreDNS 配置,如果配置发生变化则重新加载配置。 loadbalance:DNS 负载均衡器,默认 round_robin。 # 编辑 coredns 配置 kubectl edit configmap coredns -n kube-system apiVersion: v1 data: Corefile: | .:53 { log # new add errors health { lameduck 5s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 } cache 30 loop reload loadbalance } kind: ConfigMap metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {apiVersion:v1,data:{Corefile:.:53 {\n errors\n health {\n lameduck 5s\n }\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . /etc/resolv.conf {\n max_concurrent 1000\n }\n cache 30\n loop\n reload\n loadbalance\n}\n},kind:ConfigMap,metadata:{annotations:{},name:coredns,namespace:kube-system}} creationTimestamp: 2021-05-13T11:57:45Z name: coredns namespace: kube-system resourceVersion: 38460 selfLink: /api/v1/namespaces/kube-system/configmaps/coredns uid: c62a856d-1fc3-4fe9-b5f1-3ca0dbeb39c1
回滚操作(需要外网,根据部署方式一来的):
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/rollback.sh chmod +x rollback.sh export CLUSTER_DNS_SVC_IP=10.254.0.2 export CLUSTER_DNS_DOMAIN=cluster.local # 这个建议能访问外网 ./rollback.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f - kubectl delete --namespace=kube-system deployment coredns
8.2 Dashboard
GitHub:https://github.com/kubernetes/dashboard/blob/master/aio/deploy/recommended.yaml
如果镜像拉不下来可以直接使用docker pull 镜像名
的方式
mkdir -p $HOME/k8s-install/dashboard && cd $HOME/k8s-install/dashboard # 1. 下载并安装 wget https://github.com/kubernetes/dashboard/blob/v2.5.1/aio/deploy/recommended.yaml kubectl apply -f recommended.yaml # 2. 检查运行状态 kubectl get pods -n kubernetes-dashboard -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES dashboard-metrics-scraper-5b8896d7fc-58fgt 0/1 ContainerCreating 0 7s <none> k8s-node01 <none> <none> kubernetes-dashboard-7b5d774449-tn7hk 0/1 ContainerCreating 0 7s <none> k8s-master1 <none> <none> # 3. 检查服务状态 kubectl get svc -n kubernetes-dashboard -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dashboard-metrics-scraper ClusterIP 10.254.14.1 <none> 8000/TCP 24m k8s-app=dashboard-metrics-scraper kubernetes-dashboard ClusterIP 10.254.219.125 <none> 443/TCP 24m k8s-app=kubernetes-dashboard # 4. 服务改为NodePort方式 kubectl edit svc kubernetes-dashboard -n kubernetes-dashboard type: ClusterIP => type: NodePort kubectl get svc -n kubernetes-dashboard -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dashboard-metrics-scraper ClusterIP 10.254.14.1 <none> 8000/TCP 3h30m k8s-app=dashboard-metrics-scraper kubernetes-dashboard NodePort 10.254.219.125 <none> 443:31639/TCP 3h30m k8s-app=kubernetes-dashboard # 5. 创建service account并绑定默认cluster-admin管理员集群角色: kubectl create serviceaccount dashboard-admin -n kube-system kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin # 6. 获取访问 token kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}') Name: dashboard-admin-token-xwd72 Namespace: kube-system Labels: <none> Annotations: kubernetes.io/service-account.name: dashboard-admin kubernetes.io/service-account.uid: 013e9f84-827f-4dc7-81b3-874a28bfebc6 Type: kubernetes.io/service-account-token Data ==== ca.crt: 1310 bytes namespace: 11 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6InNQRElCQTlPRUZ5SU54STQ1QWllLXlKMTFCcmZieG0wVTJnRlpzYlBNLXcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4teHdkNzIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMDEzZTlmODQtODI3Zi00ZGM3LTgxYjMtODc0YTI4YmZlYmM2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.O-DI-0IlLFP2pDRKzQYJrZeDAnVvW1IjU-iVwGzvwID7BH0v6kXfWnti07qm8VkuGFJtpuQsmrf6v4sUeRDhr95kZlEVV8Rxnes6oixrkXdk3fR4xreh4lh6ZgCzbER6xI8pMG-j9KNjTRdY6gQPJuOThtI9ab13dpTT5AYpggA2O98DFfgcJ_DzD05hhk6TghOdoro00msHRSUrsEiH0CYa_3PiyPlkvmmY3MlJPsBTdO2pCDzcrjQ2L5EaJAvSh6OodkRY6ymOwfcbfPs3WwSocCEfwkogYOCAQhMC4NU3Jea_hoeFqzLdS1PK5R2rPT-wqemwjDKn0E6jUv6juw # 7. 访问 https://192.168.80.45:31639
9. 高可用
角色 | IP | 组件 | 备注 |
---|---|---|---|
k8s-master1 | 192.168.80.45 | etcd, api-server, controller-manager, scheduler, kubelet, kube-proxy, docker | |
k8s-node01 | 192.168.80.46 | etcd, kubelet, kube-proxy, docker | |
k8s-node02 | 192.168.80.47 | etcd, kubelet, kube-proxy, docker | |
k8s-master2 | 192.168.80.49 | etcd, api-server, controller-manager, scheduler, kubelet, kube-proxy, docker | 新增节点 |
9.1 准备操作 (Master-1)
9.1.1 kube-apiserver 证书更新
在新增节点的IP段未在证书中时需要如下操作:
mkdir -p /root/ssl && cd /root/ssl # 1. 证书签名请求文件 cat > apiserver-csr.json <<EOF { CN: kubernetes, hosts: [ 127.0.0.1, localhost, 192.168.80.1, 192.168.80.2, 192.168.80.3, 192.168.80.45, 192.168.80.46, 192.168.80.47, 192.168.80.48, 192.168.80.49, 10.254.0.1, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster, kubernetes.default.svc.cluster.local ], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: k8s, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver # 3. 证书更新 cp apiserver*.pem /etc/kubernetes/pki scp apiserver*.pem [email protected]:/root scp apiserver*.pem [email protected]:/root # 4. node节点证书更新 chown root:root /root/apiserver*.pem mv /root/apiserver*.pem /etc/kubernetes/pki # 5. 重启 apiserver systemctl restart kube-apiserver systemctl status kube-apiserver
9.1.2 增加主机
在 k8s-master1, k8s-node01, k8s-node02 上制作:
echo '192.168.80.49 k8s-master2' >> /etc/hosts
9.2 扩容 Master
9.2.1 初始化
# 1. 修改主机名 hostnamectl set-hostname k8s-master2 # 2. 主机名解析 cat >> /etc/hosts <<EOF 192.168.80.45 k8s-master1 192.168.80.46 k8s-node01 192.168.80.47 k8s-node02 192.168.80.49 k8s-master2 EOF # 3. 禁用 swap swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab # 4. 将桥接的IPv4流量传递到iptables的链 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 EOF sysctl --system # 5. 域名解析 echo nameserver 8.8.8.8 >> /etc/resolv.conf # 6. 时间同步 apt install ntpdate -y ntpdate ntp1.aliyun.com crontab -e */30 * * * * /usr/sbin/ntpdate-u ntp1.aliyun.com >> /var/log/ntpdate.log 2>&1 # 7. 日志目录 mkdir -p /var/log/kubernetes
9.2.2 克隆
# 1. k8s-master1 上执行 mkdir -p $HOME/k8s-install && cd $HOME/k8s-install tar zcvf master-node-clone.tar.gz /usr/bin/kube* /lib/systemd/system/kube*.service /etc/kubernetes /root/.kube/config /usr/bin/docker* /usr/bin/runc /usr/bin/containerd* /usr/bin/ctr /etc/docker /lib/systemd/system/docker.service scp master-node-clone.tar.gz [email protected]:/root # 2. k8s-master2 执行 cd / && mv /root/master-node-clone.tar.gz / && tar zxvf master-node-clone.tar.gz && rm -f master-node-clone.tar.gz rm -f /etc/kubernetes/kubelet.kubeconfig rm -f /etc/kubernetes/pki/kubelet*
9.2.3 更新配置
vim /etc/kubernetes/kube-apiserver.conf --bind-address=192.168.80.49 \ --advertise-address=192.168.80.49 \ sed -i 's#k8s-master1#k8s-master2#' /etc/kubernetes/* sed -i 's#192.168.80.45:6443#192.168.80.49:6443#' /etc/kubernetes/* vi /root/.kube/config server: https://192.168.80.49:6443
9.2.4 开机启动
systemctl daemon-reload systemctl start docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy systemctl status docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy systemctl enable docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
9.2.5 集群状态
kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-2 Healthy {health:true} etcd-1 Healthy {health:true} etcd-0 Healthy {health:true}
9.2.6 加入集群
kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-HfzAqSEc7sIIG9QFHip4vGFnFZhyZnYjBVGWQyGpz54 7m49s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending # 批准加入 kubectl certificate approve node-csr-HfzAqSEc7sIIG9QFHip4vGFnFZhyZnYjBVGWQyGpz54 kubectl get node NAME STATUS ROLES AGE VERSION NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 27h v1.19.11 k8s-master2 NotReady <none> 11s v1.19.11 k8s-node01 Ready node 27h v1.19.11 k8s-node02 Ready node 27h v1.19.11
9.2.7 打标和污点
# 设置标签 kubectl label node k8s-master2 node-role.kubernetes.io/master= # 设置污点:是master节点无法创建pod kubectl taint nodes k8s-master2 node-role.kubernetes.io/master=:NoSchedule # 节点信息 kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS k8s-master1 Ready master 2d17h v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master1,kubernetes.io/os=linux,node-role.kubernetes.io/master= k8s-master2 Ready master 2m33s v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-master2,kubernetes.io/os=linux,node-role.kubernetes.io/master= k8s-node01 Ready node 2d17h v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node01,kubernetes.io/os=linux,node-role.kubernetes.io/node= k8s-node02 Ready node 2d17h v1.19.11 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8s-node02,kubernetes.io/os=linux,node-role.kubernetes.io/node=
9.3 高可用负载均衡
Nginx
: 主流Web服务和反向代理服务器,这里用四层实现对apiserver实现负载均衡。
Keepalived: 主流高可用软件,基于VIP绑定实现服务器双机热备。Keepalived主要根据Nginx运行状态判断是否需要故障转移(漂移VIP),例如当Nginx主节点挂掉,VIP会自动绑定在Nginx备节点,从而保证VIP一直可用,实现Nginx高可用。
服务器规划:
角色 | IP | 组件 |
---|---|---|
k8s-master1 | 192.168.80.45 | kube-apiserver |
k8s-master2 | 192.168.80.49 | kube-apiserver |
k8s-loadbalancer1 | 192.168.80.2 | nginx, keepalived |
k8s-loadbalancer2 | 192.168.80.3 | nginx, keepalived |
VIP | 192.168.80.1 | 虚拟IP |
9.3.1 安装软件
apt install nginx keepalived -y sudo useradd nginx -G www-data
9.3.2 配置Nginx
解决stream问题:https://blog.csdn.net/qq_39043100/article/details/89644264
cat > /etc/nginx/nginx.conf << EOF load_module /usr/lib/nginx/modules/ngx_stream_module.so; user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; include /usr/share/nginx/modules/*.conf; events { worker_connections 1024; } stream { log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent'; access_log /var/log/nginx/k8s-access.log main; upstream k8s-apiserver { server 192.168.80.45:6443; # Master1 APISERVER IP:PORT server 192.168.80.49:6443; # Master2 APISERVER IP:PORT } server { listen 16443; proxy_pass k8s-apiserver; } } http { log_format main '$remote_addr - $remote_user [$time_local] $request ' '$status $body_bytes_sent $http_referer ' '$http_user_agent $http_x_forwarded_for'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; server { listen 80 default_server; server_name _; location / { } } } EOF
9.3.3 keepalived 配置 (master)
cat > /etc/keepalived/keepalived.conf << EOF global_defs { notification_email { [email protected] [email protected] [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_MASTER } # 检查脚本 vrrp_script check_nginx { script /etc/keepalived/check_nginx.sh } vrrp_instance VI_1 { state MASTER interface ens33 # 修改为实际网卡名 virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的 priority 100 # 优先级,备服务器设置 90 advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒 authentication { auth_type PASS auth_pass 1111 } # 虚拟IP virtual_ipaddress { 192.168.80.100/24 } track_script { check_nginx } } EOF
9.3.4 keepalived 配置 (slave)
cat > /etc/keepalived/keepalived.conf << EOF global_defs { notification_email { [email protected] [email protected] [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_BACKUP } # 检查脚本 vrrp_script check_nginx { script /etc/keepalived/check_nginx.sh } vrrp_instance VI_1 { state BACKUP interface ens33 # 修改为实际网卡名 virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的 priority 90 # 优先级,备服务器设置 90 advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒 authentication { auth_type PASS auth_pass 1111 } # 虚拟IP virtual_ipaddress { 192.168.80.100/24 } track_script { check_nginx } } EOF
9.3.5 keepalived 检查脚本
cat > /etc/keepalived/check_nginx.sh << EOF #!/bin/bash count=$(ss -antp |grep 16443 |egrep -cv grep|$$) if [ $count -eq 0 ];then exit 1 else exit 0 fi EOF chmod +x /etc/keepalived/check_nginx.sh
9.3.6 启动服务
systemctl daemon-reload systemctl start nginx keepalived systemctl enable nginx keepalived # 卸载nginx命令 sudo apt-get remove nginx nginx-common # 卸载删除除了配置文件以外的所有文件。 sudo apt-get purge nginx nginx-common # 卸载所有东东,包括删除配置文件。 sudo apt-get autoremove # 在上面命令结束后执行,主要是卸载删除Nginx的不再被使用的依赖包。 sudo apt-get remove nginx-full nginx-common #卸载删除两个主要的包。
9.3.7 状态检查
ip addr curl -k https://192.168.80.100:16443/version { major: 1, minor: 19, gitVersion: v1.19.11, gitCommit: c6a2f08fc4378c5381dd948d9ad9d1080e3e6b33, gitTreeState: clean, buildDate: 2021-05-12T12:19:22Z, goVersion: go1.15.12, compiler: gc, platform: linux/amd64 }
9.3.8 Worker Node 连接到 LB VIP
master01操作,在主机后面加上vip的ip地址
mkdir -p /root/ssl && cd /root/ssl # 1. 证书签名请求文件 cat > apiserver-csr.json <<EOF { CN: kubernetes, hosts: [ 127.0.0.1, localhost, 192.168.80.1, 192.168.80.2, 192.168.80.3, 192.168.80.45, 192.168.80.46, 192.168.80.47, 192.168.80.48, 192.168.80.49, 192.168.80.100, 10.254.0.1, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster, kubernetes.default.svc.cluster.local ], key: { algo: rsa, size: 2048 }, names: [ { C: CN, ST: BeiJing, L: BeiJing, O: k8s, OU: System } ] } EOF # 2. 生成证书 cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver # 3. 证书更新 cp apiserver*.pem /etc/kubernetes/pki scp apiserver*.pem [email protected]:/root scp apiserver*.pem [email protected]:/root scp apiserver*.pem [email protected]:/root # 4. node节点证书更新 chown root:root /root/apiserver*.pem mv /root/apiserver*.pem /etc/kubernetes/pki # 5. 重启 apiserver systemctl restart kube-apiserver systemctl status kube-apiserver
node节点
sed -i 's#192.168.80.45:6443#192.168.80.100:16443#' /etc/kubernetes/* sed -i 's#192.168.80.45:6443#192.168.80.100:16443#' /etc/kubernetes/pki/* systemctl restart kubelet kube-proxy grep '192.168.80.' /etc/kubernetes/* kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 3d17h v1.19.11 k8s-master2 Ready master 2d16h v1.19.11 k8s-node01 Ready node 3d15h v1.19.11 k8s-node02 Ready node 3d15h v1.19.11
10. 删除节点
# 1. k8s-master2 上,停止kubelet进程 systemctl stop kubelet # 2. 检查 k8s-master2 是否已下线 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 40h v1.19.11 k8s-master2 NotReady master 12h v1.19.11 k8s-node01 Ready node 40h v1.19.11 k8s-node02 Ready node 40h v1.19.11 # 3. 删除节点 kubectl drain k8s-master2 node/k8s-master2 cordoned error: unable to drain node k8s-master2, aborting command... There are pending nodes to be drained: k8s-master2 error: cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): kube-system/calico-node-lwj2r # 4. 强制下线 kubectl drain k8s-master2 --ignore-daemonsets node/k8s-master2 already cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-lwj2r node/k8s-master2 drained # 5. 下线状态 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 40h v1.19.11 k8s-master2 Ready,SchedulingDisabled master 12h v1.19.11 k8s-node01 Ready node 39h v1.19.11 k8s-node02 Ready node 39h v1.19.11 # 6. 恢复操作 (如有必要) kubectl uncordon k8s-master2 node/k8s-master2 uncordoned kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 40h v1.19.11 k8s-master2 Ready master 12h v1.19.11 k8s-node01 Ready node 39h v1.19.11 k8s-node02 Ready node 39h v1.19.11 # 7. 彻底删除 kubectl delete node k8s-master2 kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master1 Ready master 41h v1.19.11 k8s-node01 Ready node 40h v1.19.11 k8s-node02 Ready node 40h v1.19.11