环境准备 3个节点,都是 Centos 7.6 系统,内核版本:3.10.0-1062.4.1.el7.x86_64,在每个节点上添加 hosts 信息:
1 2 3 4 [root@master1 ~]# cat /etc/hosts 10.8.0.1 master1 10.8.0.14 node1 10.8.0.18 node2
节点的 hostname 必须使用标准的 DNS 命名,另外千万不用什么默认的 localhost
的 hostname,会导致各种错误出现的。在 Kubernetes 项目里,机器的名字以及一切存储在 Etcd 中的 API 对象,都必须使用标准的 DNS 命名(RFC 1123)。可以使用命令 hostnamectl set-hostname node1
来修改 hostname。
禁用防火墙:
1 2 systemctl stop firewalld systemctl disable firewalld
禁用 SELINUX:
1 2 3 setenforce 0 cat /etc/selinux/config SELINUX=disabled
由于开启内核 ipv4 转发需要加载 br_netfilter 模块,所以加载下该模块:
创建/etc/sysctl.d/k8s.conf
文件,添加如下内容:
1 2 3 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1
bridge-nf 使得 netfilter 可以对 Linux 网桥上的 IPv4/ARP/IPv6 包过滤。比如,设置net.bridge.bridge-nf-call-iptables=1
后,二层的网桥在转发包时也会被 iptables的 FORWARD 规则所过滤。常用的选项包括:
net.bridge.bridge-nf-call-arptables:是否在 arptables 的 FORWARD 中过滤网桥的 ARP 包
net.bridge.bridge-nf-call-ip6tables:是否在 ip6tables 链中过滤 IPv6 包
net.bridge.bridge-nf-call-iptables:是否在 iptables 链中过滤 IPv4 包
net.bridge.bridge-nf-filter-vlan-tagged:是否在 iptables/arptables 中过滤打了 vlan 标签的包。
执行如下命令使修改生效:
1 sysctl -p /etc/sysctl.d/k8s.conf
安装 ipvs:
1 2 3 4 5 6 7 8 9 cat > /etc/sysconfig/modules/ipvs.modules <<EOF #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
上面脚本创建了的/etc/sysconfig/modules/ipvs.modules
文件,保证在节点重启后能自动加载所需模块。使用lsmod | grep -e ip_vs -e nf_conntrack_ipv4
命令查看是否已经正确加载所需的内核模块。
接下来还需要确保各个节点上已经安装了 ipset 软件包:
为了便于查看 ipvs 的代理规则,最好安装一下管理工具 ipvsadm:
如果集群的规模小可以使用iptables。
同步服务器时间
1 2 3 4 5 6 7 8 9 10 11 yum install chrony -y systemctl enable chronyd systemctl start chronyd chronyc sources 210 Number of sources = 4 MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^- makaki.miuku.net 2 10 375 37m +59ms[ +59ms] +/- 94ms ^- ntp1.ams1.nl.leaseweb.net 2 10 377 232 +2521us[+2521us] +/- 198ms ^- electrode.felixc.at 3 10 377 98 +6250us[+6250us] +/- 125ms ^* 119.28.183.184 2 10 377 553 -806us[ -837us] +/- 27ms
关闭 swap 分区:
修改/etc/fstab
文件,注释掉 SWAP 的自动挂载,使用free -m
确认 swap 已经关闭。swappiness 参数调整,修改/etc/sysctl.d/k8s.conf
添加下面一行:
执行 sysctl -p /etc/sysctl.d/k8s.conf
使修改生效。
安装 Containerd 参考 Containerd 搭建以及使用 文章
使用 kubeadm 部署 Kubernetes 上面的相关环境配置也完成了,现在我们就可以来安装 Kubeadm 了,我们这里是通过指定yum 源的方式来进行安装的:
1 2 3 4 5 6 7 8 9 10 cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg EOF
当然了,上面的 yum 源是需要科学上网的,如果不能科学上网的话,我们可以使用阿里云的源进行安装:
1 2 3 4 5 6 7 8 9 10 cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
然后安装 kubeadm、kubelet、kubectl:
1 2 3 4 5 # --disableexcludes 禁掉除了kubernetes之外的别的仓库 yum makecache fast yum install -y kubelet-1.22.2 kubeadm-1.22.2 kubectl-1.22.2 --disableexcludes=kubernetes kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:37:34Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
可以看到我们这里安装的是 v1.22.2
版本,然后将 master 节点的 kubelet 设置成开机启动:
1 systemctl enable --now kubelet
到这里为止上面所有的操作都需要在所有节点执行配置。
初始化集群 当我们执行 kubelet --help
命令的时候可以看到原来大部分命令行参数都被 DEPRECATED
了,这是因为官方推荐我们使用 --config
来指定配置文件,在配置文件中指定原来这些参数的配置,可以通过官方文档 Set Kubelet parameters via a config file 了解更多相关信息,这样 Kubernetes 就可以支持动态 Kubelet 配置(Dynamic Kubelet Configuration)了,参考 Reconfigure a Node’s Kubelet in a Live Cluster 。
然后我们可以通过下面的命令在 master 节点上输出集群初始化默认使用的配置:
1 kubeadm config print init-defaults --component-configs KubeletConfiguration > kubeadm.yaml
然后根据我们自己的需求修改配置,比如修改 imageRepository
指定集群初始化时拉取 Kubernetes 所需镜像的地址,kube-proxy 的模式为 ipvs,另外需要注意的是我们这里是准备安装 flannel 网络插件的,需要将 networking.podSubnet
设置为10.244.0.0/16
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 [root@master1 ~ ] apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 10.8 .0 .1 bindPort: 6443 nodeRegistration: criSocket: /run/containerd/containerd.sock imagePullPolicy: IfNotPresent name: master taints: - effect: "NoSchedule" key: "node-role.kubernetes.io/master" --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {}dns: {}etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/k8sxio kind: ClusterConfiguration kubernetesVersion: 1.22 .0 networking: dnsDomain: cluster.local serviceSubnet: 10.96 .0 .0 /12 podSubnet: 10.244 .0 .0 /16 scheduler: {}--- apiVersion: kubelet.config.k8s.io/v1beta1 authentication: anonymous: enabled: false webhook: cacheTTL: 0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.crt authorization: mode: Webhook webhook: cacheAuthorizedTTL: 0s cacheUnauthorizedTTL: 0s cgroupDriver: systemd clusterDNS: - 10.96 .0 .10 clusterDomain: cluster.local cpuManagerReconcilePeriod: 0s evictionPressureTransitionPeriod: 0s fileCheckFrequency: 0s healthzBindAddress: 127.0 .0 .1 healthzPort: 10248 httpCheckFrequency: 0s imageMinimumGCAge: 0s kind: KubeletConfiguration logging: {}memorySwap: {}nodeStatusReportFrequency: 0s nodeStatusUpdateFrequency: 0s rotateCertificates: true runtimeRequestTimeout: 0s shutdownGracePeriod: 0s shutdownGracePeriodCriticalPods: 0s staticPodPath: /etc/kubernetes/manifests streamingConnectionIdleTimeout: 0s syncFrequency: 0s
配置提示
对于上面的资源清单的文档比较杂,要想完整了解上面的资源对象对应的属性,可以查看对应的 godoc 文档,地址: https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta3。
在开始初始化集群之前可以使用 kubeadm config images pull --config kubeadm.yaml
预先在各个服务器节点上拉取所k8s需要的容器镜像
配置文件准备好过后,可以使用如下命令先将相关镜像 pull 下面:
1 2 3 4 5 6 7 8 9 10 kubeadm config images pull --config kubeadm.yaml [config/images] Pulled registry.aliyuncs.com/k8sxio/kube-apiserver:v1.22.2 [config/images] Pulled registry.aliyuncs.com/k8sxio/kube-controller-manager:v1.22.2 [config/images] Pulled registry.aliyuncs.com/k8sxio/kube-scheduler:v1.22.2 [config/images] Pulled registry.aliyuncs.com/k8sxio/kube-proxy:v1.22.2 [config/images] Pulled registry.aliyuncs.com/k8sxio/pause:3.5 [config/images] Pulled registry.aliyuncs.com/k8sxio/etcd:3.5.0-0 failed to pull image "registry.aliyuncs.com/k8sxio/coredns:v1.8.4" : output: time="2021-11-18T17:34:48+08:00" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"registry.aliyuncs.com/k8sxio/coredns:v1.8.4\": failed to resolve reference \"registry.aliyuncs.com/k8sxio/coredns:v1.8.4\": registry.aliyuncs.com/k8sxio/coredns:v1.8.4: not found" , error: exit status 1 To see the stack trace of this error execute with --v=5 or higher
上面在拉取 coredns
镜像的时候出错了,没有找到这个镜像,我们可以手动 pull 该镜像,然后重新 tag 下镜像地址即可:
1 2 3 4 5 6 7 8 9 10 11 ctr -n k8s.io i pull docker.io/coredns/coredns:1.8.4 docker.io/coredns/coredns:1.8.4: resolved |++++++++++++++++++++++++++++++++++++++| index-sha256:6e5a02c21641597998b4be7cb5eb1e7b02c0d8d23cce4dd09f4682d463798890: done |++++++++++++++++++++++++++++++++++++++| manifest-sha256:10683d82b024a58cc248c468c2632f9d1b260500f7cd9bb8e73f751048d7d6d4: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:bc38a22c706b427217bcbd1a7ac7c8873e75efdd0e59d6b9f069b4b243db4b4b: done |++++++++++++++++++++++++++++++++++++++| config-sha256:8d147537fb7d1ac8895da4d55a5e53621949981e2e6460976dae812f83d84a44: done |++++++++++++++++++++++++++++++++++++++| layer-sha256:c6568d217a0023041ef9f729e8836b19f863bcdb612bb3a329ebc165539f5a80: exists |++++++++++++++++++++++++++++++++++++++| elapsed: 12.4s total: 12.0 M (991.3 KiB/s) unpacking linux/amd64 sha256:6e5a02c21641597998b4be7cb5eb1e7b02c0d8d23cce4dd09f4682d463798890... done : 410.185888msctr -n k8s.io i tag docker.io/coredns/coredns:1.8.4 registry.aliyuncs.com/k8sxio/coredns:v1.8.4
然后就可以使用上面的配置文件在 master 节点上进行初始化:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ➜ ~ kubeadm init --config kubeadm.yaml [init] Using Kubernetes version: v1.22.2 [preflight] Running pre-flight checks ... [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME /.kube sudo cp -i /etc/kubernetes/admin.conf $HOME /.kube/config sudo chown $(id -u):$(id -g) $HOME /.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 10.8.0.1:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:ca0c87226c69309d7779096c15b6a41e14b077baf4650bfdb6f9d3178d4da645
根据安装提示拷贝 kubeconfig 文件:
1 2 3 ➜ ~ mkdir -p $HOME/.kube ➜ ~ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config ➜ ~ sudo chown $(id -u):$(id -g) $HOME/.kube/config
然后可以使用 kubectl 命令查看 master 节点已经初始化成功了:
1 2 3 ➜ ~ kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,master 41s v1.22.2
添加节点 记住初始化集群上面的配置和操作要提前做好,将 master 节点上面的 $HOME/.kube/config
文件拷贝到 node 节点对应的文件中,安装 kubeadm、kubelet、kubectl(可选),然后执行上面初始化完成后提示的 join 命令即可:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ➜ ~ kubeadm join 10.8.0.1:6443 --token abcdef.0123456789abcdef \ > --discovery-token-ca-cert-hash sha256:ca0c87226c69309d7779096c15b6a41e14b077baf4650bfdb6f9d3178d4da645 [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
join 命令
如果忘记了上面的 join 命令可以使用命令 kubeadm token create --print-join-command
重新获取。
执行成功后运行 get nodes 命令:
1 2 3 4 kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,master 2m35s v1.22.2 node1 Ready <none> 45s v1.22.2
这个时候其实集群还不能正常使用,因为还没有安装网络插件,接下来安装网络插件,可以在文档 https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ 中选择我们自己的网络插件,这里我们安装 flannel:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml # 如果有节点是多网卡,则需要在资源清单文件中指定内网网卡 # 搜索到名为 kube-flannel-ds 的 DaemonSet,在kube-flannel容器下面 vi kube-flannel.yml ...... containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.15.0 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr - --iface=eth0 # 如果是多网卡的话,指定内网网卡的名称 ...... kubectl apply -f kube-flannel.yml # 安装 flannel 网络插件
隔一会儿查看 Pod 运行状态:
1 2 3 4 5 6 7 8 9 10 11 12 kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-7568f67dbd-5mg59 1/1 Running 0 8m32s coredns-7568f67dbd-b685t 1/1 Running 0 8m31s etcd-master 1/1 Running 0 66m kube-apiserver-master 1/1 Running 0 66m kube-controller-manager-master 1/1 Running 0 66m kube-flannel-ds-dsbt6 1/1 Running 0 11m kube-flannel-ds-zwlm6 1/1 Running 0 11m kube-proxy-jq84n 1/1 Running 0 66m kube-proxy-x4hbv 1/1 Running 0 19m kube-scheduler-master 1/1 Running 0 66m
Flannel 网络插件
当我们部署完网络插件后执行 ifconfig 命令,正常会看到新增的cni0
与flannel1
这两个虚拟设备,但是如果没有看到cni0
这个设备也不用太担心,我们可以观察/var/lib/cni
目录是否存在,如果不存在并不是说部署有问题,而是该节点上暂时还没有应用运行,我们只需要在该节点上运行一个 Pod 就可以看到该目录会被创建,并且cni0
设备也会被创建出来。
用同样的方法添加另外一个节点即可。
搭建遇到问题:
解决k8s join node出现kube-flannel-ds服务状态Init kube-poxy 一直显示containercreating
解决k8s”failed to delegate addfailed to set bridge addr”cni0” already has an IP address different from xxxx”
解决k8s注册节点后发现ingress 转发请求timeout
清理 如果你的集群安装过程中遇到了其他问题,我们可以使用下面的命令来进行重置:
1 2 3 4 kubeadm reset ifconfig cni0 down && ip link delete cni0 ifconfig flannel.1 down && ip link delete flannel.1 rm -rf /var/lib/cni/
课程内容:https://youdianzhishi.com/web/course/1030