
🐳 K8S+Rancher搭建
安装K8S
-
修改主机名
hostnamectl set-hostname kubemaster echo 172.16.0.22 kubemaster >> /etc/hostsg
-
关闭selinux
setenforce 0 sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
-
转发 IPv4 并让 iptables 看到桥接流量https://kubernetes.io/zh-cn/docs/setup/production-environment/container-runtimes/
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF sudo modprobe overlay sudo modprobe br_netfilter # 设置所需的 sysctl 参数,参数在重新启动后保持不变 cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF # 应用 sysctl 参数而不重新启动 sudo sysctl --system
-
关闭交换区
swapoff -a sed -e '/swap/s/^/#/g' -i /etc/fstab
-
安装容器
使用containerd,并修改配置
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo dnf install -y containerd.io mv /etc/containerd/config.toml /etc/containerd/config.toml.orig containerd config default > /etc/containerd/config.toml # 打开文件 修改SystemdCgroup为true vi /etc/containerd/config.toml [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true # 配置开机启动,查看状态 systemctl enable --now containerd.service systemctl status containerd.service
使用docker
dnf install -u docker-ce systemctl enable -now docker systemctl status docker VER=$(curl -s https://api.github.com/repos/Mirantis/cri-dockerd/releases/latest|grep tag_name | cut -d '"' -f 4|sed 's/v//g') echo $VER wget https://github.com/Mirantis/cri-dockerd/releases/download/v${VER}/cri-dockerd-${VER}.amd64.tgz tar xvf cri-dockerd-${VER}.amd64.tgz mv cri-dockerd/cri-dockerd /usr/local/bin/ wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.service wget https://raw.githubusercontent.com/Mirantis/cri-dockerd/master/packaging/systemd/cri-docker.socket sudo mv cri-docker.socket cri-docker.service /etc/systemd/system/ sudo sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service sudo systemctl daemon-reload sudo systemctl enable cri-docker.service sudo systemctl enable --now cri-docker.socket
-
防火墙端口开启
firewall-cmd --permanent --add-port={6443,2379,2380,10250,10251,10252}/tcp firewall-cmd --reload
-
配置k8s镜像源
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg exclude=kubelet kubeadm kubectl EOF
-
安装k8s,当前版本为1.27.1,根据官方文档,一般上一个版本1.26.3为稳定版本。如果要安装rancher,请对照rancherhttps://artifacthub.io/packages/helm/rancher-stable/rancher#views
dnf install -y {kubelet,kubeadm,kubectl} --disableexcludes=kubernetes # 使用上一个版本 dnf install -y kubelet-1.26.3 kubeadm-1.26.3 kubectl-1.26.3 --disableexcludes=kubernetes systemctl enable --now kubelet.service # 此时查看状态是启动错误 systemctl status kubelet.service
-
开启bash自动补全
source <(kubectl completion bash) kubectl completion bash > /etc/bash_completion.d/kubectl
-
使用阿里源下载容器镜像
# 查看依赖镜像的版本,后续修镜像tag需要使用 kubeadm config images list # contrainerd版本 kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers # docker版本,指定crisocket版本 kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers --cri-socket unix:///var/run/cri-dockerd.sock
-
修改镜像tag,一共7个镜像
# contrainerd版本 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 registry.k8s.io/pause:3.9 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.10.1 registry.k8s.io/coredns:v1.10.1 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.7-0 registry.k8s.io/etcd:3.5.7-0 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.27.1 registry.k8s.io/kube-apiserver:v1.27.1 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.27.1 registry.k8s.io/kube-controller-manager:v1.27.1 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.27.1 registry.k8s.io/kube-scheduler:v1.27.1 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.27.1 registry.k8s.io/kube-proxy:v1.27.1 # docker版本,指定crisocket版本 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.26.4 k8s.gcr.io/kube-apiserver:v1.26.4 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.26.4 k8s.gcr.io/kube-controller-manager:v1.26.4 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.26.4 k8s.gcr.io/kube-scheduler:v1.26.4 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.26.4 k8s.gcr.io/kube-proxy:v1.26.4 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.9.3 k8s.gcr.io/coredns:v1.9.3 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.6-0 k8s.gcr.io/etcd:3.5.6-0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 k8s.gcr.io/pause:3.9 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.26.4 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.26.4 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.26.4 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.26.4 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.9.3 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.6-0 docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
-
安装CNI插件
containerd版
mkdir /opt/bin curl -fsSLo /opt/bin/flanneld https://github.com/flannel-io/flannel/releases/download/v0.20.1/flannel-v0.20.1-linux-amd64.tar.gz chmod +x /opt/bin/flanneld
cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist { "cniVersion": "1.0.0", "name": "containerd-net", "plugins": [ { "type": "bridge", "bridge": "cni0", "isGateway": true, "ipMasq": true, "promiscMode": true, "ipam": { "type": "host-local", "ranges": [ [{ "subnet": "10.88.0.0/16" }], [{ "subnet": "2001:db8:4860::/64" }] ], "routes": [ { "dst": "0.0.0.0/0" }, { "dst": "::/0" } ] } }, { "type": "portmap", "capabilities": {"portMappings": true}, "externalSetMarkChain": "KUBE-MARK-MASQ" } ] } EOF
docker版本
vi /etc/systemd/system/cri-docker.service # 修改下面的配置 ExecStart=/usr/local/bin/cri-dockerd --network-plugin cni --container-runtime-endpoint fd:// systemctl daemon-reload systemctl restart cri-docker cat << EOF > 01-cri-dockerd.json { "cniVersion": "0.4.0", "name": "dbnet", "type": "bridge", "bridge": "cni0", "ipam": { "type": "host-local", "subnet": "10.1.0.0/16", "gateway": "10.1.0.1" } } EOF
-
初始化
kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers # docker版本 kubeadm init --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers --cri-socket unix:///var/run/cri-dockerd.sock
初始化失败,查看日志
错误1:
failed to pull image \"
http://registry.k8s.io/pause:3.6%5C": failed to pull and unpack image \"
http://registry.k8s.io/pause:3.6%5C"
原因:虽然命令行提供的pause是3.9版本,但实际初始化过程中需要的是3.6版本,默认是去google云获取镜像,国内无法访问,导致错误
解决方法:
自行下载3.6 并修改tag
# contrainerd版本 ctr -n k8s.io image pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6 # docker 版本 docker image pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6
重置初始化
kubeadm reset
后,重新执行初始化命令,成功此命令为子节点加入命令,请保存
-
完成
根据提示,配置以下内容
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile.d/k8s.sh
查看节点信息
kubectl get nodes
查看集群信息
kubectl cluster-info
配置pod网络插件
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
查看所有pod信息
kubectl get pods --all-namespaces
常用操作命令
# 部署应用 kubectl apply -f app.yaml # 查看deployment kubectl get deployment # 查看pod kubectl get pod -o wide # 查看pod详情 kubectl describe pod pod-name # 查看log kubectl logs pod-name # 进入pod终端 kubectl exec -it pod-name --bash # 指定进入的容器 kubectl exec -it pod-name -c container-name -- bash # 伸缩扩展副本 kubectl scale deployment test-k8s --replicas=5 # 把集群内端口映射到节点 kubectl port-forword pod-name 8090:8080 # 查看历史 kubectl rollout history deployment test-k8s # 回到上个版本 kubectl rollout undo deployment test-k8s # 回到指定版本 kubectl rollout undo deployment test-k8s --to-revision=2 # 删除部署 kubectl delete deployment test-k8s
安装Rancher
-
安装helm
wget https://github.com/helm/helm/releases/download/v3.11.3/helm-v3.0.0-linux-amd64.tar.gz tar -zxvf helm-v3.0.0-linux-amd64.tar.gz mv linux-amd64/helm /usr/local/bin/helm
-
添加helm chart仓库
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable # 创建命名空间 kubectl create namespace cattle-system
-
# 如果无域名可以不填写--ssl-domain参数 ./create_self-signed-cert.sh --ssl-size=2048 --ssl-date=3650 # 验证后 请添加TLS密文 #创建ingress密钥 kubectl -n cattle-system create secret tls tls-rancher-ingress --cert=./tls.crt --key=./tls.key #创建证书密钥 kubectl -n cattle-system create secret generic tls-ca --from-file=./cacerts.pem
-
helm install rancher rancher-stable/rancher \ --namespace cattle-system \ --set bootstrapPassword=admin \ --set ingress.tls.source=secret \ --set replicas=1 \ # 出错,需要卸载请执行 # helm uninstall rancher rancher-stable/rancher --namespace cattle-system
-
检查安装
kubectl -n cattle-system rollout status deploy/rancher Waiting for deployment "rancher" rollout to finish: 0 of 3 updated replicas are available... deployment "rancher" successfully rolled out
kubectl -n cattle-system get deploy rancher NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE rancher 3 3 3 3 3m
-
编辑配置文件
kubectl -n cattle-system edit service rancher
-
修改
type: ClusterIP
为type: NodePort
spec: type: NodePort ports: - name: http port: 443 targetPort: 444 nodePort: 30409
-
查看分配的端口地址为30777和30409
kubectl -n cattle-system get service rancher NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE rancher NodePort 10.102.224.3 <none> 80:30777/TCP,443:30409/TCP 8m1s
-
查看初始密码
kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{ "\n" }}'
-
访问https://172.16.0.22:30409%E7%99%BB%E5%BD%95%E5%8D%B3%E5%8F%AF
-
卸载rancher https://github.com/rancher/rancher-cleanup
kubectl create -f deploy/rancher-cleanup.yaml # 查看卸载日志 kubectl -n kube-system logs -l job-name=cleanup-job -f
异常处理
-
使用
kubectl -n namespace describe pod
查看pod信息,events报错0/1 nodes are available: 1 node(s) had untolerated taint {
http://node-role.kubernetes.io/control-plane:}. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
原因:由于是单节点部署,而默认主节点http://node-role.kubernetes.io/control-plane:不允许被调度
修复方式:允许节点被调度
kubectl taint nodes --all
http://node-role.kubernetes.io/control-plane-
扩展:
kubectl taint nodes --all
http://node-role.kubernetes.io/control-plane=:NoSchedule
- NoSchedule: 一定不能被调度
- PreferNoSchedule: 尽量不要调度
- NoExecute: 不仅不会调度, 还会驱逐Node上已有的Pod
-
systemctl enable cri-docker.service/cri-docker.socket
报错Failed to enable unit: Unit filecri-docker.service/cri-docker.socket
does not exist.原因:SELinux配置信息问题
修复方式:使用
restorecon
命令用来恢复SELinux文件属性即恢复文件的安全上下文。restorecon /etc/systemd/system/cri-docker.service systemctl enable cri-docker.service Created symlink /etc/systemd/system/multi-user.target.wants/cri-docker.service → /etc/systemd/system/cri-docker.service. restorecon /etc/systemd/system/cri-docker.socket systemctl enable cri-docker.socket Created symlink /etc/systemd/system/sockets.target.wants/cri-docker.socket → /etc/systemd/system/cri-docker.socket. systemctl start cri-docker