Pré-requisito: Conclua a Parte 1 (preparação de todos os nós, instalação de CRI-O e Kubernetes).
kube-ctrl-01)O primeiro control plane é especial: ele inicializa o cluster do zero, gera certificados e cria o etcd. Os demais control planes ingressarão usando os certificados gerados nesta etapa.
Conecte-se ao nó kube-ctrl-01 (10.48.9.2):
sudo hostnamectl set-hostname kube-ctrl-01.geanmartins.net
kubeadm-config.yamlEste arquivo define todos os parâmetros de inicialização do cluster. Crie um diretório para artefatos:
mkdir -p ~/artefacts/yaml
cd ~/artefacts/yaml
Crie o arquivo kubeadm-config.yaml:
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
localAPIEndpoint:
# IP do primeiro control plane (kube-ctrl-01)
advertiseAddress: 10.48.9.2
bindPort: 6443
nodeRegistration:
# Socket do CRI-O (não use Docker)
criSocket: unix:///var/run/crio/crio.sock
# Configurar IPs do nó (IPv4 e IPv6)
kubeletExtraArgs:
- name: node-ip
value: "10.48.9.2,fd00:0:b:9::2"
name: kube-ctrl-01.geanmartins.net
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
clusterName: kubernetes
kubernetesVersion: 1.35.0
# Endpoint que todos os clientes usarão para acessar a API
# Aponta para o HAProxy (kube-lb)
controlPlaneEndpoint: "api.cluster.geanmartins.net:6443"
# Configuração da API Server
apiServer:
certSANs:
# Hostnames dos control planes (para certificados)
- "kube-ctrl-01"
- "kube-ctrl-01.geanmartins.net"
- "kube-ctrl-02"
- "kube-ctrl-02.geanmartins.net"
- "kube-ctrl-03"
- "kube-ctrl-03.geanmartins.net"
# Endpoint do HAProxy
- "api.cluster.geanmartins.net"
- "10.48.9.100"
- "fd00:0:b:9::63"
# Nomes padrão do Kubernetes
- "kubernetes"
- "kubernetes.default"
- "kubernetes.default.svc"
- "kubernetes.default.svc.cluster.local"
# IPs dos control planes (para validação de certificado)
- "10.48.9.2"
- "10.48.9.3"
- "10.48.9.4"
- "fd00:0:b:9::2"
- "fd00:0:b:9::3"
- "fd00:0:b:9::4"
# Localhost (para acesso local)
- "127.0.0.1"
- "::1"
- "localhost"
# Configuração de rede
networking:
dnsDomain: cluster.local
# Redes de pods (Calico)
podSubnet: "10.51.0.0/16,fd00:0:b:a000::/56"
# Redes de serviços (ClusterIP)
serviceSubnet: "10.49.0.0/16,fd00:0:b:d::/108"
--upload-certsA flag --upload-certs faz o kubeadm criptografar os certificados do control plane e armazená-los como Secret no cluster por 2 horas. Isso permite que os outros control planes ingressem sem cópia manual de certificados via scp.
sudo kubeadm init --config kubeadm-config.yaml --upload-certs
Saída Esperada (resumida):
[init] Using Kubernetes version: v1.35.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
...
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods
[control-plane-check] Checking kube-apiserver at https://10.48.9.2:6443/livez
...
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
02f196c79ff6a5d849da67c2cf07d4e8ea29f8623c1b4567dedd01bcade9b441
[bootstrap-token] Using token: x9yc1c.33p2qxamjylnnx1e
...
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You can now join any number of control-plane nodes running the following command on each as root:
kubeadm join api.cluster.geanmartins.net:6443 --token x9yc1c.33p2qxamjylnnx1e \
--discovery-token-ca-cert-hash sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a \
--control-plane --certificate-key 02f196c79ff6a5d849da67c2cf07d4e8ea29f8623c1b4567dedd01bcade9b441
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join api.cluster.geanmartins.net:6443 --token x9yc1c.33p2qxamjylnnx1e \
--discovery-token-ca-cert-hash sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a
Importante: Salve os valores de
token,discovery-token-ca-cert-hashecertificate-key. Você precisará deles para adicionar os outros control planes e workers.
Para usar kubectl sem sudo:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Verifique o status do cluster:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-ctrl-01.geanmartins.net NotReady control-plane 72s v1.35.3
Nota: O status é
NotReadyporque a rede de pods (Calico) ainda não foi instalada. Isso é normal.
kube-ctrl-02)No nó kube-ctrl-02 (10.48.9.3):
sudo hostnamectl set-hostname kube-ctrl-02.geanmartins.net
kubeadm-config-join-ctrl.yamlUse os valores obtidos da inicialização anterior:
HASH="sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a"
TOKEN="x9yc1c.33p2qxamjylnnx1e"
CERTIFICATE="02f196c79ff6a5d849da67c2cf07d4e8ea29f8623c1b4567dedd01bcade9b441"
Crie o arquivo:
cat > kubeadm-config-join-ctrl.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: api.cluster.geanmartins.net:6443
caCertHashes:
- $HASH
token: $TOKEN
tlsBootstrapToken: $TOKEN
controlPlane:
localAPIEndpoint:
# IP do segundo control plane
advertiseAddress: 10.48.9.3
bindPort: 6443
# Chave de certificado para recuperar os certificados do Secret
certificateKey: $CERTIFICATE
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
kubeletExtraArgs:
- name: node-ip
value: "10.48.9.3,fd00:0:b:9::3"
name: kube-ctrl-02.geanmartins.net
EOF
sudo kubeadm join --config=kubeadm-config-join-ctrl.yaml
Aguarde alguns minutos. O nó se conectará ao cluster, sincronizará os certificados e iniciará os componentes do control plane.
kube-ctrl-03)No nó kube-ctrl-03 (10.48.9.4):
sudo hostnamectl set-hostname kube-ctrl-03.geanmartins.net
kubeadm-config-join-ctrl.yamlUse os mesmos valores de HASH, TOKEN e CERTIFICATE:
HASH="sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a"
TOKEN="x9yc1c.33p2qxamjylnnx1e"
CERTIFICATE="02f196c79ff6a5d849da67c2cf07d4e8ea29f8623c1b4567dedd01bcade9b441"
cat > kubeadm-config-join-ctrl.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: api.cluster.geanmartins.net:6443
caCertHashes:
- $HASH
token: $TOKEN
tlsBootstrapToken: $TOKEN
controlPlane:
localAPIEndpoint:
advertiseAddress: 10.48.9.4
bindPort: 6443
certificateKey: $CERTIFICATE
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
kubeletExtraArgs:
- name: node-ip
value: "10.48.9.4,fd00:0:b:9::4"
name: kube-ctrl-03.geanmartins.net
EOF
sudo kubeadm join --config=kubeadm-config-join-ctrl.yaml
Os workers não executam componentes do control plane (API, etcd, scheduler, controller-manager). Eles apenas executam o kubelet e recebem pods.
kube-worker-01)sudo hostnamectl set-hostname kube-worker-01.geanmartins.net
HASH="sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a"
TOKEN="x9yc1c.33p2qxamjylnnx1e"
cat > kubeadm-config-join.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: api.cluster.geanmartins.net:6443
caCertHashes:
- $HASH
token: $TOKEN
tlsBootstrapToken: $TOKEN
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
kubeletExtraArgs:
- name: node-ip
value: "10.48.9.20,fd00:0:b:9::14"
name: kube-worker-01.geanmartins.net
EOF
sudo kubeadm join --config=kubeadm-config-join.yaml
kube-worker-02)sudo hostnamectl set-hostname kube-worker-02.geanmartins.net
HASH="sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a"
TOKEN="x9yc1c.33p2qxamjylnnx1e"
cat > kubeadm-config-join.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: api.cluster.geanmartins.net:6443
caCertHashes:
- $HASH
token: $TOKEN
tlsBootstrapToken: $TOKEN
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
kubeletExtraArgs:
- name: node-ip
value: "10.48.9.21,fd00:0:b:9::15"
name: kube-worker-02.geanmartins.net
EOF
sudo kubeadm join --config=kubeadm-config-join.yaml
kube-worker-03)sudo hostnamectl set-hostname kube-worker-03.geanmartins.net
HASH="sha256:1d608bd78192f705846c2ae16ad0dd3b159edb0093cd6322a3f9ce72ab11412a"
TOKEN="x9yc1c.33p2qxamjylnnx1e"
cat > kubeadm-config-join.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta4
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: api.cluster.geanmartins.net:6443
caCertHashes:
- $HASH
token: $TOKEN
tlsBootstrapToken: $TOKEN
nodeRegistration:
criSocket: unix:///var/run/crio/crio.sock
kubeletExtraArgs:
- name: node-ip
value: "10.48.9.22,fd00:0:b:9::16"
name: kube-worker-03.geanmartins.net
EOF
sudo kubeadm join --config=kubeadm-config-join.yaml
Nota sobre Tokens Expirados: Os tokens de bootstrap do kubeadm expiram após 24 horas. Se o token expirar antes de adicionar todos os workers, gere um novo a partir de qualquer control plane:
kubeadm token create --print-join-command
De volta ao kube-ctrl-01, verifique que todos os nós ingressaram:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-ctrl-01.geanmartins.net NotReady control-plane 18m v1.35.3
kube-ctrl-02.geanmartins.net NotReady control-plane 7m11s v1.35.3
kube-ctrl-03.geanmartins.net NotReady control-plane 6m26s v1.35.3
kube-worker-01.geanmartins.net NotReady <none> 2m26s v1.35.3
kube-worker-02.geanmartins.net NotReady <none> 119s v1.35.3
kube-worker-03.geanmartins.net NotReady <none> 86s v1.35.3
Todos com status
NotReady? Isso é esperado. Eles ficarãoNotReadyaté que a rede de pods (Calico) seja instalada.
Calico é um CNI (Container Network Interface) que fornece networking de pods e políticas de segurança. Execute a partir de kube-ctrl-01:
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.31.4/manifests/operator-crds.yaml
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.31.4/manifests/tigera-operator.yaml
custom-resources.yamlBaixe o arquivo padrão:
curl -O https://raw.githubusercontent.com/projectcalico/calico/v3.31.4/manifests/custom-resources.yaml
Edite para configurar as redes de pods (IPv4 e IPv6):
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
calicoNetwork:
mtu: 1450
ipPools:
# Pool IPv4 para pods
- name: default-ipv4-ippool
blockSize: 26
cidr: 10.51.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
# Pool IPv6 para pods
- name: default-ipv6-ippool
blockSize: 122
cidr: fd00:0:b:a000::/56
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
---
apiVersion: operator.tigera.io/v1
kind: Goldmane
metadata:
name: default
---
apiVersion: operator.tigera.io/v1
kind: Whisker
metadata:
name: default
kubectl create -f custom-resources.yaml
kubectl get pods -n calico-system
Aguarde até que todos os pods estejam em status Running:
NAME READY STATUS RESTARTS AGE
calico-apiserver-7bf78799c8-h25cg 1/1 Running 0 2m27s
calico-apiserver-7bf78799c8-nxdm2 1/1 Running 0 2m27s
calico-kube-controllers-67d8d84fd7-5n94l 1/1 Running 0 2m24s
calico-node-vflkx 1/1 Running 0 2m24s
calico-typha-56fd687bd8-hl98h 1/1 Running 0 2m25s
csi-node-driver-pjhwr 2/2 Running 0 2m24s
goldmane-58f96f7c58-nfbch 1/1 Running 0 2m26s
whisker-5965949594-kqsvg 2/2 Running 0 77s
Após o Calico estar pronto, todos os nós devem estar em status Ready:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-ctrl-01.geanmartins.net Ready control-plane 18m v1.35.3
kube-ctrl-02.geanmartins.net Ready control-plane 7m11s v1.35.3
kube-ctrl-03.geanmartins.net Ready control-plane 6m26s v1.35.3
kube-worker-01.geanmartins.net Ready <none> 2m26s v1.35.3
kube-worker-02.geanmartins.net Ready <none> 119s v1.35.3
kube-worker-03.geanmartins.net Ready <none> 86s v1.35.3
kubectl cluster-info
Kubernetes control plane is running at https://api.cluster.geanmartins.net:6443
CoreDNS is running at https://api.cluster.geanmartins.net:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
sudo apt install bash-completion
# bash completion
sudo sh -c 'kubeadm completion bash > /etc/bash_completion.d/kubeadm'
sudo sh -c 'kubectl completion bash > /etc/bash_completion.d/kubectl'
sudo sh -c 'crictl completion bash > /etc/bash_completion.d/crictl'
# Carrega o autocompletar do bash no shell atual
source /usr/share/bash-completion/bash_completion
Execute a partir de qualquer control plane:
kubectl -n kube-system exec etcd-kube-ctrl-01.geanmartins.net -- \
etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/peer.crt \
--key /etc/kubernetes/pki/etcd/peer.key \
member list -w table
Saída Esperada:
+------------------+---------+------------------------------+------------------------+------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+------------------------------+------------------------+------------------------+------------+
| 11b8766d55d9f565 | started | kube-ctrl-02.geanmartins.net | https://10.48.9.3:2380 | https://10.48.9.3:2379 | false |
| 1e6a4d502836b082 | started | kube-ctrl-03.geanmartins.net | https://10.48.9.4:2380 | https://10.48.9.4:2379 | false |
| 850f503a191542d7 | started | kube-ctrl-01.geanmartins.net | https://10.48.9.2:2380 | https://10.48.9.2:2379 | false |
+------------------+---------+------------------------------+------------------------+------------------------+------------+
kubectl -n kube-system exec etcd-kube-ctrl-01.geanmartins.net -- \
etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/peer.crt \
--key /etc/kubernetes/pki/etcd/peer.key \
endpoint health \
--endpoints=https://10.48.9.2:2379,https://10.48.9.3:2379,https://10.48.9.4:2379
Saída Esperada:
https://10.48.9.2:2379 is healthy: successfully committed proposal: took = 39.926304ms
https://10.48.9.3:2379 is healthy: successfully committed proposal: took = 40.601729ms
https://10.48.9.4:2379 is healthy: successfully committed proposal: took = 54.600412ms
kubectl -n kube-system exec etcd-kube-ctrl-01.geanmartins.net -- \
etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/peer.crt \
--key /etc/kubernetes/pki/etcd/peer.key \
endpoint status \
--endpoints=https://10.48.9.2:2379,https://10.48.9.3:2379,https://10.48.9.4:2379 \
-w table
O membro com IS LEADER = true é o líder atual.
Para confirmar que o cluster continua operacional após a perda de um control plane:
# 1. Verifique o estado atual
kubectl get nodes
# 2. Simule uma falha desligando o kubelet em kube-ctrl-01
sudo systemctl stop kubelet
# 3. Aguarde ~40 segundos e verifique novamente
kubectl get nodes
# kube-ctrl-01 aparecerá como NotReady
# 4. Confirme que a API ainda responde via HAProxy
kubectl get pods -n kube-system
# 5. Confirme que o etcd ainda tem quórum (2 de 3 membros)
kubectl -n kube-system exec etcd-kube-ctrl-02.geanmartins.net -- \
etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/etcd/peer.crt \
--key /etc/kubernetes/pki/etcd/peer.key \
endpoint health \
--endpoints=https://10.48.9.3:2379,https://10.48.9.4:2379
# 6. Restaure o kubelet
sudo systemctl start kubelet
Resultado Esperado: Com 3 membros etcd, o cluster tolera a falha de 1 control plane sem perda de quórum. A perda simultânea de 2 control planes tornará o cluster read-only (sem quórum etcd).
No nó kube-lb:
sudo journalctl -u haproxy -f
O HAProxy registrará eventos como:
kube-ctrl-01 is DOWN quando o backend fica indisponívelkube-ctrl-01 is UP quando o backend se recuperaIsso confirma que o HAProxy está detectando automaticamente falhas e redirecionando tráfego.