LINUX.ORG.RU
ФорумAdmin

Cilium pod Init:0/6 Init:Error — не работает

 , ,


0

1

Cilium не работает в Talos Состояние

Init:0/6

Init:Error

Включение толерантности не помогает.

Делаю установку Talos

talosctl gen secrets  

talosctl gen config --with-secrets secrets.yaml talos-vbox https://192.168.1.100:6443 

talosctl machineconfig patch controlplane.yaml --patch patch.yaml -o controlplane_patched.yaml

talosctl apply-config --insecure -n 192.168.1.100 --file controlplane_patched.yaml


talosctl bootstrap --nodes 192.168.1.100 --endpoints 192.168.1.100   --talosconfig=talosconfig


talosctl kubeconfig -n 192.168.1.100 --endpoints 192.168.1.100   --talosconfig=talosconfig

В файле patch.yaml отключаем flannel & proxy

cluster:
  network:
    cni:
      name: none
  proxy:
    disabled: true

Устанавливаю Cilium хелмом

helm install  cilium   cilium/cilium --version 1.18.0 -n kube-system
 
 helm upgrade  cilium   cilium/cilium --version 1.18.0  --namespace kube-system   --set ipam.mode=kubernetes   --set kubeProxyReplacement=true --set operator.replicas=1    --set hubble.enabled=true   --set hubble.relay.enabled=true   --set hubble.ui.enabled=true  --set l2podAnnouncements.interface="enp0s3"   --set devices=enp0s3 

Смотрим состояние подов

kubectl get pod -A -o wide

NAMESPACE     NAME                                         READY   STATUS                  RESTARTS        AGE
kube-system   cilium-envoy-6n6pc                           1/1     Running                 0               29m
kube-system   cilium-operator-85c86d7fb9-rmft5             0/1     Pending                 0               29m
kube-system   cilium-operator-85c86d7fb9-t7xbs             0/1     Pending                 0               8m9s
kube-system   cilium-w96nb                                 0/1     Init:CrashLoopBackOff   7 (9m58s ago)   29m
kube-system   coredns-7859998f6-chfpr                      0/1     Pending                 0               77m
kube-system   coredns-7859998f6-f55jm                      0/1     Pending                 0               77m
kube-system   kube-apiserver-node01.localdomain            1/1     Terminated              0               33m
kube-system   kube-controller-manager-node01.localdomain   1/1     Terminated              2 (33m ago)     33m
kube-system   kube-scheduler-node01.localdomain            1/1     Terminated              2 (33m ago)     33m

Что там с этим бедным подом:

kubectl describe pods/cilium-w96nb -n kube-system

Name:                 cilium-w96nb
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      cilium
Node:                 node01.localdomain/192.168.1.100
Start Time:           Fri, 06 Feb 2026 00:09:39 +0300
Labels:               app.kubernetes.io/name=cilium-agent
                      app.kubernetes.io/part-of=cilium
                      controller-revision-hash=7b7c49857d
                      k8s-app=cilium
                      pod-template-generation=1
Annotations:          kubectl.kubernetes.io/default-container: cilium-agent
Status:               Pending
SeccompProfile:       Unconfined
IP:                   192.168.1.100
IPs:
  IP:           192.168.1.100
Controlled By:  DaemonSet/cilium
Init Containers:
  config:
    Container ID:  containerd://a4e7bcc4d71a2f98e9b50a2969c5c80e43e53d35b0f51c2e8816c80fd2822e0b
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:      quay.io/cilium/cilium@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Port:          <none>
    Host Port:     <none>
    Command:
      cilium-dbg
      build-config
    State:       Running
      Started:   Fri, 06 Feb 2026 00:10:46 +0300
    Last State:  Terminated
      Reason:    Error
      Message:   time=2026-02-05T21:09:40.784802739Z level=info msg=Running subsys=cilium-dbg
time=2026-02-05T21:09:40.786691799Z level=info msg="Starting hive" subsys=cilium-dbg
time=2026-02-05T21:09:40.786901748Z level=info msg="Establishing connection to apiserver" subsys=cilium-dbg module=k8s-client ipAddr=https://10.96.0.1:443
time=2026-02-05T21:10:15.816825365Z level=info msg="Establishing connection to apiserver" subsys=cilium-dbg module=k8s-client ipAddr=https://10.96.0.1:443
time=2026-02-05T21:10:45.842985179Z level=error msg="Unable to contact k8s api-server" subsys=cilium-dbg module=k8s-client ipAddr=https://10.96.0.1:443 error="Get \"https://10.96.0.1:443/api/v1/namespaces/kube-system\": dial tcp 10.96.0.1:443: i/o timeout"
time=2026-02-05T21:10:45.843082548Z level=error msg="Start hook failed" subsys=cilium-dbg function="client.(*compositeClientset).onStart (k8s-client)" error="Get \"https://10.96.0.1:443/api/v1/namespaces/kube-system\": dial tcp 10.96.0.1:443: i/o timeout"
time=2026-02-05T21:10:45.843105244Z level=error msg="Failed to start hive" subsys=cilium-dbg error="Get \"https://10.96.0.1:443/api/v1/namespaces/kube-system\": dial tcp 10.96.0.1:443: i/o timeout" duration=1m5.056323338s
time=2026-02-05T21:10:45.843150241Z level=info msg="Stopping hive" subsys=cilium-dbg
time=2026-02-05T21:10:45.843208798Z level=info msg="Stopped hive" subsys=cilium-dbg duration=47.542µs
Error: Build config failed: failed to start: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system": dial tcp 10.96.0.1:443: i/o timeout


      Exit Code:    1
      Started:      Fri, 06 Feb 2026 00:09:40 +0300
      Finished:     Fri, 06 Feb 2026 00:10:45 +0300
    Ready:          False
    Restart Count:  1
    Environment:
      K8S_NODE_NAME:          (v1:spec.nodeName)
      CILIUM_K8S_NAMESPACE:  kube-system (v1:metadata.namespace)
    Mounts:
      /tmp from tmp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
  mount-cgroup:
    Container ID:
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -ec
      cp /usr/bin/cilium-mount /hostbin/cilium-mount;
      nsenter --cgroup=/hostproc/1/ns/cgroup --mount=/hostproc/1/ns/mnt "${BIN_PATH}/cilium-mount" $CGROUP_ROOT;
      rm /hostbin/cilium-mount

    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      CGROUP_ROOT:  /run/cilium/cgroupv2
      BIN_PATH:     /opt/cni/bin
    Mounts:
      /hostbin from cni-path (rw)
      /hostproc from hostproc (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
  apply-sysctl-overwrites:
    Container ID:
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -ec
      cp /usr/bin/cilium-sysctlfix /hostbin/cilium-sysctlfix;
      nsenter --mount=/hostproc/1/ns/mnt "${BIN_PATH}/cilium-sysctlfix";
      rm /hostbin/cilium-sysctlfix

    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      BIN_PATH:  /opt/cni/bin
    Mounts:
      /hostbin from cni-path (rw)
      /hostproc from hostproc (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
  mount-bpf-fs:
    Container ID:
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
      --
    Args:
      mount | grep "/sys/fs/bpf type bpf" || mount -t bpf bpf /sys/fs/bpf
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /sys/fs/bpf from bpf-maps (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
  clean-cilium-state:
    Container ID:
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /init-container.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      CILIUM_ALL_STATE:           <set to the key 'clean-cilium-state' of config map 'cilium-config'>         Optional: true
      CILIUM_BPF_STATE:           <set to the key 'clean-cilium-bpf-state' of config map 'cilium-config'>     Optional: true
      WRITE_CNI_CONF_WHEN_READY:  <set to the key 'write-cni-conf-when-ready' of config map 'cilium-config'>  Optional: true
    Mounts:
      /run/cilium/cgroupv2 from cilium-cgroup (rw)
      /sys/fs/bpf from bpf-maps (rw)
      /var/run/cilium from cilium-run (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
  install-cni-binaries:
    Container ID:
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /install-plugin.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        100m
      memory:     10Mi
    Environment:  <none>
    Mounts:
      /host/opt/cni/bin from cni-path (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
Containers:
  cilium-agent:
    Container ID:
    Image:         quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      cilium-agent
    Args:
      --config-dir=/tmp/cilium/config-map
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://127.0.0.1:9879/healthz delay=0s timeout=5s period=30s #success=1 #failure=10
    Readiness:      http-get http://127.0.0.1:9879/healthz delay=0s timeout=5s period=30s #success=1 #failure=3
    Startup:        http-get http://127.0.0.1:9879/healthz delay=5s timeout=1s period=2s #success=1 #failure=300
    Environment:
      K8S_NODE_NAME:                  (v1:spec.nodeName)
      CILIUM_K8S_NAMESPACE:          kube-system (v1:metadata.namespace)
      CILIUM_CLUSTERMESH_CONFIG:     /var/lib/cilium/clustermesh/
      GOMEMLIMIT:                    node allocatable (limits.memory)
      KUBE_CLIENT_BACKOFF_BASE:      1
      KUBE_CLIENT_BACKOFF_DURATION:  120
    Mounts:
      /host/etc/cni/net.d from etc-cni-netd (rw)
      /host/proc/sys/kernel from host-proc-sys-kernel (rw)
      /host/proc/sys/net from host-proc-sys-net (rw)
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /sys/fs/bpf from bpf-maps (rw)
      /tmp from tmp (rw)
      /var/lib/cilium/clustermesh from clustermesh-secrets (ro)
      /var/lib/cilium/tls/hubble from hubble-tls (ro)
      /var/run/cilium from cilium-run (rw)
      /var/run/cilium/envoy/sockets from envoy-sockets (rw)
      /var/run/cilium/netns from cilium-netns (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pqjr8 (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True
  Initialized                 False
  Ready                       False
  ContainersReady             False
  PodScheduled                True
Volumes:
  tmp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  cilium-run:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/cilium
    HostPathType:  DirectoryOrCreate
  cilium-netns:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/netns
    HostPathType:  DirectoryOrCreate
  bpf-maps:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/fs/bpf
    HostPathType:  DirectoryOrCreate
  hostproc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc
    HostPathType:  Directory
  cilium-cgroup:
    Type:          HostPath (bare host directory volume)
    Path:          /run/cilium/cgroupv2
    HostPathType:  DirectoryOrCreate
  cni-path:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:  DirectoryOrCreate
  etc-cni-netd:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  DirectoryOrCreate
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  envoy-sockets:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/cilium/envoy/sockets
    HostPathType:  DirectoryOrCreate
  clustermesh-secrets:
    Type:        Projected (a volume that contains injected data from multiple sources)
    SecretName:  cilium-clustermesh
    Optional:    true
    SecretName:  clustermesh-apiserver-remote-cert
    Optional:    true
    SecretName:  clustermesh-apiserver-local-cert
    Optional:    true
  host-proc-sys-net:
    Type:          HostPath (bare host directory volume)
    Path:          /proc/sys/net
    HostPathType:  Directory
  host-proc-sys-kernel:
    Type:          HostPath (bare host directory volume)
    Path:          /proc/sys/kernel
    HostPathType:  Directory
  hubble-tls:
    Type:        Projected (a volume that contains injected data from multiple sources)
    SecretName:  hubble-server-certs
    Optional:    true
  kube-api-access-pqjr8:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    Optional:                false
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type    Reason     Age                From               Message
  ----    ------     ----               ----               -------
  Normal  Scheduled  82s                default-scheduler  Successfully assigned kube-system/cilium-w96nb to node01.localdomain
  Normal  Pulled     16s (x2 over 82s)  kubelet            Container image "quay.io/cilium/cilium:v1.18.0@sha256:dfea023972d06ec183cfa3c9e7809716f85daaff042e573ef366e9ec6a0c0ab2" already present on machine and can be accessed by the pod
  Normal  Created    16s (x2 over 82s)  kubelet            Container created
  Normal  Started    16s (x2 over 82s)  kubelet            Container started


Последнее исправление: dataman (всего исправлений: 3)

Да вы издеваетесь. Не конкретно вы, я про ситуацию вообще. В последнее время появляется все больше воннаби девопсов, которые ну вообще деревянные. Учитесь читать свой же лог. Вам же несколько раз пишет:

dial tcp 10.96.0.1:443: i/o timeout

Вы буквально врубили --set kubeProxyReplacement=true, а сам kube-proxy у вас отключён.

Причем выражение dial tcp 10.96.0.1:443: i/o timeout сразу гуглится вплоть до похожего кейса на GH issues и даже простые ИИ-агенты сразу находят причину (я специально перепроверил). Ну нельзя же так, бездумно, не пытаясь понять проблему, сразу на форум выкидывать.

Obezyan
()

Вот рабочий пример cilium helm параметров

helm upgrade  cilium   cilium/cilium   --version 1.18.0  --namespace kube-system  --set ipam.mode=kubernetes  --set kubeProxyReplacement=true   --set securityContext.capabilities.ciliumAgent="{CHOWN,KILL,NET_ADMIN,NET_RAW,IPC_LOCK,SYS_ADMIN,SYS_RESOURCE,DAC_OVERRIDE,FOWNER,SETGID,SETUID}"  --set securityContext.capabilities.cleanCiliumState="{NET_ADMIN,SYS_ADMIN,SYS_RESOURCE}"   --set cgroup.autoMount.enabled=false  --set cgroup.hostRoot=/sys/fs/cgroup  --set k8sServiceHost=localhost  --set k8sServicePort=7445  --set ipam.mode=kubernetes  --set hubble.enabled=true   --set hubble.relay.enabled=true   --set hubble.ui.enabled=true  --set l2podAnnouncements.interface="enp0s3"   --set devices=enp0s3 --set operator.replicas=1

Пригодится новичкам в Talos

antonio-an
() автор топика
Последнее исправление: antonio-an (всего исправлений: 1)

Перед тем, как постить, читать текст пробовал?

kube-system   kube-apiserver-node01.localdomain            1/1     Terminated              0               33m
kube-system   kube-controller-manager-node01.localdomain   1/1     Terminated              2 (33m ago)     33m
kube-system   kube-scheduler-node01.localdomain            1/1     Terminated              2 (33m ago)     33m
msg="Unable to contact k8s api-server"
l0stparadise ★★★★★
()