Self-hosted vanilla k8s

I’m trying to install self-hosted vanilla k8s - node-daemon pod doesn’t start (Warning BackOff), i’m some additional info to debug. I’m not sure the what can cause the problem

hi @fernfab!
can you share the output of kubectl logs and kubectl describe when you run it on that pod?

sure, actually now i’m having problem with the image-builder pod:

thank you

k describe pod image-builder-7b9c669698-nw6kr
Name:           image-builder-7b9c669698-nw6kr
Namespace:      default
Priority:       0
Node:           asit4edev2163/10.86.210.203
Start Time:     Fri, 31 Jan 2020 12:41:15 -0600
Labels:         app=gitpod
                component=image-builder
                kind=pod
                pod-template-hash=7b9c669698
                stage=production
Annotations:    checksum/builtin-registry-auth: 0314f197223f233859ee247fa375973f7b0fe84ff1dd7ff7b967fdcbbf44e0dd
                checksum/image-builder-configmap.yaml: ff788f01aec5c9f2d5ade752e3f02a011f914a47320604dae68d1cf59b792c6e
Status:         Running
IP:             10.244.3.62
Controlled By:  ReplicaSet/image-builder-7b9c669698
Containers:
  dind:
    Container ID:  docker://ff81f595efe8c4b783d5096d74be8d2914a35ca646f39e9b4e481dd535729b77
    Image:         docker:18.06-dind
    Image ID:      docker-pullable://docker@sha256:cab1016728d2637f856cb9f5b16769de6806b55f4def7b2856abac42b1b21b0a
    Port:          <none>
    Host Port:     <none>
    Args:
      dockerd
      --userns-remap=default
      -H tcp://127.0.0.1:2375
    State:          Running
      Started:      Fri, 31 Jan 2020 12:41:16 -0600
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:     100m
      memory:  256Mi
    Environment:
      KUBE_STAGE:                     production
      KUBE_NAMESPACE:                 default (v1:metadata.namespace)
      VERSION:                        v0.3.0
      HOST_URL:                       http://<internal-server>
      GITPOD_REGION:                  local
      GITPOD_INSTALLATION_LONGNAME:   production.gitpod.local.00
      GITPOD_INSTALLATION_SHORTNAME:  local-00
    Mounts:
      /etc/docker/certs.d/reg.<internal-server> from docker-tls-certs-0 (rw)
      /var/lib/docker from dind-storage (rw)
  service:
    Container ID:  docker://e996fff82bdf4065b557d25458be905ba5d87894bf1978185c06c21c0f2924c2
    Image:         gcr.io/gitpod-io/image-builder:v0.3.0
    Image ID:      docker-pullable://gcr.io/gitpod-io/image-builder@sha256:a373d8a97d1f2fa07867967076b8f71262c4875eb5ff1c50d755bba257bfd848
    Ports:         9500/TCP, 8080/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      run
      -v
      --config
      /config/image-builder.json
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 31 Jan 2020 12:44:13 -0600
      Finished:     Fri, 31 Jan 2020 12:44:14 -0600
    Ready:          False
    Restart Count:  5
    Requests:
      cpu:     100m
      memory:  200Mi
    Environment:
      KUBE_STAGE:                     production
      KUBE_NAMESPACE:                 default (v1:metadata.namespace)
      VERSION:                        v0.3.0
      HOST_URL:                       http://<internal-server>
      GITPOD_REGION:                  local
      GITPOD_INSTALLATION_LONGNAME:   production.gitpod.local.00
      GITPOD_INSTALLATION_SHORTNAME:  local-00
      DOCKER_HOST:                    tcp://localhost:2375
    Mounts:
      /config/image-builder.json from configuration (rw,path="image-builder.json")
      /config/pull-secret.json from pull-secret (rw,path=".dockerconfigjson")
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  configuration:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      image-builder-config
    Optional:  false
  dind-storage:
    Type:          HostPath (bare host directory volume)
    Path:          /var/gitpod/docker
    HostPathType:  DirectoryOrCreate
  pull-secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  builtin-registry-auth
    Optional:    false
  docker-tls-certs-0:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  builtin-registry-certs
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From                    Message
  ----     ------     ----                   ----                    -------
  Normal   Scheduled  5m19s                  default-scheduler       Successfully assigned default/image-builder-7b9c669698-nw6kr to asit4edev2163
  Normal   Pulling    5m18s                  kubelet, asit4edev2163  Pulling image "docker:18.06-dind"
  Normal   Pulled     5m17s                  kubelet, asit4edev2163  Successfully pulled image "docker:18.06-dind"
  Normal   Created    5m17s                  kubelet, asit4edev2163  Created container dind
  Normal   Started    5m17s                  kubelet, asit4edev2163  Started container dind
  Normal   Pulling    4m33s (x4 over 5m17s)  kubelet, asit4edev2163  Pulling image "gcr.io/gitpod-io/image-builder:v0.3.0"
  Normal   Pulled     4m33s (x4 over 5m16s)  kubelet, asit4edev2163  Successfully pulled image "gcr.io/gitpod-io/image-builder:v0.3.0"
  Normal   Created    4m33s (x4 over 5m16s)  kubelet, asit4edev2163  Created container service
  Normal   Started    4m33s (x4 over 5m15s)  kubelet, asit4edev2163  Started container service
  Warning  BackOff    17s (x24 over 5m10s)   kubelet, asit4edev2163  Back-off restarting failed container

 k logs image-builder-7b9c669698-nw6kr -c service
{"message":"enabled JSON logging","serviceContext":{"service":"image-builder","version":""},"severity":"info","time":"2020-01-31T18:47:17Z"}
{"message":"enabled verbose logging","serviceContext":{"service":"image-builder","version":""},"severity":"info","time":"2020-01-31T18:47:17Z"}
{"interval":"6h0m0s","message":"starting Docker ref pre-cache","refs":["gitpod/workspace-full:latest"],"serviceContext":{"service":"image-builder","version":""},"severity":"info","time":"2020-01-31T18:47:17Z"}
{"gitpodLayer":"/app/workspace-image-layer.tar.gz","hash":"67e372e9b1e5d3b16899d25a341667a2e802e7a0a610635257f6fa1078c764d9","message":"computed Gitpod layer hash","serviceContext":{"service":"image-builder","version":""},"severity":"info","time":"2020-01-31T18:47:17Z"}
{"gitpodLayer":"/app/workspace-image-layer.tar.gz","message":"running self-build","serviceContext":{"service":"image-builder","version":""},"severity":"info","time":"2020-01-31T18:47:17Z"}
{"message":"self-build context sent","serviceContext":{"service":"image-builder","version":""},"severity":"debug","time":"2020-01-31T18:47:17Z"}
{"message":"Step 1/7 : FROM alpine:3.9","severity":"debug","time":"2020-01-31T18:47:17Z"}
{"message":"3.9: Pulling from library/alpine","severity":"debug","time":"2020-01-31T18:47:18Z"}
{"message":"pre-cached Docker ref","ref":"gitpod/workspace-full:latest","resolved-to":"docker.io/gitpod/workspace-full:latest@sha256:aac2ef307933e5ee9ffc73ea5cf55939498c918e159ec3628b1ebd9f590a4329","serviceContext":{"service":"image-builder","version":""},"severity":"debug","time":"2020-01-31T18:47:18Z"}
{"message":"Digest: sha256:115731bab0862031b44766733890091c17924f9b7781b79997f5f163be262178","severity":"debug","time":"2020-01-31T18:47:18Z"}
{"message":"Status: Image is up to date for alpine:3.9","severity":"debug","time":"2020-01-31T18:47:18Z"}
{"message":" ---\u003e 82f67be598eb","severity":"debug","time":"2020-01-31T18:47:18Z"}
{"message":"Step 2/7 : RUN addgroup -g 33333 gitpod     \u0026\u0026 adduser -D -h /home/gitpod -s /bin/sh -u 33333 -G gitpod gitpod     \u0026\u0026 echo \"gitpod:gitpod\" | chpasswd","severity":"debug","time":"2020-01-31T18:47:18Z"}
{"message":" ---\u003e Running in a48fe681884e","severity":"debug","time":"2020-01-31T18:47:18Z"}
{"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","error":"OCI runtime create failed: container_linux.go:348: starting container process caused \"process_linux.go:301: running exec setns process for init caused \\\"exit status 41\\\"\": unknown","message":"self-build failed","serviceContext":{"service":"image-builder","version":""},"severity":"error","time":"2020-01-31T18:47:18Z"}
{"message":"OCI runtime create failed: container_linux.go:348: starting container process caused \"process_linux.go:301: running exec setns process for init caused \\\"exit status 41\\\"\": unknown","serviceContext":{"service":"image-builder","version":""},"severity":"fatal","time":"2020-01-31T18:47:18Z"}

k logs image-builder-7b9c669698-nw6kr -c dind
time="2020-01-31T18:41:16.892759478Z" level=info msg="User namespaces: ID ranges will be mapped to subuid/subgid ranges of: dockremap:dockremap"
time="2020-01-31T18:41:16.893327448Z" level=warning msg="Error while setting daemon root propagation, this is not generally critical but may cause some functionality to not work or fallback to less desirable behavior" dir=/var/lib/docker/165536.165536 error="error writing file to signal mount cleanup on shutdown: open /var/run/docker/unmount-on-shutdown: no such file or directory"
time="2020-01-31T18:41:16.893490937Z" level=warning msg="[!] DON'T BIND ON ANY IP ADDRESS WITHOUT setting --tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING [!]"
time="2020-01-31T18:41:16.894276440Z" level=info msg="libcontainerd: started new docker-containerd process" pid=18
time="2020-01-31T18:41:16.894333746Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2020-01-31T18:41:16.894343319Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2020-01-31T18:41:16.894403168Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
time="2020-01-31T18:41:16.894413582Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2020-01-31T18:41:16.894481879Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201765a0, CONNECTING" module=grpc
time="2020-01-31T18:41:16Z" level=info msg="starting containerd" revision=468a545b9edcd5932818eb9de8e72413e616e86e version=v1.1.2
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1
time="2020-01-31T18:41:16Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/docker/165536.165536/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1
time="2020-01-31T18:41:16Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "ip: can't find device 'aufs'\nmodprobe: can't change directory to '/lib/modules': No such file or directory\n": exit status 1"
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
time="2020-01-31T18:41:16Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.zfs" error="path /var/lib/docker/165536.165536/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter"
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
time="2020-01-31T18:41:16Z" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/docker/165536.165536/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
time="2020-01-31T18:41:16Z" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "ip: can't find device 'aufs'\nmodprobe: can't change directory to '/lib/modules': No such file or directory\n": exit status 1"
time="2020-01-31T18:41:16Z" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/docker/165536.165536/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter"
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." type=io.containerd.differ.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." type=io.containerd.gc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.containers-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.content-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.diff-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.images-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.leases-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.namespaces-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.snapshots-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." type=io.containerd.monitor.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." type=io.containerd.runtime.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.service.v1.tasks-service"..." type=io.containerd.service.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." type=io.containerd.grpc.v1
time="2020-01-31T18:41:16Z" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd-debug.sock"
time="2020-01-31T18:41:16Z" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd.sock"
time="2020-01-31T18:41:16Z" level=info msg="containerd successfully booted in 0.013874s"
time="2020-01-31T18:41:16.925407393Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4201765a0, READY" module=grpc
time="2020-01-31T18:41:16.926632857Z" level=info msg="User namespaces: ID ranges will be mapped to subuid/subgid ranges of: dockremap:dockremap"
time="2020-01-31T18:41:16.928032679Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2020-01-31T18:41:16.928052661Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2020-01-31T18:41:16.928091749Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
time="2020-01-31T18:41:16.928104209Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2020-01-31T18:41:16.928147450Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420170ef0, CONNECTING" module=grpc
time="2020-01-31T18:41:16.928293113Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc420170ef0, READY" module=grpc
time="2020-01-31T18:41:16.943786828Z" level=info msg="[graphdriver] using prior storage driver: overlay2"
time="2020-01-31T18:41:16.969568463Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
time="2020-01-31T18:41:16.970386395Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2020-01-31T18:41:16.970400527Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2020-01-31T18:41:16.970435427Z" level=info msg="ccResolverWrapper: sending new addresses to cc: [{unix:///var/run/docker/containerd/docker-containerd.sock 0  <nil>}]" module=grpc
time="2020-01-31T18:41:16.970443683Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2020-01-31T18:41:16.970481912Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4204e75a0, CONNECTING" module=grpc
time="2020-01-31T18:41:16.970714161Z" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc4204e75a0, READY" module=grpc
time="2020-01-31T18:41:16.970753627Z" level=info msg="Loading containers: start."
time="2020-01-31T18:41:17.113405725Z" level=warning msg="Running modprobe bridge br_netfilter failed with message: ip: can't find device 'bridge'\nbridge                151336  1 br_netfilter\nstp                    12976  1 bridge\nllc                    14552  2 bridge,stp\nip: can't find device 'br_netfilter'\nbr_netfilter           22256  1 xt_physdev\nbridge                151336  1 br_netfilter\nmodprobe: can't change directory to '/lib/modules': No such file or directory\n, error: exit status 1"
time="2020-01-31T18:41:17.122190846Z" level=warning msg="Running modprobe nf_nat failed with message: `ip: can't find device 'nf_nat'\nnf_nat_ipv6            14131  1 ip6table_nat\nnf_nat_masquerade_ipv4    13412  1 ipt_MASQUERADE\nnf_nat_ipv4            14115  1 iptable_nat\nnf_nat                 26583  4 nf_nat_ipv6,xt_nat,nf_nat_masquerade_ipv4,nf_nat_ipv4\nnf_conntrack          137239  8 nf_conntrack_ipv6,nf_nat_ipv6,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_nat_ipv4,nf_nat,nf_conntrack_ipv4,xt_conntrack\nlibcrc32c              12644  4 libceph,nf_nat,nf_conntrack,xfs\nmodprobe: can't change directory to '/lib/modules': No such file or directory`, error: exit status 1"
time="2020-01-31T18:41:17.126271954Z" level=warning msg="Running modprobe xt_conntrack failed with message: `ip: can't find device 'xt_conntrack'\nxt_conntrack           12760 41 \nnf_conntrack          137239  8 nf_conntrack_ipv6,nf_nat_ipv6,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_nat_ipv4,nf_nat,nf_conntrack_ipv4,xt_conntrack\nmodprobe: can't change directory to '/lib/modules': No such file or directory`, error: exit status 1"
time="2020-01-31T18:41:17.225464804Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2020-01-31T18:41:17.316024470Z" level=info msg="Loading containers: done."
time="2020-01-31T18:41:17.331490791Z" level=info msg="Docker daemon" commit=d7080c1 graphdriver(s)=overlay2 version=18.06.3-ce
time="2020-01-31T18:41:17.331634812Z" level=info msg="Daemon has completed initialization"
time="2020-01-31T18:41:17.334656859Z" level=warning msg="Could not register builder git source: failed to find git binary: exec: \"git\": executable file not found in $PATH"
time="2020-01-31T18:41:17.342085236Z" level=info msg="API listen on 127.0.0.1:2375"
time="2020-01-31T18:41:19Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/8682b1918450bfe26d62f05fb80a198bfa3443376d94c938084c963e65bf047d/shim.sock" debug=false pid=192
time="2020-01-31T18:41:19Z" level=info msg="shim reaped" id=8682b1918450bfe26d62f05fb80a198bfa3443376d94c938084c963e65bf047d
time="2020-01-31T18:41:19.418221962Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:41:19.418277921Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:41:19.482083279Z" level=error msg="8682b1918450bfe26d62f05fb80a198bfa3443376d94c938084c963e65bf047d cleanup: failed to delete container from containerd: no such container"
time="2020-01-31T18:41:22Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/01efd7de34d205cc408d48b8f4cb40f5e65b1072ccf7a886e0dc7b325a7a7854/shim.sock" debug=false pid=219
time="2020-01-31T18:41:22Z" level=info msg="shim reaped" id=01efd7de34d205cc408d48b8f4cb40f5e65b1072ccf7a886e0dc7b325a7a7854
time="2020-01-31T18:41:22.239044441Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:41:22.239263697Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:41:22.301119347Z" level=error msg="01efd7de34d205cc408d48b8f4cb40f5e65b1072ccf7a886e0dc7b325a7a7854 cleanup: failed to delete container from containerd: no such container"
time="2020-01-31T18:41:36Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/54413e7d00691681f9ae5ce6e2e83729b875f9d4c5f9f00693f5b0b9eed07edc/shim.sock" debug=false pid=245
time="2020-01-31T18:41:36Z" level=info msg="shim reaped" id=54413e7d00691681f9ae5ce6e2e83729b875f9d4c5f9f00693f5b0b9eed07edc
time="2020-01-31T18:41:36.452271430Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:41:36.452312485Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:41:36.513199577Z" level=error msg="54413e7d00691681f9ae5ce6e2e83729b875f9d4c5f9f00693f5b0b9eed07edc cleanup: failed to delete container from containerd: no such container"
time="2020-01-31T18:42:01Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/3bd51fb65d81ab59afff25205b4c76903b7eaa657f95aea81d7397e2cb579e97/shim.sock" debug=false pid=271
time="2020-01-31T18:42:01Z" level=info msg="shim reaped" id=3bd51fb65d81ab59afff25205b4c76903b7eaa657f95aea81d7397e2cb579e97
time="2020-01-31T18:42:01.882366970Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:42:01.888105688Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:42:01.952078844Z" level=error msg="3bd51fb65d81ab59afff25205b4c76903b7eaa657f95aea81d7397e2cb579e97 cleanup: failed to delete container from containerd: no such container"
time="2020-01-31T18:42:52Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/f3c538ef749477d9492cd8444fc22ce84ff013f220cd02fa011fcd7b7dccccba/shim.sock" debug=false pid=297
time="2020-01-31T18:42:52Z" level=info msg="shim reaped" id=f3c538ef749477d9492cd8444fc22ce84ff013f220cd02fa011fcd7b7dccccba
time="2020-01-31T18:42:52.097380469Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:42:52.097569247Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:42:52.163521686Z" level=error msg="f3c538ef749477d9492cd8444fc22ce84ff013f220cd02fa011fcd7b7dccccba cleanup: failed to delete container from containerd: no such container"
time="2020-01-31T18:44:14Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/5d525f18410d7230085acfa5c056c6af5f81e374bad505782d6027d97a6f02ba/shim.sock" debug=false pid=324
time="2020-01-31T18:44:14Z" level=info msg="shim reaped" id=5d525f18410d7230085acfa5c056c6af5f81e374bad505782d6027d97a6f02ba
time="2020-01-31T18:44:14.907452203Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:44:14.907499658Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:44:14.963308134Z" level=error msg="5d525f18410d7230085acfa5c056c6af5f81e374bad505782d6027d97a6f02ba cleanup: failed to delete container from containerd: no such container"
time="2020-01-31T18:47:18Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/a48fe681884ecd83dda2c24093222c9082bbf6f69c6d25e8b920fa65d7f5040e/shim.sock" debug=false pid=350
time="2020-01-31T18:47:18Z" level=info msg="shim reaped" id=a48fe681884ecd83dda2c24093222c9082bbf6f69c6d25e8b920fa65d7f5040e
time="2020-01-31T18:47:18.830582297Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:47:18.830636374Z" level=error msg="stream copy error: reading from a closed fifo"
time="2020-01-31T18:47:18.904550760Z" level=error msg="a48fe681884ecd83dda2c24093222c9082bbf6f69c6d25e8b920fa65d7f5040e cleanup: failed to delete container from containerd: no such container"

Hi @fernfab, thank you for the detailed log output.

I can see that the service container from the image builder is in a crash restart loop.
Also, the dind container has several error messages that may indicate it’s not working properly.

On the service container, I find this error noteworthy:

{
@type”: “type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent”,
“error”: “OCI runtime create failed: container_linux.go:348: starting container process caused “process_linux.go:301: running exec setns process for init caused \“exit status 41\””: unknown”,
“message”: “self-build failed”,
“serviceContext”: {
“service”: “image-builder”,
“version”: “”
},
“severity”: “error”,
“time”: “2020-01-31T18:47:18Z”
}

Can you describe what platform you run Kubernetes on and what flavor (vanialla, gke, openshift, ect.) of kubernetes you are running? Are there security measures in place that would prevent Docker’s dind to work on your Kubernetes?

Hi @meysholdt, thanks for the quick response, is a vanilla k8s v1.15.1, no additional docker security in place, looks something is missing from the pod log:

k logs image-builder-7b9c669698-nw6kr
Error from server (BadRequest): a container name must be specified for pod image-builder-7b9c669698-nw6kr, choose one of: [dind service] 

potential something missing ?
Thanks

potential something missing ?

yes, your k logs command is missing the -c flag.
You actually used it properly in your previous comment: k logs image-builder-7b9c669698-nw6kr -c service

@meysholdt, never mind my last message missed the container …, I’m looking right now my docker engine is v18.09, the docker bind is using v18.06. Just to prove i changed the deployment by hand to use v18.09. but the error still the same. Do you have any thoughts ? if can be a docker engine problem ? looking at https://hub.docker.com/_/docker mentions about 18.09.
Thank you

hi @fernfab,

the version of the docker engine should be fine.

To verify that a docker engine generally works on your cluster, could you launch a dind-pod using this deployment? https://github.com/meysholdt/spring-petclinic/blob/master/docker-engine.k8s.yaml

Then you can kubectl exec into the pod and try if docker build and docker run works properly inside the dind.

hi @meysholdt,
i was able to create the deployment, exec inside docker build and run without any problem.
i also was trying to run in other environments (not running yet - i’m waiting to solve the require wildcards) at least i was to run this deployment, but in my target environment. no successful yet. is this deployment supposed to run also in minikube or docker-desktop enviroment ? i’m evaluating other options at least to make it run and evaluate was is missing in my actual enviroment.
thanks

hi @meysholdt, any thoughts in my last comment? thank you

@fernfab - did helm installation finished successfully? I am having similar testing on my Vanilla K8s cluster where these prompts.

[root@kubemaster self-hosted]# helm --debug upgrade --install $(for i in $(cat configuration.txt); do echo -e “-f $i”; done) gitpod .
history.go:52: [debug] getting history for release gitpod
Release “gitpod” does not exist. Installing it now.
install.go:158: [debug] Original chart version: “”
install.go:175: [debug] CHART PATH: /root/self-hosted

client.go:98: [debug] creating 94 resource(s)
client.go:244: [debug] Starting delete for “db-migrations” Job
client.go:98: [debug] creating 1 resource(s)
client.go:449: [debug] Watching for changes to Job db-migrations with timeout of 5m0s
client.go:477: [debug] Add/Modify event for db-migrations: ADDED
client.go:516: [debug] db-migrations: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:477: [debug] Add/Modify event for db-migrations: MODIFIED
client.go:516: [debug] db-migrations: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:244: [debug] Starting delete for “db-migrations” Job
Error: failed post-install: timed out waiting for the condition
helm.go:75: [debug] failed post-install: timed out waiting for the condition

I can see that the same node-daemon pod is off. Probably install never finishes. Anyone knows how can I fix this?

I was able to install the helm chart, no errors during the install.

hi @nemke82!

the db-migrations job runs as a post-install helm-hook and it looks like it failed in your case.

Right after you run helm install, there should a terminated pod in Kubernetes. Can you take a looks at it’s log output via kubectl logs db-migration-.....? This log output should explain what went wrong.

Hello @meysholdt

Name: minio
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: app=minio
chart=minio-2.5.18
heritage=Helm
release=gitpod
Annotations:
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: minio-6684dbbb9-qvvdt
Events:
Type Reason Age From Message


Normal FailedBinding 44s (x26 over 6m53s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set

Seems problem is with pvc. I’ve followed https://www.gitpod.io/docs/self-hosted/latest/install/install-on-kubernetes/ instructions and it never says I need to create storage class.

error while running “VolumeBinding” filter plugin for pod “db-5c9b9988df-4cbkv”: pod has unbound immediate PersistentVolumeClaims

Guess I will need to create Minio manually and configure it, not sure what part did I do wrong.

hey @nemke82

persistentvolume-controller no persistent volumes available for this claim and no storage class is set

This sounds like your Kubernetes cluster has PersistedVolumes not configured.
Gitpod does not need any special storage class… just standard PersistentVolumes.

Minio also uses PersistentVolumes by default. Or do you plan to configure Minio with a different storage backend?

Hey @meysholdt

Since dynamic provisioner required I’ve added nfs provisioner and since this is dedicated server it required patch to the proxy svc to put external IP. Domain name is alive and it come to the very end of booting workspace, just now fails with the following message:

  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: gitpod.io--theia.v0.3.0
                operator: Exists
              - key: gitpod.io--ws-sync
                operator: Exists
  schedulerName: default-scheduler
  tolerations:
    - key: node.kubernetes.io/not-ready
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/unreachable
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
  priority: 0
  dnsConfig:
    nameservers:
      - 1.1.1.1
      - 8.8.8.8
  enableServiceLinks: false
status:
  phase: Pending
  conditions:
    - type: PodScheduled
      status: 'False'
      lastProbeTime: null
      lastTransitionTime: '2020-02-17T18:34:20Z'
      reason: Unschedulable
      message: '0/3 nodes are available: 3 node(s) didn''t match node selector.'
  qosClass: Burstable

Hey,

the nodeSelectorTerms look a bit odd. They should be something like gitpod.io/theia.v0.3.0.
Is this just an issue with copy&past/Discourse or do they actually look like that when running kubectl get pod -o yaml -l component=workspace?

1 Like

I am having the exact same issue. Unfortunately this is the only hit of “exit status 41” on Google which means this is a rare problem to debug/solve here. This problem occurs on both Docker 1.13 and Docker 1.19 so I doubt the problem is with the underlying engine. I am using RKE 1.17.5 in both deployments so perhaps there is some extra configuration required for RKE deployment?

This is interesting. If I change the service image to use docker:1903-dind (matching the version of Docker CE runtime) I get a different error message
level=error msg=“Handler for POST /v1.40/containers/ff717258ed81a1db16c540091c6bde8fa94b7542a80de4a0de2f72881f284d7c/start returned error: OCI runtime create failed: container_linux.go:349: starting container process caused “process_linux.go:319: getting the final child’s pid from pipe caused \“EOF\””: unknown”

This tells me there is something incorrect about the Docker-in-Docker (dind) container running in this environment.

This issue has been resolved as it was a problem with RHEL 7 not supporting user namespaces by default. It is tracked in this issue here

1 Like