Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"oc cluster join" is failing on: kubelet cgroup driver #17190

Closed
ikus060 opened this issue Nov 5, 2017 · 2 comments
Closed

"oc cluster join" is failing on: kubelet cgroup driver #17190

ikus060 opened this issue Nov 5, 2017 · 2 comments

Comments

@ikus060
Copy link

ikus060 commented Nov 5, 2017

I'm trying to use oc cluster join to add a new node to a running master.

Version

oc v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://jerd.patrikdufresne.com:8443
openshift v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7

Steps To Reproduce
  1. I've configure a new server to run openshift.
  2. I've install docker on debian stretch
  3. I've install "oc"
  4. I'm running /usr/local/bin/oc cluster up --use-existing-config --public-hostname jerd.patrikdufresne.com --routing-suffix oc.patrikdufresne.com --host-data-dir /srv --host-config-dir /var/lib/origin/openshift.local.config --logging
  5. Edit master-config.yaml to include the following and restart.
  controllerArguments:
    cluster-signing-cert-file: [ /var/lib/origin/openshift.local.config/master//ca.crt ]
    cluster-signing-key-file: [ /var/lib/origin/openshift.local.config/master//ca.key ]
  1. Then I execute the following line to accept all CSR: oc observe csr -- oc adm certificate approve
  2. Then on a new server, I'm running oc cluster join
  3. I'm pasting the following secret:
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM2akNDQWRLZ0F3SUJBZ0lCQVRBTkJna3Foa2lHOXcwQkFRc0ZBREFtTVNRd0lnWURWUVFEREJ0dmNHVnUKYzJocFpuUXRjMmxuYm1WeVFERTFNRGs0TVRjek56a3dIaGNOTVRjeE1UQTBNVGMwTWpVNVdoY05Nakl4TVRBegpNVGMwTXpBd1dqQW1NU1F3SWdZRFZRUUREQnR2Y0dWdWMyaHBablF0YzJsbmJtVnlRREUxTURrNE1UY3pOemt3CmdnRWlNQTBHQ1NxR1NJYjNEUUVCQVFVQUE0SUJEd0F3Z2dFS0FvSUJBUURNRWl1SlRyZXJVdHh2V3VQQ0crMFYKNXVZMFRGRVhQeUNBMStpeXk4RW9TbmxubmdwaFRYYlBja0FPM1NxV0hobVdFRVBJbzhLU3lZbnBBRmQ3cnd6MgplRnR3bTJUZzhTNzVEaWdxd2VuZ215Wnp6OWlKcXByR25yUVpVd1hSdDRwLy8xaWxZczVlNUhjRm9NT1o0MTNnCjlNM3lsQnZQNWZDZkpQU0FBeEdiV2U4RXg1ZUI5Nitod3Zyc3hEU2FKUU5ibTZmZHlMREg4bFlyT25ycUljQlYKOGMxRW5adEY0Q2hOdVFzV3Fma3NqZFhYdTZsVzA0aUxTUC95MnBxOVZWcE5hYkg0R1djUWYycFZSSEFGT3UyWgpFQTJ5L1REdXlTc1VEU2FjTkY2TFgzYXc4NHVVUytUckwwUXpZemRscHZsaTgydy9Jb2hBTFpQSzdlQm1jSVkzCkFnTUJBQUdqSXpBaE1BNEdBMVVkRHdFQi93UUVBd0lDcERBUEJnTlZIUk1CQWY4RUJUQURBUUgvTUEwR0NTcUcKU0liM0RRRUJDd1VBQTRJQkFRQjJNaTlnVUtaamMyYnhPTXAyYXRXaTdPMHQxUTZlZHNGNTYxMi9TSGtJQndMbgpFdktJdDZHT01sTlMyUFcrRHJKZzRzbWRtMHlIK3ZwNjdrc3VzbUFPeFhFVlpxUHFDUTFsUXFrT2FTUGRpb21rCmtnTVh3TU9rZHhZb0MwaXBudTFoNHdzb1dvUzlQZlYvZXhqZG5zZklGcVUxYy80TSs0Ry9CTUUxaGJxUVhPT28KdTFhajhmaE9hY1N5TDlVckx4dlB1M1dkNHZKclRDNzRvK1hjUUVyUkV2WkJFbHJsRGlYYjNuZytMQStNMnRmTApXajZyd3NrVy9pb25saDQ4L1Nrb053NmplSTE0TWY4YjZmTkM0cngxbVhITklqbENyQjd5UmxQUUhDT2Q1WTY3CkNOUEl4WExuZG81ZjM5TldkZGxsVVRwNW5rSFJNMTZPcENkeFMySm4KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    server: https://jerd.patrikdufresne.com:8443
  name: jerd-patrikdufresne-com:8443
contexts:
- context:
    cluster: jerd-patrikdufresne-com:8443
    namespace: default
    user: system:admin/127-0-0-1:8443
  name: default/jerd-patrikdufresne-com:8443/system:admin
current-context: default/jerd-patrikdufresne-com:8443/system:admin
kind: Config
preferences: {}
users:
- name: system:admin/127-0-0-1:8443
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUREVENDQWZXZ0F3SUJBZ0lCQmpBTkJna3Foa2lHOXcwQkFRc0ZBREFtTVNRd0lnWURWUVFEREJ0dmNHVnUKYzJocFpuUXRjMmxuYm1WeVFERTFNRGs0TVRjek56a3dIaGNOTVRjeE1UQTBNVGMwTWpVNVdoY05NVGt4TVRBMApNVGMwTXpBd1dqQTNNUjR3SEFZRFZRUUtFeFZ6ZVhOMFpXMDZZMngxYzNSbGNpMWhaRzFwYm5NeEZUQVRCZ05WCkJBTVRESE41YzNSbGJUcGhaRzFwYmpDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUIKQUxQMER0QTJlUnp4R0lnZEszNTBNNHlDY1RKcnN0Z2NaNkZTaURURzFoUCtIOGh2aVdnSnhaLzZVQ3hKVTFzVwprWURNWG5oaDRnN0hyRk1wb2lQYTA1a2NBcm83RTJSckFkcm8rSXNRKzBzOGwrL1RaQStNY1J1Mjc2NUxzeEhOCisraXZ5K3lZTS83dHZiMmx2dHJkTjNjOTAzckErcmZoSDM2Q282OGFBYjVra2w5Vm41SGZRNE5TS3ZkQ3Zob3gKdG4zdG14TERHZzlYSElQMDNSemZ1R01YVjVGV0YxaWViNTlHQU5LM0d6LzlLNUFFWEphdzVUbEI3WDI5MU9IWQphRWJLU0ZTTlA1T2wzT1ZIZWtuS0NKUDRwNFhYUnpWWGNOWXZxTW5XRVhycGphRDRHTnZwKzVQd1VYS0hGY1hUClg0MGtlN0pVMmI2S2tHYkVMU3l4cW5zQ0F3RUFBYU0xTURNd0RnWURWUjBQQVFIL0JBUURBZ1dnTUJNR0ExVWQKSlFRTU1Bb0dDQ3NHQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQgpBRHh0c0NCbXY5eFkyMW54SytTNko4Zm41VmtGS2IzR05IZWtnYis5S0VCR3doRU8wVWdoK0k5OG5nZlZxN2ZLClg2VS8wNEpnbTkvQjFkSkRRUUVBeWozVVdrSHpkOHliUzB3MXNKQXk0akc3UXBqNW14UjV0ZFdVSzhGbzgzZ24KYmtsQzZMbWJaSDEwekxqRzdMemZPUm9McUpKak1acXR6SzJQTmFKQktYc0FsZWZqK1B1K3A2ZWtoMTVnMDYxUAp3RmNIRENnd283M1lLZDV5a3VjNjRHbWJScXRYYndJZ01WZURTVnJ2ajE2U3ZMQlFUY3pobzYrQzFwU0I2cElTCmNPTjZQZ1NaOW1DMFN3dzkzQnZ4S3ZTQTlvODV4MnhXeW8yT3ZMdEJFSEp3ZTZXSDc3LzZGeWRRNjZGb2l5WEMKT0x4eWEySEUyQlFZUHZhb1Bxdkd5bFU9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBcy9RTzBEWjVIUEVZaUIwcmZuUXpqSUp4TW11eTJCeG5vVktJTk1iV0UvNGZ5RytKCmFBbkZuL3BRTEVsVFd4YVJnTXhlZUdIaURzZXNVeW1pSTlyVG1Sd0N1anNUWkdzQjJ1ajRpeEQ3U3p5WDc5TmsKRDR4eEc3YnZya3V6RWMzNzZLL0w3Smd6L3UyOXZhVysydDAzZHozVGVzRDZ0K0VmZm9LanJ4b0J2bVNTWDFXZgprZDlEZzFJcTkwSytHakcyZmUyYkVzTWFEMWNjZy9UZEhOKzRZeGRYa1ZZWFdKNXZuMFlBMHJjYlAvMHJrQVJjCmxyRGxPVUh0ZmIzVTRkaG9Sc3BJVkkwL2s2WGM1VWQ2U2NvSWsvaW5oZGRITlZkdzFpK295ZFlSZXVtTm9QZ1kKMituN2svQlJjb2NWeGROZmpTUjdzbFRadm9xUVpzUXRMTEdxZXdJREFRQUJBb0lCQUR2eEFiWXRUdTVyQ0tiZQpRSXlnbkVNamVCMDVicHM1NnZMN2tNOHpwRCtJbUlHbFZYbklONEh3V1NCSFZISzA4OGFaVEthQXhGSDBCTnkyCnM0R0o5STI5bk5MM3RwL3VYUEhVUkdYZVJEWnRlcGF5TFZSWWpaeVR0UWF6eEhRYnp0dFZJM0l0eUxRVDhPM28KOWNmbGhBSStIK0Yxd28zWmVTb2t6ZTBYbHBrYVRBSUFnSERyN2Z1OVlWWjFWMDB4a2grbDZpV1U5WXFRNk1OYgpUaXMzVHdOalVYQWw2b2ZKZ0FRU3c3dmd3ZVRUZEtmZHk0OGpua3lDS0V6RVpDMmpWRGZESWhyQm1Pa2tsTVhzCk1wV1JRS3RQLzhXdGFPQTZ3RXVvQlVzQUdMejV0Nkw1SFpyZ3Y1d2F2a2xWVHFyNW95MjduRGJncC9EK2wreGcKR3lrYVdsRUNnWUVBMWlJYTB4bHN2dEVBVnNYbTJCZk1lNzQrZ0ViVzE4dWw2NmQ2aHd6cUlzWTdEcm5obktTUwoyWEFEb3BJbTgxcFdWd0UyMmhXaW1odDJnSlFuOVJ4NFE4bEZlL3lNSHVSbTBCYnJIV2lUbHQzd2ZQU01uQ0RRCkdFVnZxZlVVNTFha1VMQkRnWWp4VVVYU0FCV2M3c2Z4SWpUZG9lREdqL1k1M002K2xhY2o2VjhDZ1lFQTF5TXEKbGpwdVF6ZUxyVnJkVWZ3eGY0QmhuWHpxdnA5MlZFd01MYkVrSFVrbVA2ZG55R0xrNmRPUmpROEx5TXZRSTRrSApIcFJNTiszM2RtYUVQOFNKZUZsUVBVeWgrcFl1RWdISXNOTUFZVWxpbzVJQlFEcHBSWlllQXVCckYzV3lnTHpSCkFySkl1enFad0Y2aHJmZEtJNDJJRGJGcDVzbVRVcys1Z3o1SWFHVUNnWUJwRmIrWVRXZmlsT3JYcXJOSTVSVUMKdlRBcS81aTd1a01vekw0Q3ZNSENZd25raGpCRUVUZkg0WUxISzNaV0FzVlFXVll6M0Y0NXhyUjNFVDR5dWRBRApaQ2puV2Q3aDRqRGFlZ1RVSDlnZU43aW5lZFB2WXVMOHBrYlFYMmZzeDhaSG10am1IdkJlZENkRXgxYUdrRFMwCkZzR1ZpWnVvVnF1Nnd2TWd1aStUZlFLQmdRQ1RyblJCaEFMRTZaQmNoQStCaEJtR2FONlplWEs3UUVPK1FpWHEKQjd1K3pzUU8xaUFNRXJjSlBFNmlBajRZckxCSWRId0twY3BjYW1LQlNJWm9MOFllYzFEOWcybDkxekh3OG9DTgp3WXAxUGljVUdkbjUrUjdpd2RZQUs0WFlLTmRNUUZGS0JKQ2cxZTFOZktpSC8wVkplcEoyczk3NnFrMFRmN3pkCk54Z0ZjUUtCZ1FDVnFENXQ0NTJzK0JXZGlORUhLMkp5eXhYQWlybnEyL1QzZlF1WGZBTnYwWlJrUmtPV3RneUYKNkpHaTk4aXpLY0FwQkd1dmdISG9XVjFmWXJiSGZFUVdwaFJubGVkMnpjT2NNRmJrVHVPRkRLbnBZOTgxalprZwpzUVJQalZKMm1FdFVHL1B4ViszcGJCOTJmcnlRUnpYUlhjQU9DcUhUbUJ2ZS9ZMUMzakR2bHc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
Current Result

The command oc cluster join is failing.

-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... 
   Deleted existing OpenShift container
-- Checking for openshift/origin:v3.6.0 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ... 
   Using Docker shared volumes for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ... 
   Using 127.0.0.1 as the server IP
-- Joining OpenShift cluster ... 
   Starting OpenShift Node using container 'origin'
   Waiting for server to start listening
FAIL
   Error: timed out waiting for OpenShift container "origin"
   Details:
     No log available from "origin" container
$ docker logs origin
I1105 22:15:18.989870   10732 bootstrap_node.go:266] Bootstrapping from API server https://jerd.patrikdufresne.com:8443 (experimental)
I1105 22:15:24.269381   10732 docker.go:364] Connecting to docker on unix:///var/run/docker.sock
I1105 22:15:24.269410   10732 docker.go:384] Start docker client with request timeout=2m0s
W1105 22:15:24.270999   10732 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
I1105 22:15:24.285513   10732 start_node.go:345] Starting node skelly (v3.6.0+c4dd4cf)
I1105 22:15:24.287322   10732 start_node.go:354] Connecting to API server https://jerd.patrikdufresne.com:8443
I1105 22:15:24.287465   10732 docker.go:364] Connecting to docker on unix:///var/run/docker.sock
I1105 22:15:24.287488   10732 docker.go:384] Start docker client with request timeout=2m0s
I1105 22:15:24.288557   10732 node.go:134] Connecting to Docker at unix:///var/run/docker.sock
I1105 22:15:24.294198   10732 feature_gate.go:144] feature gates: map[]
I1105 22:15:24.294976   10732 manager.go:143] cAdvisor running in container: "/docker/a9d1becb1e26a1b39f1a5d31f778ccf9b036fcb12a2b2a1aef42b5dd022ae27f"
I1105 22:15:24.298434   10732 node.go:348] Using iptables Proxier.
W1105 22:15:24.300228   10732 node.go:488] Failed to retrieve node info: nodes "skelly" not found
W1105 22:15:24.300315   10732 proxier.go:309] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
W1105 22:15:24.300334   10732 proxier.go:314] clusterCIDR not specified, unable to distinguish between internal and external traffic
I1105 22:15:24.300350   10732 node.go:380] Tearing down userspace rules.
W1105 22:15:24.307926   10732 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused
I1105 22:15:24.320658   10732 fs.go:117] Filesystem partitions: map[/dev/vda1:{mountpoint:/rootfs major:254 minor:1 fsType:ext4 blockSize:0} overlay:{mountpoint:/ major:0 minor:38 fsType:overlay blockSize:0}]
I1105 22:15:24.322495   10732 manager.go:198] Machine: {NumCores:4 CpuFrequency:2400084 MemoryCapacity:4147773440 MachineID:9bb73c7811bc4896bed7fbbefabd6cea SystemUUID:9FDDE162-DDCC-44A5-93BB-06A28C34B506 BootID:f65e32b6-4302-49c1-abe5-9b22b66da3e8 Filesystems:[{Device:/dev/vda1 DeviceMajor:254 DeviceMinor:1 Capacity:122061389824 Type:vfs Inodes:7602176 HasInodes:true} {Device:overlay DeviceMajor:0 DeviceMinor:38 Capacity:122061389824 Type:vfs Inodes:7602176 HasInodes:true}] DiskMap:map[254:0:{Name:vda Major:254 Minor:0 Size:128849018880 Scheduler:none}] NetworkDevices:[{Name:ens18 MacAddress:a2:cb:8c:f2:47:b0 Speed:-1 Mtu:1500}] Topology:[{Id:0 Memory:4147773440 Cores:[{Id:0 Threads:[0] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:4194304 Type:Unified Level:2}]} {Id:1 Threads:[1] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:4194304 Type:Unified Level:2}]} {Id:2 Threads:[2] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:4194304 Type:Unified Level:2}]} {Id:3 Threads:[3] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:4194304 Type:Unified Level:2}]}] Caches:[{Size:16777216 Type:Unified Level:3}]}] CloudProvider:Unknown InstanceType:Unknown InstanceID:None}
I1105 22:15:24.323403   10732 manager.go:204] Version: {KernelVersion:4.9.0-3-amd64 ContainerOsVersion:CentOS Linux 7 (Core) DockerVersion:17.09.0-ce DockerAPIVersion:1.32 CadvisorVersion: CadvisorRevision:}
I1105 22:15:24.324265   10732 server.go:509] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
W1105 22:15:24.326477   10732 container_manager_linux.go:217] Running with swap on is not supported, please disable swap! This will be a fatal error by default starting in K8s v1.6! In the meantime, you can opt-in to making this a fatal error by enabling --experimental-fail-swap-on.
I1105 22:15:24.326569   10732 container_manager_linux.go:244] container manager verified user specified cgroup-root exists: /
I1105 22:15:24.326599   10732 container_manager_linux.go:249] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd ProtectKernelDefaults:false EnableCRI:true NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>}]} ExperimentalQOSReserved:map[]}
I1105 22:15:24.326784   10732 kubelet.go:265] Watching apiserver
W1105 22:15:24.327356   10732 kubelet.go:417] Invalid clusterDNS ip '""'
W1105 22:15:24.333196   10732 kubelet_network.go:70] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I1105 22:15:24.333230   10732 kubelet.go:494] Hairpin mode set to "hairpin-veth"
W1105 22:15:24.336789   10732 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
I1105 22:15:24.354641   10732 docker_service.go:184] Docker cri networking managed by kubernetes.io/no-op
F1105 22:15:24.364394   10732 node.go:281] failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

So basically, kubelet is mis configure and I dont know why. I have no clue how to get this fixed.

Expected Result

I'm expecting the bootstrap to work.

@sjenning
Copy link
Contributor

sjenning commented Dec 1, 2017

@vikaschoudhary16 PTAL

@sjenning sjenning removed their assignment Dec 1, 2017
@vikaschoudhary16
Copy link
Contributor

failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

As the log indicates, there is a mismatch between docker and kubelet's cgroup-driver configuration. Following should be helpful:
kubernetes/kubernetes#43805 (comment)

Try adding --cgroup-driver=systemd to kubelet args.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants