Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ContainerCannotRun: devmapper: Error activating devmapper device for ''...'': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed #17787

Closed
tnozicka opened this issue Dec 14, 2017 · 32 comments
Assignees
Labels
component/containers kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P0 sig/containers

Comments

@tnozicka
Copy link
Contributor

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/17751/test_pull_request_origin_extended_conformance_install/4109/

  containerStatuses:
  - containerID: docker://b7f14c8009aee731cd5199c9cce74b5b46912ffd0175c26b3ff73f2139a8212b
    image: openshift/origin-deployer:551f02f
    imageID: docker://sha256:ed03e11e54a7b36812b1a76d0f821357f727b59f3fa56fd290efa5a94fd745a6
    lastState: {}
    name: deployment
    ready: false
    restartCount: 0
    state:
      terminated:
        containerID: docker://b7f14c8009aee731cd5199c9cce74b5b46912ffd0175c26b3ff73f2139a8212b
        exitCode: 128
        finishedAt: 2017-12-14T13:59:13Z
        message: 'devmapper: Error activating devmapper device for ''46d82310de547f7f9e3e87d95f6a5af6420fe4b69ea3e1142bfc86486999e0f0'':
          devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed'
        reason: ContainerCannotRun
        startedAt: 2017-12-14T13:59:13Z
@tnozicka tnozicka added component/containers priority/P1 sig/containers kind/test-flake Categorizes issue or PR as related to test flakes. labels Dec 14, 2017
@tnozicka tnozicka changed the title ContainerCannotRun: devmapper: Error activating devmapper device for ''46d82310de547f7f9e3e87d95f6a5af6420fe4b69ea3e1142bfc86486999e0f0'': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed ContainerCannotRun: devmapper: Error activating devmapper device for ''...'': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed Dec 14, 2017
@sjenning sjenning assigned jwhonce and unassigned sjenning Dec 14, 2017
@jwhonce
Copy link
Contributor

jwhonce commented Dec 15, 2017

@rhvgoyal excerpts from logs. Looks like your patch for additional debugging is not in this build.

Client:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-68.gitec8512b.el7.x86_64
 Go version:      go1.8.3
 Git commit:      ec8512b/1.12.6
 Built:           Thu Nov 16 15:19:17 2017
 OS/Arch:         linux/amd64

Server:
 Version:         1.12.6
 API version:     1.24
 Package version: docker-1.12.6-68.gitec8512b.el7.x86_64
 Go version:      go1.8.3
 Git commit:      ec8512b/1.12.6
 Built:           Thu Nov 16 15:19:17 2017
 OS/Arch:         linux/amd64
Dec 14 13:59:12 ip-172-18-2-50.ec2.internal oci-umount[112015]: umounthook <debug>: prestart container_id:f92f4b74698a rootfs:/var/lib/docker/devicemapper/mnt/451a040fd9afc89f24aae64bd046d1755e841c4c25e5ad515f7b5d6abe05d487/rootfs
Dec 14 13:59:12 ip-172-18-2-50.ec2.internal oci-umount[112015]: umounthook <error>: f92f4b74698a: Failed to read directory /usr/share/oci-umount/oci-umount.d: No such file or directory
Dec 14 13:59:14 ip-172-18-2-50.ec2.internal dockerd-current[16589]: time="2017-12-14T13:59:14.909237523Z" level=error msg="devmapper: Error unmounting device 46d82310de547f7f9e3e87d95f6a5af6420fe4b69ea3e1142bfc86486999e0f0: invalid argument"
Dec 14 13:59:14 ip-172-18-2-50.ec2.internal dockerd-current[16589]: time="2017-12-14T13:59:14.909278876Z" level=error msg="Error unmounting container b7f14c8009aee731cd5199c9cce74b5b46912ffd0175c26b3ff73f2139a8212b: invalid argument"
Dec 14 13:59:14 ip-172-18-2-50.ec2.internal dockerd-current[16589]: time="2017-12-14T13:59:14.911250303Z" level=error msg="Handler for POST /v1.24/containers/b7f14c8009aee731cd5199c9cce74b5b46912ffd0175c26b3ff73f2139a8212b/start returned error: devmapper: Error activating devmapper device for '46d82310de547f7f9e3e87d95f6a5af6420fe4b69ea3e1142bfc86486999e0f0': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed"
Dec 14 13:59:14 ip-172-18-2-50.ec2.internal dockerd-current[16589]: time="2017-12-14T13:59:14.911285913Z" level=error msg="Handler for POST /v1.24/containers/b7f14c8009aee731cd5199c9cce74b5b46912ffd0175c26b3ff73f2139a8212b/start returned error: devmapper: Error activating devmapper device for '46d82310de547f7f9e3e87d95f6a5af6420fe4b69ea3e1142bfc86486999e0f0': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed"
Dec 14 13:59:16 ip-172-18-2-50.ec2.internal oci-register-machine[112541]: 2017/12/14 13:59:16 Register machine: prestart 158ff1a13e78d92571cf9cd2de8df046b399b82a124cb134ce9b6e07bfc82912 112533 /var/lib/docker/devicemapper/mnt/e56517f04a762ed475c067b149aa6020d1cc8d09a64041a0a967511951aab1d3/rootfs
Dec 14 13:59:16 ip-172-18-2-50.ec2.internal oci-umount[112548]: umounthook <debug>: prestart container_id:158ff1a13e78 rootfs:/var/lib/docker/devicemapper/mnt/e56517f04a762ed475c067b149aa6020d1cc8d09a64041a0a967511951aab1d3/rootfs

@stevekuznetsov
Copy link
Contributor

The AMIs now have docker-1.12.6-71.git3e8e77d.el7

@bparees
Copy link
Contributor

bparees commented Jan 25, 2018

@jwhonce do we think this would have also been addressed by the devmapper fix in the new docker? it's a different error.

@jwhonce
Copy link
Contributor

jwhonce commented Jan 25, 2018

@bparees If not addressed, the AMI should provide more detailed logging which will have root causing the issue.

@rhvgoyal
Copy link

I don't think -71 has the fix for this issue. We will have to root cause this issue. Given libdm is complaining about some errors, something is wrong with thin pool or some other condition. Lets enable libdm logging to get more verbose messages. Use following docker daemon parameter to enable libdm logs.

'--storage-opt dm.libdm_log_level=3'

@stevekuznetsov
Copy link
Contributor

@rhvgoyal can you add the requisite changes here so we configure Docker correctly in our tests please.

@bparees
Copy link
Contributor

bparees commented Jan 27, 2018

this is definitely still happening:

2018-01-27T13:27:10.205813981Z F0127 13:27:10.205539       1 helpers.go:119] error: build error: building docker.io/extended-test-build-sti-inc-rbsds-hg2fj/internal-build-1:3765aef7 failed when committing the image due to error: Error response from daemon: {"message":"devmapper: Error activating devmapper device for '0ba36102e39b248741f810fccf6ea56e2e3ed07661751663a00d8790d60dd3f1': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed"}

https://ci.openshift.redhat.com/jenkins/job/test_branch_origin_extended_builds/354/

@bparees
Copy link
Contributor

bparees commented Jan 28, 2018

https://ci.openshift.redhat.com/jenkins/job/test_branch_origin_extended_builds/355/consoleFull

  Warning  Failed                 17s   kubelet, ip-172-18-5-67.ec2.internal  Error: failed to start container "docker-build": Error response from daemon: devmapper: Error activating devmapper device for '5c568d2b6df36bdbc60adf23071ac7706a30a042b7b3da30d34dd565be419872': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed

@stevekuznetsov
Copy link
Contributor

@jwhonce @rhvgoyal ping

@jwhonce
Copy link
Contributor

jwhonce commented Feb 1, 2018

See openshift/origin-ci-tool#152 for additional debugging and setting json-file for debugging driver

/cc @rhvgoyal @stevekuznetsov

@rhvgoyal
Copy link

rhvgoyal commented Feb 2, 2018

Apart from debug output, are there any kernel messages on the system. libdm seems to say that a device activation failed. May be kernel has logged something which might give a clue also.

@stevekuznetsov
Copy link
Contributor

@rhvgoyal PID 1 log is here for the job last linked by Ben. Can you take a look?

@rhvgoyal
Copy link

rhvgoyal commented Feb 2, 2018

@stevekuznetsov These are pid1 logs. I am looking for logs by kernel.

@stevekuznetsov
Copy link
Contributor

@rhvgoyal I mis-remembered, thought we got kernel logs in that file too. Adding kernel logs to the jobs now, will have to get a new report to see them.

@bparees
Copy link
Contributor

bparees commented Feb 5, 2018

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/18440/test_pull_request_origin_extended_image_ecosystem/511/

          message: 'devmapper: Error activating devmapper device for ''4310c37b85962b06ab305a8212c1fde0946f2087ba745d823807f7c176bea859'':
            devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run

@stevekuznetsov
Copy link
Contributor

@rhvgoyal dmesg output here for the log @bparees just linked to

@legionus
Copy link
Contributor

https://ci.openshift.redhat.com/jenkins/job/test_branch_origin_extended_builds/379/consoleFull#-169259343956c60d7be4b02b88ae8c268b

2018-02-19T13:25:29.654156375Z error: build error: building docker.io/extended-test-new-app-ht59s-bq949/a234567890123456789012345678901234567890123456789012345678-1:9a714a1e failed when committing the image due to error: Error response from daemon: {"message":"devmapper: Error activating devmapper device for '5f89b8a25b72959bd260aaff47dfcc685de8f3b7b0f9d38cbee8713f9a0884c2': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed"}

@stevekuznetsov
Copy link
Contributor

@rhvgoyal ping

@stevekuznetsov
Copy link
Contributor

@rhvgoyal ping

@jwhonce
Copy link
Contributor

jwhonce commented Mar 16, 2018

@runcom Can you assist @rhvgoyal ?

@jwhonce jwhonce assigned runcom, mrunalp and jwhonce and unassigned jwhonce and runcom Mar 16, 2018
@jwhonce
Copy link
Contributor

jwhonce commented Mar 27, 2018

@bparees I'm going to have to work with the CI guys to debug saving the S3 artifacts. All the files appear to have zero lengths. I have no insight into why that job failed. /cc @runcom

@wozniakjan
Copy link
Contributor

wozniakjan commented Apr 6, 2018

something similar might have happened today again: https://ci.openshift.redhat.com/jenkins/job/test_branch_origin_extended_builds/426/consoleFull#-60125067356cbb9a5e4b02b88ae8c2f77

2018-04-06T11:13:18.687466133Z error: build error: devmapper: Error activating devmapper device for 'e8e4aad8db3aab8c67f96479a218c146eac193784e421d614804e39e551529b3': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed to contain substring
<string>: MEMORY=209715200

@jwhonce
Copy link
Contributor

jwhonce commented Apr 6, 2018

@wozniakjan I reviewed the artifacts for this event. The docker log does not sufficiently cover the time of event. I don't know if the cause was journald throttling logging or something else. Only the console log shows that error. I need the low level debugging logging to correlate what went wrong. Please attach future events as you see them. TIA

/cc @runcom @mrunalp

@wozniakjan
Copy link
Contributor

@jwhonce possibly another occurrence today: https://ci.openshift.redhat.com/jenkins/job/test_branch_origin_extended_builds/433/consoleFull#83489914556cbb9a5e4b02b88ae8c2f77

2018-04-13T11:30:15.548743367Z Removing intermediate container 0ccef51568e5
2018-04-13T11:30:15.675685762Z F0413 11:30:15.675036       1 helpers.go:119] error: build error: devmapper: Error activating devmapper device for 'be467e484fbabbe279fb86f4d80579d3d7d045dee8face8977aa0217b3c5e1c3': devicemapper: Error running deviceCreate (ActivateDevice) dm_task_run failed

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 12, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 11, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/containers kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P0 sig/containers
Projects
None yet
Development

No branches or pull requests