Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to restart service origin-node #18963

Closed
mfojtik opened this issue Mar 13, 2018 · 6 comments
Closed

Unable to restart service origin-node #18963

mfojtik opened this issue Mar 13, 2018 · 6 comments
Assignees
Labels
area/infrastructure kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P1 sig/pod

Comments

@mfojtik
Copy link
Contributor

mfojtik commented Mar 13, 2018

Seen in https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/18957/test_pull_request_origin_extended_conformance_install/9007/

  1. Hosts:    localhost
     Play:     Configure nodes
     Task:     restart node
     Message:  Unable to restart service origin-node: Job for origin-node.service failed because the control process exited with error code. See "systemctl status origin-node.service" and "journalctl -xe" for details.

Not sure if this failure is related:

Configure nodes [localhost] nickhammond.logrotate : nickhammond.logrotate | Setup logrotate.d scripts 21m31s
go run hack/e2e.go -v -test --test_args='--ginkgo.focus=Configure\snodes\s\[localhost\]\snickhammond\.logrotate\s\:\snickhammond\.logrotate\s\|\sSetup\slogrotate\.d\sscripts$'
@sdodson
Copy link
Member

sdodson commented Mar 15, 2018

The node never sends the systemd notification that it's started successfully even though it looks to be running just fine.

Mar 13 12:36:12 ip-172-18-15-85.ec2.internal systemd[1]: Starting OpenShift Node...
... five minutes later ...
Mar 13 12:41:12 ip-172-18-15-85.ec2.internal systemd[1]: origin-node.service start operation timed out. Terminating.
Mar 13 12:41:12 ip-172-18-15-85.ec2.internal origin-node[3143]: I0313 12:41:12.256208    3143 docker_server.go:73] Stop docker server
Mar 13 12:41:12 ip-172-18-15-85.ec2.internal systemd[1]: origin-node.service: main process exited, code=exited, status=1/FAILURE
Mar 13 12:41:12 ip-172-18-15-85.ec2.internal systemd[1]: Failed to start OpenShift Node.
Mar 13 12:41:12 ip-172-18-15-85.ec2.internal systemd[1]: Unit origin-node.service entered failed state.
Mar 13 12:41:12 ip-172-18-15-85.ec2.internal systemd[1]: origin-node.service failed.
Mar 13 12:41:17 ip-172-18-15-85.ec2.internal systemd[1]: origin-node.service holdoff time over, scheduling restart.
Mar 13 12:41:17 ip-172-18-15-85.ec2.internal systemd[1]: Starting OpenShift Node...

@sjenning
Copy link
Contributor

Possibly related to #18886.
Was there a corresponding change in the systemd unit for the node to switch it to Type=notify (if it isn't already)

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 14, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/infrastructure kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P1 sig/pod
Projects
None yet
Development

No branches or pull requests

5 participants