Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container creation fails because of "Failed create pod sandbox" #17047

Closed
Ocimum-basilicum opened this issue Oct 26, 2017 · 35 comments
Closed

Container creation fails because of "Failed create pod sandbox" #17047

Ocimum-basilicum opened this issue Oct 26, 2017 · 35 comments
Assignees
Labels
component/kubernetes kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P1

Comments

@Ocimum-basilicum
Copy link

Pods are not getting created anymore

Version

oc v3.6.173.0.7
kubernetes v1.6.1+5115d708d7
features: Basic-Auth

Server https://api.starter-ca-central-1.openshift.com:443
openshift v3.7.0-0.143.7
kubernetes v1.7.0+80709908fd

Steps To Reproduce
  1. create a application (e.g. resid (persistent) from catalog)
  2. check pod/container creation
  3. wait for timeouts
Current Result

Warn messages on pod:
1:33:46 PM | Normal | Sandbox changed | Pod sandbox changed, it will be killed and re-created. 2 times in the last 5 minutes -- | -- | -- | -- 1:33:42 PM | Warning | Failed create pod sand box | Failed create pod sandbox. 2 times in the last 5 minutes
--> pod is not created

the only real error I could grab was :
Failed kill pod | error killing pod: failed to "KillPodSandbox" for "c4c2ec61-ba29-11e7-8b2c-02d8407159d1" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"redis-1-deploy_instantsoundbot\" network: CNI request failed with status 400: 'Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?\n)\n'"

Expected Result

pod should start up as the used to do...

Additional Information

Couldn't get oc adm diagnostics working atm
I guess it could have to do with the introduction of #15880

@tumido
Copy link

tumido commented Oct 26, 2017

I'm facing the same in the starter-us-east-1.openshift.com environment. It's been unstable for couple of days already...

@pweil- pweil- added component/kubernetes kind/bug Categorizes issue or PR as related to a bug. priority/P2 priority/P1 and removed priority/P2 labels Oct 26, 2017
@pweil-
Copy link

pweil- commented Oct 26, 2017

/cc @jupierce

@sjenning
Copy link
Contributor

This is a known issue that has a fix and it being rolled out to the starter clusters presently.

@skjolber
Copy link

I think 'm facing the same issue at starter-ca-central-1.openshift.com:

10:04:10 AM Normal Deadline exceeded Pod was active on the node longer than the specified deadline
10:00:03 AM Normal Sandbox changed Pod sandbox changed, it will be killed and re-created.14 times in the last 58 minutes
-- -- -- --
9:59:40 AM Warning Failed create pod sand box Failed create pod sandbox.14 times in the last 58 minutes

When will the fix finish rolling out?

@14yannick
Copy link

Facing same issue not possible to rollout anything on starter-ca-central-1.openshift.com. Hope will be fixed soon.

@osamahassan245
Copy link

osamahassan245 commented Oct 30, 2017

i got same issue , i tried to create application using tomcat 8 , and try to build source code that exist in this path
https://github.com/osamahassan245/samplepp

i got build error , when tried to check the log , got this log

container "sti-build" in pod is not available

@Artod
Copy link

Artod commented Oct 30, 2017

Same problem on starter-ca-central-1.openshift.com

error streaming logs from build pod: sii-test/app-5-build container: , container "sti-build" in pod "app-5-build" is not available

@osamahassan245
Copy link

issue solved , i tried to use " Red Hat JBoss Web Server 3.1 Tomcat 8 1.0 " , it's working fine now

@edevyatkin
Copy link

The issue is still actual on starter-us-east-1.openshift.com

@izderadicka
Copy link

Still problem on ca-central

@brianHollingsworth
Copy link

Glad I'm not the only one seeing this issue. It's been occurring for me on console.starter-us-west-1.openshift.com since last weekend (11/4).

@sothawo
Copy link

sothawo commented Nov 12, 2017

still seeing this on starter-ca-central-1.openshift.com

@mavajsunco
Copy link

I have same issue . error streaming logs from build pod: mavajsunco-website/mavajsunco-msc-6-build container: , container "sti-build" in pod "mavajsunco-msc-6-build" is not available

@axl8713
Copy link

axl8713 commented Nov 21, 2017

Same issue deploying rhscl/mysql-57-rhel7 on starter-us-east-1.

@sjenning sjenning assigned dcbw and unassigned sjenning Nov 29, 2017
@sjenning
Copy link
Contributor

@dcbw this is the all too familiar iptables-restore issue. You are closer to this that I am and hopefully can provide better feedback about the progress.

@warmchang
Copy link
Contributor

👍

@nevadascout
Copy link

Still having this problem on starter-us-west-2.

I've got 7 failed deployments in a row for this error message.

@skoorupa
Copy link

skoorupa commented Feb 4, 2018

^same

@DanyC97
Copy link
Contributor

DanyC97 commented Feb 21, 2018

@dcbw @sjenning any input as to where the issue might be?

@jamestenglish
Copy link

Seeing this on pro-us-east-1

@jherson
Copy link

jherson commented Feb 22, 2018

Seeing this the last couple of days on pro-us-east-1 as well

@shreyasgombi
Copy link

Same here!!! Observing on pro-us-east-1.

@saurabhdevops
Copy link

Hey Folks! Any update on this one, do you have a fix already on openshift or openshift ansible repos that I can pick up. Is there a temporary workaround for this issue? We are facing the same issue with our openshift cluster on AWS.

Version
OpenShift Master:
v3.7.0+7ed6862
Kubernetes Master:
v1.7.6+a08f5eeb62

@saurabhdevops
Copy link

@pweil- , @jupierce are you still looking into this issue. Is there any progress or workaround available?

@pweil-
Copy link

pweil- commented Mar 26, 2018

@dcbw @knobunc ping

@agajdosi
Copy link

I am facing similar issue using OCP v3.9.30 with CDK. In my case I have Che deployed on OpenShift and when I start a new workspace, its node crashes on sandbox changed:

11:52:32 AM 	Normal 	Killing  	Killing container with id docker://container:Need to kill Pod
11:52:30 AM 	Normal 	Sandbox Changed  	Pod sandbox changed, it will be killed and re-created.
11:52:28 AM 	Normal 	Started  	Started container
11:52:28 AM 	Normal 	Created  	Created container

Is there any update on this issue @dcbw?

@14yannick
Copy link

14yannick commented Jun 12, 2018

I used Openshift for more than 5 years. Spend a lot time making my app running on v2 again. At the end traffic was just not rooted anymore. Mooved to heroku took me 2 hours to migrate all my data(db) and make the necessary Source Code changes. Since then no more Problems. Sorry Openshift

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 10, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 10, 2018
@ghost
Copy link

ghost commented Oct 31, 2018

Seeing this (or something similar) currently on OpenShift Online starter-us-west-1. Unable to build or deploy because of it. No logs from pods that have this issue. Status page says all green.

@jhaohai
Copy link

jhaohai commented Nov 13, 2018

We still see this issue on okd 3.7.1

@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci-robot
Copy link

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@matthewmcneilly
Copy link

matthewmcneilly commented May 10, 2019

I am seeing this issue or something simular when deploying 3Scale API Management Platform on Openshift in particular system-sidekiq.

Failed create pod sandboxc: rpc error: code = Unknown desc = failed to set up sandbox container "2cc1e1d064082f2a2b8cd7a10efb7d135a8a150e7d95fb7b939d6368e1717309" network for pod "system-sidekiq-6-deploy- debug": NetworkPlugin cni failed to set up pod "system-sidekiq-6-deploy-debug_mmcneilly-3scale-onprem" network: CNI request failed with status 400: 'pods "system-sidekiq-6-deploy-debug" not found '

Can this issue be reopened?
/reopen

@openshift-ci-robot
Copy link

@matthewmcneilly: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubernetes kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P1
Projects
None yet
Development

No branches or pull requests