Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If a plugin pod's host is terminated, aggregator gets stuck #1978

Open
jpdstan opened this issue Jun 28, 2024 · 1 comment
Open

If a plugin pod's host is terminated, aggregator gets stuck #1978

jpdstan opened this issue Jun 28, 2024 · 1 comment
Labels

Comments

@jpdstan
Copy link

jpdstan commented Jun 28, 2024

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
tl;dr - if your sonobuoy aggregator and sonobuoy plugin pods are running on separate hosts, and the sonobuoy plugin's host dies, then the sonobuoy aggregator will get stuck with the following message and keep infinitely retrying until timeout:

time="2024-06-28T21:49:39Z" level=error msg="could not find pod created by plugin my-plugin-test, will retry: no pods were created by plugin my-plugin-test"
  1. sonobuoy run with a test that takes >few minutes to finish
  2. wait for the sonobuoy pod to create the plugin pod (e.g. sonobuoy-my-plugin-test)
  3. force delete the node that sonobuoy-my-plugin-test is running on. it MUST be a different node than the sonobuoy pod.
  4. check the logs of the sonobuoy pod.

What did you expect to happen:
it would be good if sonobuoy re-created the plugin pods. perhaps we could add a timeout for this check and try to re-create the pods if it times out.

alternatively, we can have the parent caller of sonobuoy run do the retry, but i'm wondering if there's a better way to do this in sonobuoy itself.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Sonobuoy version: 0.56.15
  • Kubernetes version: (use kubectl version): 1.22
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release): ubuntu 20.04
  • Sonobuoy tarball (which contains * below)
Copy link

stale bot commented Jan 31, 2025

There has not been much activity here. We'll be closing this issue if there are no follow-ups within 15 days.

@stale stale bot added the stale label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant