Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "disable ResilientWatchCacheInitialization feature" #2192

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

benluddy
Copy link

@benluddy benluddy commented Jan 30, 2025

When this feature is enabled, watch requests that are to be served from the watch cache immediately return 429 if the cache is not initialized and the client retries. When disabled, the same watch requests "hang" until they either time out or complete successfully.

There is an OCP test that counts the number of watch requests during a job on a per-user basis by scraping audit logs. The test fails if a user exceeds an arbitrary threshold that has been selected based on historical observations. With this feature enabled, any issue that delays watch cache initialization or forces a watch cache to reinitialize now results in an increase in the number of watch requests appearing in the audit logs (due to the retries), which in turn causes the test thresholds to breach.

This was temporarily disabled for kube-apiserver to improve the CI signal-to-noise ratio during the 1.31 rebase. It was not disabled for openshift-apiserver.

Sample job from the 1.31 rebase process before the feature was disabled: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-kube-apiserver-operator-1734-openshift-kubernetes-2055-openshift-cluster-kube-apiserver-operator-1734-nightly-4.18-e2e-aws-ovn-single-node-serial/1835775665903767552

@openshift-ci-robot openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Jan 30, 2025
@openshift-ci-robot
Copy link

@benluddy: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@benluddy
Copy link
Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2025
@openshift-ci openshift-ci bot requested review from jerpeter1 and tkashem January 30, 2025 22:25
@openshift-ci openshift-ci bot added the vendor-update Touching vendor dir or related files label Jan 30, 2025
Copy link

openshift-ci bot commented Jan 30, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: benluddy
Once this PR has been reviewed and has the lgtm label, please assign p0lyn0mial for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

openshift-ci bot commented Jan 31, 2025

@benluddy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agnostic-ovn-cmd 3a3459a link false /test e2e-agnostic-ovn-cmd
ci/prow/e2e-gcp 3a3459a link true /test e2e-gcp
ci/prow/k8s-e2e-conformance-aws 3a3459a link true /test k8s-e2e-conformance-aws
ci/prow/e2e-aws-ovn-serial 3a3459a link true /test e2e-aws-ovn-serial
ci/prow/e2e-aws-ovn-hypershift 3a3459a link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-aws-crun-wasm 3a3459a link true /test e2e-aws-crun-wasm
ci/prow/e2e-aws-ovn-runc 3a3459a link true /test e2e-aws-ovn-runc
ci/prow/e2e-aws-ovn-crun 3a3459a link true /test e2e-aws-ovn-crun
ci/prow/verify-commits 3a3459a link true /test verify-commits
ci/prow/e2e-aws-ovn-cgroupsv2 3a3459a link true /test e2e-aws-ovn-cgroupsv2
ci/prow/okd-scos-e2e-aws-ovn 3a3459a link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-csi 3a3459a link false /test e2e-aws-csi
ci/prow/k8s-e2e-aws-ovn-serial 3a3459a link false /test k8s-e2e-aws-ovn-serial
ci/prow/e2e-aws-ovn-fips 3a3459a link true /test e2e-aws-ovn-fips

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants