Revert "disable ResilientWatchCacheInitialization feature" #2192

benluddy · 2025-01-30T22:24:34Z

When this feature is enabled, watch requests that are to be served from the watch cache immediately return 429 if the cache is not initialized and the client retries. When disabled, the same watch requests "hang" until they either time out or complete successfully.

There is an OCP test that counts the number of watch requests during a job on a per-user basis by scraping audit logs. The test fails if a user exceeds an arbitrary threshold that has been selected based on historical observations. With this feature enabled, any issue that delays watch cache initialization or forces a watch cache to reinitialize now results in an increase in the number of watch requests appearing in the audit logs (due to the retries), which in turn causes the test thresholds to breach.

This was temporarily disabled for kube-apiserver to improve the CI signal-to-noise ratio during the 1.31 rebase. It was not disabled for openshift-apiserver.

Sample job from the 1.31 rebase process before the feature was disabled: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/openshift-cluster-kube-apiserver-operator-1734-openshift-kubernetes-2055-openshift-cluster-kube-apiserver-operator-1734-nightly-4.18-e2e-aws-ovn-single-node-serial/1835775665903767552

…eature" This reverts commit 4772890.

openshift-ci-robot · 2025-01-30T22:24:45Z

@benluddy: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

3a3459a|Revert "UPSTREAM: : disable ResilientWatchCacheInitialization feature": does not specify an upstream backport in the commit message

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

benluddy · 2025-01-30T22:25:05Z

/hold

openshift-ci · 2025-01-30T22:25:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: benluddy
Once this PR has been reviewed and has the lgtm label, please assign p0lyn0mial for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

DOWNSTREAM_OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-01-31T04:07:01Z

@benluddy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-agnostic-ovn-cmd	`3a3459a`	link	false	`/test e2e-agnostic-ovn-cmd`
ci/prow/e2e-gcp	`3a3459a`	link	true	`/test e2e-gcp`
ci/prow/k8s-e2e-conformance-aws	`3a3459a`	link	true	`/test k8s-e2e-conformance-aws`
ci/prow/e2e-aws-ovn-serial	`3a3459a`	link	true	`/test e2e-aws-ovn-serial`
ci/prow/e2e-aws-ovn-hypershift	`3a3459a`	link	true	`/test e2e-aws-ovn-hypershift`
ci/prow/e2e-aws-crun-wasm	`3a3459a`	link	true	`/test e2e-aws-crun-wasm`
ci/prow/e2e-aws-ovn-runc	`3a3459a`	link	true	`/test e2e-aws-ovn-runc`
ci/prow/e2e-aws-ovn-crun	`3a3459a`	link	true	`/test e2e-aws-ovn-crun`
ci/prow/verify-commits	`3a3459a`	link	true	`/test verify-commits`
ci/prow/e2e-aws-ovn-cgroupsv2	`3a3459a`	link	true	`/test e2e-aws-ovn-cgroupsv2`
ci/prow/okd-scos-e2e-aws-ovn	`3a3459a`	link	false	`/test okd-scos-e2e-aws-ovn`
ci/prow/e2e-aws-csi	`3a3459a`	link	false	`/test e2e-aws-csi`
ci/prow/k8s-e2e-aws-ovn-serial	`3a3459a`	link	false	`/test k8s-e2e-aws-ovn-serial`
ci/prow/e2e-aws-ovn-fips	`3a3459a`	link	true	`/test e2e-aws-ovn-fips`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Revert "UPSTREAM: <drop>: disable ResilientWatchCacheInitialization f…

3a3459a

…eature" This reverts commit 4772890.

openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Jan 30, 2025

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2025

openshift-ci bot requested review from jerpeter1 and tkashem January 30, 2025 22:25

openshift-ci bot added the vendor-update Touching vendor dir or related files label Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "disable ResilientWatchCacheInitialization feature" #2192

Revert "disable ResilientWatchCacheInitialization feature" #2192

benluddy commented Jan 30, 2025 •

edited

Loading

openshift-ci-robot commented Jan 30, 2025

benluddy commented Jan 30, 2025

openshift-ci bot commented Jan 30, 2025

openshift-ci bot commented Jan 31, 2025

Revert "disable ResilientWatchCacheInitialization feature" #2192

Are you sure you want to change the base?

Revert "disable ResilientWatchCacheInitialization feature" #2192

Conversation

benluddy commented Jan 30, 2025 • edited Loading

openshift-ci-robot commented Jan 30, 2025

benluddy commented Jan 30, 2025

openshift-ci bot commented Jan 30, 2025

openshift-ci bot commented Jan 31, 2025

benluddy commented Jan 30, 2025 •

edited

Loading