-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Router: Changed default resource resync interval from 10mins to 30mins #17012
Router: Changed default resource resync interval from 10mins to 30mins #17012
Conversation
@knobunc @eparis @rajatchopra PTAL |
Rationale: - Resyncs are mainly intended for robustness. Mainly to handle the case where the resource handler failed to process the item and we hope this will be fixed if we process the item again after sometime(resync interval). Yes, this may fix some transient errors but if we resync frequently then there could be big penalities. - Currently router watches these resources: routes, endpoints, nodes, namespaces, ingresses and secrets. When we have many routes (like several thousand in online case), processing these items takes long time, router reload itself takes few seconds (not milliseconds). Due to short resync interval there will be constant churn of reprocessing of all the items for all these resources. - Earlier we needed shorter resync interval because sharded router was depending on this interval but with openshift#16039 that limitation is removed. 10 mins seems aggressive for some rare transient errors, changed defaults to 30 mins. Admin can edit router deployment config if they need custom resync interval.
ResyncInterval is tunable param and it can go as low as 1 second.
1ebc7b1
to
5e20571
Compare
/unassign |
/hold |
@eparis |
LGTM |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eparis, pravisankar The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest |
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue (batch tested with PRs 17012, 17243). |
@pravisankar: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Resyncs are mainly intended for robustness. Mainly to handle the case
where the resource handler failed to process the item and we hope this
will be fixed if we process the item again after sometime(resync interval).
Yes, this may fix some transient errors but if we resync frequently then
there could be big penalities.
Currently router watches these resources: routes, endpoints, nodes,
namespaces, ingresses and secrets. When we have many routes
(like several thousand in online case), processing these items takes
long time, router reload itself takes few seconds (not milliseconds).
Due to short resync interval there will be constant churn of reprocessing
of all the items for all these resources.
Earlier we needed shorter resync interval because sharded router was depending
on this interval but with Sharded router based on namespace labels should notice routes immediately #16039 that limitation is removed.
10 mins seems aggressive for some rare transient errors, changed defaults
to 30 mins. Admin can edit router deployment config if they need custom resync interval.
Fixed project sync interval in router