-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the router reload supression, and add some more controls #16564
Conversation
Code looks ok. The flow is actually a more clear this way. |
204995f
to
cc3edd2
Compare
@imcsk8 Thanks for the review. I plan to add some tests soon. |
70a6561
to
d09631e
Compare
@openshift/networking @openshift/sig-networking FYI |
902bd6f
to
929af8f
Compare
Can we use the controls' default value (RELOAD_INTERVAL=5s, RELOAD_GAP=100ms, RELOAD_EVENT_WAIT=10ms) to verify the bug https://bugzilla.redhat.com/show_bug.cgi?id=1471899 ? If not can you give the recommended value for that scenario of thousands routes? And my understanding for this commit is: the max reload duration will be limited to 4890ms (5s-100ms-10ms), right ? |
@lihongan those values would work. You could drop RELOAD_INTERVAL to 1s and RELOAD_GAP and RELOAD_EVENT_WAIT to 0ms if you wanted. That might make it easier to reproduce and validate. The duration is the longest of the times. So if a reload is desired, then it can reload if (and only if):
So, the longest possible reload time is really unknowable. If the time taken to actually perform the haproxy reload is longer than the RELOAD_INTERVAL, it will be the actual reload time + RELOAD_GAP. Otherwise, if the actual reload time is less than RELOAD_INTERVAL - RELOAD_GAP, then it will reload at RELOAD_INTERVAL. RELOAD_EVENT_WAIT will likely be ignored on a busy system since if you have events coming in all the time that would trigger a reload, then a reload will be requested long before RELOAD_INTERVAL and RELOAD_GAP will allow it. On the other hand, on a system with few route or endpoint changes, then before, the reload would happen immediately as soon as the first event was delivered, and the second event would have to wait for RELOAD_INTERVAL before it took effect, even though it was probably a route or endpoint change related to the first event, and the first event may not be useful without it. So, we now allow the router to hesitate to gather more events before actually triggering the reload. |
a6ee50f
to
9f2f273
Compare
9f2f273
to
2cb02bb
Compare
pkg/cmd/infra/router/template.go
Outdated
value = time.Duration(defaultReloadEventWait * time.Millisecond) | ||
} | ||
return value | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kinda seems like these could be one function that takes the env var name and the default interval as arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
glog.V(4).Infof("Router state synchronized for the first time") | ||
r.synced = true | ||
r.stateChanged = true | ||
if hasSynced { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused... if hasSynced = true that implies the state is fully synchronized, but then below r.synced can be false? maybe there's a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user tells us we have synced, but our internal state doesn't yet reflect that... update the internal state and emit a message
// We need to track the earliest commit time so we can do burst supression. | ||
now := time.Now() | ||
r.commitReqTime = &now | ||
r.lock.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you just defer r.lock.Unlock()
at the top, and then in this block do defer r.commitWorker()
so the locking is a bit simpler? The second defer should run after the unlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do it that way because the unlock would be called after the commitWorker because defers are called in reverse order from definition.
pkg/router/template/router.go
Outdated
lastReloadEnd time.Time | ||
// commitReqTime is nil if no commit is needed, otherwise it is the time the commit was requested | ||
commitReqTime *time.Time | ||
// commitRunning indicates whether there is the commitFunc is actively running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit "commitRunning indicates whether the commitFunc is actively running"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
0ed02ab
to
5b50019
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple of nits, otherwise good
pkg/cmd/infra/router/template.go
Outdated
@@ -37,6 +37,12 @@ import ( | |||
// defaultReloadInterval is how often to do reloads in seconds. | |||
const defaultReloadInterval = 5 | |||
|
|||
// defaultReloadGap is the minimum gap between a reload end and start in milliseconds. | |||
const defaultReloadGap = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should just be time.Durations
too...
pkg/router/template/router.go
Outdated
commitRunning bool | ||
// commitTimer is the timer we use to make callbacks to the delayed commits | ||
commitTimer *time.Timer | ||
// The minimum time between the starts of reloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
godoc: reloadInterval is the minimum...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto the next few fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/router/template/router.go
Outdated
untilNextCallback = getNextReload(timeUntilNextAction(now, r.lastReloadEnd, r.reloadGap)) | ||
untilNextCallback = getNextReload(timeUntilNextAction(now, *r.commitReqTime, r.reloadEventWait)) | ||
|
||
if untilNextCallback > 0*time.Second { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need * time.Second
here, I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/router/template/router.go
Outdated
if r.commitTimer == nil { | ||
r.commitTimer = time.AfterFunc(untilNextCallback, r.commitWorker) | ||
} else { | ||
r.commitTimer.Reset(untilNextCallback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should note the reason for the reset here as well ("we reset to give a chance to grab extra changes before a reload, but the maxReloadInterval above prevents us from resetting forever").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified in a comment.
The main purpose is to change the locking so that a reload doesn't hold a lock of the router object for the duration of the reload so that updates that happen while the router is reloaded can be processed immediately and batched up and included when the next reload occurs. Before this, if a reload went longer than the reload interval, one event would be processed, and then trigger a new reload. Which would then lock out other event processing. So the router would not make any meaningful progress consuming events, and spend a lot of time and CPU reloading continually. The router controls are: - $RELOAD_INTERVAL - Max Reload Frequency (start-to-start of a reload), default 5s - $RELOAD_GAP - Gaps between reloads (end-to-start of a reload), default 100ms - $RELOAD_EVENT_WAIT - Bunch events by waiting a little before reload (wait N ms between the time the event comes in and when the reload starts), default 10ms The commit routine is now changed to have a top and bottom half. The top half checks to see if a reload is scheduled, and if it is, it returns immediately. If there's no reload scheduled then it calls the bottom half and returns. The bottom half is in charge of determining if it can immediately reload or if it has to wait. If it must wait, then it works out the earliest time it can reload and schedules a callback to itself for that time. If it determines it can reload, then it runs the reload code immediately. When the reload is complete, it calls itself again to make sure there was no other pending reload that had come in while the reload was running.
These are the autogenerated shell completion files that correspond to the new router reload command line parameters.
Added new environment variables to help with debugging: - OPENSHIFT_LOG_LEVEL: Defaults to 4, but sets the debug level to the given value - OPENSHIFT_GET_ALL_DOCKER_LOGS: A boolean that enables dumping of all container logs if any container failed (rather than just giving the logs from the failure)
The reload constants were previously integers that were multipled by a duration constant later. This is confusing since the units are different, and the environment variables are duration strings. This patch also changes the way the constants are used and moves them inline in the parameter definitions with a helper function to translate to durations.
5b50019
to
4099687
Compare
/approve |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: knobunc Assign the PR to them by writing The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/assign @smarterclayton |
@@ -103,7 +103,9 @@ func (o *TemplateRouter) Bind(flag *pflag.FlagSet) { | |||
flag.StringVar(&o.DefaultDestinationCAPath, "default-destination-ca-path", util.Env("DEFAULT_DESTINATION_CA_PATH", "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"), "A path to a PEM file containing the default CA bundle to use with re-encrypt routes. This CA should sign for certificates in the Kubernetes DNS space (service.namespace.svc).") | |||
flag.StringVar(&o.TemplateFile, "template", util.Env("TEMPLATE_FILE", ""), "The path to the template file to use") | |||
flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use") | |||
flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.") | |||
flag.DurationVar(&o.ReloadInterval, "interval", getDurationEnv("RELOAD_INTERVAL", 5*time.Second), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload started.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos
@@ -103,7 +103,9 @@ func (o *TemplateRouter) Bind(flag *pflag.FlagSet) { | |||
flag.StringVar(&o.DefaultDestinationCAPath, "default-destination-ca-path", util.Env("DEFAULT_DESTINATION_CA_PATH", "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"), "A path to a PEM file containing the default CA bundle to use with re-encrypt routes. This CA should sign for certificates in the Kubernetes DNS space (service.namespace.svc).") | |||
flag.StringVar(&o.TemplateFile, "template", util.Env("TEMPLATE_FILE", ""), "The path to the template file to use") | |||
flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use") | |||
flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.") | |||
flag.DurationVar(&o.ReloadInterval, "interval", getDurationEnv("RELOAD_INTERVAL", 5*time.Second), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload started.") | |||
flag.DurationVar(&o.ReloadGap, "reload-gap", getDurationEnv("RELOAD_GAP", 100*time.Millisecond), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload ended.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often are we really going to tune these? Adding lots of extra tunables comes with a cost - do these really need to be changeable or can they just be proportional to the overall duration?
Every tunable we add is more choices that someone can get wrong. Also, I couldn't understand how the flags are used from the descriptions, which is a problem.
@@ -205,6 +207,14 @@ func (o *TemplateRouterOptions) Complete() error { | |||
return fmt.Errorf("invalid reload interval: %v - must be a positive duration", nsecs) | |||
} | |||
|
|||
if o.ReloadGap < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in validate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the other checks in here be in validate too? Or is there some pattern that I don't grok fully.
lock sync.Mutex | ||
// Track the start and end of router reloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow. This is a LOT of variables. This should be abstracted into some sort of separate object that is passed in because the rest of the template router has nothing to do with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please hide this complexity so that it doesn't get entangled with other places.
// Note: If this is changed FakeCommit() in fake.go should also be updated | ||
func (r *templateRouter) Commit() { | ||
func (r *templateRouter) realCommit(hasSynced bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow.
So this is incredibly complicated and really needs to be abstracted into something smaller and more well tested on its own. The original point of the "commit wrapper" was to have a function that you invoke that does the right thing, and then the rest of it is ignored. This has now become entangled with the template router.
Please separate all of this logic out into three pieces - the signal (from the rest of the template router to reload), the rate limiter / tracker (the bit that says when an actual commit should go), and the underlying commit action that controls how fast it happens.
I.e.:
- router controller signals (changes happened)
- a rate limiter / commit controller decides whether to pass it on (this code)
- the underlying commit.
There really cannot be a connection between them, and the code here is too complicated for me to be confident it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt can you help suggest here? i think of this as controller model - several caches:
informer cache -> feeding secondary cache -> controller loop that reads secondary cache and needs to sync up with secondary cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have worked out a possible solution.
keyFunc := func(_ interface{}) (string, error) { | ||
return "templaterouter", nil | ||
} | ||
// Do an immediate commit so that we can reply to health checks before the state syncs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this make the router reply that it is healthy prior to having synced? is that correct? couldn't that cause traffic to be directed to it prior to actually having all the routes set up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already cover that behavior in the router... the haproxy template is passed a synced flag. So it doesn't bind 80/443 until the sync happens, but it does bind the status port for health checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems really odd. It reports healthy before it can serve traffic?
t1 (time between start and end of reload) is variable as it depends on the number of routes/endpoints and also on cpu load on the router machine. So as routes/endpoints increase or decrease, what ever value we initially set may not be accurate/desired. I think instead of 3 knobs, just RELOAD_GAP knob will solve most of the reload problems and the current implementation of doing router reload and not blocking other in-memory router updates during this time will help a lot. |
Superseded by #17049 |
The main purpose is to change the locking so that a reload doesn't
hold a lock of the router object for the duration of the reload so that updates that happen while the router is reloaded can be processed immediately and batched up and included when the next reload occurs. Before this, if a reload went longer than the reload interval, one event would be processed, and then trigger a new reload. Which would then lock out other event processing. So the router would not make any meaningful progress consuming events, and spend a lot of time and CPU reloading continually.
The router controls are:
the time the event comes in and when the reload starts)
The commit routine is now changed to have a top and bottom half. The
top half checks to see if a reload is scheduled, and if it is, it
returns immediately. If there's no reload scheduled then it calls the
bottom half and returns.
The bottom half is in charge of determining if it can immediately
reload or if it has to wait. If it must wait, then it works out the
earliest time it can reload and schedules a callback to itself for
that time.
If it determines it can reload, then it runs the reload code
immediately. When the reload is complete, it calls itself again to
make sure there was no other pending reload that had come in while the
reload was running.