Change the router reload supression, and add some more controls #16564

knobunc · 2017-09-26T17:29:33Z

The main purpose is to change the locking so that a reload doesn't
hold a lock of the router object for the duration of the reload so that updates that happen while the router is reloaded can be processed immediately and batched up and included when the next reload occurs. Before this, if a reload went longer than the reload interval, one event would be processed, and then trigger a new reload. Which would then lock out other event processing. So the router would not make any meaningful progress consuming events, and spend a lot of time and CPU reloading continually.

The router controls are:

Max Reload Frequency (start-to-start of a reload)
Gaps between reloads (end-to-start of a reload)
Bunch events by waiting a little before reload (wait N ms between
the time the event comes in and when the reload starts)

The commit routine is now changed to have a top and bottom half. The
top half checks to see if a reload is scheduled, and if it is, it
returns immediately. If there's no reload scheduled then it calls the
bottom half and returns.

The bottom half is in charge of determining if it can immediately
reload or if it has to wait. If it must wait, then it works out the
earliest time it can reload and schedules a callback to itself for
that time.

If it determines it can reload, then it runs the reload code
immediately. When the reload is complete, it calls itself again to
make sure there was no other pending reload that had come in while the
reload was running.

imcsk8 · 2017-09-26T23:00:10Z

Code looks ok. The flow is actually a more clear this way.
Are there any suggested ways to test this?

knobunc · 2017-09-27T17:49:27Z

@imcsk8 Thanks for the review. I plan to add some tests soon.

knobunc · 2017-09-28T12:24:24Z

@openshift/networking @openshift/sig-networking FYI

lihongan · 2017-09-29T02:07:04Z

Can we use the controls' default value (RELOAD_INTERVAL=5s, RELOAD_GAP=100ms, RELOAD_EVENT_WAIT=10ms) to verify the bug https://bugzilla.redhat.com/show_bug.cgi?id=1471899 ? If not can you give the recommended value for that scenario of thousands routes?

And my understanding for this commit is: the max reload duration will be limited to 4890ms (5s-100ms-10ms), right ?

knobunc · 2017-09-29T12:09:05Z

@lihongan those values would work. You could drop RELOAD_INTERVAL to 1s and RELOAD_GAP and RELOAD_EVENT_WAIT to 0ms if you wanted. That might make it easier to reproduce and validate.

The duration is the longest of the times. So if a reload is desired, then it can reload if (and only if):

The last attempt to reload was more than RELOAD_INTERVAL ago
The last reload finished more than RELOAD_GAP ago
The current time is more than RELOAD_EVENT_WAIT than the first message that triggered the reload (since we expect more than one message to come in in a bunch related to a route change)

So, the longest possible reload time is really unknowable. If the time taken to actually perform the haproxy reload is longer than the RELOAD_INTERVAL, it will be the actual reload time + RELOAD_GAP. Otherwise, if the actual reload time is less than RELOAD_INTERVAL - RELOAD_GAP, then it will reload at RELOAD_INTERVAL.

RELOAD_EVENT_WAIT will likely be ignored on a busy system since if you have events coming in all the time that would trigger a reload, then a reload will be requested long before RELOAD_INTERVAL and RELOAD_GAP will allow it. On the other hand, on a system with few route or endpoint changes, then before, the reload would happen immediately as soon as the first event was delivered, and the second event would have to wait for RELOAD_INTERVAL before it took effect, even though it was probably a route or endpoint change related to the first event, and the first event may not be useful without it. So, we now allow the router to hesitate to gather more events before actually triggering the reload.

dcbw · 2017-09-29T16:06:40Z

pkg/cmd/infra/router/template.go

+		value = time.Duration(defaultReloadEventWait * time.Millisecond)
+	}
+	return value
+}


Kinda seems like these could be one function that takes the env var name and the default interval as arguments.

dcbw · 2017-09-29T16:10:20Z

pkg/router/template/router.go

-		glog.V(4).Infof("Router state synchronized for the first time")
-		r.synced = true
-		r.stateChanged = true
+	if hasSynced {


I'm confused... if hasSynced = true that implies the state is fully synchronized, but then below r.synced can be false? maybe there's a better name.

If the user tells us we have synced, but our internal state doesn't yet reflect that... update the internal state and emit a message

dcbw · 2017-09-29T16:13:02Z

pkg/router/template/router.go

+			// We need to track the earliest commit time so we can do burst supression.
+			now := time.Now()
+			r.commitReqTime = &now
+			r.lock.Unlock()


What if you just defer r.lock.Unlock() at the top, and then in this block do defer r.commitWorker() so the locking is a bit simpler? The second defer should run after the unlock.

We can't do it that way because the unlock would be called after the commitWorker because defers are called in reverse order from definition.

JacobTanenbaum · 2017-09-29T16:47:49Z

pkg/router/template/router.go

+	lastReloadEnd   time.Time
+	// commitReqTime is nil if no commit is needed, otherwise it is the time the commit was requested
+	commitReqTime *time.Time
+	// commitRunning indicates whether there is the commitFunc is actively running


nit "commitRunning indicates whether the commitFunc is actively running"

DirectXMan12

couple of nits, otherwise good

DirectXMan12 · 2017-09-29T15:58:27Z

pkg/cmd/infra/router/template.go

@@ -37,6 +37,12 @@ import (
 // defaultReloadInterval is how often to do reloads in seconds.
 const defaultReloadInterval = 5

+// defaultReloadGap is the minimum gap between a reload end and start in milliseconds.
+const defaultReloadGap = 100


these should just be time.Durations too...

DirectXMan12 · 2017-09-29T17:10:19Z

pkg/router/template/router.go

+	commitRunning bool
+	// commitTimer is the timer we use to make callbacks to the delayed commits
+	commitTimer *time.Timer
+	// The minimum time between the starts of reloads


godoc: reloadInterval is the minimum...

ditto the next few fields

DirectXMan12 · 2017-09-29T17:16:04Z

pkg/router/template/router.go

+	untilNextCallback = getNextReload(timeUntilNextAction(now, r.lastReloadEnd, r.reloadGap))
+	untilNextCallback = getNextReload(timeUntilNextAction(now, *r.commitReqTime, r.reloadEventWait))
+
+	if untilNextCallback > 0*time.Second {


you don't need * time.Second here, I believe

DirectXMan12 · 2017-09-29T17:19:09Z

pkg/router/template/router.go

+		if r.commitTimer == nil {
+			r.commitTimer = time.AfterFunc(untilNextCallback, r.commitWorker)
+		} else {
+			r.commitTimer.Reset(untilNextCallback)


you should note the reason for the reset here as well ("we reset to give a chance to grab extra changes before a reload, but the maxReloadInterval above prevents us from resetting forever").

Clarified in a comment.

The main purpose is to change the locking so that a reload doesn't hold a lock of the router object for the duration of the reload so that updates that happen while the router is reloaded can be processed immediately and batched up and included when the next reload occurs. Before this, if a reload went longer than the reload interval, one event would be processed, and then trigger a new reload. Which would then lock out other event processing. So the router would not make any meaningful progress consuming events, and spend a lot of time and CPU reloading continually. The router controls are: - $RELOAD_INTERVAL - Max Reload Frequency (start-to-start of a reload), default 5s - $RELOAD_GAP - Gaps between reloads (end-to-start of a reload), default 100ms - $RELOAD_EVENT_WAIT - Bunch events by waiting a little before reload (wait N ms between the time the event comes in and when the reload starts), default 10ms The commit routine is now changed to have a top and bottom half. The top half checks to see if a reload is scheduled, and if it is, it returns immediately. If there's no reload scheduled then it calls the bottom half and returns. The bottom half is in charge of determining if it can immediately reload or if it has to wait. If it must wait, then it works out the earliest time it can reload and schedules a callback to itself for that time. If it determines it can reload, then it runs the reload code immediately. When the reload is complete, it calls itself again to make sure there was no other pending reload that had come in while the reload was running.

These are the autogenerated shell completion files that correspond to the new router reload command line parameters.

Added new environment variables to help with debugging: - OPENSHIFT_LOG_LEVEL: Defaults to 4, but sets the debug level to the given value - OPENSHIFT_GET_ALL_DOCKER_LOGS: A boolean that enables dumping of all container logs if any container failed (rather than just giving the logs from the failure)

The reload constants were previously integers that were multipled by a duration constant later. This is confusing since the units are different, and the environment variables are duration strings. This patch also changes the way the constants are used and moves them inline in the parameter definitions with a helper function to translate to durations.

knobunc · 2017-09-29T17:51:16Z

/approve

openshift-merge-robot · 2017-09-29T17:51:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: knobunc
We suggest the following additional approver: smarterclayton

Assign the PR to them by writing /assign @smarterclayton in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

knobunc · 2017-09-29T17:53:28Z

/assign @smarterclayton

smarterclayton · 2017-09-29T18:25:04Z

pkg/cmd/infra/router/template.go

@@ -103,7 +103,9 @@ func (o *TemplateRouter) Bind(flag *pflag.FlagSet) {
 	flag.StringVar(&o.DefaultDestinationCAPath, "default-destination-ca-path", util.Env("DEFAULT_DESTINATION_CA_PATH", "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"), "A path to a PEM file containing the default CA bundle to use with re-encrypt routes. This CA should sign for certificates in the Kubernetes DNS space (service.namespace.svc).")
 	flag.StringVar(&o.TemplateFile, "template", util.Env("TEMPLATE_FILE", ""), "The path to the template file to use")
 	flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use")
-	flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.")
+	flag.DurationVar(&o.ReloadInterval, "interval", getDurationEnv("RELOAD_INTERVAL", 5*time.Second), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload started.")


smarterclayton · 2017-09-29T18:26:37Z

pkg/cmd/infra/router/template.go

@@ -103,7 +103,9 @@ func (o *TemplateRouter) Bind(flag *pflag.FlagSet) {
 	flag.StringVar(&o.DefaultDestinationCAPath, "default-destination-ca-path", util.Env("DEFAULT_DESTINATION_CA_PATH", "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"), "A path to a PEM file containing the default CA bundle to use with re-encrypt routes. This CA should sign for certificates in the Kubernetes DNS space (service.namespace.svc).")
 	flag.StringVar(&o.TemplateFile, "template", util.Env("TEMPLATE_FILE", ""), "The path to the template file to use")
 	flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use")
-	flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.")
+	flag.DurationVar(&o.ReloadInterval, "interval", getDurationEnv("RELOAD_INTERVAL", 5*time.Second), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload started.")
+	flag.DurationVar(&o.ReloadGap, "reload-gap", getDurationEnv("RELOAD_GAP", 100*time.Millisecond), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the time the last reload ended.")


How often are we really going to tune these? Adding lots of extra tunables comes with a cost - do these really need to be changeable or can they just be proportional to the overall duration?

Every tunable we add is more choices that someone can get wrong. Also, I couldn't understand how the flags are used from the descriptions, which is a problem.

smarterclayton · 2017-09-29T18:26:48Z

pkg/cmd/infra/router/template.go

@@ -205,6 +207,14 @@ func (o *TemplateRouterOptions) Complete() error {
 		return fmt.Errorf("invalid reload interval: %v - must be a positive duration", nsecs)
 	}

+	if o.ReloadGap < 0 {


This should be in validate

Should the other checks in here be in validate too? Or is there some pattern that I don't grok fully.

smarterclayton · 2017-09-29T18:27:26Z

pkg/router/template/router.go

 	lock sync.Mutex
+	// Track the start and end of router reloads


Oh wow. This is a LOT of variables. This should be abstracted into some sort of separate object that is passed in because the rest of the template router has nothing to do with it.

Please hide this complexity so that it doesn't get entangled with other places.

smarterclayton · 2017-09-29T18:40:22Z

pkg/router/template/router.go

 // Note: If this is changed FakeCommit() in fake.go should also be updated
-func (r *templateRouter) Commit() {
+func (r *templateRouter) realCommit(hasSynced bool) {


Wow.

So this is incredibly complicated and really needs to be abstracted into something smaller and more well tested on its own. The original point of the "commit wrapper" was to have a function that you invoke that does the right thing, and then the rest of it is ignored. This has now become entangled with the template router.

Please separate all of this logic out into three pieces - the signal (from the rest of the template router to reload), the rate limiter / tracker (the bit that says when an actual commit should go), and the underlying commit action that controls how fast it happens.

I.e.:

router controller signals (changes happened)

a rate limiter / commit controller decides whether to pass it on (this code)

the underlying commit.

There really cannot be a connection between them, and the code here is too complicated for me to be confident it works.

@liggitt can you help suggest here? i think of this as controller model - several caches:

informer cache -> feeding secondary cache -> controller loop that reads secondary cache and needs to sync up with secondary cache

I think I have worked out a possible solution.

liggitt · 2017-09-29T19:32:50Z

pkg/router/template/router.go

-	keyFunc := func(_ interface{}) (string, error) {
-		return "templaterouter", nil
-	}
+	// Do an immediate commit so that we can reply to health checks before the state syncs


does this make the router reply that it is healthy prior to having synced? is that correct? couldn't that cause traffic to be directed to it prior to actually having all the routes set up?

We already cover that behavior in the router... the haproxy template is passed a synced flag. So it doesn't bind 80/443 until the sync happens, but it does bind the status port for health checks.

That seems really odd. It reports healthy before it can serve traffic?

pravisankar · 2017-10-18T19:17:12Z

                  t1                    t2
   StartReload ----------> EndReload -----------> StartNextReload

t1 (time between start and end of reload) is variable as it depends on the number of routes/endpoints and also on cpu load on the router machine. So as routes/endpoints increase or decrease, what ever value we initially set may not be accurate/desired.
t2 (time between end and next start of reload) can be tuned irrespective of the number of routes or cpu load and this allows coalescing bunch of updates on router.
RELOAD_EVENT_WAIT may be unnecessary as RELOAD_GAP implicitly includes this case and also if we have a stream of events continuously then we may not reload for a long time.

I think instead of 3 knobs, just RELOAD_GAP knob will solve most of the reload problems and the current implementation of doing router reload and not blocking other in-memory router updates during this time will help a lot.

knobunc · 2017-10-26T13:54:20Z

Superseded by #17049

knobunc self-assigned this Sep 26, 2017

knobunc requested a review from DirectXMan12 September 26, 2017 17:29

openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 26, 2017

openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 26, 2017

pravisankar added the component/routing label Sep 26, 2017

knobunc force-pushed the feature/change-router-locking branch from 204995f to cc3edd2 Compare September 27, 2017 16:54

knobunc force-pushed the feature/change-router-locking branch 2 times, most recently from 70a6561 to d09631e Compare September 27, 2017 20:35

openshift-merge-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 28, 2017

knobunc requested review from rajatchopra and pravisankar September 28, 2017 12:23

knobunc changed the title ~~[WIP] Changed the router reload timing controls~~ Change the router reload supression, and add some more controls Sep 28, 2017

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 28, 2017

knobunc force-pushed the feature/change-router-locking branch 7 times, most recently from 902bd6f to 929af8f Compare September 28, 2017 17:01

knobunc force-pushed the feature/change-router-locking branch 4 times, most recently from a6ee50f to 9f2f273 Compare September 29, 2017 15:16

knobunc force-pushed the feature/change-router-locking branch from 9f2f273 to 2cb02bb Compare September 29, 2017 15:46

dcbw reviewed Sep 29, 2017

View reviewed changes

JacobTanenbaum reviewed Sep 29, 2017

View reviewed changes

knobunc force-pushed the feature/change-router-locking branch 2 times, most recently from 0ed02ab to 5b50019 Compare September 29, 2017 17:19

DirectXMan12 reviewed Sep 29, 2017

View reviewed changes

knobunc added 4 commits September 29, 2017 13:49

Added autogenerated completions for new router reload parameters

af1a03a

These are the autogenerated shell completion files that correspond to the new router reload command line parameters.

knobunc force-pushed the feature/change-router-locking branch from 5b50019 to 4099687 Compare September 29, 2017 17:49

openshift-ci-robot assigned smarterclayton Sep 29, 2017

smarterclayton reviewed Sep 29, 2017

View reviewed changes

liggitt reviewed Sep 29, 2017

View reviewed changes

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 17, 2017

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 20, 2017

knobunc closed this Oct 26, 2017

knobunc deleted the feature/change-router-locking branch June 7, 2018 12:39

Change the router reload supression, and add some more controls #16564

Change the router reload supression, and add some more controls #16564

Conversation

knobunc commented Sep 26, 2017 • edited Loading

imcsk8 commented Sep 26, 2017

knobunc commented Sep 27, 2017

knobunc commented Sep 28, 2017

lihongan commented Sep 29, 2017

knobunc commented Sep 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DirectXMan12 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

knobunc commented Sep 29, 2017

openshift-merge-robot commented Sep 29, 2017

knobunc commented Sep 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pravisankar commented Oct 18, 2017 • edited Loading

knobunc commented Oct 26, 2017

knobunc commented Sep 26, 2017 •

edited

Loading

pravisankar commented Oct 18, 2017 •

edited

Loading