-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce HAProxy reloads - adds support to use the haproxy dynamic config api #19073
Conversation
@ramr: GitHub didn't allow me to request PR reviews from the following users: fyi. Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
glide.yaml
Outdated
version: release-1.9.1 | ||
repo: [email protected]:openshift/kubernetes-apiserver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please undo this reformatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's all done by glide. I couldn't run glide on my local repo - it doesn't run for me, times out all the time, so @rajatchopra ran it for me and he didn't do anything other than glide get
&& ./hack/update-deps.sh
. Is something that's glide version specific causing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve never had glide time out - are you on an old version? Bad internet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glide version is v0.13.1 and its a fast[ish] pipe - 200mb+ consistently (and also tried it on a T3 connection) - so i suspect its not the speed. Plus did try it on 3 different vms as well (all fedora 27 though), so something else. Anyway, will look at it at some other time.
@@ -194,9 +194,6 @@ func (b *configMapVolumeMounter) SetUpAt(dir string, fsGroup *int64) error { | |||
if err := wrapped.SetUpAt(dir, fsGroup); err != nil { | |||
return err | |||
} | |||
if err := volumeutil.MakeNestedMountpoints(b.volName, dir, b.pod); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix this, you reverted changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe because we did the glide get
and update-deps.sh
a few weeks back - this might be outdated. Will do.
test/end-to-end/router_test.go
Outdated
@@ -1324,6 +1324,10 @@ func createAndStartRouterContainerExtended(dockerCli *dockerClient.Client, maste | |||
fmt.Sprintf("ROUTER_BIND_PORTS_AFTER_SYNC=%s", strconv.FormatBool(bindPortsAfterSync)), | |||
fmt.Sprintf("ROUTER_ENABLE_INGRESS=%s", strconv.FormatBool(enableIngress)), | |||
fmt.Sprintf("NAMESPACE_LABELS=%s", namespaceLabels), | |||
fmt.Sprintf("ROUTER_CONFIG_MANAGER=haproxy-manager"), | |||
fmt.Sprintf("ROUTER_DYNAMIC_SERVER_PREFIX=_test-dynamic"), | |||
fmt.Sprintf("ROUTER_MAX_DYNAMIC_SERVERS=3"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should have reasonable defaults. I'd prefer if your testing required only ROUTER_CONFIG_MANAGER unless there's a good reason not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - i'll set it to only the defaults.
pkg/router/template/types.go
Outdated
Initialize(router RouterInterface, certPath string) | ||
|
||
// Register registers an id to be associated with a route. | ||
Register(id string, route *routeapi.Route, wildcard bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you passing wildcard here when it should be on the route?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already had boolean deciphered based on the policy so I just passed the boolean but yeah we could recompute
it in register/add.
pkg/router/template/types.go
Outdated
|
||
// ConfigManager is used by the router to make configuration changes using | ||
// the template router's dynamic configuration API (if any). | ||
type ConfigManager interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please specify the threading guarantees this code requires with the router controller here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
@@ -0,0 +1,209 @@ | |||
package haproxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give this its own unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a map_test.go
in there. Or did you mean something else?
} | ||
|
||
// NewCSVConverter returns a new CSVConverter. | ||
func NewCSVConverter(headers string, out interface{}, fn ByteConverterFunc) CSVConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a pointer unless you have a good reason not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really - ok will change.
|
||
// Convert runs a haproxy dynamic config API command. | ||
func (c CSVConverter) Convert(data []byte) ([]byte, error) { | ||
glog.V(4).Infof("CSV converter input data bytes: %s", string(data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much data is this? If it's going to regularly be more than a few hundred bytes it should be at v5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it might be more than a few hundred - will set to v5.
if c.converterFunc != nil { | ||
convertedBytes, err := c.converterFunc(data) | ||
if err != nil { | ||
glog.Errorf("CSV converter error: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't log errors like this, wrap or log when you eat it with utilruntime.HandleError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a remnant from the logs I was generating during dev - will remove this as its not needed - error are anyways propagated up and handled by the caller.
// fixupHeaders fixes up haproxy API responses that don't contain any CSV | ||
// header information. This allows us to easily parse the data and marshal | ||
// into an array of native golang structs. | ||
func (c CSVConverter) fixupHeaders(data []byte) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't make this a method, just a function that takes headers and data.
Give this file unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
} | ||
|
||
// isRetryable checks if a haproxy command can be retried. | ||
func isRetryable(err error, cmd string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isRetriable
// Backends returns the list of configured haproxy backends. | ||
func (c *Client) Backends() ([]*Backend, error) { | ||
if len(c.backends) == 0 { | ||
if backends, err := GetHAProxyBackends(c); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these methods don't need to be public (GetHAProxyBackends
) they shouldn't be public.
// if the error for the command is a retryable error. | ||
func (c *Client) runCommandWithRetries(cmd string, limit int) (*bytes.Buffer, error) { | ||
retryAttempt := 0 | ||
for { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use util.ExponentialBackoff or util.PollImmediate instead of inventing your own code for it.
pkg/oc/admin/router/router.go
Outdated
env["ROUTER_METRICS_TLS_CERT_FILE"] = "/etc/pki/tls/metrics/tls.crt" | ||
env["ROUTER_METRICS_TLS_KEY_FILE"] = "/etc/pki/tls/metrics/tls.key" | ||
if cfg.Type == "haproxy-router" { | ||
env["ROUTER_CONFIG_MANAGER"] = "haproxy-manager" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If everyone is going to get these just make these the new default on the command. Don't force people to change their config and make a pointless choice. Otherwise you'll have to add this to the upgrade scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, so because the template router is also used for nginx, this needs to be either set here since its haproxy specific or we need to pass down the type of router to the infra command (that's not done today). I can either pass down this new setting or then the type and set the defaults based on that in the infra command ... but in either case it still means upgrading existing haproxy-router deployments. So which do you prefer?
Edited
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the config manager works, there’s no reason we wouldn’t have it on by default for haproxy. However, we will likely default it off at first until we’re comfortable that we’ve caught all the edge cases and then default it on. So ideally we just have a Boolean that has an internal default and if the user specifies it we obey their rule. That avoids end users ending up with the flag in their config.
// blueprintRoutes returns all the routes in the blueprint namespace. | ||
func (o *TemplateRouterOptions) blueprintRoutes(routeclient *routeinternalclientset.Clientset) ([]*routeapi.Route, error) { | ||
blueprints := make([]*routeapi.Route, 0) | ||
if len(o.BlueprintRouteNamespace) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like an entire namespace of route blueprints. Require them to be identified with a label or associated metadata within a specific namespace, or require them to be specified by name directly.
Whole namespaces devoted to this is pretty ugly, although the idea of pulling it from the route is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will do.
pkg/cmd/infra/router/template.go
Outdated
|
||
var cfgManager templateplugin.ConfigManager | ||
if o.ConfigManagerName == "haproxy-manager" { | ||
blueprintRoutes, err := o.blueprintRoutes(routeclient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're going to have to refresh these, they shouldn't be static or require restart. Make that dynamic (add a controller plugin wrapper that triggers reload of the blueprints).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's going to be some issues with blueprint refresh that I need to think about - basically vis-a-vis replacement and deletion and how the allocations/config changes, so can we do this in a follow-up PR?
pkg/cmd/infra/router/template.go
Outdated
flag.DurationVar(&o.CommitInterval, "commit-interval", getIntervalFromEnv("COMMIT_INTERVAL", defaultCommitInterval), "Controls how often to commit (to the actual config) all the changes made using the router specific dynamic configuration manager.") | ||
flag.StringVar(&o.BlueprintRouteNamespace, "blueprint-route-namespace", util.Env("ROUTER_BLUEPRINT_ROUTE_NAMESPACE", ""), "Specifies the namespace which contains the routes that serve as blueprints for the dynamic configuration manager.") | ||
flag.IntVar(&o.BlueprintRoutePoolSize, "blueprint-route-pool-size", int(util.EnvInt("ROUTER_BLUEPRINT_ROUTE_POOL_SIZE", 10, 1)), "Specifies the size of the pre-allocated pool for each route blueprint managed by the router specific dynamic configuration manager. This can be overriden by an annotation router.openshift.io/pool-size on an individual route.") | ||
flag.StringVar(&o.DynamicServerPrefix, "dynamic-server-prefix", util.Env("ROUTER_DYNAMIC_SERVER_PREFIX", ""), "Specifies the prefix for dynamic servers added to router backends. These dynamic servers are handled by the router specific dynamic configuration manager.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this is configurable. Seems pointless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this.
pkg/cmd/infra/router/template.go
Outdated
flag.BoolVar(&o.BindPortsAfterSync, "bind-ports-after-sync", util.Env("ROUTER_BIND_PORTS_AFTER_SYNC", "") == "true", "Bind ports only after route state has been synchronized") | ||
flag.StringVar(&o.MaxConnections, "max-connections", util.Env("ROUTER_MAX_CONNECTIONS", ""), "Specifies the maximum number of concurrent connections.") | ||
flag.StringVar(&o.Ciphers, "ciphers", util.Env("ROUTER_CIPHERS", ""), "Specifies the cipher suites to use. You can choose a predefined cipher set ('modern', 'intermediate', or 'old') or specify exact cipher suites by passing a : separated list.") | ||
flag.BoolVar(&o.StrictSNI, "strict-sni", isTrue(util.Env("ROUTER_STRICT_SNI", "")), "Use strict-sni bind processing (do not use default cert).") | ||
flag.StringVar(&o.MetricsType, "metrics-type", util.Env("ROUTER_METRICS_TYPE", ""), "Specifies the type of metrics to gather. Supports 'haproxy'.") | ||
flag.StringVar(&o.ConfigManagerName, "config-manager", util.Env("ROUTER_CONFIG_MANAGER", ""), "Specifies the manager to use for dynamically configuring changes with the underlying router. Supports 'haproxy-manager'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reason not to default this on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we don't pass the router type to the infra command, so this needs to be set from above as of now. We can't set this for nginx template routers. We can either pass the type of router down (ala haproxy etc) and then do the required defaults here or then (as was done) pass this env variable down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the defaulting later when you know the type. I doubt we’ll ever have multiple, so this is really just a bool.
pkg/cmd/infra/router/template.go
Outdated
flag.BoolVar(&o.BindPortsAfterSync, "bind-ports-after-sync", util.Env("ROUTER_BIND_PORTS_AFTER_SYNC", "") == "true", "Bind ports only after route state has been synchronized") | ||
flag.StringVar(&o.MaxConnections, "max-connections", util.Env("ROUTER_MAX_CONNECTIONS", ""), "Specifies the maximum number of concurrent connections.") | ||
flag.StringVar(&o.Ciphers, "ciphers", util.Env("ROUTER_CIPHERS", ""), "Specifies the cipher suites to use. You can choose a predefined cipher set ('modern', 'intermediate', or 'old') or specify exact cipher suites by passing a : separated list.") | ||
flag.BoolVar(&o.StrictSNI, "strict-sni", isTrue(util.Env("ROUTER_STRICT_SNI", "")), "Use strict-sni bind processing (do not use default cert).") | ||
flag.StringVar(&o.MetricsType, "metrics-type", util.Env("ROUTER_METRICS_TYPE", ""), "Specifies the type of metrics to gather. Supports 'haproxy'.") | ||
flag.StringVar(&o.ConfigManagerName, "config-manager", util.Env("ROUTER_CONFIG_MANAGER", ""), "Specifies the manager to use for dynamically configuring changes with the underlying router. Supports 'haproxy-manager'.") | ||
flag.StringVar(&o.ConfigManagerConnectionInfo, "config-manager-connection-info", "", "Specifies connection information for the dynamic configuration manager.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems overkill to configure this. We don't allow configuring the stats socket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the stats socket is sorta tied into how the template is configured (using af unix sockets).
It probably is overkill but if the haproxy template is potentially changed by the user, this is just a knob to use tcp if so needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this - we can add it later if needed.
pkg/cmd/infra/router/template.go
Outdated
flag.StringVar(&o.ConfigManagerConnectionInfo, "config-manager-connection-info", "", "Specifies connection information for the dynamic configuration manager.") | ||
flag.DurationVar(&o.CommitInterval, "commit-interval", getIntervalFromEnv("COMMIT_INTERVAL", defaultCommitInterval), "Controls how often to commit (to the actual config) all the changes made using the router specific dynamic configuration manager.") | ||
flag.StringVar(&o.BlueprintRouteNamespace, "blueprint-route-namespace", util.Env("ROUTER_BLUEPRINT_ROUTE_NAMESPACE", ""), "Specifies the namespace which contains the routes that serve as blueprints for the dynamic configuration manager.") | ||
flag.IntVar(&o.BlueprintRoutePoolSize, "blueprint-route-pool-size", int(util.EnvInt("ROUTER_BLUEPRINT_ROUTE_POOL_SIZE", 10, 1)), "Specifies the size of the pre-allocated pool for each route blueprint managed by the router specific dynamic configuration manager. This can be overriden by an annotation router.openshift.io/pool-size on an individual route.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on an individual blueprint route
You need to explain blueprint routes in the command help for this.
pkg/cmd/infra/router/template.go
Outdated
flag.BoolVar(&o.BindPortsAfterSync, "bind-ports-after-sync", util.Env("ROUTER_BIND_PORTS_AFTER_SYNC", "") == "true", "Bind ports only after route state has been synchronized") | ||
flag.StringVar(&o.MaxConnections, "max-connections", util.Env("ROUTER_MAX_CONNECTIONS", ""), "Specifies the maximum number of concurrent connections.") | ||
flag.StringVar(&o.Ciphers, "ciphers", util.Env("ROUTER_CIPHERS", ""), "Specifies the cipher suites to use. You can choose a predefined cipher set ('modern', 'intermediate', or 'old') or specify exact cipher suites by passing a : separated list.") | ||
flag.BoolVar(&o.StrictSNI, "strict-sni", isTrue(util.Env("ROUTER_STRICT_SNI", "")), "Use strict-sni bind processing (do not use default cert).") | ||
flag.StringVar(&o.MetricsType, "metrics-type", util.Env("ROUTER_METRICS_TYPE", ""), "Specifies the type of metrics to gather. Supports 'haproxy'.") | ||
flag.StringVar(&o.ConfigManagerName, "config-manager", util.Env("ROUTER_CONFIG_MANAGER", ""), "Specifies the manager to use for dynamically configuring changes with the underlying router. Supports 'haproxy-manager'.") | ||
flag.StringVar(&o.ConfigManagerConnectionInfo, "config-manager-connection-info", "", "Specifies connection information for the dynamic configuration manager.") | ||
flag.DurationVar(&o.CommitInterval, "commit-interval", getIntervalFromEnv("COMMIT_INTERVAL", defaultCommitInterval), "Controls how often to commit (to the actual config) all the changes made using the router specific dynamic configuration manager.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would tie this to the router reload interval unless you have some sort of real reason to make this configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is a separate config option is because we don't want to change the reload interval to something higher. As that means if the dynamic config manager can't bring routes online (because of certs/custom annotations etc), the reloads to bring those route would be staggered by the reload interval - not good if say that's 1 hour ... which the default for the commit interval.
@@ -113,12 +132,19 @@ func (o *TemplateRouter) Bind(flag *pflag.FlagSet) { | |||
flag.StringVar(&o.DefaultDestinationCAPath, "default-destination-ca-path", util.Env("DEFAULT_DESTINATION_CA_PATH", "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt"), "A path to a PEM file containing the default CA bundle to use with re-encrypt routes. This CA should sign for certificates in the Kubernetes DNS space (service.namespace.svc).") | |||
flag.StringVar(&o.TemplateFile, "template", util.Env("TEMPLATE_FILE", ""), "The path to the template file to use") | |||
flag.StringVar(&o.ReloadScript, "reload", util.Env("RELOAD_SCRIPT", ""), "The path to the reload script to use") | |||
flag.DurationVar(&o.ReloadInterval, "interval", reloadInterval(), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.") | |||
flag.DurationVar(&o.ReloadInterval, "interval", getIntervalFromEnv("RELOAD_INTERVAL", defaultReloadInterval), "Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just took the reloadInterval code and made it into a reusable function so that we can call it to get the interval value for both reload and commit intervals (same code passes a different parameters).
@@ -179,8 +195,9 @@ var helperFunctions = template.FuncMap{ | |||
"matchValues": matchValues, //compares a given string to a list of allowed strings | |||
|
|||
"genSubdomainWildcardRegexp": genSubdomainWildcardRegexp, //generates a regular expression matching the subdomain for hosts (and paths) with a wildcard policy | |||
"generateRouteRegexp": generateRouteRegexp, //generates a regular expression matching the route hosts (and paths) | |||
"generateRouteRegexp": GenerateRouteRegexp, //generates a regular expression matching the route hosts (and paths) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are trying to reuse this, put it in its own package (templatehelpers
maybe) and include it here. That gets more crap out of this package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will do.
pkg/router/template/router.go
Outdated
glog.V(4).Infof("Dynamically replacing endpoints for associated backend %s", backendKey) | ||
if err := r.dynamicConfigManager.ReplaceRouteEndpoints(backendKey, oldEndpoints, newEndpoints, weight); err != nil { | ||
// Error dynamically modifying the config, so return false to cause a reload to happen. | ||
glog.Warningf("Router will reload as dynamic endpoint replacement for service id %s (backend=%s, weight=%v) failed: %v", id, backendKey, weight, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could easily spew errors here, i'm a little concerned with this in the long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change to glog.V(4).Infof
, so that its available if someone turns up the debugging level and for all the other "Router will reload" messages.
pkg/router/template/router.go
Outdated
|
||
// If no initial sync was done, don't try to dynamically add the | ||
// route as we will need a reload anyway. | ||
if !r.synced { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this after registration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't add the routes if the initial sync has not been done. There's nothing in the haproxy config at that point in time (backends don't exist for blueprints as yet, etc). Plus since there's going to be a reload and the config is going to change anyway, the configmanager is not really going to do anything here (other than registration).
So, the registration is done always - irrespective of whether the router sync state. And that ensures that the mappings in the configmanager are correct.
/retest |
/test integration |
/retest |
…eded when configuration changes. This commit includes: o Support for a dynamic configuration manager in the template router. o HAProxy configuration manager implementation. o Scale up support with a pool of dynamic servers for each backend. o Dynamic route addition support with a pre-allocated pool of blueprint routes. o HAProxy dynamic configuration retry support. o Passthrough routes weights are relative, so as to allow them to be scaled. Fixes a bug with the haproxy dynamic config api hanging. o Support for a blueprint route namespace, so that we can load custom blueprints at startup. o Refactor, log errors, cleanup and bug fixes. o Remove unused/extraneous map entries for blueprint routes. o Self-review changes. o Add new tests and modify existing integration+e2e tests.
… the manager appropriately. And some more changes as per review comments.
…with passthrough routes.
/retest |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, knobunc, ramr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks good
/retest |
@ramr: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Please review the full test history for this PR and help us cut down flakes. |
Add support to the template router to dynamically configure haproxy. This reduces the reloads
needed when the router configuration changes due to routes/endpoints changing.
Note that HAProxy doesn't have support to dynamically change certificates, so we need to pre-allocate pools of "blueprint" routes to dynamically bring new routes online.
We now add/remove servers from a backend on endpoint changes (for scale up/down, service addition/removal/update, node drain/evictions etc).
Each haproxy backend (for a route) has a set of dynamic servers added named
_dynamic-pod-[1-5]
- which are used when endpoints change (example for scale up).Scaling down endpoints just sets the existing servers in a haproxy backend to "maint" or
non-serving state.
The infra router options - command line flags or environment variables named
ROUTER_DYNAMIC_SERVER_PREFIX
andROUTER_MAX_DYNAMIC_SERVERS
(default: 5)controls these options.
We dynamically add and remove routes if a route matches a pre-allocated/configured blueprint route.
The default config has blueprint routes for insecure/http, edge terminated and passthrough routes.
Re-encrypt routes are an issue as they could have custom certificates so is not currently in that
default blueprint set (commented out).
Route removal just removes the map entries, marks the servers in
maint
state but leaves thebackend config in haproxy intact. The backend is not referenced from anywhere, so is unused.
Route addition checks if it can bring the route online without a router reload - it searches for a
matching blueprint and then a free slot from the pre-configured pool of routes for that blueprint.
If the new route can't be dynamically configured, then the behavior is as today and a reload occurs.
A route might not match a blueprint if it has information that causes a custom haproxy config.
Examples of that could be a route that has custom certificate information or custom CA information
to validate re-encrypted routes or it could even be for custom config that is generated for the
route annotations. Those would fail to find a matching blueprint (and hence cause a reload).
However that said, the template router does also provide the ability to add custom blueprints.
These are basically just routes in a namespace which serve as templates or blueprints - similar
to the default blueprints (for insecure/http, edge terminated and passthrough routes).
So one can easily add a re-encrypt route blueprint if needed.
Also, the add route code would search the custom blueprints (in the same fashion that it does
the default blueprints - custom ones are just added to the list).
The infra router options
ROUTER_BLUEPRINT_ROUTE_NAMESPACE
andROUTER_BLUEPRINT_ROUTE_POOL_SIZE
(default: 10)provide some knobs for these custom blueprints.
Tests will automatically use this new functionality (extended ones needed a config change) but the
others use the router image, so automatically check with the haproxy router with the dynamic config
manager enabled.
Again, a point to note here is that if we exhaust the pool of dynamic servers or blueprint routes or
don't find a matching blueprint, the template router will need/do a reload.
Associated trello card: https://trello.com/c/W21FD0v0/585-13-implement-dynamic-changes-to-the-router-routerscaledemo
/cc @knobunc @zhaozhanqi fyi