Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for bugz 1564984 - rejected by router when using router sharding and NAMESPACE_LABELS #19330

Merged
merged 1 commit into from
Apr 20, 2018

Conversation

ramr
Copy link
Contributor

@ramr ramr commented Apr 12, 2018

Fixes the host admitter plugin to respect the namespaces it is supposed to service.
And not update the status of a route that it is not supposed to process (e.g. for router shards).

@knobunc fyi

…mespaces

it is supposed to service. And not update the status of a route that it is not
supposed to process (e.g. for router shards).
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 12, 2018
@@ -122,6 +126,12 @@ func (p *HostAdmitter) HandleEndpoints(eventType watch.EventType, endpoints *kap

// HandleRoute processes watch events on the Route resource.
func (p *HostAdmitter) HandleRoute(eventType watch.EventType, route *routeapi.Route) error {
if p.allowedNamespaces != nil && !p.allowedNamespaces.Has(route.Namespace) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could push this namespace label filtering at the higher level (router controller), that way every router plugin (host-admitter, unique-host, etc.) don't need to handle this case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravisankar that's a good idea. Though I think in an ideal world, this wouldn't need to be in every plugin. We do need a refactor to consolidate work that's being done in both the host_admitter (one does wildcard specific stuff) and unique_host (that does host uniqueness check which overlaps with the host_admitter). There's duplication in there and needs a review before we can get collapse this into a single plugin. So in a follow up PR past this release?

Copy link

@pravisankar pravisankar Apr 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the follow up pr.
I wasn't even thinking about consolidating host uniqueness and host admitter. My idea was to filter events (ns, ep, routes, etc.) at the source, which is router controller and then propagate filtered events to the chain of the router plugins (unique host, admitter, status, validator).
Currently, some of the router plugins are doing its own filtering like unique_host plugin. This raises one more concern. What will happen to this below scenario:

  • router1 handles all namespaces matching labelset ls1
  • router2 handles all namespaces matching labelset ls2
  • There could be overlap between labelset ls1 and ls2.

unique_host plugin filters routes based on its namespace filter. For the same route, router1 may reject the route but router2 may accept the route. And if we look at all the routes in the cluster, host uniqueness may be broken. Don't we want unique_host plugin to look at all the routes to determine host uniqueness in the cluster even when namespace label filter is present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree with moving the filtering out (up in the chain).

If I understood your question correctly, router{1,2} are 2 different/separate router environments that co-exist on the same cluster. The unique_host plugin is going to handle the namespaces (via HandleNamespace) which a particular router is filtering on. It is not filtering on its own set, it is filtering on whatever the router is filtering on.
Meaning that the unique_host plugin would just get those set of namespaces (matching ls1 or ls2 for the router its running inside off) and would admit routes based on those exact same namespaces.

As re: doing the uniqueness checks cluster wide - across multiple different routers or across all routes, I don't think that's a good thing or that we can do that for a few different reasons:

  • A router might not have access to all the routes (aka it could be namespace scoped).
  • A sharded router environment could service a subset of the routes in each of the router shards. You may do this for performance/distribution reasons on a high-occupancy cluster. The uniqueness check becomes more expensive plus it could potentially have overlaps which you may want. Again this could be all namespace scoped or even a subset of namespaces scoped.
  • To go further down the rabbit hole ;^) on the above point, you could have same routes or even different routes with the same host name pointing to different services for SLA reasons (high/medium/low) and a front-end load balancer could select the different shards based on its own SLAs etc.
  • Makes it difficult to deploy multiple environments in the same cluster.
    Example: Different namespaces on the same cluster could represent the different environments (ala staging/qe/multiple-devs etc). And all these environments/namespaces have their own router running and other objects (routes, services, etc). The host name/route is then specific to that environment, and you don't want to enforce a cluster-wide check for uniqueness.
  • You have 2 routes that point to different services (say version1 and version2 of a service) and you want to basically use label filters to bring the version2 of the service online without requiring a downtime/new deployment - just set everything up for version2 and then change the labels on the route that points to version1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramr Thank you for your detailed explanation. Now I understood the scope of the unique host plugin.

@ramr
Copy link
Contributor Author

ramr commented Apr 12, 2018

flake #17970

@ramr
Copy link
Contributor Author

ramr commented Apr 12, 2018

/retest

@ramr
Copy link
Contributor Author

ramr commented Apr 12, 2018

/test gcp
same flake on gcp

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 12, 2018
Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks for the good test cases.

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knobunc, pravisankar, ramr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 18, 2018
@ramr
Copy link
Contributor Author

ramr commented Apr 18, 2018

/retest
flakes pulling down wordpress template/images.

@ramr
Copy link
Contributor Author

ramr commented Apr 20, 2018

/retest
same flake with wordpress-mysql template

@openshift-merge-robot openshift-merge-robot merged commit 9b0043e into openshift:master Apr 20, 2018
@ramr ramr deleted the fix-bugz-1564984 branch April 25, 2018 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/routing lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants