Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple auto-egress-IP fixes #18808

Merged
merged 5 commits into from
Mar 8, 2018

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Mar 2, 2018

OK, 3 "setup" commits followed by 2 commits with fixes:

  • the first and second commits only change egressip_test.go, so clearly don't break the actual functioning of auto-egress-IP
  • the third commit ("Have multiple egress IP ovscontroller methods rather than one confusing one") is small and fairly self-explanatory.
  • the fourth commit ("Fix egressip handling when a NetNamespace is updated") involves a small fix to egressip.go and then a larger update to egressip_test.go to show that the fix works. Specifically: if the EgressIPs field on a NetNamespace was changed while that egress IP was active, then we did not clean up all of the state associated with the old egress IP, because we were accidentally cleaning up the state associated with the new egress IP instead.
  • the last commit ("Rearrange egressip internals, add duplication tests") is very large and complicated and needs the most review, but in particular note that all of the existing tests still pass with the new code, and it adds two more tests (which don't pass with the old code). This fixes the problem that we were previously mostly doing the right thing when the user screwed up and added a duplicate EgressIP, but we weren't doing the right thing when they removed the duplicates, because we weren't keeping enough information about the global merged node/namespace EgressIP state. I think the end result is actually simpler than the old code.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1551028
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1547899
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1543786
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1520363

@openshift/sig-networking PTAL

This makes the tests a little bit more self-documenting, and makes it
simpler to add new tests. Also, change it to only look at OVS flows in
the egress table, so that when there is a mismatch, the error message
is of a manageable length.
Split out one test case, add some new (already-passing) test cases,
add some comments, and remove some double-checking-of-internal-state
that will break with upcoming rewrites.
(We handled delete+recreate correctly, but not update.)
@danwinship danwinship added kind/bug Categorizes issue or PR as related to a bug. component/networking sig/networking labels Mar 2, 2018
@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 2, 2018
@danwinship danwinship removed the request for review from smarterclayton March 2, 2018 14:31
Copy link
Contributor

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@knobunc
Copy link
Contributor

knobunc commented Mar 2, 2018

/assign @dcbw

@knobunc knobunc self-assigned this Mar 2, 2018
ns.assignedIP = ""
ns.nodeIP = ""
if ns.requestedIP != "" {
eg := eip.egressIPs[ns.requestedIP]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure eg isn't going to be nil here?

if oldNode := eip.nodesByEgressIP[ip]; oldNode != nil {
utilruntime.HandleError(fmt.Errorf("Multiple nodes claiming EgressIP %q (nodes %q, %q)", ip, node.nodeIP, oldNode.nodeIP))
continue
eg := eip.egressIPs[ip]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost want:

	eg := eip.ensureEgressIPInfo(ip)
	if eg == nil {
		return
	}
	...
}

func (eip *egressIPWatcher) ensureEgressIPInfo(ip string) *egressIPInfo {
	if ip == "" {
		return nil
	}
	eg := eip.egressIPs[ip]
	if eg == nil {
		eg = &egressIPInfo{ip: ip}
		eip.egressIPs[ip] = eg
	}
	return eg
}

which you'd also use down below.

}

eip.nodesByEgressIP[ip] = node
eip.maybeAddEgressIP(ip)
if len(eg.nodes) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also kinda want:

func (eg *egressIPInfo) addNode(node *nodeEgress, ip string) {
	if len(eg.nodes) != 0 {
		utilruntime.HandleError(fmt.Errorf("Multiple nodes claiming EgressIP %q (nodes %q, %q)", ip, node.nodeIP, eg.nodes[0].nodeIP))
	}
	eg.nodes = append(eg.nodes, node)
}

func (eg *egressIPInfo) delNode(node *nodeEgress) {
	for i := range eg.nodes {
		if eg.nodes[i] == node {
			eg.nodes = append(eg.nodes[:i], eg.nodes[i+1:]...)
			return
 		}
 	}
}

and the corresponding functions for namespaces.

// The egressIPInfo should have an assigned node IP if and only if the
// egress IP is active (ie, it is assigned to exactly 1 node and exactly
// 1 namespace).
egressIPActive := (len(eg.nodes) == 1 && len(eg.namespaces) == 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if you do choose to go with struct egressIPInfo functions, some of the stuff here could be done as functions too. Obviously it can go too far, but maybe a bit cleaner. Your call.

maybe even

changed := eg.syncNodeIP(eip.masqueradeBit, eip.localIP, eip.assignEgressIP, eip.releaseEgressIP)

func (eg *egressIPInfo) syncNodeIP(masqBit uint32, localIP string, assign func(string, string) error, func(string, string) error) bool {
   ...
}

@danwinship
Copy link
Contributor Author

I took most of your suggestions. I didn't change syncEgressIP much other than to split it into two subfunctions, one for iptables stuff and one for OVS stuff. The OVS stuff doesn't really split further very well because the blockedVNIDs state has to flow through it so they'd just have to be passed around if we did split it up further

There should never be multiple HostSubnets or multiple NetNamespaces
claiming the same egress IP, but if there are, we need to track them
carefully so we don't get out sync with reality after things are
fixed.
@knobunc
Copy link
Contributor

knobunc commented Mar 5, 2018

The changes LGTM too. @dcbw?

@danwinship
Copy link
Contributor Author

/retest

@dcbw
Copy link
Contributor

dcbw commented Mar 5, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, dcbw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danwinship
Copy link
Contributor Author

/retest

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments, rest of the changes LGTM

ns.requestedIP = egressIP
eip.namespacesByEgressIP[egressIP] = ns
eip.maybeAddEgressIP(egressIP)
if egressIP == "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is to handle delete case where ns is no longer referenced in eip.namespacesByVNID?
To be clear, may be we could assign ns.requestedIP = egressip after this check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right, the ns.requestedIP assignment is irrelevant (but harmless) in this case

@@ -266,44 +233,120 @@ func (eip *egressIPWatcher) updateNamespaceEgress(vnid uint32, egressIP string)

ns := eip.namespacesByVNID[vnid]
Copy link

@pravisankar pravisankar Mar 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to consider join and isolate project operations otherwise there will be some problems.
Example:

  1. ns1 with vnid v1 requested egrees ip e1
  2. ns2 with vnid v2 requested egress ip e2
  3. ns2 joined with ns1, now ns2 has vnid v1
  4. eip.namespacesByVNID[v1] will now have requestedIP=e2
  5. ns2 got isolated

Now, v1 has only one namespace ns1 but the requested ip will be set as ep2 instead of ep1 which is incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #18852


func (eg *egressIPInfo) addNode(node *nodeEgress) {
if len(eg.nodes) != 0 {
utilruntime.HandleError(fmt.Errorf("Multiple nodes claiming EgressIP %q (nodes %q, %q)", eg.ip, node.nodeIP, eg.nodes[0].nodeIP))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify that only the first node that claimed the egress IP will have egress rules?
(syncEgressIPTablesState() checks eg.nodes[0].nodeIP)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually none of them will; syncEgressIPTablesState() only looks at eg.nodes[0] when len(eg.nodes)==1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// 1 namespace).
egressIPActive := (len(eg.nodes) == 1 && len(eg.namespaces) == 1)
assignedNodeIPChanged := false
if egressIPActive && eg.assignedNodeIP != eg.nodes[0].nodeIP {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be very rare corner case when F5 is involved.
Scenario:

  1. F5 node n1(no real backing node only hostSubnet object) requested egress ip e1
  2. node n2 requested same egress ip e1
  3. eg.nodes == [n1, n2], n1 claims the egress ip e1
  4. SDN controller calls delete + create on F5 node n1 (removes AssignHostSubnetAnnoation and assigns vnid)
  5. eg.nodes == [n2, n1] now n2 claims the egress ip e1 (side effect of how we handle F5 case)

This is theoretical case but may never happen in practice. I don't think we need to fix this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, there's no actual corner case here because egressIPActive would be false at both step 3 and step 5

@danwinship
Copy link
Contributor Author

/retest


func (eg *egressIPInfo) addNode(node *nodeEgress) {
if len(eg.nodes) != 0 {
utilruntime.HandleError(fmt.Errorf("Multiple nodes claiming EgressIP %q (nodes %q, %q)", eg.ip, node.nodeIP, eg.nodes[0].nodeIP))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danwinship
Copy link
Contributor Author

/test end_to_end

@danwinship
Copy link
Contributor Author

/retest

3 similar comments
@danwinship
Copy link
Contributor Author

/retest

@danwinship
Copy link
Contributor Author

/retest

@danwinship
Copy link
Contributor Author

/retest

@danwinship
Copy link
Contributor Author

/retest

1 similar comment
@knobunc
Copy link
Contributor

knobunc commented Mar 6, 2018

/retest

@knobunc
Copy link
Contributor

knobunc commented Mar 6, 2018

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2018
@knobunc
Copy link
Contributor

knobunc commented Mar 6, 2018

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2018
@danwinship
Copy link
Contributor Author

/retest

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 18780, 18802, 18391, 18832, 18808).

@openshift-merge-robot openshift-merge-robot merged commit 6b99107 into openshift:master Mar 8, 2018
@danwinship danwinship deleted the egressip-fixes branch July 20, 2018 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/networking kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. sig/networking size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants