-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semi-automatic egress IP #16561
Semi-automatic egress IP #16561
Conversation
e6046ee
to
89cca5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff Dan. It's clear and well written. Thanks
pkg/network/node/ovscontroller.go
Outdated
otx.AddFlow("table=100, priority=%d,tcp,tcp_dst=53,nw_dst=%s,actions=output:2", networkapi.EgressNetworkPolicyMaxRules+1, nodeIP) | ||
otx.AddFlow("table=100, priority=%d,udp,udp_dst=53,nw_dst=%s,actions=output:2", networkapi.EgressNetworkPolicyMaxRules+1, nodeIP) | ||
// Table 100: egress routing; edited by UpdateNamespaceEgressRules() | ||
// eg, FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor omission
/test all |
89cca5f
to
a8df5db
Compare
@liggitt, @smarterclayton: Can I get an API review on this one too please? |
/lgtm |
|
||
for i, ip := range netnamespace.EgressIPs { | ||
if net.ParseIP(ip) == nil { | ||
allErrs = append(allErrs, field.Invalid(field.NewPath("egressIPs").Index(i), ip, "invalid IP address")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this allowed to be localhost? Multicast? Any implications of those? (extra validation is needed only if it there is no clear use case for ever allowing those)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no use case for setting it to either of those, but there's no danger either. (It would just fail, just like if they tried to claim a random far-away IP address; EgressIPs have to be on the same subnet as some node's primary IP.) We don't check for semantically bad IPs in validation anywhere else, so I think just letting it through here and then failing later makes sense.
@smarterclayton since you've looked over it, can you approve it if it meets with your approval? Thanks |
pkg/network/node/egressip.go
Outdated
|
||
if nodeIP == eip.localIP { | ||
if err := eip.claimEgressIP(ip, hex); err != nil { | ||
glog.Errorf("Error claiming Egress IP %q: %v", ip, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we continue at this point? Or update the rules anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update the rules, but we should update them to drop the packets (so the code is currently wrong)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(fixed)
/unassign |
/retest |
a8df5db
to
b398614
Compare
@smarterclayton the bot added needs-api-review back after you removed it... |
/lgtm |
b398614
to
e6919ba
Compare
e6919ba
to
7d92222
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eparis, knobunc The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest |
1 similar comment
/retest |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest |
1 similar comment
/retest |
@@ -68,6 +68,9 @@ type HostSubnet struct { | |||
HostIP string `json:"hostIP" protobuf:"bytes,3,opt,name=hostIP"` | |||
// Subnet is the CIDR range of the overlay network assigned to the node for its pods | |||
Subnet string `json:"subnet" protobuf:"bytes,4,opt,name=subnet"` | |||
|
|||
// EgressIPs is the list of automatic egress IP addresses currently hosted by this node | |||
EgressIPs []string `json:"egressIPs" protobuf:"bytes,5,rep,name=egressIPs"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omitempty?
|
||
// EgressIPs is a list of reserved IPs that will be used as the source for external traffic coming from pods in this namespace. (If empty, external traffic will be masqueraded to Node IPs.) | ||
// +optional | ||
EgressIPs []string `json:"egressIPs" protobuf:"bytes,4,rep,name=egressIPs"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same omitempty?
for i, egressIP := range hs.EgressIPs { | ||
if net.ParseIP(egressIP) == nil { | ||
allErrs = append(allErrs, field.Invalid(field.NewPath("egressIPs").Index(i), egressIP, "invalid IP address")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect egress IP to work for IPV6 as well? If not, may be we should add ipv4 check here and in netnamespace.EgressIPs validation.
} | ||
} | ||
|
||
func (eip *egressIPWatcher) Start(networkClient networkclient.Interface, iptables *NodeIPTables) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this method will not return an error may be this method can have no return value.
for _, link := range links { | ||
addrs, err := netlink.AddrList(link, syscall.AF_INET) | ||
if err != nil { | ||
return fmt.Errorf("could not get addresses of interface %q while adding egress IP: %v", link.Attrs().Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failed link may not have desired IP address, we could log an error and continue
iptables *NodeIPTables | ||
|
||
// from HostSubnets | ||
nodesByNodeIP map[string]*nodeEgress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already know the nodeIP (map key), don't need to duplicate in the map value
nodesByNodeIP map[string]*sets.String ?
|
||
// from HostSubnets | ||
nodesByNodeIP map[string]*nodeEgress | ||
nodesByEgressIP map[string]*nodeEgress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly this could be nodesByEgressIP map[string]string
nodesByEgressIP map[string]*nodeEgress | ||
|
||
// From NetNamespaces | ||
namespacesByVNID map[uint32]*namespaceEgress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vnid may be duplicate here but considering most of the fields are also used by namespacesByEgressIP, I think this is a good compromise.
} | ||
|
||
if ns, exists := eip.namespacesByEgressIP[ip]; exists { | ||
if ns.assignedIP == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If assignedIP is already set, then drop the traffic for this namespace?
} | ||
ns = &namespaceEgress{vnid: vnid} | ||
eip.namespacesByVNID[vnid] = ns | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When namespace is deleted, where are we deleting vnid from namespacesByVNID map?
/retest |
Automatic merge from submit-queue. |
@@ -332,7 +335,9 @@ func (node *OsdnNode) Start() error { | |||
if err != nil { | |||
return err | |||
} | |||
|
|||
if err = node.egressIP.Start(node.networkClient, nodeIPTables); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not call this for single-tenant/subnet plugin as it depends on NetNamespaces.
I think, we should move this to node.policy.StartEgressIP?
This is the first half of the automatic egress IP work; it handles routing traffic through egress IPs, but not the assigning of IPs to nodes, which must be done manually by the administrator at this point (by editing the HostSubnet records; in the future the SDN master will do this automatically based on annotations on the Node records).
The set of Egress IPs active on a node was stored in HostSubnet rather than Node because (a) there are no upstream compatibility issues to deal with that way, and (b) nodes already have a HostSubnet watch, but not a Node watch. (And Node statuses change constantly, while HostSubnets are much more low-bandwidth.)
Oh, the other thing that's there for future expansion is that a NetNamespace can have an array of EgressIPs, but only the first one gets used. In the future we will likely support multiple IPs, either at the same time, or sequentially failing over.
@openshift/sig-networking PTAL