Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdn: clean up startup scripts, remove DNS, watch config file #21654

Merged
merged 3 commits into from
Jan 8, 2019
Merged

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Dec 13, 2018

Over the years, wrappers on wrappers on wrappers have been added. Clean those up somewhat (but not entirely, sadly).

Remove DNS from the network process. We no longer need this.

Watch for configuration file changes, and exit the SDN process if it does.

@openshift-ci-robot openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Dec 13, 2018
@@ -0,0 +1,96 @@
package openshift_sdn
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is moved over from pkg/cmd/server, but had enough deletions not to trigger the rename heuristic.

@squeed
Copy link
Contributor Author

squeed commented Dec 14, 2018

hey @dcbw, do you know if crio still does hostport mapping itself? We disable openshift-sdn hostport forwarding when the runtime is crio.

@danwinship
Copy link
Contributor

hey @dcbw, do you know if crio still does hostport mapping itself? We disable openshift-sdn hostport forwarding when the runtime is crio.

Yes, crio and dockershim still behave differently here. But are we still supporting docker in 4.x

@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 15, 2018
@squeed
Copy link
Contributor Author

squeed commented Dec 15, 2018

Thanks for the review. I managed to kill another wrapper struct; updated.

One notable change is that we no longer assume the IP address of the DNS server is the node's IP (in other words, the dnsmasq setup is dead). I wonder if we can get rid of it completely - the only thing we used the cluster DNS IP is passing it to the CNI plugin, which in turn sets up a route, which is only used in the event of a macvlan interface for the egress router.

@danwinship
Copy link
Contributor

One notable change is that we no longer assume the IP address of the DNS server is the node's IP (in other words, the dnsmasq setup is dead). I wonder if we can get rid of it completely - the only thing we used the cluster DNS IP is passing it to the CNI plugin, which in turn sets up a route, which is only used in the event of a macvlan interface for the egress router.

So that's from #20115 which was a spinoff of #20094. I'm not sure why we decided we needed to special-case the DNS IP specifically... the only case where that would be needed would be if the DNS IP was an IP on the local host that was neither the eth0 IP nor the tun0 IP.

@squeed
Copy link
Contributor Author

squeed commented Dec 17, 2018

So that's from #20115 which was a spinoff of #20094. I'm not sure why we decided we needed to special-case the DNS IP specifically... the only case where that would be needed would be if the DNS IP was an IP on the local host that was neither the eth0 IP nor the tun0 IP.

Actually, if the DNS IP was the host's eth0, then that would be problematic, since the traffic would probably go via the macvlan, which would fail thanks to hairpin. So that makes sense.

The DNS IP now is a serviceip, so we definitely want to make sure its routed over the default pod interface. So I guess it "cant hurt."

@danwinship
Copy link
Contributor

Actually, if the DNS IP was the host's eth0, then that would be problematic, since the traffic would probably go via the macvlan, which would fail thanks to hairpin.

Right, but we already handle that separately. We already set up routes to use the pod network for traffic to clusterNetworkCIDR, traffic to serviceNetworkCIDR, and traffic to any IP address on the node's eth0. So if the DNS IP is a service IP or any of eth0's IP, then adding a route to it is just redundant. The DNS IP special case only matters if the IP is some other local IP.

And given that when using egress IPs, we only do the special case for traffic to port 53 on the node's IP, without a special case for "DNS IP", we probably don't actually need the DNS IP case here either.

@squeed
Copy link
Contributor Author

squeed commented Dec 17, 2018

Awesome. I'll rip all that out.

@squeed
Copy link
Contributor Author

squeed commented Dec 17, 2018

On second thought, there's nothing stopping us from moving the DNS IP again. So I'll keep it for now.

@squeed
Copy link
Contributor Author

squeed commented Dec 17, 2018

/retest

1 similar comment
@squeed
Copy link
Contributor Author

squeed commented Dec 17, 2018

/retest

@squeed
Copy link
Contributor Author

squeed commented Dec 17, 2018

Added one more bit of cleanup: we no longer special-case the DNS IP in the pod's routing table.

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Dec 18, 2018
@squeed
Copy link
Contributor Author

squeed commented Dec 20, 2018

/retest

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 20, 2018
Lots of changes:
* Over the years, wrappers on wrappers on wrappers have been added. Clean
  those up somewhat (and yet somehow never enough).
* Remove DNS from the network process. We no longer need this.
* Watch for configuration file changes, and exit the SDN process if it
  does.
* Remove ability to run standalone proxy from the openshift-sdn binary
squeed added 2 commits January 3, 2019 11:35
We already explicitly add a route to the service cidr and the node's IP
address. There's no need to also add one for the DNS IP.
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 3, 2019
@squeed
Copy link
Contributor Author

squeed commented Jan 3, 2019

Rebased on master.

@squeed
Copy link
Contributor Author

squeed commented Jan 3, 2019

/retest

@squeed
Copy link
Contributor Author

squeed commented Jan 3, 2019

/retest

seems like flakes.

@squeed
Copy link
Contributor Author

squeed commented Jan 5, 2019

/retest

@squeed
Copy link
Contributor Author

squeed commented Jan 5, 2019

All green!

@danwinship, @mfojtik would you mind approving?

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 7, 2019
@squeed
Copy link
Contributor Author

squeed commented Jan 7, 2019

That's cool, this PR was all green on Friday.

/test e2e-aws-serial

@squeed
Copy link
Contributor Author

squeed commented Jan 7, 2019

Back to all green.
@smarterclayton @pweil- @mfojtik @soltysh can one of you LGTM?

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, soltysh, squeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2019
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 8, 2019

@squeed: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-gcp 5776d04 link /test e2e-gcp
ci/prow/e2e-gcp-serial 5776d04 link /test e2e-gcp-serial
ci/prow/launch-aws 5776d04 link /test launch-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 392cb3b into openshift:master Jan 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants