Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced event queue based openshift sdn resource watches with shared informers #16766

Merged

Conversation

pravisankar
Copy link

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 10, 2017
@openshift-merge-robot openshift-merge-robot added needs-api-review needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 10, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 10, 2017
@pravisankar
Copy link
Author

/retest

@pravisankar pravisankar changed the title [WIP] Replaced event queue based openshift sdn resource watches with shared informers Replaced event queue based openshift sdn resource watches with shared informers Oct 11, 2017
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 11, 2017
@pravisankar
Copy link
Author

@openshift/networking PTAL

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 11, 2017
Copy link
Contributor

@danwinship danwinship left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool

Is this targeted for 3.7 or for after the branch?

@@ -66,17 +70,21 @@ func New(options configapi.NodeConfig, clusterDomain string, proxyConfig *compon

var sdnNode network.NodeInterface
var sdnProxy network.ProxyInterface
var internalNetworkInformers networkinformers.SharedInformerFactory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could get rid of these temporary variables if you moved the creation of config up here, and then in the "if IsOpenShiftNetworkPlugin" section, you could just set config.SDNNode, config.SDNProxy, and config.InternalNetworkInformers

@@ -29,6 +33,11 @@ func NewSDNInterfaces(options configapi.NodeConfig, networkClient networkclient.
// SDN's hostport handling when run under CRI-O.
enableHostports := !strings.Contains(runtimeEndpoint, "crio")

sdnInformers := sdncommon.SDNInformers{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDNInformers type is useful to the SDN code itself for passing to RegisterSharedInformer, but I don't think you should use it from outside code like this; just pass the two informers separately.

networkClient := ctx.ClientBuilder.OpenshiftInternalNetworkClientOrDie(bootstrappolicy.InfraSDNControllerServiceAccountName)
sdnInformers := sdncommon.SDNInformers{
KubeInformers: ctx.InternalKubeInformers,
NetworkInformers: networkinformers.NewSharedInformerFactory(networkClient, networkapi.DefaultInformerResyncPeriod),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other OpenShift API groups already have an Informers field in ControllerContext. We should make NetworkInformers work the same way. (Add a field to ControllerContext and initialize it (and eventually call Start() on it) from pkg/cmd/server/start/.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally took this route. Adding a new field in existing informers struct in start/informers.go. All the informers are initialized in NewInformers() and passed to newCreateControllerContext() and ControllerContext runs all controllers(including SDN). But the problem is we don't want to run our network informers if the user is not using openshift-sdn, so I started sprinkling IsOpenShiftNetworkPlugin() in few places. Then I realized why NewInformers/ControllerContext() need to know about openshift network plugin? SDN RunController() is the starting point for openshift sdn and putting network informers there seems the right place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... it's weird to treat pkg/networking/apis differently from all the other API groups, but then again, pkg/networking/apis is different from all the other API groups; it's only used by our network plugins, not by the OpenShift core or by third-party network plugins. So, OK, I guess this makes sense.

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

const (
ClusterNetworkDefault = "default"
EgressNetworkPolicyMaxRules = 50
DefaultInformerResyncPeriod = 30 * time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/network/apis/network/ is public; it's for things that external software might need to use. Stuff that is only used by other parts of origin goes in pkg/network/plugin.go. (They used to be all mixed up together but this got fixed recently along with the other reorgs.)

(OTOH it's not clear to me that we should be using our own default value here anyway, rather than picking up some shared origin-wide default from somewhere else...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenShift is using 10 mins as informer resync period and k8s is using 15 mins. These values seems aggressive to me. Resync is intended to help the case where you failed to process the first time due to some transient error and hoping that processing again after sometime will succeed. If the watch code is smart enough to recognize nothing has changed, frequent resync is okay but if we are unnecessarily updating iptables, etc. then I think we should resync less often. Moved 30 min resync period constant to pkg/network/plugin.go

hs := obj.(*networkapi.HostSubnet)
log.V(5).Infof("Watch %s event for HostSubnet %q", eventType, hs.Name)

if _, ok := hs.Annotations[networkapi.AssignHostSubnetAnnotation]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flip this; just return from the function early if the annotation is not present. (likewise in handleDeleteSubnet)

oc *ovsController
localIP string
oc *ovsController
informers common.SDNInformers

networkClient networkclient.Interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

networkClient is unused now; it's replaced by informers. (And you should put informers in the group where networkClient is now, not the group with localIP and oc)

@@ -57,10 +57,11 @@ type egressIPWatcher struct {
testModeChan chan string
}

func newEgressIPWatcher(localIP string, oc *ovsController) *egressIPWatcher {
func newEgressIPWatcher(localIP string, oc *ovsController, informers common.SDNInformers) *egressIPWatcher {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't pass informers here, pass it to Start(); that way you don't have to create a fake-not-actually-usable informers arg to pass here in egressip_test.go (because egressip_test.go never calls Start(), because it just calls the update funcs by hand)

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 13, 2017
@pravisankar
Copy link
Author

@danwinship
This may not meet the bar for 3.7 at this time, I guess 3.7.1 then.
Addressed your feedback, PTAL

hs := obj.(*networkapi.HostSubnet)
log.V(5).Infof("Watch %s event for HostSubnet %q", watch.Deleted, hs.Name)

if _, ok := hs.Annotations[networkapi.AssignHostSubnetAnnotation]; !ok {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier we were releasing subnet when networkapi.AssignHostSubnetAnnotation is not present. (Refer: https://github.com/openshift/origin/blob/master/pkg/network/master/subnets.go#L276)
I'm guessing that's a existing bug in the code or my understanding could be wrong.
@rajatchopra can you confirm?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bug. We do not want to release if the annotation is present (because that indicates a manually created hostsubnet). Otherwise we do want to release it. Who will release it otherwise?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @rajatchopra on IRC,
AssignHostSubnetAnnotation is for F5 use case and there is no real node exists in the cluster.
HostSubnet is manually created with this special annotation to assign a subnet.
Since we don't get a node deletion event in this case, delete HostSubnet is not triggered.
When user manually deletes this created HostSubnet and if the special annotation is present, then we need to release the subnet (existing bug).
I will add some comments to make it clear.

@pravisankar pravisankar force-pushed the replace-eventqueue branch 3 times, most recently from cb5bf5f to 60f072d Compare October 16, 2017 22:10
@pravisankar
Copy link
Author

/retest

@danwinship
Copy link
Contributor

/hold
for post-branch

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 17, 2017
origNetns := obj.(*networkapi.NetNamespace)
log.V(5).Infof("Watch %s event for NetNamespace %q", eventType, origNetns.Name)

// Informer cache should not be mutated, so get a copy of the object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's updateVNID that mutates it, so I feel like we should do the copy there

return
}
netns, ok := objCopy.(*networkapi.NetNamespace)
if !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just assume that it's the right type. (We don't double-check the typecast at the start of the function, so there's no reason to double-check this one either.)

Ravi Sankar Penta added 2 commits January 10, 2018 13:15
…nd proxy.

- Callers of SDN master and proxy need to check if the network plugin is
openshift specific before calling any openshift SDN methods.
For sdn master, SDNControllerConfig.RunController() and for sdn proxy,
NetworkConfig.New() already does these checks.
@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 12, 2018
@pravisankar
Copy link
Author

@smarterclayton @danwinship @knobunc

Currently we don't need to sync the informers because in case of dependency we are not looking at the other resource's shared informer cache, we make a direct API call.

For example, sdn master gets node event, it will make an API call to check existence of corresponding hostsubnet and then updates or creates hostsubnet as needed. Yes, we could have used Hostsubnet shared informer if the cache is populated. For this, we need to maintain additional queue for each watching resource and start consuming items on the queue only after the caches are populated.

This needs decent amount of changes as we watch for several resources in SDN. I prefer this to be handled in a separate pr, created trello card: https://trello.com/c/Uifetuz3
Current pr is the first step and at least it will solve 2 issues (P0 and P1). Can we get this in for 3.9 ?

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 12, 2018

@pravisankar: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/experimental/integration 6a1a983 link /test origin-it
ci/openshift-jenkins/experimental/unit 4add321 link /test origin-ut

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@pravisankar
Copy link
Author

/retest

@smarterclayton
Copy link
Contributor

You should always wait for sync (see my last comment) for every resource, no exceptions. It's possible some controllers don't need it, but there is no reason not to wait.

@pravisankar
Copy link
Author

pravisankar commented Jan 15, 2018

Just waiting for informer sync after informer start will not help because informer start launches informer queue asynchronously and all the registered event handlers will start receiving the items. In this case, informer sync will be in the main thread which slows the openshift master/node start. I think the proper way is to:

(1) Openshift master/node main thread calls SDN mater/node initalization.
(2) As part of RunSDNController()/sdnNode.Start(), register event handlers to the informers but the event handlers should only queue the items but not process it.
(3) Start informers
(4) Asynchronously run a go routine that will:

 (a) Wait for the informers to sync
 (b) Start workers to pick items from the queue and do necessary processing.

(5) Main thread is done with SDN and runs other stuff
[These steps will be in-line with other controllers that we have]

This is what was proposed in https://trello.com/c/Uifetuz3 . Current PR only replaces existing event queue which had few issues but doesn't wait for informer sync before processing. Note that older event queue implementation doesn't handle synchronization as well. And the reason for not having issues before is that one event-queue/informer did not depend on another event-queue/informer.

So the question is whether to merge this pr together with https://trello.com/c/Uifetuz3 or can we do incremental push by merging this pr in 3.9 release and handling informer sync in another pr?

@smarterclayton what do you prefer?

@smarterclayton
Copy link
Contributor

Ok, you don't have workers yet. If you're only making live calls then we can live with this for now.

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 16, 2018
@pravisankar pravisankar added the kind/bug Categorizes issue or PR as related to a bug. label Jan 16, 2018
@pravisankar
Copy link
Author

Need blessings from pkg/cmd/OWNERS
@smarterclayton can you please approve this pr?

@smarterclayton
Copy link
Contributor

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, knobunc, pravisankar, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 17, 2018
@eparis
Copy link
Member

eparis commented Jan 17, 2018

/hold
not sure if we want this after feature freeze. Is there a specific bug?

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2018
@knobunc
Copy link
Contributor

knobunc commented Jan 17, 2018

/hold cancel

Since this fixes:
openshift/origin: Issue #16080
openshift/origin: Issue #13879

We should take this. I will risk a the scorn and opprobrium of my peers if this regresses.

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2018
@danwinship
Copy link
Contributor

Yeah, this isn't a feature. It's fixing a bug

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 18075, 17725, 16766, 18070, 18113).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/networking kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. sig/networking size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants