Replaced event queue based openshift sdn resource watches with shared informers #16766

pravisankar · 2017-10-10T01:12:41Z

Use widely used and relatively stable shared informers instead of custom sdn
event queue which has some issues:
openshift/origin: Issue Panic observed in master: *errors.errorString: invalid state transition: Added -> Added #16080
openshift/origin: Issue *errors.errorString: invalid state transition: Updated -> Sync #13879
We do network object watching in multiple places for the same resource
(HostSubnet/NetNamespace/EgressNetworkPolicy) and each watch consumes a go routine.
Shared informer reduces bandwidth/cpu/memory footprint by running only one go
routine per resource watch and allows multiple subscribers.

pravisankar · 2017-10-11T07:15:26Z

/retest

pravisankar · 2017-10-11T07:16:40Z

@openshift/networking PTAL

danwinship

Cool

Is this targeted for 3.7 or for after the branch?

danwinship · 2017-10-11T19:20:58Z

pkg/cmd/server/kubernetes/network/network_config.go

@@ -66,17 +70,21 @@ func New(options configapi.NodeConfig, clusterDomain string, proxyConfig *compon

 	var sdnNode network.NodeInterface
 	var sdnProxy network.ProxyInterface
+	var internalNetworkInformers networkinformers.SharedInformerFactory


You could get rid of these temporary variables if you moved the creation of config up here, and then in the "if IsOpenShiftNetworkPlugin" section, you could just set config.SDNNode, config.SDNProxy, and config.InternalNetworkInformers

danwinship · 2017-10-11T19:23:01Z

pkg/cmd/server/kubernetes/network/sdn_linux.go

@@ -29,6 +33,11 @@ func NewSDNInterfaces(options configapi.NodeConfig, networkClient networkclient.
 	// SDN's hostport handling when run under CRI-O.
 	enableHostports := !strings.Contains(runtimeEndpoint, "crio")

+	sdnInformers := sdncommon.SDNInformers{


The SDNInformers type is useful to the SDN code itself for passing to RegisterSharedInformer, but I don't think you should use it from outside code like this; just pass the two informers separately.

danwinship · 2017-10-11T19:28:34Z

pkg/cmd/server/origin/controller/network_linux.go

+	networkClient := ctx.ClientBuilder.OpenshiftInternalNetworkClientOrDie(bootstrappolicy.InfraSDNControllerServiceAccountName)
+	sdnInformers := sdncommon.SDNInformers{
+		KubeInformers:    ctx.InternalKubeInformers,
+		NetworkInformers: networkinformers.NewSharedInformerFactory(networkClient, networkapi.DefaultInformerResyncPeriod),


All the other OpenShift API groups already have an Informers field in ControllerContext. We should make NetworkInformers work the same way. (Add a field to ControllerContext and initialize it (and eventually call Start() on it) from pkg/cmd/server/start/.)

I originally took this route. Adding a new field in existing informers struct in start/informers.go. All the informers are initialized in NewInformers() and passed to newCreateControllerContext() and ControllerContext runs all controllers(including SDN). But the problem is we don't want to run our network informers if the user is not using openshift-sdn, so I started sprinkling IsOpenShiftNetworkPlugin() in few places. Then I realized why NewInformers/ControllerContext() need to know about openshift network plugin? SDN RunController() is the starting point for openshift sdn and putting network informers there seems the right place.

Hm... it's weird to treat pkg/networking/apis differently from all the other API groups, but then again, pkg/networking/apis is different from all the other API groups; it's only used by our network plugins, not by the OpenShift core or by third-party network plugins. So, OK, I guess this makes sense.

danwinship · 2017-10-11T19:32:33Z

pkg/network/apis/network/types.go

 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 )

 const (
 	ClusterNetworkDefault       = "default"
 	EgressNetworkPolicyMaxRules = 50
+	DefaultInformerResyncPeriod = 30 * time.Minute


pkg/network/apis/network/ is public; it's for things that external software might need to use. Stuff that is only used by other parts of origin goes in pkg/network/plugin.go. (They used to be all mixed up together but this got fixed recently along with the other reorgs.)

(OTOH it's not clear to me that we should be using our own default value here anyway, rather than picking up some shared origin-wide default from somewhere else...)

OpenShift is using 10 mins as informer resync period and k8s is using 15 mins. These values seems aggressive to me. Resync is intended to help the case where you failed to process the first time due to some transient error and hoping that processing again after sometime will succeed. If the watch code is smart enough to recognize nothing has changed, frequent resync is okay but if we are unnecessarily updating iptables, etc. then I think we should resync less often. Moved 30 min resync period constant to pkg/network/plugin.go

danwinship · 2017-10-11T19:39:09Z

pkg/network/master/subnets.go

+	hs := obj.(*networkapi.HostSubnet)
+	log.V(5).Infof("Watch %s event for HostSubnet %q", eventType, hs.Name)
+
+	if _, ok := hs.Annotations[networkapi.AssignHostSubnetAnnotation]; ok {


flip this; just return from the function early if the annotation is not present. (likewise in handleDeleteSubnet)

danwinship · 2017-10-11T19:46:27Z

pkg/network/node/egressip.go

-	oc      *ovsController
+	localIP   string
+	oc        *ovsController
+	informers common.SDNInformers

 	networkClient networkclient.Interface


networkClient is unused now; it's replaced by informers. (And you should put informers in the group where networkClient is now, not the group with localIP and oc)

danwinship · 2017-10-11T19:47:53Z

pkg/network/node/egressip.go

@@ -57,10 +57,11 @@ type egressIPWatcher struct {
 	testModeChan chan string
 }

-func newEgressIPWatcher(localIP string, oc *ovsController) *egressIPWatcher {
+func newEgressIPWatcher(localIP string, oc *ovsController, informers common.SDNInformers) *egressIPWatcher {


don't pass informers here, pass it to Start(); that way you don't have to create a fake-not-actually-usable informers arg to pass here in egressip_test.go (because egressip_test.go never calls Start(), because it just calls the update funcs by hand)

pravisankar · 2017-10-13T17:48:10Z

@danwinship
This may not meet the bar for 3.7 at this time, I guess 3.7.1 then.
Addressed your feedback, PTAL

pravisankar · 2017-10-13T18:08:29Z

pkg/network/master/subnets.go

+	hs := obj.(*networkapi.HostSubnet)
+	log.V(5).Infof("Watch %s event for HostSubnet %q", watch.Deleted, hs.Name)
+
+	if _, ok := hs.Annotations[networkapi.AssignHostSubnetAnnotation]; !ok {


Earlier we were releasing subnet when networkapi.AssignHostSubnetAnnotation is not present. (Refer: https://github.com/openshift/origin/blob/master/pkg/network/master/subnets.go#L276)
I'm guessing that's a existing bug in the code or my understanding could be wrong.
@rajatchopra can you confirm?

Not a bug. We do not want to release if the annotation is present (because that indicates a manually created hostsubnet). Otherwise we do want to release it. Who will release it otherwise?

Discussed with @rajatchopra on IRC,
AssignHostSubnetAnnotation is for F5 use case and there is no real node exists in the cluster.
HostSubnet is manually created with this special annotation to assign a subnet.
Since we don't get a node deletion event in this case, delete HostSubnet is not triggered.
When user manually deletes this created HostSubnet and if the special annotation is present, then we need to release the subnet (existing bug).
I will add some comments to make it clear.

pravisankar · 2017-10-17T00:04:48Z

/retest

danwinship · 2017-10-17T13:26:28Z

/hold
for post-branch

danwinship · 2017-10-17T13:42:00Z

pkg/network/master/vnids.go

+	origNetns := obj.(*networkapi.NetNamespace)
+	log.V(5).Infof("Watch %s event for NetNamespace %q", eventType, origNetns.Name)
+
+	// Informer cache should not be mutated, so get a copy of the object


it's updateVNID that mutates it, so I feel like we should do the copy there

danwinship · 2017-10-17T13:45:40Z

pkg/network/master/vnids.go

+		return
+	}
+	netns, ok := objCopy.(*networkapi.NetNamespace)
+	if !ok {


Just assume that it's the right type. (We don't double-check the typecast at the start of the function, so there's no reason to double-check this one either.)

…nd proxy. - Callers of SDN master and proxy need to check if the network plugin is openshift specific before calling any openshift SDN methods. For sdn master, SDNControllerConfig.RunController() and for sdn proxy, NetworkConfig.New() already does these checks.

pravisankar · 2018-01-12T22:12:11Z

@smarterclayton @danwinship @knobunc

Currently we don't need to sync the informers because in case of dependency we are not looking at the other resource's shared informer cache, we make a direct API call.

For example, sdn master gets node event, it will make an API call to check existence of corresponding hostsubnet and then updates or creates hostsubnet as needed. Yes, we could have used Hostsubnet shared informer if the cache is populated. For this, we need to maintain additional queue for each watching resource and start consuming items on the queue only after the caches are populated.

This needs decent amount of changes as we watch for several resources in SDN. I prefer this to be handled in a separate pr, created trello card: https://trello.com/c/Uifetuz3
Current pr is the first step and at least it will solve 2 issues (P0 and P1). Can we get this in for 3.9 ?

openshift-ci-robot · 2018-01-12T22:17:25Z

@pravisankar: The following tests failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/openshift-jenkins/experimental/integration	`6a1a983`	link	`/test origin-it`
ci/openshift-jenkins/experimental/unit	`4add321`	link	`/test origin-ut`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

pravisankar · 2018-01-12T22:22:34Z

/retest

smarterclayton · 2018-01-15T19:31:18Z

You should always wait for sync (see my last comment) for every resource, no exceptions. It's possible some controllers don't need it, but there is no reason not to wait.

pravisankar · 2018-01-15T20:26:10Z

Just waiting for informer sync after informer start will not help because informer start launches informer queue asynchronously and all the registered event handlers will start receiving the items. In this case, informer sync will be in the main thread which slows the openshift master/node start. I think the proper way is to:

(1) Openshift master/node main thread calls SDN mater/node initalization.
(2) As part of RunSDNController()/sdnNode.Start(), register event handlers to the informers but the event handlers should only queue the items but not process it.
(3) Start informers
(4) Asynchronously run a go routine that will:

 (a) Wait for the informers to sync
 (b) Start workers to pick items from the queue and do necessary processing.

(5) Main thread is done with SDN and runs other stuff
[These steps will be in-line with other controllers that we have]

This is what was proposed in https://trello.com/c/Uifetuz3 . Current PR only replaces existing event queue which had few issues but doesn't wait for informer sync before processing. Note that older event queue implementation doesn't handle synchronization as well. And the reason for not having issues before is that one event-queue/informer did not depend on another event-queue/informer.

So the question is whether to merge this pr together with https://trello.com/c/Uifetuz3 or can we do incremental push by merging this pr in 3.9 release and handling informer sync in another pr?

@smarterclayton what do you prefer?

smarterclayton · 2018-01-15T21:06:57Z

Ok, you don't have workers yet. If you're only making live calls then we can live with this for now.

danwinship · 2018-01-16T19:10:52Z

/lgtm

pravisankar · 2018-01-16T19:28:19Z

Need blessings from pkg/cmd/OWNERS
@smarterclayton can you please approve this pr?

smarterclayton · 2018-01-17T18:49:50Z

/approve

openshift-ci-robot · 2018-01-17T18:49:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, knobunc, pravisankar, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~pkg/cmd/OWNERS~~ [smarterclayton]
~~pkg/network/OWNERS~~ [danwinship,knobunc,pravisankar,smarterclayton]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

eparis · 2018-01-17T18:53:00Z

/hold
not sure if we want this after feature freeze. Is there a specific bug?

knobunc · 2018-01-17T19:48:00Z

/hold cancel

Since this fixes:
openshift/origin: Issue #16080
openshift/origin: Issue #13879

We should take this. I will risk a the scorn and opprobrium of my peers if this regresses.

danwinship · 2018-01-17T21:07:47Z

Yeah, this isn't a feature. It's fixing a bug

openshift-merge-robot · 2018-01-17T23:46:01Z

Automatic merge from submit-queue (batch tested with PRs 18075, 17725, 16766, 18070, 18113).

This was introduced in openshift#16766

This was introduced in openshift/origin#16766

openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 10, 2017

openshift-merge-robot assigned knobunc and smarterclayton Oct 10, 2017

pravisankar force-pushed the replace-eventqueue branch from f8c3c25 to 6a1a983 Compare October 10, 2017 06:33

openshift-merge-robot added needs-api-review needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Oct 10, 2017

pravisankar force-pushed the replace-eventqueue branch from 6a1a983 to 4add321 Compare October 10, 2017 21:18

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 10, 2017

pravisankar changed the title ~~[WIP] Replaced event queue based openshift sdn resource watches with shared informers~~ Replaced event queue based openshift sdn resource watches with shared informers Oct 11, 2017

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 11, 2017

pravisankar added the component/networking label Oct 11, 2017

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 11, 2017

danwinship suggested changes Oct 11, 2017

View reviewed changes

pravisankar force-pushed the replace-eventqueue branch from 4add321 to bcc276c Compare October 13, 2017 17:35

openshift-merge-robot removed the needs-api-review label Oct 13, 2017

pravisankar force-pushed the replace-eventqueue branch from bcc276c to c6c2dc9 Compare October 13, 2017 17:42

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 13, 2017

pravisankar commented Oct 13, 2017

View reviewed changes

pravisankar force-pushed the replace-eventqueue branch 3 times, most recently from cb5bf5f to 60f072d Compare October 16, 2017 22:10

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 17, 2017

danwinship reviewed Oct 17, 2017

View reviewed changes

pravisankar force-pushed the replace-eventqueue branch from 60f072d to 7bc3558 Compare October 17, 2017 18:48

Ravi Sankar Penta added 2 commits January 10, 2018 13:15

Remove SDNInformers and pass kube and network informers appropriately

d997cb8

pravisankar force-pushed the replace-eventqueue branch from c39f8bf to 99cb946 Compare January 12, 2018 21:20

openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 12, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 16, 2018

pravisankar added the kind/bug Categorizes issue or PR as related to a bug. label Jan 16, 2018

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 17, 2018

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2018

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 17, 2018

openshift-merge-robot merged commit aecc074 into openshift:master Jan 17, 2018

pravisankar mentioned this pull request Jan 18, 2018

Panic observed in master: *errors.errorString: invalid state transition: Added -> Added #16080

Closed

danwinship mentioned this pull request Feb 20, 2018

openshift-sdn is opening more watches than expected #16397

Closed

pravisankar pushed a commit to pravisankar/origin that referenced this pull request Mar 2, 2018

Fix handleDeleteSubnet() to release network from subnet allocator.

32ad85b

This was introduced in openshift#16766

pravisankar mentioned this pull request Mar 2, 2018

Fix handleDeleteSubnet() to release network from subnet allocator #18801

Merged

openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/origin that referenced this pull request Mar 3, 2018

Fix handleDeleteSubnet() to release network from subnet allocator.

9b2e5ed

This was introduced in openshift#16766

deads2k pushed a commit to openshift/sdn that referenced this pull request Jun 18, 2019

Fix handleDeleteSubnet() to release network from subnet allocator.

2e4afec

This was introduced in openshift/origin#16766

Replaced event queue based openshift sdn resource watches with shared informers #16766

Replaced event queue based openshift sdn resource watches with shared informers #16766

Conversation

pravisankar commented Oct 10, 2017

pravisankar commented Oct 11, 2017

pravisankar commented Oct 11, 2017

danwinship left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pravisankar commented Oct 13, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pravisankar commented Oct 17, 2017

danwinship commented Oct 17, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pravisankar commented Jan 12, 2018

openshift-ci-robot commented Jan 12, 2018 • edited Loading

pravisankar commented Jan 12, 2018

smarterclayton commented Jan 15, 2018

pravisankar commented Jan 15, 2018 • edited Loading

smarterclayton commented Jan 15, 2018

danwinship commented Jan 16, 2018

pravisankar commented Jan 16, 2018

smarterclayton commented Jan 17, 2018

openshift-ci-robot commented Jan 17, 2018

eparis commented Jan 17, 2018

knobunc commented Jan 17, 2018

danwinship commented Jan 17, 2018

openshift-merge-robot commented Jan 17, 2018

openshift-ci-robot commented Jan 12, 2018 •

edited

Loading

pravisankar commented Jan 15, 2018 •

edited

Loading