Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimize direct API calls in SDN master by using informer cache #18911

Merged

Conversation

@openshift-ci-robot openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Mar 9, 2018
@pravisankar
Copy link
Author

@openshift/sig-networking PTAL

@pravisankar
Copy link
Author

/retest

@danwinship
Copy link
Contributor

danwinship commented Mar 9, 2018

Allow allocator to mark assigned subnet

What is this commit doing?

@danwinship
Copy link
Contributor

Use existing network informers from openshift controller context

This seems completely independent of everything else, and needs a pkg/cmd approver, so you probably should split it into a separate PR

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 10, 2018
pravisankar pushed a commit to pravisankar/origin that referenced this pull request Mar 15, 2018
- Currently, we need to pass all allocated subnets during
subnet allocator creation time (inUse arg to NewSubnetAllocator()).
This means we need to know all existing subnets beforehand.

- This change exposes additional method so that we can mark a
specific subnet as already allocated dynamically (after the subnet
allocated is created).

Precursor for openshift#18911
pravisankar pushed a commit to pravisankar/origin that referenced this pull request Mar 15, 2018
- Currently, we need to pass all allocated subnets during
subnet allocator creation time (inUse arg to NewSubnetAllocator()).
This means we need to know all existing subnets beforehand.

- This change exposes additional method so that we can mark a
specific subnet as already allocated dynamically (after the subnet
allocator is created).

Precursor for openshift#18911
@pravisankar
Copy link
Author

pravisankar commented Mar 15, 2018 via email

@pravisankar
Copy link
Author

pravisankar commented Mar 15, 2018 via email

@pravisankar pravisankar added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 15, 2018
pravisankar pushed a commit to pravisankar/origin that referenced this pull request Mar 16, 2018
- Currently, we need to pass all allocated subnets during
subnet allocator creation time (inUse arg to NewSubnetAllocator()).
This means we need to know all existing subnets beforehand.

- This change exposes additional method so that we can mark a
specific subnet as already allocated dynamically (after the subnet
allocator is created).

Precursor for openshift#18911
pravisankar pushed a commit to pravisankar/origin that referenced this pull request Mar 16, 2018
- Currently, we need to pass all allocated subnets during
subnet allocator creation time (inUse arg to NewSubnetAllocator()).
This means we need to know all existing subnets beforehand.

- This change exposes additional method so that we can mark a
specific subnet as already allocated dynamically (after the subnet
allocator is created).

Precursor for openshift#18911
openshift-merge-robot added a commit that referenced this pull request Mar 20, 2018
Automatic merge from submit-queue (batch tested with PRs 18999, 18543).

Allow subnet allocator to mark assigned subnet dynamically

- Moved subnet allocator from pkg/util/netutils to pkg/network/master
  Subnet allocator is specific to SDN master and not used anywhere.

- Currently, we need to pass all allocated subnets during
subnet allocator creation time (inUse arg to NewSubnetAllocator()).
This means we need to know all existing subnets beforehand.

- This change exposes additional method so that we can mark a
specific subnet as already allocated dynamically (after the subnet
allocator is created).

Precursor for #18911
@pravisankar pravisankar force-pushed the sdn-minimize-live-apicalls branch from 02ecfac to 5920c3c Compare March 20, 2018 20:46
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 20, 2018
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 20, 2018
@pravisankar pravisankar force-pushed the sdn-minimize-live-apicalls branch from 5920c3c to 9be3468 Compare March 21, 2018 21:15
@pravisankar pravisankar removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 21, 2018
@pravisankar
Copy link
Author

dependent prs got merged, now this is ready for review/merge
@openshift/networking @danwinship PTAL

}

func (master *OsdnMaster) handleDeleteNode(obj interface{}) {
node := obj.(*kapi.Node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the "Tombstone" pattern here? eg

                        ns, ok := obj.(*kapi.Namespace)
                        if !ok {
                                tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
                                if !ok {
                                        logrus.Errorf("couldn't get object from tombstone %+v", obj)
                                        return
                                }
                                ns, ok = tombstone.Obj.(*kapi.Namespace)
                                if !ok {
                                        logrus.Errorf("tombstone contained object that is not a namespace %#v", obj)
                                        return
                                }
                        }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravisankar
Copy link
Author

/retest


func (master *OsdnMaster) startSubSystems(pluginName string) {
if err := master.SubnetStartMaster(master.networkInfo.ClusterNetworks); err != nil {
panic(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glog.Fatalf
(likewise below and in other commits)

@@ -77,26 +75,17 @@ func (vmap *masterVNIDMap) isAdminNamespace(nsName string) bool {
return false
}

func (vmap *masterVNIDMap) populateVNIDs(netNamespaceInformer networkinformers.NetNamespaceInformer) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch doesn't work. You're removing the initial bulk NetNamespace->netIDManager initialization, but there isn't any lazy NetNamespace->netIDManager initialization anywhere, so it will never learn the existing namespace->VNID mappings. So when you start up the master the second time, it will see Namespace Added events for each pre-existing Namespace, and it will call assignVNID, which will call allocateNetID, which will call getVNID to see if the Namespace is already known, but it's not already known, so that will return !found, and so then allocateNetID will call vmap.netIDManager.AllocateNext() to allocate a new VNID, then try to Create a NetNamespace object with that VNID, fail because the NetNamespace already exists, and then log an error.

Even if you updated handleAddOrUpdateNetNamespace to do the right thing, you'd still have to deal with the fact that you don't know if you're going to see Namespace "foo" or NetNamespace "foo" first. So I think basically you can't get rid of populateVNIDs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, initializing vnid allocator is done in 00bf4a7

@openshift-ci-robot openshift-ci-robot added the ¯\_(ツ)_/¯ ¯\\\_(ツ)_/¯ label Mar 28, 2018
@danwinship
Copy link
Contributor

/lol

go master.startSubSystems(networkConfig.NetworkPluginName)

return nil
}

func (master *OsdnMaster) initSubSystems() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason this has to be a separate function, is there? You could just as easily call initSubnetMaster() at the start of startSubSystems().

Actually is there even any reason left at this point for separate init and start methods? It's true that we do that in some other files but that's usually because the start method does things we don't want to do from unit tests. But there are no unit tests of these files (and the init method does things we probably wouldn't want done from unit tests as well anyway).

Same comment applies to vnid code

Ravi Sankar Penta added 9 commits March 30, 2018 10:32
- Move HostSubnet reconciliation inside watchSubnets() which will
help us to detect and fix issues sooner and not during next restart.
- Move host IP validations for existing subnets inside watchSubnets()
Subnet watch gets both existing and new subnets and this avoids
duplicate host IP validations.
- Now we do not need to call release subnet in all the cluster networks.
- Removed inUse arg to newSubnetAllocator (no longer used)
- Renamed method names
@pravisankar pravisankar force-pushed the sdn-minimize-live-apicalls branch from d2baa57 to ae0f3e1 Compare March 30, 2018 17:34
@pravisankar
Copy link
Author

@danwinship PTAL

@pravisankar
Copy link
Author

/retest

@pravisankar
Copy link
Author

/test gcp

@danwinship
Copy link
Contributor

/test extended_networking

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 2, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, pravisankar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danwinship
Copy link
Contributor

/hold
until extended_networking passes

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 2, 2018
@pravisankar
Copy link
Author

extended_networking tests passed
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 2, 2018
@openshift-merge-robot openshift-merge-robot merged commit 2cf54b9 into openshift:master Apr 2, 2018
deads2k pushed a commit to openshift/sdn that referenced this pull request Jun 18, 2019
- Currently, we need to pass all allocated subnets during
subnet allocator creation time (inUse arg to NewSubnetAllocator()).
This means we need to know all existing subnets beforehand.

- This change exposes additional method so that we can mark a
specific subnet as already allocated dynamically (after the subnet
allocator is created).

Precursor for openshift/origin#18911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. sig/networking size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. ¯\_(ツ)_/¯ ¯\\\_(ツ)_/¯
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants