deploy: tweak enqueueing in the trigger controller #11501

0xmichalis · 2016-10-22T12:21:20Z

smarterclayton · 2016-10-22T18:52:00Z

pkg/deploy/controller/generictrigger/factory.go

+	// but I think this is fine.
+	shouldInstantiate := true
+	if newDc.Status.LatestVersion > 0 {
+		latestRc, err := c.rcLister.ReplicationControllers(newDc.Namespace).Get(deployutil.LatestDeploymentNameForConfig(newDc))


If the rc lister significantly lags, what happens here?

As is, we are dropping the dc event. e343373 changed the dc controller to update observedGeneration when a new rc fails to be created. If we change it back and have just the dc condition updated (I think I should have it do that in the first place) then we can safely ignore errors from the rc cache and proceeed with enqueueing since the dc has to be synced at this point. Not sure if there is any case where we shouldn't block instantiate because the dc controller fails to create a new rc.

smarterclayton · 2016-10-22T18:52:48Z

[test]

0xmichalis · 2016-10-24T13:56:29Z

Updated to ignore errors from the rc cache and defer to instantiate to do the rc check. I opted for letting the dc controller update observedGeneration even when it observes that it cannot create a new rc so that the trigger controller will never be blocked.

openshift-bot · 2016-10-24T15:13:23Z

Evaluated for origin test up to 5398c1a

openshift-bot · 2016-10-24T16:35:38Z

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/10549/) (Base Commit: 9195050)

mfojtik · 2016-10-24T16:39:28Z

pkg/deploy/controller/generictrigger/controller.go

+	// rcLister provides a local cache for replication controllers.
+	rcLister cache.StoreToReplicationControllerLister
+	// rcListerSynced makes sure the dc store is synced before reconcling any replication controller.
+	rcListerSynced func() bool


do you need this for testing?

nope, needed for the controller to know if its caches are synced

mfojtik · 2016-10-24T16:41:21Z

pkg/deploy/controller/generictrigger/factory.go

+			glog.V(2).Infof("Cannot decode dc from replication controller %s: %v", deployutil.LabelForDeployment(latestRc), err)
+			return
+		}
+		shouldInstantiate = !reflect.DeepEqual(newDc.Spec.Template, initial.Spec.Template)


how expensive is this?

As expensive as a deep equal check can be? We were running this in the trigger controller already prior to refactoring the controller to use the instantiate endpoint.

Deep equal is expensive but it's the correct thing to use for now (vs something net new).

mfojtik · 2016-10-24T16:43:44Z

pkg/deploy/controller/generictrigger/factory.go

+	if err != nil {
+		// If we get an error here it may be due to the rc cache lagging behind. In such a case
+		// just defer to the api server (instantiate REST) where we will retry this.
+		glog.V(2).Infof("Cannot get latest rc for dc %s:%d (%v) - will defer to instantiate", deployutil.LabelForDeploymentConfig(newDc), newDc.Status.LatestVersion, err)


mfojtik · 2016-10-24T16:44:21Z

pkg/deploy/controller/generictrigger/factory.go

+	} else {
+		initial, err := deployutil.DecodeDeploymentConfig(latestRc, c.codec)
+		if err != nil {
+			glog.V(2).Infof("Cannot decode dc from replication controller %s: %v", deployutil.LabelForDeployment(latestRc), err)


do we want to reconcile here? if we failed to decode, what are the chances that we will succeed next time?

We won't reconcile, I just return without adding the dc in the queue

mfojtik · 2016-10-25T15:59:40Z

LGTM [merge]

openshift-bot · 2016-10-25T16:05:53Z

Evaluated for origin merge up to 5398c1a

smarterclayton · 2016-10-25T18:27:58Z

I guess it depends on what observedGeneration is supposed to mean. Right
now it means "I've seen your request and I'm trying to bring it to
completion" but not "I've succeeded at reaching your request".

I don't think you can treat that the rc cache has anything to do with the
dc cache in this case though - the rc cache can wedge at any time and you
can be arbitrarily far behind the real state

On Sat, Oct 22, 2016 at 4:21 PM, Michail Kargakis [email protected]
wrote:

@Kargakis commented on this pull request.

In pkg/deploy/controller/generictrigger/factory.go
#11501:
if len(newDc.Spec.Triggers) == 0 || newDc.Spec.Paused {
  return
}
// We don't want to compete with the main deployment config controller. Let's process this

// config once it's synced. Note that this does not eliminate conflicts between the two

// controllers because the main controller is constantly updating deployment configs as

// owning replication controllers and pods are updated.

if !deployutil.HasSynced(newDc, newDc.Generation) {
return
}

// Compare deployment config templates before enqueueing. This reduces the amount of times

// we will try to instantiate a deployment config to the exact number of times we need at

// the expense of duplicating some of the work that the instantiate endpoint is already doing

// but I think this is fine.

shouldInstantiate := true

if newDc.Status.LatestVersion > 0 {
latestRc, err := c.rcLister.ReplicationControllers(newDc.Namespace).Get(deployutil.LatestDeploymentNameForConfig(newDc))
As is, we are dropping the dc event. e343373
e343373
changed the dc controller to update observedGeneration when a new rc fails
to be created. If we change it back and have just the dc condition updated
(I think I should have it do that in the first place) then we can safely
ignore errors from the rc cache and proceeed with enqueueing since the dc
has to be synced at this point. Not sure if there is any case where we
shouldn't block instantiate because the dc controller fails to create a new
rc.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11501, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p-QBFvrX6_0B6SFnSJ-cB8HNJWU_ks5q2nBEgaJpZM4Kd2GJ
.

0xmichalis · 2016-10-25T18:44:52Z

I guess it depends on what observedGeneration is supposed to mean. Right
now it means "I've seen your request and I'm trying to bring it to
completion" but not "I've succeeded at reaching your request".

I don't think you can treat that the rc cache has anything to do with the
dc cache in this case though - the rc cache can wedge at any time and you
can be arbitrarily far behind the real state

The rc cache has nothing to do with the dc cache, that's right. It will also be much heavier but my point with expecting a synced dc is that we have a better chance to not conflict with the dc controller.

openshift-bot · 2016-10-25T20:13:26Z

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/10549/) (Base Commit: 8b8e813) (Image: devenv-rhel7_5240)

smarterclayton reviewed Oct 22, 2016

View reviewed changes

deploy: tweak enqueueing in the trigger controller

5398c1a

mfojtik reviewed Oct 24, 2016

View reviewed changes

openshift-bot merged commit a76ec57 into openshift:master Oct 25, 2016

0xmichalis deleted the tweak-trigger-controller-enqueueing branch October 26, 2016 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deploy: tweak enqueueing in the trigger controller #11501

deploy: tweak enqueueing in the trigger controller #11501

0xmichalis commented Oct 22, 2016

smarterclayton Oct 22, 2016

0xmichalis Oct 22, 2016

smarterclayton commented Oct 22, 2016

0xmichalis commented Oct 24, 2016

openshift-bot commented Oct 24, 2016

openshift-bot commented Oct 24, 2016

mfojtik Oct 24, 2016

0xmichalis Oct 24, 2016

mfojtik Oct 24, 2016

0xmichalis Oct 24, 2016

smarterclayton Oct 24, 2016

mfojtik Oct 24, 2016

mfojtik Oct 24, 2016

0xmichalis Oct 24, 2016

mfojtik commented Oct 25, 2016

openshift-bot commented Oct 25, 2016

smarterclayton commented Oct 25, 2016

@Kargakis commented on this pull request.

0xmichalis commented Oct 25, 2016

openshift-bot commented Oct 25, 2016 •

edited

Loading

deploy: tweak enqueueing in the trigger controller #11501

deploy: tweak enqueueing in the trigger controller #11501

Conversation

0xmichalis commented Oct 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smarterclayton commented Oct 22, 2016

0xmichalis commented Oct 24, 2016

openshift-bot commented Oct 24, 2016

openshift-bot commented Oct 24, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfojtik commented Oct 25, 2016

openshift-bot commented Oct 25, 2016

smarterclayton commented Oct 25, 2016

@Kargakis commented on this pull request.

0xmichalis commented Oct 25, 2016

openshift-bot commented Oct 25, 2016 • edited Loading

openshift-bot commented Oct 25, 2016 •

edited

Loading