Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apps: retry when scaler hits the conflict #19271

Closed
wants to merge 1 commit into from

Conversation

mfojtik
Copy link
Contributor

@mfojtik mfojtik commented Apr 9, 2018

Fixes: #19270

This is weird, seems like the new genericScaler does not handle the conflicts? Probably need more investigation.

//cc @tnozicka
//cc @DirectXMan12

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mfojtik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 9, 2018
Copy link
Contributor

@tnozicka tnozicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this hot fix, while we look for what went wrong and better way to fix it

@mfojtik
Copy link
Contributor Author

mfojtik commented Apr 9, 2018

@tnozicka lets wait for the tests to prove this is fixing the problem, I'm still curious why the new genericScaler is not handing the conflicts (if that is per-design)... Also wonder if upstream deployments scaler is dealing with conflicts in the caller now (not that deployers acts as kubectl client here, so I guess this should be handled somewhere in kubectl scale code?)

@mfojtik
Copy link
Contributor Author

mfojtik commented Apr 9, 2018

/cc @liggitt

@@ -234,6 +234,10 @@ func (s *RecreateDeploymentStrategy) scaleAndWait(deployment *kapi.ReplicationCo
if scaleErr == nil {
return true, nil
}
// Handle conflicts (shouldn't this be already handled by the scaler?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was bugged. kubernetes/kubernetes#61790 comes to mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That only fixes the result, not the cause. Do we want the scalers to handle conflicts? Personally, the current approach is fine, it's the callers duty to handle those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another reason we should not use kubectl scaler but rather directly implement that functionality.

Copy link
Contributor

@tnozicka tnozicka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the genericScaler should already handle retries on conflict

@tnozicka
Copy link
Contributor

tnozicka commented Apr 9, 2018

conflicts are handled here:

cond := ScaleCondition(s, preconditions, namespace, resourceName, newSize, nil)
if err := wait.PollImmediate(retry.Interval, retry.Timeout, cond); err != nil {

and here:
case ScaleError:
// Retry only on update conflicts.
if e.FailureType == ScaleUpdateConflictFailure {
return false, nil
}

@mfojtik
Copy link
Contributor Author

mfojtik commented Apr 9, 2018

Closing in favor of #19275

@mfojtik mfojtik closed this Apr 9, 2018
@mfojtik mfojtik deleted the fix-scaler-01 branch September 5, 2018 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants