Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail new builds that can't start build pod because it already exists #12057

Merged

Conversation

mmilata
Copy link
Contributor

@mmilata mmilata commented Nov 29, 2016

Consider following scenario:

  • build is started by user
  • build controller creates build pod
  • build is deleted by user
  • build pod is killed but remains Terminating during grace period
  • build with same name as in first step is started by user

Previously the new build stayed in New phase until the old build pod finished and then went into Failed after the pod terminated. This commit causes the new build to go into Error phase when it first tries creating build pod and fails because pod with the same name exists.

Slightly improves bug 1300949 (the described behavior is still very similar).

glog.V(4).Infof("Build pod already existed: %#v", podSpec)
build.Status.Phase = buildapi.BuildPhaseError
build.Status.Reason = buildapi.StatusReasonBuildPodExists
build.Status.Message = err.Error()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be checking if the pod is older than the build. if the pod is newer than the build, we should continue to no-op this (it indicates something else managed to create the pod for us).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, totally forgot about this - updated.

@mmilata mmilata force-pushed the bz1300949-buildconfig-deletion branch from 7b5f209 to 9ae2650 Compare November 30, 2016 16:07
@mmilata
Copy link
Contributor Author

mmilata commented Nov 30, 2016

[test][testextended][extended:core(builds)]

@bparees
Copy link
Contributor

bparees commented Nov 30, 2016

lgtm [merge]

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2016
@openshift-bot
Copy link
Contributor

openshift-bot commented Dec 1, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/11862/) (Image: devenv-rhel7_5458)

@bparees bparees self-assigned this Dec 1, 2016
Consider following scenario:

- build is started by user
- build controller creates build pod
- build is deleted by user
- build pod is killed but remains Terminating during grace period
- build with same name as in first step is started by user

Previously the new build stayed in New phase until the old build pod
finished and then went into Failed after the pod terminated. This commit
causes the new build to go into Error phase when it first tries creating
build pod and fails because pod with the same name (and earlier
creationTimestamp) exists.

Slightly improves bug 1300949 (the described behavior is still very
similar).
@mmilata mmilata force-pushed the bz1300949-buildconfig-deletion branch from 9ae2650 to 0cfb747 Compare December 1, 2016 14:53
@mmilata
Copy link
Contributor Author

mmilata commented Dec 1, 2016

Rebased, also updated the code to be in line with #10817 - the message field is set to predefined message instead of propagating the AlreadyExists error.

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 0cfb747

@openshift-bot
Copy link
Contributor

Evaluated for origin testextended up to 0cfb747

@bparees
Copy link
Contributor

bparees commented Dec 1, 2016

[merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 0cfb747

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/11862/) (Base Commit: ff5bd5a)

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/testextended FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin_extended/853/) (Base Commit: b644d1f) (Extended Tests: core(builds))

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2016
@bparees
Copy link
Contributor

bparees commented Dec 1, 2016

extended test failure is a known issue.

@openshift-bot openshift-bot merged commit b7eca0d into openshift:master Dec 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants