Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pausing in deploymentconfigs #9086

Merged
merged 3 commits into from
Jun 9, 2016
Merged

Add pausing in deploymentconfigs #9086

merged 3 commits into from
Jun 9, 2016

Conversation

0xmichalis
Copy link
Contributor

@0xmichalis 0xmichalis commented May 30, 2016

@@ -311,6 +311,10 @@ type DeploymentConfigSpec struct {
// or failing. Post strategy hooks and After actions can be used to integrate successful deployment with an action.
Test bool

// Paused indicates that the deployment config is paused resulting in no new deployments on template
// changes or changes in the template caused by other triggers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it also mean I can't do oc deploy --latest ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the command will succeed but nothing will be deployed until the config is un-paused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am disallowing oc deploy --latest for paused deployments. Regarding pausing the deployer, as per @smarterclayton's comments, we probably need to give it more thought (and work) so I guess the first version of oc rollout pause will not pause the deployer if there is one running.

@mfojtik
Copy link
Contributor

mfojtik commented May 31, 2016

@Kargakis does this also affect the deployments that are currently running? IOW. what happen when I go update the DC in a middle of rolling deployment? Should it pause the active deployment as well?

@0xmichalis
Copy link
Contributor Author

0xmichalis commented May 31, 2016

@Kargakis does this also affect the deployments that are currently running? IOW. what happen when I go update the DC in a middle of rolling deployment?

Regarding running deployments, that's the plan. We are going to pause all deployer pods (I am already doing this in #8927). It needs a bit more thought though. My issue is how to let the paused deployer know that it is paused so that it wont take that time into account when computing its progress.

Should it pause the active deployment as well?

Regarding active ie. complete deployments, then pausing should remove the ability to create a new deployment (scaling should continue working so autoscalers can continue doing their thing)

@smarterclayton
Copy link
Contributor

We should probably split the work into two parts (as I think you're
saying below) - support pausing without affecting the current
deployment, and then support pausing in a running deployment.

The latter is harder because we need to be able to be paused for much
longer, which means we have to start worrying about the node dying
under us and the deployer pod being deleted (which we probably
should deal with today, but in practice hasn't yet been an issue).
To do that, deployments need to be reentrant and we have to start
distinguishing between deployments that fail internally (especially
custom deployments) and that fail due to the underlying node. That
sounds like a lot of work.

@0xmichalis
Copy link
Contributor Author

0xmichalis commented May 31, 2016

To do that, deployments need to be reentrant and we have to start distinguishing between deployments that fail internally (especially custom deployments) and that fail due to the underlying node.

Are you saying that internal failures may delete the deployer pod? Apart from custom deployments, how is this possible? Otherwise, node goes down -> deployer is deleted. Internal failure -> deployer pod stays around. We have some form of reentracy currently: if the latest deployment is deleted, it will be recreated, albeit we have an issue about it: #2809

@smarterclayton
Copy link
Contributor

On May 31, 2016, at 11:11 AM, Michail Kargakis [email protected]
wrote:

To do that, deployments need to be reentrant and we have to start
distinguishing between deployments that fail internally (especially custom
deployments) and that fail due to the underlying node.

Are you saying that internal failures may delete the deployer pod? Apart
from custom deployments, how is this possible?

No, I'm saying that we can't tell the difference today between "hook pod
deleted because the node was lost" and "custom deployment step failed".
Deployer pod will stay around.

Components that assume the pod can't be deleted at any time will eventually
be surprised.

Otherwise, node goes down -> deployer is deleted. Internal failure ->
deployer pod stays around. We have some form of reentrace currently: if the
latest deployment is deleted, it will be recreated, albeit we have an issue
about it: #2809 #2809

@0xmichalis 0xmichalis changed the title [WIP] Add pausing in deploymentconfigs Add pausing in deploymentconfigs Jun 1, 2016
@0xmichalis
Copy link
Contributor Author

@mfojtik @ironcladlou I have added an extended test and blocked part of our cli tooling (oc deploy, oc rollback) from working with paused deployments. In #8927, I will add oc rollout {pause,resume}. PTAL

cc: @fabianofranz but I guess since the console cannot instantiate deployments or rollback, you don't really need to do anything atm regarding paused deployments.

@openshift/api-review for the api

@@ -211,6 +211,9 @@ func (o DeployOptions) RunDeploy() error {
// deploy launches a new deployment unless there's already a deployment
// process in progress for config.
func (o DeployOptions) deploy(config *deployapi.DeploymentConfig, out io.Writer) error {
if config.Spec.Paused {
return fmt.Errorf("cannot deploy a paused deploymentconfig")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deployment configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "deploymentconfig" is fine as we already return it for anything cli-related. Try creating, or editing a dc and you will get "deploymentconfig" back. Cli users that pause their configs are probably familiar with our api too (/deploymentconfigs/namespace/name)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When describing to end users we use "deployment config" or "deployment
configuration". When we present commands or other semi-actionable output
(oc status) we use "dc/*". We shouldn't use "deploymentconfig" ever.

On Wed, Jun 1, 2016 at 2:57 PM, Michail Kargakis [email protected]
wrote:

In pkg/cmd/cli/cmd/deploy.go
#9086 (comment):

@@ -211,6 +211,9 @@ func (o DeployOptions) RunDeploy() error {
// deploy launches a new deployment unless there's already a deployment
// process in progress for config.
func (o DeployOptions) deploy(config *deployapi.DeploymentConfig, out io.Writer) error {

  • if config.Spec.Paused {
  •   return fmt.Errorf("cannot deploy a paused deploymentconfig")
    

I think "deploymentconfig" is fine as we already return it for anything
cli-related. Try creating, or editing a dc and you will get
"deploymentconfig" back. Cli users that pause their configs are probably
familiar with our api too (/deploymentconfigs/namespace/name)


You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/9086/files/bbb3b59832e3980da80e07404d17b43aad07d6a4#r65421656,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p7oRPMSaKRs-ctOovTCykzZVnlWWks5qHdYxgaJpZM4Ip6MH
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use "deploymentconfig" ever.

The thing is that we already do it. For all I know, "deploymentconfig" is valid input in the cli and "deployment config" or "deployment configuration" is not.

@0xmichalis 0xmichalis closed this Jun 2, 2016
@0xmichalis 0xmichalis reopened this Jun 2, 2016
@0xmichalis
Copy link
Contributor Author

[test]

@fabianofranz
Copy link
Member

@Kargakis actually you can start, cancel and rollback in the web console. So @jwforres @spadgett we may need to evaluate how this affects the action buttons we currently have (maybe even have a new "pause" action).

@0xmichalis
Copy link
Contributor Author

@Kargakis actually you can start, cancel and rollback in the web console. So @jwforres @spadgett we may need to evaluate how this affects the action buttons we currently have (maybe even have a new "pause" action).

All of those buttons should be disabled in case the deployment config is paused.

@openshift-bot openshift-bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 2, 2016
@0xmichalis
Copy link
Contributor Author

This is ready for a final look

@0xmichalis
Copy link
Contributor Author

test flake about to come: #9155

@@ -352,6 +361,9 @@ func (o DeployOptions) cancel(config *deployapi.DeploymentConfig, out io.Writer)

// reenableTriggers enables all image triggers and then persists config.
func (o DeployOptions) reenableTriggers(config *deployapi.DeploymentConfig, out io.Writer) error {
if config.Spec.Paused {
return fmt.Errorf("cannot enable triggers for a paused deployment config")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't enforced on the server and should be benign anyway (since triggering shouldn't have an effect when paused). Seems safe to let users continue to mutate their spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want paused to have triggers set to automatic:false. I think we need to enforce it on the server too. When a deployment config is paused then we shouldnt' allow updates on automatic: false to automatic: true or the addition of a config change trigger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not modeling paused as a new field that overrides trigger state?

On Mon, Jun 6, 2016 at 9:53 AM, Michail Kargakis [email protected]
wrote:

In pkg/cmd/cli/cmd/deploy.go
#9086 (comment):

@@ -352,6 +361,9 @@ func (o DeployOptions) cancel(config *deployapi.DeploymentConfig, out io.Writer)

// reenableTriggers enables all image triggers and then persists config.
func (o DeployOptions) reenableTriggers(config *deployapi.DeploymentConfig, out io.Writer) error {

  • if config.Spec.Paused {
  •   return fmt.Errorf("cannot enable triggers for a paused deployment config")
    

We want paused to have triggers set to automatic:false. I think we need
to enforce it on the server too. When a deployment config is paused then we
shouldnt' allow updates on automatic: false to automatic: true or the
addition of a config change trigger.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/9086/files/3d2699cf1c43a644b1af175d23b5291c6956a708#r65893001,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p_VdGeCmPZZVO09qsqDM5K1pNQYZks5qJCZ0gaJpZM4Ip6MH
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I just realized that automatic and a config change trigger can be whatever they want, paused supersedes them. I think it's better to supersede than try to align all values. I guess it's ok to remove the restriction here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not modeling paused as a new field that overrides trigger state?

It overrides, this was an oversight.

@0xmichalis
Copy link
Contributor Author

@ironcladlou @smarterclayton any other comments here?

@smarterclayton
Copy link
Contributor

No, I was fine with this. Dan has final sign off.

On Mon, Jun 6, 2016 at 10:09 AM, Michail Kargakis [email protected]
wrote:

@ironcladlou https://github.com/ironcladlou @smarterclayton
https://github.com/smarterclayton any other comments here?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#9086 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p-SJCvqipVuWG_eWtUWp581xIbscks5qJCoegaJpZM4Ip6MH
.


out, err = oc.Run("deploy").Args(resource, "--enable-triggers").Output()
o.Expect(err).To(o.HaveOccurred())
o.Expect(out).To(o.ContainSubstring("cannot enable triggers for a paused deploymentconfig"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this restriction was lifted (according to the latest code and #9086 (comment))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, removed the test too

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 064936b

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/4520/)

@ironcladlou
Copy link
Contributor

LGTM!

@0xmichalis
Copy link
Contributor Author

[merge]

@0xmichalis
Copy link
Contributor Author

flaked on #9217
[merge]

@0xmichalis
Copy link
Contributor Author

same flake again [merge]

@0xmichalis
Copy link
Contributor Author

yum flake [merge]

@openshift-bot
Copy link
Contributor

openshift-bot commented Jun 8, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/4642/) (Image: devenv-rhel7_4339)

@0xmichalis
Copy link
Contributor Author

[merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 064936b

@openshift-bot openshift-bot merged commit ad36e24 into openshift:master Jun 9, 2016
@0xmichalis 0xmichalis deleted the add-pausing-in-dcs branch June 9, 2016 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants