Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

template completion detection #15402

Conversation

jim-minter
Copy link
Contributor

@jim-minter jim-minter commented Jul 21, 2017

@jim-minter jim-minter self-assigned this Jul 21, 2017
@jim-minter
Copy link
Contributor Author

@coreydaley fyi

@jim-minter jim-minter force-pushed the trello135-completion-detection branch from bf6c5bd to 606869c Compare July 21, 2017 17:12
@jim-minter
Copy link
Contributor Author

@bparees fyi

@openshift-merge-robot openshift-merge-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 24, 2017
@coreydaley
Copy link
Member

@openshift-merge-robot openshift-merge-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 28, 2017
@jim-minter jim-minter force-pushed the trello135-completion-detection branch from 606869c to 52bbbb7 Compare August 2, 2017 21:44
@openshift-merge-robot openshift-merge-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 2, 2017
@jim-minter
Copy link
Contributor Author

@deads2k @smarterclayton @bparees @coreydaley @openshift/devex I've reworked the template completion detection again to follow @smarterclayton's guidance (for now, just a "template.alpha.openshift.io/wait-for-ready": "true" annotation, which as of now works on DeploymentConfigs, Deployments, Jobs and StatefulSets). Tests not implemented yet; feedback would be very welcome.

// WaitForReadyAnnotation indicates that the TemplateInstance controller
// should wait for the object to be ready before reporting the template
// instantiation complete.
WaitForReadyAnnotation = "template.alpha.openshift.io/wait-for-ready"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @smarterclayton's desire was for this to be entirely seamless to the user (no template updates needed), so we should just wait on any object we know how to wait on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with it being opt in for now. Lots of things don't need to wait. But were you planning on putting this annotation on every template we ship ootb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do - @bparees? Also if this annotation is specifically alpha, what are the implications for documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to have the annotation be necessary than yeah, i think we'd have to put it on the templates we control (and ideally, on the middleware templates too) to be good citizens.

As for documentation, i think we'd document it in the "writing a template" guide, but we'd need to note it as being tech preview/subject to change. Which of course means we're going to make template authors unhappy when we suddenly stop respecting this value (ie not waiting for completion).

Which, while "legitimate" under the tech preview umbrella, feels like we're just setting ourselves up to make people unhappy.

So i'd still sort of prefer that waiting just be a thing we do, if we know how to wait for the resource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

talked with @smarterclayton and his preference is to allow users to request waiting on a per resource basis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this similar to the tolerate-unready-endpoints annotation that is useful for statefulsets? Do we want to switch to an unready concept so that we don't need to add the annotation in all templates?

Copy link
Contributor

@bparees bparees Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now at least we decided we did not want to default to enabled(we want to force users to opt-in), but it is an alpha api so we can leave the door open to revisit that in the future.

deployapi "github.com/openshift/origin/pkg/deploy/apis/apps"
)

func checkDeploymentReadiness(obj runtime.Object) (bool, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we never got a pointer from @smarterclayton about where upstream was prototyping this sort of logic.. @smarterclayton you did say there was an upstream helper (perhaps in a PR) that was trying to do this sort of readiness checking for various resources, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is code all over e2e tests that basically emulates all of this multiple times. There was an external project that tried to do this and got it wrong.

I had expected a per api group helper upstream (dealing with both internal and unversioned) which exposed a condition that took a deployment / statefulset / daemonset and returned "ready" or "not ready". And maybe a variant that took a client for services and fetched an endpoint or something. I then expected to layer a kubectl wait command around those basic libraries. Eventually, I expected to have a subresource for resources that would encapsulate whatever we learned from kubectl wait.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton some follow up questions:

I had expected a per api group helper upstream (dealing with both internal and unversioned)

do you mean just a utility function, or an actual api endpoint? Sounds like the former.

which exposed a condition that took a deployment / statefulset / daemonset and returned "ready" or "not ready".

exposed a condition how? is this a reference to api conditions, or just a general use of the term? my current interpretation of this is "utility function that takes as inputs a desired condition and a resource object, and returns whether that object meets the desired condition or not".... this utility function would be implemented by various api groups for their resource types accurate? Not sure who is responsible for looking up the utility function to call for a given Object.

And maybe a variant that took a client for services and fetched an endpoint or something.

so in this case the readiness check would be available as an api endpoint... associated w/ the api group presumably?

I then expected to layer a kubectl wait command around those basic libraries. Eventually, I expected to have a subresource for resources that would encapsulate whatever we learned from kubectl wait.

How much of this vision are you requesting be driven via this PR? Or can this PR proceed as is with the intent of adapting it to the aforementioned helpers when they become available?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

answers from @smarterclayton, roughly interpreted:

this is fine for now for detection logic, but the follow up is to do an upstream pull to introduce the various resource checking logic as a utility function.

and longer term expose that logic as an api.

if err != nil {
glog.V(4).Infof("TemplateInstance controller: checkReadiness %s returned %v", key, err)

templateInstance.DeleteCondition(templateapi.TemplateInstanceWaiting)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the philosophy behind Conditions was that you just add/update them? Should we not move the Waiting condition back to false instead of deleting it? (obviously this requires other logic in the controller for deciding whether it needs to process a template or not..presumably that could be checking for the Sucess || Failed conditions).

}

if allReady {
templateInstance.DeleteCondition(templateapi.TemplateInstanceWaiting)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same condition comment/question here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also wondering if checkReadiness should really be doing this, might be better for it just to return true/false and let the caller decide what to do w/ that information? Updating the template instance feels like a side-effect here.

Status: kapi.ConditionTrue,
Reason: "Waiting",
Message: "Waiting for template objects to become ready",
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be re-asserted? why? (and same concerns about updating the templateinstance from within this function)

templateInstance.SetCondition(templateapi.TemplateInstanceCondition{
Type: templateapi.TemplateInstanceReady,
Status: kapi.ConditionTrue,
Reason: "Created",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would move this outside this function also.

@jim-minter
Copy link
Contributor Author

@bparees thanks - fresh eyes helped! Updated; awaiting comment from @smarterclayton and tests

@jim-minter jim-minter force-pushed the trello135-completion-detection branch from 6a4d147 to aa216ca Compare August 4, 2017 22:08
@bparees
Copy link
Contributor

bparees commented Aug 4, 2017

open discussion about how/where to structure the actual resource-specific checking logic not withstanding, the framework lgtm.

@jim-minter
Copy link
Contributor Author

/retest

@jim-minter
Copy link
Contributor Author

@openshift/api-review (@smarterclayton) please can you sign off on this on the api-review side officially?

@bparees bparees changed the title WIP completion detection template completion detection Aug 11, 2017
@jim-minter jim-minter force-pushed the trello135-completion-detection branch from aa216ca to 19777c5 Compare August 11, 2017 16:44
@openshift-merge-robot openshift-merge-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 11, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 15, 2017
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 15, 2017
@jim-minter jim-minter force-pushed the trello135-completion-detection branch from c84d271 to 8414c08 Compare August 15, 2017 20:35
@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Aug 15, 2017
@bparees
Copy link
Contributor

bparees commented Aug 15, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 15, 2017
@jim-minter jim-minter force-pushed the trello135-completion-detection branch from 8414c08 to 3f0e488 Compare August 15, 2017 21:58
@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Aug 15, 2017
@bparees
Copy link
Contributor

bparees commented Aug 15, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 15, 2017
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 16, 2017
@jim-minter jim-minter force-pushed the trello135-completion-detection branch from 3f0e488 to 54a58e9 Compare August 16, 2017 15:16
@openshift-merge-robot openshift-merge-robot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 16, 2017
@bparees
Copy link
Contributor

bparees commented Aug 16, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2017
@jim-minter jim-minter force-pushed the trello135-completion-detection branch from 54a58e9 to c0db5d8 Compare August 16, 2017 16:19
@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2017
@bparees
Copy link
Contributor

bparees commented Aug 16, 2017

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, jim-minter

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue

@openshift-merge-robot openshift-merge-robot merged commit f970ded into openshift:master Aug 16, 2017
// ensuring that it does not contain conditions of condType.
func FilterTemplateInstanceCondition(conditions []TemplateInstanceCondition, condType TemplateInstanceConditionType) []TemplateInstanceCondition {
newConditions := make([]TemplateInstanceCondition, 0, len(conditions)+1)
func (templateInstance *TemplateInstance) HasCondition(typ TemplateInstanceConditionType, status kapi.ConditionStatus) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openshift/api-review we missed these guys sneaking in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. needs-api-review size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants