Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Made build webhooks return new Build name. #12573

Merged
merged 1 commit into from
Mar 17, 2017
Merged

Made build webhooks return new Build name. #12573

merged 1 commit into from
Mar 17, 2017

Conversation

oatmealraisin
Copy link

We now write the name of builds created with webhooks to the body of the
response, which was previously unused. This allows the client to alert
the user of the new build, similarly to if it was created through
non-webhook methods.

Old output:

$ oc start-build --from-webhook https://10.192.217.252:8443/oapi/v1/namespaces/myproject/buildconfigs/ruby-sample-build/webhooks/secret101/generic

New output:

$ oc start-build --from-webhook https://10.192.217.252:8443/oapi/v1/namespaces/myproject/buildconfigs/ruby-sample-build/webhooks/secret101/generic
build "ruby-sample-build-2" started

Fixes bz1373441

@csrwng @wanghaoran1988 ptal?

@bparees
Copy link
Contributor

bparees commented Jan 19, 2017

if the webhook is going to start returning a response body, then:

  1. it's possible that's a breaking change, so we can't do it at all (@smarterclayton ?)
  2. we need to define a struct for the body so we can extend our response object in the future, if we do what this PR proposes we're painted into a corner on extending the response body in the future.

@bparees
Copy link
Contributor

bparees commented Jan 19, 2017

also needs a test case

@smarterclayton
Copy link
Contributor

smarterclayton commented Jan 19, 2017 via email

@bparees
Copy link
Contributor

bparees commented Jan 20, 2017

We have to be careful about returning a body.

can you be more explicit about what it means to be careful in this context?

But if we do, why wouldn't we return the build itself?

makes sense to me.

@oatmealraisin
Copy link
Author

@smarterclayton @bparees We probably should be returning the build, I'll change that. However, I don't think this is a super important fix, because it just tells the user what the next build is. Up to you both.

@bparees
Copy link
Contributor

bparees commented Jan 20, 2017 via email

@oatmealraisin
Copy link
Author

Change it to returning the Build object, also added a test case to test-cmd

@oatmealraisin
Copy link
Author

oatmealraisin commented Jan 20, 2017

@bparees @smarterclayton shall we proceed?

return errors.NewInternalError(fmt.Errorf("could not generate a build: %v", err))
}
// Send back the build name so that the client can alert the user.
newBuildJson, _ := json.Marshal(newBuild)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't throw away errors. I think we should return an internal error, though that my be confusing since the build still got created. @openshift/api-review ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured we wouldn't want to die if we got an error there. Maybe we should just print a warning and write an empty object?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not really "dying", because the build is going to get run regardless, it's about what you return to the caller.

Certainly just proceeding w/ writing the json seems like a mistake since you have no idea what's in there, given that the marshal returned an error.

@bparees
Copy link
Contributor

bparees commented Jan 20, 2017

@oatmealraisin aside from the marshal error it lgtm. Need to get @openshift/api-review opinion on how to handle the marshal error and general sign off on the change to return a body from this api.

@bparees bparees self-assigned this Jan 20, 2017
return errors.NewInternalError(fmt.Errorf("could not generate a build: %v", err))
}
// Send back the build name so that the client can alert the user.
newBuildJson, _ := json.Marshal(newBuild)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this an internal version of the object? If so, then no client will be able to deserialize it. You'd need to go through a runtime.Encode path using a codec.

Also, if a caller is expecting a status object or a build object, writing something other than those things seems wrong (for the reasons @bparees already called out). Writing a status success (the call actually did succeed right?) that includes a message (intended to be human readable) that explains the situation seems more appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caller is expecting nothing today. Returning a status object that includes the build (if nothing went wrong) sounds like the most flexible course of action.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caller is expecting nothing today. Returning a status object that includes the build (if nothing went wrong) sounds like the most flexible course of action.

I intended to agree with what you said before about the encoding error case: you shouldn't return unknown content.

Returning a build object (as long as its an external version) seems reasonable for success cases. Returning a status object for encoding failure cases is ok. Most of our API already behaves that way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid an api that potentially returns two different types. The biggest challenge here is it's not exactly a "failure" case. The build was launched, but we can't return you the build object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid an api that potentially returns two different types. The biggest challenge here is it's not exactly a "failure" case. The build was launched, but we can't return you the build object.

That's already a thing. curl https://localhost:8443/api/v1/namespaces/foo if "foo" doesn't exist returns a kind: Status. curl -XDELETE https://localhost:8443/api/v1/namespaces/foo/something/bar has the option of returning either the resource you deleted or a status object on a 200 return.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok cool, thanks for the sample. @oatmealraisin let's do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use marshal here - you need to use the normal encoder (kapi.Codecs.LegacyCodec().Encode(...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, missed that David already said that in the thread.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all, I've changed it to use Encode()/Decode()

@oatmealraisin
Copy link
Author

@bparees I've changed to the Encode/Decode logic, also now report all errors. ptal?

[test]

// Send back the build name so that the client can alert the user.
newBuildEncoded, err := runtime.Encode(kapi.Codecs.LegacyCodec(buildapi.SchemeGroupVersion), newBuild)
if err != nil {
return errors.NewInternalError(fmt.Errorf("could not encode build: %v", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message should make it clear the build was still started/created.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a short string. Perhaps we should set warning to the err instead?

// Send back the build name so that the client can alert the user.
newBuildEncoded, err := runtime.Encode(kapi.Codecs.LegacyCodec(buildapi.SchemeGroupVersion), newBuild)
if err != nil {
return errors.NewInternalError(fmt.Errorf("Build %s was started, but could not encode build: %v", newBuild.Name, err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"but could not encode response object"

lowercase B in build.

@oatmealraisin
Copy link
Author

@bparees merge pending test?

@bparees
Copy link
Contributor

bparees commented Jan 24, 2017

@oatmealraisin awaiting @openshift/api-review sign off.

return errors.NewInternalError(fmt.Errorf("could not generate a build: %v", err))
}

// Send back the build name so that the client can alert the user.
newBuildEncoded, err := runtime.Encode(kapi.Codecs.LegacyCodec(buildapi.SchemeGroupVersion), newBuild)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version specified here looks wrong. You need to use the version under which you were requested so that a hook under /oapi/v1 returns Build.v1 and /oapi/v2 return Build.v2.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k Could you perhaps point me to an example of how to do this? Is there a util method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k Could you perhaps point me to an example of how to do this? Is there a util method?

You could pass the groupversion to your RESTStorage and just hold it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k Here's where it's created, where would I get the current version from? Not sure what I'm looking for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k Here's where it's created, where would I get the current version from? Not sure what I'm looking for.

From that call site, you know it will be buildv1.SchemeGroupVersion. Whoever updates it to handle multiple versions (probably platform) will be able to follow your intent from there. Just pass that constant through. Make sure its v1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k Roger, thanks!

// Send back the build name so that the client can alert the user.
newBuildEncoded, err := runtime.Encode(kapi.Codecs.LegacyCodec(buildapi.SchemeGroupVersion), newBuild)
if err != nil {
return errors.NewInternalError(fmt.Errorf("build \"%s\" was started, but could not encode response object: %v", newBuild.Name, err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me.

@deads2k
Copy link
Contributor

deads2k commented Jan 24, 2017

@smarterclayton I like the idea, its predictable and consistent. I'll make sure it encodes correctly, but you happy with it otherwise?

@deads2k
Copy link
Contributor

deads2k commented Jan 24, 2017

I'd like to see an integration test that actually ensures the wire encoding is correct.

@smarterclayton
Copy link
Contributor

Barring future changes to support alternative protocols (which is unlikely), I agree with David. Marking as API approved.

@oatmealraisin
Copy link
Author

@deads2k I've added a GroupVersion to the Webhook object that we get from master.go. I've also added a section in the Integration test to check that the returned build object can be decoded. ptal?

body, _ := ioutil.ReadAll(resp.Body)

var newBuild buildapi.Build
if err = runtime.DecodeInto(kapi.Codecs.UniversalDecoder(), body, &newBuild); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks compatibilty with old servers since they don't return the build in the body. Also, be sure you properly handle status errors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe status errors are handled shortly before this, are there some that would be able to make it to this part of the code?

@@ -656,6 +656,7 @@ func (c *MasterConfig) GetRestStorage() map[string]rest.Storage {
buildConfigWebHooks := buildconfigregistry.NewWebHookREST(
buildConfigRegistry,
buildclient.NewOSClientBuildConfigInstantiatorClient(bcClient),
v1.SchemeGroupVersion,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should point explictly to the external group version from pkg/build/api/v1.

@deads2k
Copy link
Contributor

deads2k commented Jan 27, 2017

Fix up the wiring to use the build specific constant and add a comment on it explaining what it's for and that it needs to match the the API version the storage is going to be used for.

After that, the version encoding looks right, so the API is fine. Pretty sure you broke client compatibility with old servers though and you can't do that.

@oatmealraisin
Copy link
Author

@deads2k Thanks for the review, I've added a check in startbuild.go to make sure we don't break backwards compatibility. Also, I had to add a check in some webhook.go tests due to the error when runtime.Encode fails. ptal?

Copy link
Contributor

@bparees bparees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also needs a rebase.

// Send back the build name so that the client can alert the user.
if newBuildEncoded, err := runtime.Encode(kapi.Codecs.LegacyCodec(w.groupVersion), newBuild); err != nil {
glog.Warningf("build \"%s\" was started, but could not encode response object: %v", newBuild.Name, err)
//return errors.NewInternalError(fmt.Errorf("build \"%s\" was started, but could not encode response object: %v", newBuild.Name, err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete this.

@@ -299,6 +300,11 @@ func postFile(client restclient.HTTPClient, event, filename, url string, expStat
if resp.StatusCode != expStatusCode {
t.Errorf("Wrong response code, expecting %d, got %d: %s!", expStatusCode, resp.StatusCode, string(body))
}
var newBuild buildapi.Build
if err = runtime.DecodeInto(kapi.Codecs.UniversalDecoder(), body, &newBuild); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump

@@ -97,7 +97,7 @@ os::cmd::expect_success_and_text 'oc start-build --list-webhooks=all bc/ruby-sam
os::cmd::expect_success_and_text 'oc start-build --list-webhooks=all ruby-sample-build' 'github'
os::cmd::expect_success_and_text 'oc start-build --list-webhooks=github ruby-sample-build' 'secret101'
os::cmd::expect_failure 'oc start-build --list-webhooks=blah'
os::cmd::expect_success "oc start-build --from-webhook='$(oc start-build --list-webhooks='generic' ruby-sample-build --api-version=v1 | head -n 1)'"
os::cmd::expect_success_and_text "oc start-build --from-webhook='$(oc start-build --list-webhooks='generic' ruby-sample-build --api-version=v1 | head -n 1)'" "build \"ruby-sample-build-[0-9]\" started"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test the exercises the failure to instantiate flow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test that sends an odd webhook, which returns a 400 w/o a body. Is that what you meant?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was thinking of a case where the build can't actually be started because, for example, the imagestream won't resolve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the test you added is good too)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oatmealraisin i'm still looking for a test that validates the behavior when the build can't be started because it references an invalid/unresolved imagestream.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed these comments. On it.

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 22, 2017

// Send back the build name so that the client can alert the user.
if newBuildEncoded, err := runtime.Encode(kapi.Codecs.LegacyCodec(w.groupVersion), newBuild); err != nil {
glog.Warningf("build \"%s\" was started, but could not encode response object: %v", newBuild.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty bad. utilruntime.handleError so sentry smacks us and we go find you :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting me know about that, I didn't know that existed :O Did you want this in addition to the warning, or instead of?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead. handlerror does its own logging of the message so you don't need to.

@oatmealraisin
Copy link
Author

oatmealraisin commented Mar 1, 2017

@bparees I've added a few tests, I believe that should cover everything. ptal?

Sorry for the delay, I didn't know I needed to add _ "github.com/openshift/origin/pkg/api/install" to the tests to get Decode to work.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 3, 2017
@oatmealraisin
Copy link
Author

@bparees @deads2k bump

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 6, 2017
@oatmealraisin
Copy link
Author

@bparees I added a test that patches ruby-sample-build to use a bad imagestream, and a test that runs that bc and captures the output. ptal?

@oatmealraisin
Copy link
Author

@csrwng ptal?

@oatmealraisin
Copy link
Author

@bparees @csrwng I changed a test to attempt to get the created build instead of using a watch.Interface. The watch was occasionally returning the wrong Build (build-1 instead of build-3). ptal?

}

// TODO: improve negative testing
timer := time.NewTimer(time.Second / 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

increase this timeout. 1/2 second is just begging for flakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to 5 minutes

@bparees
Copy link
Contributor

bparees commented Mar 9, 2017

The watch was occasionally returning the wrong Build (build-1 instead of build-3).

it would be good to understand why that is... assuming you create the watch before you create any builds, you the watch should always return the builds in the order they are created. (and the first build the watch should see looks like it would be one created by the imagechangetrigger, then you'd have 3 builds created by the github triggers).

@bparees
Copy link
Contributor

bparees commented Mar 9, 2017

i think the test was always flawed because it does the following:

  1. create an imagestreammapping which will trigger a build
  2. create a build watch
  3. trigger more builds via the hook

1+2 are a race condition, the watch could include the build from (1) or it could not.

the easiest fix would be to create the buildconfig without an imagechangetrigger associated. then the only builds you'll get will be the hook triggered ones.

@bparees
Copy link
Contributor

bparees commented Mar 9, 2017

hm. nm, the buildconfig doesn't have an ICT on it. The reason you're seeing watch events for different builds is because builds go through multiple changes, so you get an event for each one and they could be interlaced. (build created, build pending, build running, etc). The build-1 event you saw when expecting a build-3 event was probably build-1 going to the pending state.

Elsewhere we've dealt with this by explicitly ignoring events for a build once we've seen the event we care about, but i think your approach is ok for the purposes of this test.

@oatmealraisin
Copy link
Author

@bparees I've changed the timeout for the tests, ptal?

@bparees
Copy link
Contributor

bparees commented Mar 15, 2017

lgtm, squash and i'll merge.

@bparees
Copy link
Contributor

bparees commented Mar 16, 2017

[merge]

@oatmealraisin
Copy link
Author

@bparees made the timer too long, shortening to 30sec, we'll see if that passes the test

@bparees
Copy link
Contributor

bparees commented Mar 16, 2017 via email

@oatmealraisin
Copy link
Author

oatmealraisin commented Mar 16, 2017

@bparees My mistake. I looked at the webhookgithub test first, where the timeout is the success case, and mirrored my changes to the generic test, where it is the opposite. I've changed the GitHub test back, and changed the bitbucket test to 3 minutes, which is under the 4 minute timeout that was getting triggered in the first place.

@bparees
Copy link
Contributor

bparees commented Mar 16, 2017

@bparees My mistake. I looked at the webhookgithub test first, where the timeout is the success case, and mirrored my changes to the generic test, where it is the opposite. I've changed the GitHub test back, and changed the bitbucket test to 3 minutes, which is under the 4 minute timeout that was getting triggered in the first place.

  1. i think the webhookgithub timeout of 1/2 second is also too short.. It's so short that the test might pass simply because the "bad" event didn't occur yet. So that one should at least wait, say, 5 seconds, to ensure we are not going to get the event.

  2. I'd reduce the 3 minute timeout to 1 minute, that's more than enough.

We now write the name of builds created with webhooks to the body of the
response, which was previously unused. This allows the client to alert
the user of the new build, similarly to if it was created through
non-webhook methods.

Added a GroupVersion parameter to Webhook object
@oatmealraisin
Copy link
Author

@bparees Changed the timeouts, waiting on tests

@oatmealraisin
Copy link
Author

Flake #12797, [test]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 5909f5f

@bparees
Copy link
Contributor

bparees commented Mar 17, 2017

[merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 5909f5f

@openshift-bot
Copy link
Contributor

openshift-bot commented Mar 17, 2017

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/129/) (Base Commit: e6a87ad) (Image: devenv-rhel7_6082)

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/291/) (Base Commit: 9e19032)

@openshift-bot openshift-bot merged commit 769c079 into openshift:master Mar 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants