allow template instance controller to create CRDs #19396

bparees · 2018-04-17T22:54:27Z

Still need to resolve https://bugzilla.redhat.com/show_bug.cgi?id=1568607

merging this allows the creation of CRDs, but GC is still going to do weird things.

fixes #19368

bparees · 2018-04-17T22:56:05Z

test/extended/testdata/templates/templateservicebroker_bind.yaml

@@ -47,7 +49,7 @@ objects:
          ports:
          - name: port
            port: 1234
-      - apiVersion: v1
+      - apiVersion: route.openshift.io/v1


@liggitt I think this template covers all the variants you were concerned w/ now:

openshift api object by group (route)

openshift api object by legacy (service)

k8s api object by standard api (configmap)

k8s api object by group (CRD)

openshift api object by legacy (service)

service is a kube object

yeah dunno what i was thinking. updated w/ a second route object that uses the legacy api.

bparees · 2018-04-17T22:56:50Z

pkg/cmd/server/bootstrappolicy/controller_policy.go

+			rbac.NewRule("create").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),
+			rbac.NewRule("get").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),
+			// delete needed to be able to set ownerrefs on objects
+			rbac.NewRule("delete").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),


@smarterclayton @liggitt @deads2k let the debate begin...

For blockOwnerDeletion? I thought it was update of */finalizers

The error was something like you can't set an owner ref on a resource type you don't have the power to delete. Made sense to me since setting an owner ref indirectly allows you to delete the object later (assuming you can delete the thing the owner ref points to)

That makes sense on an update, I didn't realize that was required to set an ownerRef on create. Is that what you're seeing?

I believe so but I didn't look at it super closely. (given that I didn't grant the controller update permissions, though, it seems like it would have to have been on create)

deads2k · 2018-04-18T12:54:07Z

pkg/cmd/server/bootstrappolicy/controller_policy.go

+			rbac.NewRule("create").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),
+			rbac.NewRule("get").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),
+			// delete needed to be able to set ownerrefs on objects
+			rbac.NewRule("delete").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),


How about #19404 instead?

sure, that seems like it would work for this.

(for delete specifically. still need create+get though)

deads2k · 2018-04-18T12:54:37Z

pkg/cmd/server/bootstrappolicy/controller_policy.go

@@ -163,6 +163,11 @@ func init() {
 			rbac.NewRule("create").Groups(kAuthzGroup).Resources("subjectaccessreviews").RuleOrDie(),
 			rbac.NewRule("update").Groups(templateGroup).Resources("templateinstances/status").RuleOrDie(),
 			rbac.NewRule("update").Groups(templateGroup).Resources("templateinstances/finalizers").RuleOrDie(),
+			// template instance controller needs to be able to create and get any resource the user is allowed to create.
+			rbac.NewRule("create").Groups(rbac.APIGroupAll).Resources(rbac.ResourceAll).RuleOrDie(),


Wow, that's broad.

yeah, this allows escalation flows like:

user authorized to create builds can now create custom builds (user passes create builds SAR check, template service broker passes create builds/custom check during actual create)

user authorized to create roles or create rolebindings can now create or bind a role containing any creation permission (user passes create roles SAR check, template service broker passes RBAC escalation check)

like i said, let the debate begin :)

the alternative I can see is to switch to using impersonation instead of SAR checks, which also has the advantage of fixing some other potential issues @liggitt noted, where the user might have permission to create X, but isn't allowed to create X when field Y is Z (e.g. privileged pods). Today that's not an issue since the controller SA can't create that either, but if we ever have a case where the controller SA can pass a particular admission control that the user can't pass, the SAR check isn't sufficient anymore and impersonation would be a better solution.

It seems like this is a good target for cluster role aggregation. It wouldn't eliminate cases like jordan indicated, but at least you wouldn't accidentally have the power to create a node or an oauthtoken.

i'm not familiar w/ cluster role aggregation?

That would mean granting those new permissions to everyone with "admin". That doesn't seem like a desirable model. (Nor, in general, does requiring the cluster-admin to go modify permissions to make Templates do what they are supposed to be able to do, strike me as very user friendly).

Also what is behind the dislike of impersonation? It seems like a better solution than what we've done already, nevermind the growing complexity/limitations as we are now seeing as we try to extend it.

That would mean granting those new permissions to everyone with "admin". That doesn't seem like a desirable model.

That is the only model that makes any sense from a normal user perspective. Consider if a broker creates a CRD and no user has access to it. The CRD is basically useless at that point. Without aggregation, there is also no way for the user to create a role / binding to the CRD. Thus they basically have to use the TSB as an escalation enabler.

(Nor, in general, does requiring the cluster-admin to go modify permissions to make Templates do what they are supposed to be able to do, strike me as very user friendly).

Seems like the exact UX I would want as a cluster admin. Normal users do not get to "install" new APIs into the cluster and say they "own" those new APIs. The cluster admin gets to opt-in to that. Afterwards, users are free to create instances of those CRDs just like an other resource they have access to.

Also what is behind the dislike of impersonation? It seems like a better solution than what we've done already, nevermind the growing complexity/limitations as we are now seeing as we try to extend it.

Because it turns the TSB into a component that is an indirect cluster admin (forever). In general we do not have a good way to state "do X as user Y but do not let me escalate into permissions I do not already have - i.e. make sure RBAC / admission / etc would let both of us do this thing".

That is the only model that makes any sense from a normal user perspective. Consider if a broker creates a CRD and no user has access to it. The CRD is basically useless at that point.

Not if i only want some subset of my users to have access to it.

Seems like the exact UX I would want as a cluster admin. Normal users do not get to "install" new APIs into the cluster and say they "own" those new APIs. The cluster admin gets to opt-in to that. Afterwards, users are free to create instances of those CRDs just like an other resource they have access to.

Again you're assuming I want all users in my cluster to be able to do this thing. If I don't want that, then I need to grant the TSB a different role (not admin) so i can grant access on a more fine grained level. This is a mess. The TSB should not need special permissions to create things, it should be able to create anything the user is allowed to create. Hence why impersonation is the right solution.

Because it turns the TSB into a component that is an indirect cluster admin (forever). In

So? As long as the TSB doesn't take any action that isn't impersonating the user requesting the action, who cares?

Not if i only want some subset of my users to have access to it.

If you want a more nuanced flow, then I am fine with you having to do it manually. I agree with @deads2k's assessment that the TSB is a a user focused tool and not a cluster-admin focused tool.

So? As long as the TSB doesn't take any action that isn't impersonating the user requesting the action, who cares?

Right, as long as the TSB has no bugs and there no unforeseen paths to cause it misbehave (or gain access to its token), it is perfectly fine. I want less components with root on the cluster, not more.

If you want a more nuanced flow, then I am fine with you having to do it manually.

I understand where you're coming from, but here's how that statement comes across to me: "i'm fine with @bparees spending his day explaining to customers how to make this work because we wouldn't let him/couldn't give him a way to do it that would have just worked out of the box."

I agree with @deads2k's assessment that the TSB is a a user focused tool and not a cluster-admin focused tool.

It's precisely because it's user focused that a cluster-admin should not have to go manage the TSB to enable a user to do things they can already do with their user account. The TSB is just another api for creating resources.

It's pretty easy for me to envision a cluster administrator creating a class of users who can create a particular CRD instance(e.g. people allowed to instantiate a particular operator), and those users wanting to be able to create those CRD instances via a template. With your proposal the administrator would have to understand they need to go explicitly grant said permission to the TSB SA (rather than just granting it to the new class of user). And they're not going to realize that (even if we doc it) until it doesn't work. And then they're going to forget to do it when they stand up their second cluster. It's "just one more thing" on top of the way too many things we already have.

Right, as long as the TSB has no bugs and there no unforeseen paths to cause it misbehave (or gain access to its token), it is perfectly fine. I want less components with root on the cluster, not more.

I know, and ideally you'd like the cluster to be on a hard drive that has been ripped out of the PC and melted down for scrap and that would make all the potential security holes go away. But we also have to provide a good user experience.

deads2k · 2018-04-18T12:59:22Z

merging this allows the creation of CRDs, but GC is still going to do weird things.

Why has this become an objective? Allowing the creation of CRDs takes the templateservicebroker from a user focused tool to a cluster-admin focused tool.

bparees · 2018-04-18T13:04:52Z

Why has this become an objective? Allowing the creation of CRDs takes the templateservicebroker from a user focused tool to a cluster-admin focused tool.

the ASB team seems to have a need to do this (see #19368 which started all of this) but even if we don't want to create CRDs, we do need to be able to create arbitrary resource types that might be introduced in the future.

liggitt · 2018-04-18T13:11:13Z

pkg/template/controller/templateinstance_controller.go

@@ -284,6 +297,10 @@ func (c *TemplateInstanceController) Run(workers int, stopCh <-chan struct{}) {
 		go wait.Until(c.runWorker, time.Second, stopCh)
 	}

+	go func() {


double goroutines is odd

cut+paste error, fixed.

liggitt · 2018-04-18T13:12:33Z

test/extended/testdata/bindata.go

@@ -10646,7 +10648,7 @@ objects:
          ports:
          - name: port
            port: 1234
-      - apiVersion: v1
+      - apiVersion: route.openshift.io/v1


we still want a v1 openshift object that would need to be submitted to /oapi (service is kube)

and a groupified openshift object

liggitt · 2018-04-18T13:17:17Z

test/extended/testdata/templates/templateservicebroker_bind.yaml

@@ -59,6 +61,31 @@ objects:
          to:
            kind: Service
            name: service
+      - apiVersion: apiextensions.k8s.io/v1beta1
+        kind: CustomResourceDefinition


until the cluster-scoped->namespaced and cross-namespace ownerRef issues are settled, the TSB shouldn't set up objects that result in those invalid relationships

that's fair, but i'd kinda rather put the test in place now (the test passes) while i'm in here.

arguably we don't want to merge this PR at all until we sort out the GC question of what happens if templateinstance creates a CRD since we're just setting people up for undefined behavior.

I actually wanted to put code in place to prevent those relationships (the TSB will happily set them up now, and they don't work, and I don't anticipate ownerRef API changes to support them in 3.10 timeframe)

enj · 2018-04-18T20:51:36Z

@openshift/sig-security

bparees · 2018-04-18T20:53:24Z

/hold

simo5 · 2018-04-19T14:38:14Z

Is there somewhere a write up of the reasons why TSB is getting into this business ?
We should really figure out before any more work in this directionis done in TSB or elsewhere.

bparees · 2018-04-19T14:47:55Z

Is there somewhere a write up of the reasons why TSB is getting into this business ?

The TSB is already in this business. The TSB is an api for creating resources on your cluster. The fact that there are resources it cannot create is, imho, a bug.

bparees · 2018-04-27T03:15:33Z

/retest

bparees · 2018-04-27T14:26:42Z

/retest

bparees · 2018-04-27T17:23:04Z

@deads2k ok it passed w/ the extra wrapper deleted. final review please?

bparees · 2018-04-28T01:25:07Z

/retest

bparees · 2018-04-28T03:00:59Z

/retest

deads2k · 2018-04-30T13:31:33Z

pkg/template/controller/templateinstance_controller.go

-	restmapper     meta.RESTMapper
-	config         *rest.Config
-	templateClient templateclient.Interface
+	restmapper        meta.RESTMapper


why do you have two restmappers.

same as below, we're using a different one for readiness checking. possibly it should all be switched to the dynamic mapper + dynamic(json?) client.

same as below, we're using a different one for readiness checking. possibly it should all be switched to the dynamic mapper + dynamic(json?) client.

You can keep the extra config, but onlyi one restmapper. Compiled restmappers are being removed in 1.11.

restmapper removed. dynamic mapping for everyone.

deads2k · 2018-04-30T13:31:40Z

pkg/template/controller/templateinstance_controller.go

-	templateClient templateclient.Interface
+	restmapper        meta.RESTMapper
+	dynamicRestMapper *discovery.DeferredDiscoveryRESTMapper
+	config            *rest.Config


why do you have two configs?

one is for json and one is not... we added the json one to handle unstructured content. the bulk mapper for checkReadiness is still using the non-json one:

cli, err := bulk.ClientMapperFromConfig(c.config).ClientForMapping(mapping)

is it safe to switch it to the json based config?

additional context on what we do w/ the client we retrieve:

obj, err := cli.Get().Resource(mapping.Resource).NamespaceIfScoped(object.Ref.Namespace, mapping.Scope.Name() == meta.RESTScopeNameNamespace).Name(object.Ref.Name).Do().Get()

bparees · 2018-04-30T18:27:58Z

extra mapper removed in new commit

bparees · 2018-04-30T18:33:17Z

/retest

bparees · 2018-05-01T19:36:47Z

/retest

bparees · 2018-05-02T12:14:39Z

/retest

bparees · 2018-05-02T21:40:21Z

/retest

deads2k · 2018-05-03T16:48:36Z

pkg/template/controller/templateinstance_controller.go

-	config         *rest.Config
-	templateClient templateclient.Interface
+	dynamicRestMapper meta.RESTMapper
+	config            *rest.Config


Technically this can be done with a codec against the dynamic client and can remove the kubectl dependencies this places in our controllers. Doesn't have to be done here. A TODO would be good.

deads2k · 2018-05-03T16:49:34Z

/lgtm

might want to squash the commit named for me.

openshift-ci-robot · 2018-05-03T21:36:37Z

New changes are detected. LGTM label has been removed.

bparees · 2018-05-03T21:36:45Z

added todo and squashed, re-applying lgtm

bparees · 2018-05-03T21:37:07Z

/hold cancel

bparees · 2018-05-03T22:46:16Z

/retest

openshift-ci-robot · 2018-05-04T01:17:31Z

@bparees: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/openshift-jenkins/unit	`fc9893b`	link	`/test unit`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 17, 2018

openshift-ci-robot requested review from jim-minter and juanvallejo April 17, 2018 22:54

openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 17, 2018

bparees commented Apr 17, 2018

View reviewed changes

deads2k reviewed Apr 18, 2018

View reviewed changes

liggitt reviewed Apr 18, 2018

View reviewed changes

bparees force-pushed the template_crds branch from 4d29965 to eed81f1 Compare April 18, 2018 13:18

openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 18, 2018

bparees force-pushed the template_crds branch from eed81f1 to 0799aa4 Compare April 18, 2018 13:19

openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 18, 2018

bparees mentioned this pull request Apr 18, 2018

Template Service Broker fails to provision CRDs #19368

Closed

openshift-ci-robot added the sig/security label Apr 18, 2018

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 18, 2018

rthallisey mentioned this pull request Apr 19, 2018

Template Service Broker fails to provision api server extension objects #19431

Closed

bparees force-pushed the template_crds branch from 0799aa4 to f1eaefb Compare April 19, 2018 17:15

openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 19, 2018

bparees force-pushed the template_crds branch from 68d9aca to f23de7c Compare April 27, 2018 17:22

deads2k reviewed Apr 30, 2018

View reviewed changes

bparees force-pushed the template_crds branch from 8634a13 to 05e4704 Compare May 2, 2018 07:51

deads2k reviewed May 3, 2018

View reviewed changes

openshift-ci-robot assigned deads2k May 3, 2018

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 3, 2018

allow template instance controller to create arbitrary resource types

fc9893b

bparees force-pushed the template_crds branch from 05e4704 to fc9893b Compare May 3, 2018 21:36

openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label May 3, 2018

bparees added the lgtm Indicates that a PR is ready to be merged. label May 3, 2018

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 3, 2018

openshift-merge-robot merged commit d859add into openshift:master May 4, 2018

bparees deleted the template_crds branch May 19, 2018 20:26

allow template instance controller to create CRDs #19396

allow template instance controller to create CRDs #19396

Conversation

bparees commented Apr 17, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liggitt Apr 18, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deads2k commented Apr 18, 2018

bparees commented Apr 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enj commented Apr 18, 2018

bparees commented Apr 18, 2018

simo5 commented Apr 19, 2018

bparees commented Apr 19, 2018

bparees commented Apr 27, 2018

bparees commented Apr 27, 2018

bparees commented Apr 27, 2018

bparees commented Apr 28, 2018

bparees commented Apr 28, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bparees commented Apr 30, 2018

bparees commented Apr 30, 2018

bparees commented May 1, 2018

bparees commented May 2, 2018

bparees commented May 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deads2k commented May 3, 2018

openshift-ci-robot commented May 3, 2018

bparees commented May 3, 2018

bparees commented May 3, 2018

bparees commented May 3, 2018

openshift-ci-robot commented May 4, 2018 • edited Loading

bparees commented Apr 17, 2018 •

edited

Loading

liggitt Apr 18, 2018 •

edited

Loading

openshift-ci-robot commented May 4, 2018 •

edited

Loading