Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable a simple image policy admission controller #8995

Merged
merged 4 commits into from
Aug 12, 2016

Conversation

smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented May 24, 2016

This adds an admission controller to OpenShift which can do the following:

  1. Allow admins to reject images that do not resolve to images stored on the cluster
  2. Allow admins to reject images with certain annotations, docker image labels, or image labels
  3. Allow admins to force images to associate a resource cost with an image (whether arbitrary counted, like "license.jboss" = "1", or an existing resource like "cpu")
  4. Allow images to force images to have a node selector constraint associated with them, based on a hardcoded value.

Open items:

  1. Allow an admin to bypass policy rules on a per namespace basis
  2. Allow an admin/service account to bypass policy rules by virtue of privilege (the ability to mutate images may be considered sufficient)
  3. Allow tolerations to be enforced by virtue of image metadata
  4. Check permissions on the image before allowing the user to create a pod with it.

@deads2k
Copy link
Contributor

deads2k commented May 24, 2016

In addition to signatures, we have an annotation we want to enforce against: images.openshift.io/deny-execution (takes a bool). It seems like we could enforce that unconditionally if we wanted to.

I think we'll want a pedigree check: who made this image? This could either be a check for an author claim in a signature or a subject check of the signature itself. The problem that I see is timing. We can't sign inside the builder (doing so would expose the key), so we need to have the build controller do the signing. That means we can't tag the image until after the build controller has signed it. I'm not sure of the order today @bparees

We might want multiple levels of signature checking. I think that people may want more control over which images are allowed to run as root. Combined with exec denial privileges and enforcement of entry points, this could allow a cluster-admin to curate a list of images that unprivileged users and SA could be allowed to run as root.

I think one significant question is whether we like signatures better than annotations on an image? For pedigree checks I think it makes sense to prevent someone with API access from lying about who created the image. For straight denial of execution, I don't think it buys much. I can see whitelisting falling in the in-between case. I suspect that for most people, annotations controlled by API access are enough. Others may want the image to be signed by a particular subject, but I think I'd expect most of the "this image can run" logic to be held by the signer and we simply trust that subject knew what they were doing.

@bparees
Copy link
Contributor

bparees commented May 24, 2016

so we need to have the build controller do the signing. That means we can't tag the image until after the build controller has signed it. I'm not sure of the order today @bparees

order is:

  1. build image
  2. tag locally
  3. push to registry

@deads2k
Copy link
Contributor

deads2k commented May 24, 2016

  1. tag locally
  2. push to registry

Are these step done by the build controller or (more likely) the build pod?

@smarterclayton
Copy link
Contributor Author

As a Rorschach test for image policy this is already effective.

@bparees
Copy link
Contributor

bparees commented May 24, 2016

Are these step done by the build controller or (more likely) the build pod?

pod.

@deads2k
Copy link
Contributor

deads2k commented May 24, 2016

As a Rorschach test for image policy this is already effective.

Wait, if that's not what you had in your head, what were you thinking?

@smarterclayton
Copy link
Contributor Author

smarterclayton commented May 24, 2016 via email

@smarterclayton
Copy link
Contributor Author

smarterclayton commented May 24, 2016 via email

@deads2k
Copy link
Contributor

deads2k commented May 24, 2016

There have been a number of issues and discussions, I'm just enjoying
that I can open an empty shell and people congregate.

Now I feel dirty and used! :)

The build controller would probably be the signer.

I like the build controller as the signer, but the images are only available locally on the given node that ran the build. Maybe we could push to some sort of internal staging ImageStream? That would work, but it would produce quota problems.

@smarterclayton
Copy link
Contributor Author

I think we'll want a pedigree check: who made this image? This could either be a check for an author claim in a signature or a subject check of the signature itself. The problem that I see is timing. We can't sign inside the builder (doing so would expose the key), so we need to have the build controller do the signing. That means we can't tag the image until after the build controller has signed it. I'm not sure of the order today @bparees

Unfortunately that's not really enough - most people won't enforce on the top level - they'll enforce on the root level (base) which requires us to do a join against image layers.

I think that people may want more control over which images are allowed to run as root.

That's a good scenario - something that ultimately may overlap with PSP in ways we haven't discussed yet. Certainly any sort of "auto installer" style pod needs to come from a trusted root by default. @pweil- so he's aware.

I think one significant question is whether we like signatures better than annotations on an image?

I think annotations are the primary use case, and signatures would be the secondary. We would allow someone to set constraints (based on the generic metadata extracted) and check those. I'm not sure I'm interested in deep policy engine behavior on signatures in the near term. Requiring signed by or even just is signed is probably a high bar. I'd prefer to have an external engine set annotations that come from signatures if necessary.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 24, 2016
@smarterclayton
Copy link
Contributor Author

I like the build controller as the signer, but the images are only
available locally on the given node that ran the build. Maybe we could push
to some sort of internal staging ImageStream? That would work, but it would
produce quota problems.

The problem is you can't trust the system that signed the image, any more
than you can trust that system to report back to you that it signed it. So
signing has to derive its value from either the security of the underlying
system, or of the system on top that watches the underlying system. The
build controller knows that it told a pod to do something - but it has no
way of verifying that what the pod did matched what it asked.

On Tue, May 24, 2016 at 11:20 AM, OpenShift Bot [email protected]
wrote:

Origin Action Required: Pull request cannot be automatically merged,
please rebase your branch from latest HEAD and push again


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#8995 (comment)

@smarterclayton
Copy link
Contributor Author

I wrote out some scenarios in the comments of the policy API that cover the use cases that I've heard. Core goals are making the image the central policy artifact, but affordances based on everything we've done with policies so far (combination of user perm, group perm, SA checks, namespace checks, node selector) to deal with different ways of organizing topology.

I'd like to do a perm check to allow bypass of certain rules, but admins have to decide to allow that (i.e. saying "no unsigned images in production" shouldn't be something a cluster-admin accidentally violates).

As much as possible all annotations / labels / values should be coming from config or images or namespaces. I don't want any special values hardcoded.

@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 24, 2016
@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 12, 2016
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 26, 2016
@smarterclayton smarterclayton force-pushed the image_policy branch 2 times, most recently from d9dfa2a to 04c95de Compare August 1, 2016 02:02
@smarterclayton smarterclayton changed the title WIP - Enable a simple image policy admission controller Enable a simple image policy admission controller Aug 1, 2016

// resourcesToCheck is a map of resources and corresponding kinds of things that
// we want handled in this plugin
// TODO: Include a function that will extract the PodSpec from the resource for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix the debt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now fixed

@smarterclayton
Copy link
Contributor Author

Also turns on the policy by default, still need to add a test-cmd.sh for to test for failure.

@smarterclayton
Copy link
Contributor Author

=== BEGIN TEST CASE ===
test/cmd/migrate.sh:38: executing 'oc import-image --from=php:latest test:2 --confirm' expecting success
FAILURE after 0.400s: test/cmd/migrate.sh:38: executing 'oc import-image --from=php:latest test:2 --confirm' expecting success: the command returned the wrong error code
There was no output from the command.
Standard error from the command:
Error from server: Operation cannot be fulfilled on imagestream "test": the image stream was updated from "5949" to "5950"
=== END TEST CASE ===

new flake, import image should be tolerant to retry conflicts.

@smarterclayton
Copy link
Contributor Author

I added resolution of ImageStreamTags and ImageStreamImages on Builds (to collapse them down to DockerImage) and ended up with a cleaner code path. Resolution now creates the attributes (since resolution knows the resolved image name, and ISI/ISTag also implies internal registry). The MutateImageReferenceFunc takes an Object reference, and we error on any reference that says it's an image but we don't recognize (this may fail on older builds that allowed ImageStream, but I'm ok with that I think).

[ ] I need to add a deeper test on builds to verify resolution in edge cases.

@deads2k
Copy link
Contributor

deads2k commented Aug 10, 2016

The lack of alias recognition really hurts the usability of this feature. It's definitely a good starting point, but can I convince you to take a patch that allows multi-name resolution if I can get it in this week?

matchImageAnnotations:
- key: images.openshift.io/deny-execution
value: "true"
allowResolutionFailure: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to resolve: true too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should resolve without much deeper testing. It's a more
invasive change.

On Wed, Aug 10, 2016 at 9:50 AM, David Eads [email protected]
wrote:

In pkg/image/admission/imagepolicy/api/v1/default-policy.yaml
#8995 (comment):

@@ -0,0 +1,20 @@
+kind: ImagePolicyConfig
+apiVersion: v1
+executionRules:
+- name: execution-denied

  • Reject all images that have the annotation images.openshift.io/deny-execution set to true.

  • This annotation may be set by infrastructure that wishes to flag particular images as dangerous

  • onResources:
    • resource: pods
    • resource: builds
  • reject: true
  • matchImageAnnotations:
    • key: images.openshift.io/deny-execution
  • value: "true"
  • allowResolutionFailure: true

I'd like this to resolve: true too.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/8995/files/5acd1663143e0c8196692974691537db1b40cc53#r74247302,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p2wTLrmv4FCvfIDNAbk-nQe1UdFoks5qedc4gaJpZM4IlDlb
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I.e. resolve changes the whole behavior of the platform. I might have
enabled it last week - but right now it would require a retest of every
failure mode.

On Wed, Aug 10, 2016 at 10:03 AM, Clayton Coleman [email protected]
wrote:

I don't think we should resolve without much deeper testing. It's a more
invasive change.

On Wed, Aug 10, 2016 at 9:50 AM, David Eads [email protected]
wrote:

In pkg/image/admission/imagepolicy/api/v1/default-policy.yaml
#8995 (comment):

@@ -0,0 +1,20 @@
+kind: ImagePolicyConfig
+apiVersion: v1
+executionRules:
+- name: execution-denied

  • Reject all images that have the annotation images.openshift.io/deny-execution set to true.

  • This annotation may be set by infrastructure that wishes to flag particular images as dangerous

  • onResources:
    • resource: pods
    • resource: builds
  • reject: true
  • matchImageAnnotations:
    • key: images.openshift.io/deny-execution
  • value: "true"
  • allowResolutionFailure: true

I'd like this to resolve: true too.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/8995/files/5acd1663143e0c8196692974691537db1b40cc53#r74247302,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p2wTLrmv4FCvfIDNAbk-nQe1UdFoks5qedc4gaJpZM4IlDlb
.

@smarterclayton
Copy link
Contributor Author

Alias? I was up too late - explain.

On Wed, Aug 10, 2016 at 8:33 AM, David Eads [email protected]
wrote:

The lack of alias recognition really hurts the usability of this feature.
It's definitely a good starting point, but can I convince you to take a
patch that allows multi-name resolution if I can get it in this week?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#8995 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1MYFHozM-S9gWYpJkOka7Bs7P4fks5qecUPgaJpZM4IlDlb
.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2016
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2016
@smarterclayton
Copy link
Contributor Author

Added resolution tests and a test/cmd/images.sh test to verify the default
policy rejects the image.

On Wed, Aug 10, 2016 at 6:01 PM, OpenShift Bot [email protected]
wrote:

continuous-integration/openshift-jenkins/test SUCCESS (
https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7760/)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#8995 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_pwtcyahW5gNjK85LUFm8FbBIuxipks5qekojgaJpZM4IlDlb
.

@deads2k
Copy link
Contributor

deads2k commented Aug 11, 2016

govet pkg/image/admission/imagepolicy/imagepolicy_test.go:623: missing argument for Errorf("%v"): format reads arg 2, have only 1 args

@deads2k
Copy link
Contributor

deads2k commented Aug 11, 2016

once you're green, this lgtm.

Support basic acceptance policy, and provide an ootb default config. Lay
the groundwork for future consumption and placement policy but do not
enable it in the public API.
@smarterclayton
Copy link
Contributor Author

@stevekuznetsov networking flake?

@stevekuznetsov
Copy link
Contributor

Yeah, known issue with the Jenkins set up. re[test]

@openshift-bot
Copy link
Contributor

The Origin test job could not be run again for this pull request.

  • If the proposed changes in this pull request caused the job to fail, update this pull request with new code to fix the issue(s).
  • If flaky tests caused the job to fail, leave a comment with links to the GitHub issue(s) in the openshift/origin repository with the kind/test-flake label that are tracking the flakes. If no issue already exists for the flake you encountered, create one.
  • If something else like CI system downtime or maintenance caused the job to fail, contact a member of the Team Project Committers group to trigger the job again.

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to de0ae2d

@smarterclayton
Copy link
Contributor Author

[merge]

@openshift-bot
Copy link
Contributor

openshift-bot commented Aug 12, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7830/) (Image: devenv-rhel7_4810)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to de0ae2d

@openshift-bot openshift-bot merged commit 399d7e6 into openshift:master Aug 12, 2016
@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/7831/)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants