Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imagestream tag exclude from pruning - using object selection #17511

Closed
wants to merge 4 commits into from

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Nov 29, 2017

Alternative to #16580.

To help clarify comments in #16580, https://bugzilla.redhat.com/show_bug.cgi?id=1410018, and openshift/api#9 I have put together a complete pull which adds the ability to protect namespaces and/or individual imagestreamtags based on our standard label and field selection techniques. This pull allows flows like this:

# old stock behavior
oc adm prune images <normal args>

# prune images, but only in unprotected namespaces (label of my choosing)
oc adm prune images <normal args> -l 'cluster-admin-protection-status notin (special)'

# prune images, but only gold level, then prune silver with different args
oc adm prune images <normal args> -l 'plan=gold'
oc adm prune images <more restrictive args> -l 'plan=silver'

# protect particular images, you don't get to choose this label.  myspecial:latest isn't pruned
oc label ns/openshift "image.openshift.io/allow-prune-immunity=true"
oc annotate -n openshift istag/myspecial:latest "image.openshift.io/prune-immunity=true"
oc adm prune images <normal args> [optional selector]

I'd still prefer to label the istag instead of using the equivalent of a field selector, but this gets the job done without any API changes.

@bparees @smarterclayton @mfojtik @openshift/cli-review

@bparees
Copy link
Contributor

bparees commented Nov 29, 2017

@deads2k are you going to write the docs for this approach (@legionus already wrote docs for the earlier implementation)?

Are you going to work w/ QE to update their tests and shepherd it through the rest of the process since the design has changed?

I know you think this is trivial, but you're creating significant churn on our side that you seem unwilling to acknowledge.

Also from what I see this approach does not currently offer a way to ignore immunity(let alone selectively ignore it), which is another advantage of the regex approach.

@deads2k
Copy link
Contributor Author

deads2k commented Nov 29, 2017

Also from what I see this approach does not currently offer a way to ignore immunity(let alone selectively ignore it), which is another advantage of the regex approach.

Ignoring immunity is essentially the same as removing a label or an annotation. With either implementation it requires a modification to perform selectively. To perform non-selectively, you can simply remove labels globally. As with the rest of our API objects, the ACL boundary is the namespace.

@deads2k are you going to write the docs for this approach (@legionus already wrote docs for the earlier implementation)?

Are you going to work w/ QE to update their tests and shepherd it through the rest of the process since the design has changed?

If need be, yes. I would prefer to do that and gain the benefits described in the bugzilla along with consistency of experience when using the CLI.

My preference would be for broad CLI changes introducing new patterns or deviating from existing patterns to be subject to review in the broad sense and individual commands to maintained separately. This would mirror the approach upstream, where additions using existing patterns are subject to cursory review by sig-cli-leads and involved changes are more heavily scrutinized. I don't think that juan or fabiano are pruning experts, but the experience they are trying to cultivate and maintain for consistency within oc is something they are aware of.

If I need to own the card end-to-end, I can do so. However, I think we'd be better served in the end to grow the knowledge about common patterns and user experience across the teams which own particular commands. Consistency in these areas makes using new commands easier and knowledge of the general patterns gives more confidence when adding CLI features.

@bparees
Copy link
Contributor

bparees commented Nov 29, 2017

Ignoring immunity is essentially the same as removing a label or an annotation. With either implementation it requires a modification to perform selectively. To perform non-selectively, you can simply remove labels globally. As with the rest of our API objects, the ACL boundary is the namespace.

i still don't agree that finding and updating N server side objects is easier or better than tweaking a few regexes locally (or not passing the regex at all to disable immunity).

@miminar
Copy link

miminar commented Nov 30, 2017

i still don't agree that finding and updating N server side objects is easier or better than tweaking a few regexes locally (or not passing the regex at all to disable immunity).

@bparees without the immunity being set on namespaces/imagestreams, each time the user attempts to prune manually, he needs to remember all namespaces/imagestreams he wants to preserve and write it down into a regexp on command-line, which gets tedious. I know that being able to write this regexps into a file and passing it to the command helps, but the user still needs to remember he has some namespaces/istags he wants to protect.

IMHO the namespace immunity won't change easily over time - company will want particular namespaces/imagestreams to protect against pruning in any case. Setting the label once and not needing to care about it anymore is a much more convenient solution to the problem.

@miminar
Copy link

miminar commented Nov 30, 2017

Building upon this approach also allows for per namespace/is pruning options. E.g. keep this X tag revisions in proj/myapp and X+2 revisions in proj/myprecious.

@@ -351,8 +365,18 @@ func (p *pruner) addImageStreamsToGraph(streams *imageapi.ImageStreamList, limit
continue
}

isNamespaceEligible := p.namespacesToPrune.Has(stream.Namespace)
doesNamespaceAllowImmunity := p.namespacesToAllowImmunity.Has(stream.Namespace)
isISTagImmune := stream.Spec.Tags[tag].Annotations[protectionAnnotation] == "true"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd welcome per-imagestream control as well. Where istag's annotation overrides is' annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd welcome per-imagestream control as well. Where istag's annotation overrides is' annotation.

It's easy enough to add, but the initial use-case didn't request it. @miminar could it be a followup?

@soltysh @fabianofranz @juanvallejo add it or no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for adding it (can be followup). IIRC discussions in the topic it was about both IS and ISTag to be excluded from pruning.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for a follow-up

@@ -162,6 +170,7 @@ func NewCmdPruneImages(f *clientcmd.Factory, parentName, name string, out io.Wri
cmd.Flags().StringVar(&opts.CABundle, "certificate-authority", opts.CABundle, "The path to a certificate authority bundle to use when communicating with the managed Docker registries. Defaults to the certificate authority data from the current user's config file. It cannot be used together with --force-insecure.")
cmd.Flags().StringVar(&opts.RegistryUrlOverride, "registry-url", opts.RegistryUrlOverride, "The address to use when contacting the registry, instead of using the default value. This is useful if you can't resolve or reach the registry (e.g.; the default is a cluster-internal URL) but you do have an alternative route that works. Particular transport protocol can be enforced using '<scheme>://' prefix.")
cmd.Flags().BoolVar(&opts.ForceInsecure, "force-insecure", opts.ForceInsecure, "If true, allow an insecure connection to the docker registry that is hosted via HTTP or has an invalid HTTPS certificate. Whenever possible, use --certificate-authority instead of this dangerous option.")
cmd.Flags().StringVarP(&opts.NamespacesToPruneSelector, "selector", "l", "", "Selector (label query) to filter on, not including uninitialized ones, supports '=', '==', and '!='.(e.g. -l key1=value1,key2=value2).")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would turn the e.g. part into a full example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would turn the e.g. part into a full example.

Not sure I understand. Those are valid examples.

@@ -104,19 +106,24 @@ var (
defaultKeepYoungerThan = 60 * time.Minute
defaultKeepTagRevisions = 3
defaultPruneImageOverSizeLimit = false

// TODO choose final name and promote to API
protectionLabel = "image.openshift.io/allow-prune-immunity"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be more general? (e.g. openshift.io/allow-prune-immunity). We have pruners for resources other than image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be more general? (e.g. openshift.io/allow-prune-immunity). We have pruners for resources other than image.

I would rather start specific and move out to generality. Pruning isn't a well explored space yet. See how deployments do it as a counter example.

@wanghaoran1988
Copy link
Member

/cc @geliu2016

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 6, 2017
@deads2k
Copy link
Contributor Author

deads2k commented Dec 18, 2017

Building upon this approach also allows for per namespace/is pruning options. E.g. keep this X tag revisions in proj/myapp and X+2 revisions in proj/myprecious.

Yep.

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 18, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 18, 2017
@deads2k
Copy link
Contributor Author

deads2k commented Dec 18, 2017

@soltysh @miminar I have rebased.

I like the suggestion of respecting the flag set on an imagestream, but I'd like to move the bulk of the work and circle back to it. Perhaps @juanvallejo could pick it up and work through the test matrix?

@deads2k
Copy link
Contributor Author

deads2k commented Dec 19, 2017

#17879
#17880

@deads2k
Copy link
Contributor Author

deads2k commented Dec 19, 2017

/retest

1 similar comment
@deads2k
Copy link
Contributor Author

deads2k commented Dec 20, 2017

/retest

@deads2k deads2k force-pushed the image-11-prune branch 2 times, most recently from 681aa6b to f8f07d0 Compare December 20, 2017 14:56
@deads2k
Copy link
Contributor Author

deads2k commented Dec 20, 2017

/retest

@@ -55,6 +55,9 @@ const (
// ReferencedImageLayerEdgeKind defines an edge from an ImageStreamNode or an
// ImageNode to an ImageComponentNode.
ReferencedImageLayerEdgeKind = "ReferencedImageLayer"

// TODO choose final name and promote to API
protectionAnnotation = "image.openshift.io/prune-immunity"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openshift/api-review ptal (yes David I know it's you, but also others 😉)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@openshift/api-review ptal (yes David I know it's you, but also others )

Fair. :)

@soltysh
Copy link
Contributor

soltysh commented Dec 21, 2017

The implementation itself lgtm, I'm ok with @juanvallejo working on the followup. Just need to make it in a working state.

@deads2k
Copy link
Contributor Author

deads2k commented Dec 21, 2017

/retest

to see if it stumbles on the same imagestreams.

Copy link

@miminar miminar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit, LGTM otherwise.

@@ -166,6 +186,7 @@ func NewCmdPruneImages(f *clientcmd.Factory, parentName, name string, out io.Wri
cmd.Flags().StringVar(&opts.RegistryUrlOverride, "registry-url", opts.RegistryUrlOverride, "The address to use when contacting the registry, instead of using the default value. This is useful if you can't resolve or reach the registry (e.g.; the default is a cluster-internal URL) but you do have an alternative route that works. Particular transport protocol can be enforced using '<scheme>://' prefix.")
cmd.Flags().BoolVar(&opts.ForceInsecure, "force-insecure", opts.ForceInsecure, "If true, allow an insecure connection to the docker registry that is hosted via HTTP or has an invalid HTTPS certificate. Whenever possible, use --certificate-authority instead of this dangerous option.")
cmd.Flags().BoolVar(opts.PruneRegistry, "prune-registry", *opts.PruneRegistry, "If false, the prune operation will clean up image API objects, but the none of the associated content in the registry is removed. Note, if only image API objects are cleaned up through use of this flag, the only means for subsequently cleaning up registry data corresponding to those image API objects is to employ the 'hard prune' administrative task.")
cmd.Flags().StringVarP(&opts.NamespacesToPruneSelector, "selector", "l", "", "Selector (label query) to filter on, not including uninitialized ones, supports '=', '==', and '!='.(e.g. -l key1=value1,key2=value2).")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help should mention that the selector selects namespaces.

@juanvallejo
Copy link
Contributor

/test end_to_end

@mfojtik
Copy link
Contributor

mfojtik commented Jan 17, 2018

@deads2k digged into the failed test, I think all 4 image streams that failed to update have one thing in common. They were all docker push-ed and the image stream tag was created by registry...

I wonder if we default the referencePolicy.type in API or if registry can somehow bypass this validation check.

I guess this is a separate issue, so we can opt-out the failing image streams (grep -v or something) for now to move this PR forward. Make an issue for the @openshift/sig-developer-experience or @smarterclayton (who added referencePolicy) to investigate how it is possible that those tag have that field set to an empty string.

Wild guess: we don't validate the ImageStreamMappings for the referencePolicy.type, which allows registry to create IST that has the field set to "" ?

@openshift-merge-robot
Copy link
Contributor

@deads2k PR needs rebase

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 24, 2018
@deads2k deads2k closed this Jan 24, 2018
@deads2k deads2k deleted the image-11-prune branch January 24, 2018 14:38
@deads2k deads2k restored the image-11-prune branch January 24, 2018 14:38
@deads2k deads2k reopened this Jan 24, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@mfojtik
Copy link
Contributor

mfojtik commented Feb 27, 2018

@bparees @openshift/sig-developer-experience were there any progress on determining the root cause of why certain image streams cannot be annotated (iow. we store the corrupted object in etcd) ?

@miminar
Copy link

miminar commented Feb 27, 2018

/retest

@openshift-ci-robot
Copy link

@deads2k: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/service-catalog 7f60fe8 link /test service-catalog
ci/openshift-jenkins/gcp 7f60fe8 link /test gcp
ci/openshift-jenkins/verify 7f60fe8 link /test verify
ci/openshift-jenkins/extended_image_registry 7f60fe8 link /test extended_image_registry
ci/openshift-jenkins/extended_conformance_install 7f60fe8 link /test extended_conformance_install
ci/openshift-jenkins/cmd 7f60fe8 link /test cmd
ci/openshift-jenkins/unit 7f60fe8 link /test unit
ci/openshift-jenkins/integration 7f60fe8 link /test integration
ci/openshift-jenkins/extended_networking_minimal 7f60fe8 link /test extended_networking_minimal
ci/openshift-jenkins/end_to_end 7f60fe8 link /test end_to_end

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dmage
Copy link
Contributor

dmage commented Feb 27, 2018

@mfojtik #18519

@bparees
Copy link
Contributor

bparees commented Feb 27, 2018

@bparees @openshift/sig-developer-experience were there any progress on determining the root cause of why certain image streams cannot be annotated (iow. we store the corrupted object in etcd) ?

@mfojtik it has always been this way, see my comment here where this was found and "worked around" (by setting the reference policy explicitly):
#12731 (review)

it's effectively by design, see the entire discussion (including comment from clayton) here:
#18519 (I tagged you in this issue, btw, are you not getting my notifications? You haven't responded to a number of them)

and my PR to at least make the error message clearer:
#18532

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 28, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 27, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/developer-experience size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.