Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imagestream tag exclude from pruning #16580

Merged

Conversation

legionus
Copy link
Contributor

@legionus legionus self-assigned this Sep 27, 2017
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 27, 2017
@legionus
Copy link
Contributor Author

/test extended_image_registry

@miminar
Copy link

miminar commented Sep 27, 2017

The regular expression can get very long and obfuscated.

Instead of adding new flag to the prune command. Could we allow to mark images and imagestream(tag)s as not prunable? For example with annotation like openshift.io/image.noPrune=true?

In case of the flag added to the prune command. I'd prefer the solution suggested by the customer - let the users specify a list of imagestream(tag)s in a file. I find the annotation approach more convenient though.

@bparees
Copy link
Contributor

bparees commented Sep 27, 2017

@miminar the annotation approach was discussed and rejected during planning and noted in the trello card:
"The better option would probably be to allow the imagestream itself to indicate it should be immune from pruning, but that's a big security problem (who's allowed to set that annotation/indicator?)"

we don't have good security models around annotation editing. So this was the short path to provide the capability being requested.

The intent of the card was to include support for a file listing images to be excluded(in addition to the regex support), if that's not in this PR it should be added (I haven't reviewed it yet).

@@ -143,6 +145,7 @@ func NewCmdPruneImages(f *clientcmd.Factory, parentName, name string, out io.Wri
cmd.Flags().BoolVar(opts.AllImages, "all", *opts.AllImages, "Include images that were not pushed to the registry but have been mirrored by pullthrough.")
cmd.Flags().DurationVar(opts.KeepYoungerThan, "keep-younger-than", *opts.KeepYoungerThan, "Specify the minimum age of an image for it to be considered a candidate for pruning.")
cmd.Flags().IntVar(opts.KeepTagRevisions, "keep-tag-revisions", *opts.KeepTagRevisions, "Specify the number of image revisions for a tag in an image stream that will be preserved.")
cmd.Flags().StringVar(&opts.ExcludeImageStreamTag, "exclude-imagestream-tag", "", "The regular expression that will be applied to the ImageStream Tag to preserve it from pruning.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an arg that takes a file containing a list of regexes to exclude (probably won't be true regexes normally, will just be a list of image names, but no reason not to allow them to be regexes, unless you have performance concerns)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bparees Would it suit you if there are two options (this on and another with file as an argument) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, two args is what i'd expect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bparees done

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclude-imagestreamtags?

@legionus
Copy link
Contributor Author

The intent of the card was to include support for a file listing images to be excluded(in addition to the regex support), if that's not in this PR it should be added (I haven't reviewed it yet).

@bparees I thought we agreed to add an option with a regular expression and not a file.
Well. I'll add another option.

@legionus legionus changed the title Imagestream tag exclude from prunning [WIP] Imagestream tag exclude from prunning Sep 27, 2017
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 27, 2017
@miminar
Copy link

miminar commented Sep 27, 2017

@miminar the annotation approach was discussed and rejected during planning and noted in the trello card:
"The better option would probably be to allow the imagestream itself to indicate it should be immune from pruning, but that's a big security problem (who's allowed to set that annotation/indicator?)"

I see. I'm sorry for not participating in the discussion. I could imagine allowing at least a marking of images for those cluster-wide roles that already allow to edit images (e.g. image-auditor). I don't see security implications in that. Marking of imagestream(tag)s would be more challenging for sure.

@legionus legionus force-pushed the image-exclude-from-prune branch from 939bbb9 to b6375e6 Compare September 27, 2017 15:56
@bparees
Copy link
Contributor

bparees commented Sep 27, 2017

@bparees I thought we agreed to add an option with a regular expression and not a file.

Sorry there was confusion, I have a different recollection but either way I think it speaks to the fact that we need to do a better job of updating the trello cards during planning or immediately after, with exactly what we intend to implement for the card (checklists are probably the best way to do this, rather than just putting it in the description). I'll try to do better with that, but I definitely will need help from everyone else as well.

(those updates would also help QE drafting TCs and also bystanders who sometimes watch our cards and wonder what we're doing).

@bparees
Copy link
Contributor

bparees commented Sep 27, 2017

I could imagine allowing at least a marking of images for those cluster-wide roles that already allow to edit images (e.g. image-auditor). I don't see security implications in that. Marking of imagestream(tag)s would be more challenging for sure.

Sure, that seems reasonable. However for now, this is exactly the behavior the customer requested in their RFE and I think it has some advantages (and some disadvantages, certainly) over having to mark the images, so i think it's worth doing and if we get additional requests for functionality in the future we can always do more.

@legionus
Copy link
Contributor Author

/retest

@legionus
Copy link
Contributor Author

/test extended_image_registry

glog.V(4).Infof("Image %q is younger than minimum pruning age, skipping (age=%v)", image.Name, age)
continue
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if these images were not added to the graph, the algorithm could remove blobs referenced by them, could it not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miminar yes. The layers will not have links in the graph.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a serious bug. And we need to back-port this pretty far. The mystery of blobs disappearing from the registry is solved!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please capture the case of layers being removed when they should be kept in some unit/extended test? This is pretty important gap we should cover.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miminar I'm on it. BZ 1487408

@@ -143,6 +147,8 @@ func NewCmdPruneImages(f *clientcmd.Factory, parentName, name string, out io.Wri
cmd.Flags().BoolVar(opts.AllImages, "all", *opts.AllImages, "Include images that were not pushed to the registry but have been mirrored by pullthrough.")
cmd.Flags().DurationVar(opts.KeepYoungerThan, "keep-younger-than", *opts.KeepYoungerThan, "Specify the minimum age of an image for it to be considered a candidate for pruning.")
cmd.Flags().IntVar(opts.KeepTagRevisions, "keep-tag-revisions", *opts.KeepTagRevisions, "Specify the number of image revisions for a tag in an image stream that will be preserved.")
cmd.Flags().StringVar(&opts.ExcludeImageStreamTag, "exclude-imagestream-tag", "", "The regular expression that will be applied to the ImageStream Tag to preserve it from pruning.")
cmd.Flags().StringVar(&opts.ExcludeImageStreamTagFile, "exclude-imagestream-tag-file", "", "The filename that contains the regular expressions that will be applied to the ImageStream Tag to preserve it from pruning.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/that will be applied to the ImageStream Tag to preserve it from pruning./matching ImageStreamTags excluded from pruning./

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an example and point to regular expression syntax accepted.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also describe the string you're matching against (<namespace>/<name>:<tag>).

@@ -143,6 +145,7 @@ func NewCmdPruneImages(f *clientcmd.Factory, parentName, name string, out io.Wri
cmd.Flags().BoolVar(opts.AllImages, "all", *opts.AllImages, "Include images that were not pushed to the registry but have been mirrored by pullthrough.")
cmd.Flags().DurationVar(opts.KeepYoungerThan, "keep-younger-than", *opts.KeepYoungerThan, "Specify the minimum age of an image for it to be considered a candidate for pruning.")
cmd.Flags().IntVar(opts.KeepTagRevisions, "keep-tag-revisions", *opts.KeepTagRevisions, "Specify the number of image revisions for a tag in an image stream that will be preserved.")
cmd.Flags().StringVar(&opts.ExcludeImageStreamTag, "exclude-imagestream-tag", "", "The regular expression that will be applied to the ImageStream Tag to preserve it from pruning.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclude-imagestreamtags?

@miminar
Copy link

miminar commented Sep 28, 2017

Also another extended test for the patterns file would be nice.

@legionus
Copy link
Contributor Author

Also another extended test for the patterns file would be nice.

@miminar Both of these options fill same array. If one works, then filtering works.

@legionus legionus force-pushed the image-exclude-from-prune branch from b6375e6 to 304d2b8 Compare September 28, 2017 14:22
@legionus
Copy link
Contributor Author

@miminar comments addressed.

@legionus legionus changed the title [WIP] Imagestream tag exclude from prunning Imagestream tag exclude from prunning Sep 28, 2017
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 28, 2017
Copy link
Contributor

@dmage dmage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not limit the length of regexps.

if exceedsLimits(stream, imageNode.Image, limits) {
kind = WeakReferencedImageEdgeKind

if !istExcluded {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if istExcluded {
   ...
} else if algorithm.pruneOverSizeLimit {
   ...
} else {

Do we want to add log messages for other cases?

kind = ReferencedImageEdgeKind
}
glog.V(4).Infof("ImageStreamTag %q is excluded by the regular expression", dockerImageReference)
kind = ReferencedImageEdgeKind
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be WeakReferencedImageEdgeKind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought excluded=pruned. s/excluded/protected/ or kept.

@@ -74,6 +76,10 @@ var (
# To actually perform the prune operation, the confirm flag must be appended
%[1]s %[2]s --keep-tag-revisions=3 --keep-younger-than=60m --confirm

# To exclude an imagestream from pruning you can specify a regular expression
# which will be applied to the '<namespace>/<name>:<tag>' string.
%[1]s %[2]s --exclude-imagestreamtag=myproject/hello-openshift
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth mentioning the anchoring (^ and $).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least change the example to --exclude-imagestreamtag=^myproject/hello-openshift:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd change the option name --keep-imagestreamtag=^myproject/hello-openshift:


var patterns []*regexp.Regexp

scanner := bufio.NewScanner(file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use bufio.NewScanner for reading lines. It limits the length of tokens (iow. lines).
Use file.ReadString('\n') instead.

@legionus legionus force-pushed the image-exclude-from-prune branch 2 times, most recently from 4c0492e to 84cf280 Compare September 28, 2017 15:54
@bparees
Copy link
Contributor

bparees commented Nov 15, 2017

/lgtm
/retest

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 15, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, dmage, legionus

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 15, 2017
@legionus
Copy link
Contributor Author

Test extended_image_registry is broken:

Failure summary:


  1. Hosts:    localhost
     Play:     Create OpenShift certificates for master hosts
     Task:     Set Default scheduler predicates and priorities
     Message:  An unhandled exception occurred while running the lookup plugin 'openshift_master_facts_default_predicates'. Error was a <class 'ansible.errors.AnsibleError'>, original message: Unknown short_version 3.8

@bparees
Copy link
Contributor

bparees commented Nov 15, 2017

@legionus i worked with @stevekuznetsov on that this morning, it should be resolved now, hopefully the current run will pass.

@bparees bparees changed the title Imagestream tag exclude from prunning Imagestream tag exclude from pruning Nov 16, 2017
@bparees
Copy link
Contributor

bparees commented Nov 16, 2017

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 16, 2017

@legionus: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/extended_conformance_install_update b70983d link /test extended_conformance_install_update

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 16487, 16580).

@openshift-merge-robot openshift-merge-robot merged commit 8c5b43d into openshift:master Nov 16, 2017
@mfojtik
Copy link
Contributor

mfojtik commented Nov 16, 2017

@bparees i'm going to revert this commit because registry/util was removed from rebase branch so you can't use that helper anymore. Second option would be to disable the test.

@mfojtik
Copy link
Contributor

mfojtik commented Nov 16, 2017

@bparees i reverted this but realized with Alex that we moved that registry helper func into different package, so you just need to remove that import and strip the registryutil. in the test. Anyway, can you please resubmit this after the rebase lands? I really hate to rebase 80 commits to adapt this to post-rebase world.

@deads2k
Copy link
Contributor

deads2k commented Nov 16, 2017

The use a regex here appears to make the command hard to use. In addition, if I'm reading the code correctly, the caller either has to specify specific namespaces in his regex to avoid accidentally matching image streams owned by other people that should be pruned.

Given those two things, why not just make it official and simply making some namespaces immune to pruning for a command. That could be done via label selection of namespaces (users can't label namespaces), it wouldn't require an additional flag, and it wouldn't be vulnerable to less than obvious matching problems.

@bparees
Copy link
Contributor

bparees commented Nov 16, 2017

The use a regex here appears to make the command hard to use.

if you're doing this, you have a need for the complexity.

In addition, if I'm reading the code correctly, the caller either has to specify specific namespaces in his regex to avoid accidentally matching image streams owned by other people that should be pruned.

yup.

Given those two things, why not just make it official and simply making some namespaces immune to pruning for a command.

"some namespaces"? what namespaces?

That could be done via label selection of namespaces (users can't label namespaces), it wouldn't require an additional flag, and it wouldn't be vulnerable to less than obvious matching problems.

because you're assuming that the exclusion is scoped to a particular namespace. What if i want to prune some but not all images in a particular namespace?

This feature was implemented based on the specific request of a customer in fulfillment of how they wanted it to work. You can read the RFE.

@bparees
Copy link
Contributor

bparees commented Nov 17, 2017

fyi i'm deferring this suggestion. @legionus implemented sufficient functionality to address the customer need and we're moving on, i'm not willing to waste his time rewriting it when we have more pressing concerns.

Certainly in the future we can introduce the ability to exclude pruning by a namespace label. Providing this (more flexible) capability today does not preclude providing that capability in the future.

once the rebase has landed we'll be re-merging this feature.

@smarterclayton
Copy link
Contributor

smarterclayton commented Nov 28, 2017

I'm ok with this as it is - I understand the improvement David wants to make, I think it's a good long term thing to do, but I don't see a justification to go with that approach over the current approach that is worth rewriting over.

Also, this is good design feedback that would be better earlier in the process. What can we do to make these suggestions earlier?

@miminar
Copy link

miminar commented Nov 29, 2017

Also, this is good design feedback that would be better earlier in the process. What can we do to make these suggestions earlier?

Well, the suggestion was raised much earlier and was discussed in planning as well but was turned down.

@deads2k
Copy link
Contributor

deads2k commented Nov 29, 2017

I'm ok with this as it is - I understand the improvement David wants to make, I think it's a good long term thing to do, but I don't see a justification to go with that approach over the current approach that is worth rewriting over.

Just so we can avoid the concern about cost, I put together a pull which uses label and field selection to protect image. It is complete, including tests, without any API changes here: #17511.

Also, this is good design feedback that would be better earlier in the process. What can we do to make these suggestions earlier?

I'm sure it was an oversight, but I don't see @juanvallejo, @fabianofranz, @openshift/cli-review, or @openshift/api-review tagged in this pull or in the bugzilla or in the trello card for review or comment. I suspect that any of those actors are familiar with the over-arching design principles of kubectl and oc regarding object selection via the REST API and the CLI tools.

@ctrought
Copy link

/reopen

PR was reverted, I don't think any more granularity has been provided from any other options yet. Would still like to see more control over pruning image tags.

@openshift-ci-robot
Copy link

@ctrought: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

PR was reverted, I don't think any more granularity has been provided from any other options yet. Would still like to see more control over pruning image tags.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. component/image lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.