Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import manifest lists with linux/amd64 platform #16322

Merged

Conversation

dmage
Copy link
Contributor

@dmage dmage commented Sep 13, 2017

This is an alternative for #16319

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 13, 2017
@dmage dmage force-pushed the manifest-list-linux-amd64 branch from 157f754 to 2869cb7 Compare September 13, 2017 14:46
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 13, 2017
@@ -549,6 +572,7 @@ func (isi *ImageStreamImporter) importRepositoryFromDocker(ctx gocontext.Context
importTag.Image, err = schema2ToImage(deserializedManifest, imageConfig, "")
} else {
glog.V(5).Infof("unsupported manifest type: %T", manifest)
importTag.Err = fmt.Errorf("unsupported manifest type")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the error contain the type name as well?

continue
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not handled in for i := range repository.Digests { loop as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it supposed to be a hack to unblock the queue.

if manifestList, isManifestList := manifest.(*manifestlist.DeserializedManifestList); isManifestList {
manifest = nil
for _, m := range manifestList.Manifests {
if m.Platform.OS == "linux" && m.Platform.Architecture == "amd64" {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about runtime.GOOS and runtime.GOARCH or maybe some other internal package defining what we support? At least, these could be constants taken from https://golang.org/pkg/runtime/internal/sys/#pkg-constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to read this from an annotation on the image stream tag and have it default to Linux / amd64. We are very shortly going to need to do something better here for other reasons (multiarch), and if we need to work around for Windows we'll need the tag to be controlled.

The annotation should be a struct with arch and os called out as input. If there is only one tag

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is only one tag, ignore the constraints for backwards compatibility. Default the constraint to Linux/amd64

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know for sure that all valid Linux amd64 images have both these values set correct from red hat or customer registries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmage bump ^^

if the answer is no, perhaps we update this PR to just strip out the windows manifests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may still have other types. As I mentioned, defaulting to linux/amd64 is fine, taking the first one if there is only one is fine, but we probably still need the escape hatch which is why the spec tag annotation API is probably going to be the safe fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the annotation api something we think is the long term solution to this? I don't want to get stuck supporting that annotation 12 months from now when we've properly implemented manifestlist support.

(that annotation api would, at a minimum, i assume, have to support a the user providing a list of architectures to allow, not just one).

glog.V(5).Infof("unable to access digest %q for tag %q for repository %#v: %#v", m.Digest, importTag.Name, repository, err)
importTag.Err = formatRepositoryError(repository, importTag.Name, "", err)
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should break here once you've gotten one manifest? (what are the implicaitions of multpile manifests being in the list?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(guessing the answer is going to be "because it's a hack to unblock the queue" :) )

}
}
if manifest == nil {
manifest, err = s.Get(ctx, manifestList.Manifests[0].Digest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if we can't find a linux manifest, we just take the first windows manifest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's a hack to unblock the queue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

@smarterclayton smarterclayton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec tag needs to allow override of this for other reasons and near term needs. I'm ok with an annotation on the spec tag that overrides.

@dmage
Copy link
Contributor Author

dmage commented Sep 13, 2017

As the queue is going to be unblocked by #16324, I close this PR in favour of https://trello.com/c/4EcAIJrd/1303-13-registry-add-support-for-manifest-list

@dmage dmage closed this Sep 13, 2017
@bparees
Copy link
Contributor

bparees commented Sep 13, 2017

@dmage that card is a 13 point card. Are we really ok with living w/ the current situation until we can implement a 13 point card? I would think not. Seems like we need a short term workaround for users.

@mfojtik
Copy link
Contributor

mfojtik commented Sep 13, 2017

@dmage #16324 is just temporary and it will become p0 to re-enable that test. If this PR is making the test green again, we should continue working in it. It is hack, but better than have those tests disabled until we add manifest list support which can take months.

@smarterclayton i would strongly suggest not to add more cleverness to this hack and rather focus on manifest list support long term.

@bparees
Copy link
Contributor

bparees commented Sep 13, 2017

@smarterclayton i would strongly suggest not to add more cleverness to this hack and rather focus on manifest list support long term.

+1. This PR is not the "how to add multiarch support" PR, this PR is the "let's get unbroken for our users" PR. The card @dmage referenced is the long term solution (or if it's not as described already, making sure windows manifests work needs to be part of that card).

@soltysh
Copy link
Contributor

soltysh commented Sep 14, 2017

@smarterclayton i would strongly suggest not to add more cleverness to this hack and rather focus on manifest list support long term.

+1 we need to have a simple working solution asap, disabling tests to unblock the queue happened ONLY to unblock the queue. This is a p0 blocker for 3.7 release, it needs a "proper" fix. I'd like to see this re-opened and comments addressed.

@bparees bparees reopened this Sep 14, 2017
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 14, 2017
@smarterclayton
Copy link
Contributor

When you do a hack, the hack needs to actually take into account the next things that happen. You'd need to convince me that the linux/amd64 filter is safe before I'd agree that the hack should be merged. Given that, the annotation is your escape hatch for the hack - you can survive being wrong about the filter because the filter can change. I'm not convinced this hack is giving enough wiggle room.

@bparees
Copy link
Contributor

bparees commented Sep 14, 2017

When you do a hack, the hack needs to actually take into account the next things that happen. You'd need to convince me that the linux/amd64 filter is safe before I'd agree that the hack should be merged. Given that, the annotation is your escape hatch for the hack - you can survive being wrong about the filter because the filter can change. I'm not convinced this hack is giving enough wiggle room.

agreed, but in that case i'd prefer the hack be that we just strip all windows manifests, which I think is @dmage's next plan for this PR.

As I understand it, there are basically two problems we need to solve today:

  1. tolerate getting back a manifestlist object. I don't think there's any debate on this one.

  2. When we get back a manifestlist, how do we ensure the tags we actually import out of the manifestlist are the ones that "make sense". (because otherwise the top 5 tags we import might be windows images). To me this is less urgent, frankly we could just take all of them and tell people using "dockerImageRepository" imports to stop doing that, the set of tags they get has never been deterministic and it's just "less" deterministic now that additional tags have been introduced. But if we want to be nice, we should strip the windows ones because we explicitly know they won't work today.

@dmage dmage force-pushed the manifest-list-linux-amd64 branch from 2869cb7 to f36f972 Compare September 14, 2017 16:03
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 14, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 14, 2017
@dmage dmage force-pushed the manifest-list-linux-amd64 branch from f36f972 to 2b3e41f Compare September 14, 2017 16:07
continue
glog.V(5).Infof("unable to get manifest by tag %q for image %s: %#v", importTag.Name, ref.Exact(), err)
importTag.Err = formatRepositoryError(ref, err)
return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"true"? i would have guessed false but w/o code comments its hard to understand the intent either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Result should be appended to repository.Tags.

const (
defaultArch = "amd64"
defaultOS = "linux"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some counseling from @smarterclayton, i'm not satisfied with this solution. While true that it makes things no worse than they were earlier this week (before the manifestlists were introduced), we need to at least make sure we can tolerate manifestlists that do not have a linux/amd64 entry (And by tolerate i mean, allow users to consume those images, not just "not fail"). So that means the default arch/os needs to be user-selectable, even if only via an undocumented annotation(on the imagestreamtag) that we can hand out to users who run into this problem.

True manifestlist support is too far down the road for us to be able to wait for it to solve this problem, we need a more medium-term solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for a record: manifest lists were introduced by Docker in June 2016.

we need to at least make sure we can tolerate manifestlists that do not have a linux/amd64 entry

Why do we need this? Do we officially support clusters with Windows nodes? I ask this, because we have this problem for more than 1 year.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it so that we can limp along until we get proper manifestlist support in place. We've had the problem for more than a year, it finally got us, and now we should fix it in such a way that we don't trip over the next scenario without at least a workaround in the code that we can offer users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cautioning because we're about to go put a change into place that, while it fixes the problem, doesn't prevent the next problem. We are getting close to actually having the requirement to do all of these things:

  1. run linux/power64le clusters (would be broken if we hardcode linux/amd64)
  2. support mixed mode clusters (windows and linux)
  3. the possibility that anyone in origin running on non-amd64 is broken when we make this change

It's good to get a quick fix in. We probably also should be defaulting arch and OS to the compiled version (goos and goarch) rather than hardcoding linux/amd64, although we need to think a little bit about how that could break non-amd64 issues.

@dmage
Copy link
Contributor Author

dmage commented Sep 14, 2017

You'd need to convince me that the linux/amd64 filter is safe

I can convince you that not supporting manifest lists is unsafe. This PR is sophisticated way to NOT support manifest lists. It simulates the same processes that happens in Docker Distribution if the client doesn't support manifest lists.

It doesn't happen on the registry side for OpenShift, because:

  1. The Docker Client is built in a way that it always sends the Accept header for manifest lists if the manifestlist module is imported, and the integrated registry imports this module. oc, on the other side, doesn't import this module and that's why it can tolerate some official images.
  2. If downconverting is happens on the registry side, we will get 404 for images that doesn't built for the linux/amd64 platform (see the link above). That would break our import for mongo repository, for example. Because the tag 3.5-windowsservercore is present in the tag list, but we would get an 404 error for it, because it is a Windows-only image.

@bparees
Copy link
Contributor

bparees commented Sep 14, 2017

@dmage i don't think there's debate about whether we should (pseudo) support manifestlists. it's a question of, given a manifestlist, which manifests should we ignore and which should we not ignore? Defaulting to ignoring everything except amd64 is fine. But it's also necessary we have a way to pick something else in case there is a manifest that defines some other arch that we need to be able to import.

Yes it's true that choosing amd64-only is equivalent to what the docker registry does when faced with a client that doesn't support manifestlist, but the docker registry will also provide the full manifest list to a client that supports them fully. We can't just operate in a crippled mode that acts as if we don't support manifest lists at all (by only respecting the amd64 entry in the manifest list).

@dmage
Copy link
Contributor Author

dmage commented Sep 18, 2017

otherwise we should choose the first image

It may be a good idea for manifests with just one manifest.

But what is the use case for selecting an image for a random platform from a manifest list?

@dmage dmage force-pushed the manifest-list-linux-amd64 branch from 9544fab to 916605c Compare September 18, 2017 14:21
@bparees
Copy link
Contributor

bparees commented Sep 18, 2017

But what is the use case for selecting an image for a random platform from a manifest list?

It's erroring on the side of giving the user "something" vs "nothing". If they have a manifestlist that only contains 32 bit images, for example, us picking one of those(even though we're running on 64 bit) is probably better than us picking none of them.

Yes if all the entries are "windows" then us picking one of them won't lead to a great outcome, but in that case they shouldn't have tried to import that tag in the first place.

It's basically a "fail conservatively" attitude.

@legionus
Copy link
Contributor

It's erroring on the side of giving the user "something" vs "nothing".

This will create a new category of errors: The import process completed successfully, but imagestream can not be used (because first OS in list was netbsd for example). Also there will be hidden errors in the builds for the same reason (image imported but incompatible).

I think it's much better not to get on this path from the very beginning and not import something that we were not asked to do. A import error is better than a hidden error about an incompatible image.

@dmage
Copy link
Contributor Author

dmage commented Sep 18, 2017

I prefer fail-fast attitude.

@dmage dmage force-pushed the manifest-list-linux-amd64 branch from 916605c to 7e87728 Compare September 19, 2017 12:04
@bparees
Copy link
Contributor

bparees commented Sep 19, 2017

I'm ok with this either way (refusing to import anything if we get no match and telling the user they need to add an appropriate annotation to their IST, or importing the first entry).

So this is lgtm as far as i'm concerned. @smarterclayton last chance to exercise veto power if you think always importing something, even when nothing matches, is critical.

@dmage
Copy link
Contributor Author

dmage commented Sep 19, 2017

@bparees the current version fallbacks to the first manifest. But it seems it has a bug.

/retest

@bparees
Copy link
Contributor

bparees commented Sep 19, 2017

@bparees the current version fallbacks to the first manifest. But it seems it has a bug.

sorry, missed your latest round of changes.

@smarterclayton
Copy link
Contributor

I'm ok with delaying that behavior since we have to address the root problem in the API (making it possible to import an arbitrary image from within a manifest list).

@smarterclayton
Copy link
Contributor

Needs a test proving this works (and the annotation works)

}

var manifestDigest digest.Digest
for _, manifestDescriptor := range manifestList.Manifests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is one manifest, we should always select, regardless of OS and Arch. Otherwise we are breaking backwards compatibility for clients that switch to manifest list but still only publish a single image.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe that's what line 404 is handling. (the case that we didn't manage to match anything).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks.

@dmage dmage force-pushed the manifest-list-linux-amd64 branch from 7e87728 to 74dd00e Compare September 20, 2017 10:36
@dmage
Copy link
Contributor Author

dmage commented Sep 20, 2017

I've added a test: 74dd00e#diff-d54b8cd494242c0f5294390575027003R147

@bparees
Copy link
Contributor

bparees commented Sep 20, 2017

lgtm, @smarterclayton can you sign off on the annotation names for @openshift/api-review ?

// DefaultArchAnnotation represents an architecture that should be
// selected if an image uses a manifest list and it should be
// downconverted.
DefaultArchAnnotation = "openshift.io/image.defaultArch"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside your own group, I'd expect this to be scoped to your group rather than claiming a bit of the global namespace.

image.openshift.io/defaultArch

@smarterclayton
Copy link
Contributor

smarterclayton commented Sep 20, 2017 via email

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 21, 2017
@dmage dmage force-pushed the manifest-list-linux-amd64 branch from 74dd00e to 27c1eb4 Compare September 21, 2017 10:17
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 21, 2017
@bparees
Copy link
Contributor

bparees commented Sep 21, 2017

api review comments have been addressed.
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bparees, dmage

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 21, 2017
@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 13840, 16033, 16322, 16441)

@openshift-merge-robot openshift-merge-robot merged commit 7b36998 into openshift:master Sep 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. needs-api-review size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants