-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
image not found when pulling from integrated registry - service account not allowed to pull? #17523
Comments
From the diagnostics, could you please try restart your master ? add paste registry log here? |
I've already restarted the master, but did again and also restarted the registry pod and tried to recreate the pod in question. Here're the logs:
|
If you are unable to log in to your registry directly from inside the cluster with your OpenShift token something is very broken in your installation. I would first recommend removing the registry and redeploying then trying to verify that you can log in with a token. The command you listed is correct |
@pweil- Why did you close this issue? Removing the registry and redeploying it was one of the first things I thought of. In fact, I did remove not only the registry but the whole cluster multiple times now and can reproduce this issue by installing the OS from scratch and ramping up the cluster by following the docs starting at Host Preparation up to Accessing the Registry. |
@sverboven thanks for the report. Did you also test trying to log in to the registry with a service account manually? Do you have registry logs from the times when the authorization is failing? Can you also detail the env a bit for us, version numbers at least. Thanks! While we need to avoid using GH as a support channel I think we need a little more information to determine if this is a bug since we have not seen this in our testing. |
were RBAC roles reconciled during/after the upgrade process? |
@pweil @bparees I will come with the detailed scenario of @sverboven. Version
Steps To Reproduce (steps that worked perfectly fine using Openshift 3.6, but that are not giving the expected result in Openshift 3.7)
Current result
Internal docker registry logs
Important |
From the internal docker registry logs, we can derive that the pods are being authorized as |
@Jens-vd can you confirm via the pod.yaml what the service account is for the pod? I think there are also some issues/changes in 3.7 w/ the format of the dockercfg secret to be used. The old .dockercfg format is no longer accepted and only the docker/config.json format is allowed. Can you show us your secret content (redacting the token value of course)? And/or show us how you're creating your secret. |
@bparees I will paste the command that we used to create the secret (for @Jens-vd)
|
@mfojtik @sjenning @pweil- I can never remember, does the secret need to be linked to the pod's service account in order for it to be able to use a particular secret when pulling the image? What other debug can we do to determine why it would appear that the secret is not being picked up when the pod attempts to pull the image? (Or is there a way to see what secret the pod is using when it attempts to pull the image?) |
@bparees Here is more detailed information regarding our configurations. Configuration of the service account:
Configuration of the pod which cannot pull an image from the registry:
Configuration of the docker registry secret:
|
Honestly, I'd need to look into it more. I can recreate this situation though; that being an image pushed to the internal registry from outside the cluster and can't deploy pods that reference that image because whatever is doing the pulling is not authorized (resolves to system:anonymous) according to the internal registry logs. I did notice that the However, I don't seem to be able to pull the image from inside the cluster for deployment using the same SA so.. that is strange. There must be something I'm missing. @liggitt any help? |
presumably the URL used to reference the registry inside and outside the cluster is different? are you manually creating the image pull secret for use inside the cluster |
wondering if kubernetes/kubernetes#25435 is related... normalization of the URL for the passed credentials... can you try creating the credential for |
The result is the same. I created a second secret (trying 2 different ways taking @sjenning 's comment into account):
I updated the pod's configuration accordingly:
I also created the deployment using 2 different values for the image:
and
|
Can you try this:
I am still wondering the command you used create the old docker config secret. |
I managed to get the pod running by performing the steps you suggested. I first created the secret as we did before. Then, I captured the base64 contents shown when exporting the secret and put the decoded content in a new file dockercfg.json. I used this file to create a new secret which I linked to the service account we are using for the pod. This was not sufficient for getting the pod to run, but when I added the new secret to the imagePullSecrets of the pod, it was able to pull the image. Steps I performed summarized:
|
@juanvallejo @liggitt seems like there's still an issue w/ how oc secret is creating secrets when you use the |
@bparees will take a look |
@bparees There was a PR from a few months ago that defaulted secrets created using the Based on what I've seen locally with the |
@liggitt @juanvallejo i'm still a little confused because if you look at the SA secrets we create, they are in the old format and seem to work fine:
|
the content of the |
that doesn't sound proper... the name should match the format |
@liggitt that's what I figured, and yet: |
@liggitt @bparees This is the contents of .dockercfg for our docker-registry-secret:
Where <encoded> translates to the following:
|
this does not appear to be a registry issue(registry auth appears to be working fine when a valid secret is presented), this is an issue w/ how k8s creates/finds/uses secrets, so fixing the issue ownership. |
@DirectXMan12 PTAL |
It looks like your problem seems to be related with the one described in https://bugzilla.redhat.com/show_bug.cgi?id=1531511, can you please verify? |
We have changed the way we authorize pods for pulling images from the docker registry. Since the imagePullSecrets were not working as desired for us (and are more intended for pulling from external registries), we decided to investigate how we could authorize pods to the registry using their service account token. This was not working when using the docker registry route (docker-registry-default.router.default.svc.cluster.local), but when using the docker registry service (docker-registry.default.svc:5000) pods were authorizing correctly using their service account token. This allowed us to remove the imagePullSecrets in the pod templates and the dockercfg secret we had to create for this. |
I think the root cause of this issue was fixed in #18062 for 3.7. I'm closing this issue based on that. If you're still experiencing issues and you're using latest version of oc which includes the aforementioned fix please reopen. |
I still have the issue with an openshift cluster 3.9, the docker client wrongly returns an error "Error: image myimage/myimage:latest not found" while it is not logged in. It should return an error like "Authentication required". |
that would reveal/confirm that the image exists, to users who potentially should not know the image exists. |
Awesome. I had this frustrating issue and didn't realize that I had to |
hello all, I had the same problem (version 3.9), and resolved adding the the role |
I can successfully deploy the integrated registry and push a custom image from my client. However, creating a pod that uses this image does not work. Error messages indicate that the image could not be found, but as I've tried to reference the image by different names and none worked, and I also see authentication related error messages from docker, I assume this has to do with failed authentication of the serviceaccount against the internal registry.
Version
Steps To Reproduce
2a. I've exposed the registry with a route and public hostname so I could use it from my client.
2b. I've created a user for remote access as described in Accessing the Registry
2c. I've pushed the image to the registry using the public hostname:
$ docker push registry.mycompany.com/default/ipsec-router
2d. I can see the image on the master logged in as
system:admin
:NOTE: I'm note sure about naming conventions of the
image
attribute but I've tried different variations likeipsec-router
,docker-registry.default.svc:5000/default/ipsec-router
, etc. All errors indicate that the image was not found, but I don't think that this is the issue, see below.All actions happen in
default
project.Current Result
The pod creation fails.
Pod events:
Expected Result
Pod should get created.
Additional Information
I hit TASK [template_service_broker : Reconcile with RBAC file] fails openshift-ansible#6086 so my openshift-ansible checkout from which I installed is not master but
openshift-ansible-3.7.2-1-8-g56b529e
.oc adm diagnostics
shows one error which is related to the registry:According to this mailing list entry this could be a bug. In any case, I do not experience DNS related issues. In fact, the service can be reached and I see credentials for the IP address as well as the service name (see below).
registry container logs show no relevant information (only health checks)
dockerd
system logs on the system where the pod should be created indicate authentication problem:This sounds to me as if there was an authentication problem for the internal registry and the the
not found
message comes from the other registries tried.As said, not sure if that should actually work, but it would match the error message seen in the system log from dockerd.
The text was updated successfully, but these errors were encountered: