Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pod failed to be created - cannot set an ownerRef on a resource you can't delete #922

Closed
magaldima opened this issue Jul 27, 2018 · 3 comments

Comments

@magaldima
Copy link
Contributor

BUG REPORT

What happened:
Triggered the hello-world workflow using Argo Events and the workflow controller encountered the following error stating:
pods \"arguments-via-webhook-event\" is forbidden: cannot set an ownerRef on a resource you can't delete: User \"system:serviceaccount:dev-axis:argo-sa\" cannot delete pods in project

I'm assuming this has to do permissioning, but my cluster roles for argo-sa is in sync with the helm repo.

What you expected to happen:
The worklfow to run.

How to reproduce it (as minimally and precisely as possible):

  1. Install Argo to cluster
  2. Install Argo events to cluster
  3. k create -f examples/webhook-with-resource-param.yaml (from argo events)
  4. curl -d '{"message":"this is my first webhook"}' -H "Content-Type: application/json" -X POST http://webhook-dev-axis.devkubewd.dev.blackrock.com/hello2

Anything else we need to know?:
Running on OpenShift

argo-sa cluster role:

apiVersion: authorization.openshift.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: 2018-07-09T21:34:43Z
  name: argo-cluster-role
  resourceVersion: "114466986"
  selfLink: /apis/authorization.openshift.io/v1/clusterroles/argo-cluster-role
  uid: e50bd3f0-83bf-11e8-8146-fa163e3d2f07
rules:
- apiGroups:
  - argoproj.io
  attributeRestrictions: null
  resources:
  - workflows
  verbs:
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - ""
  attributeRestrictions: null
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  attributeRestrictions: null
  resources:
  - persistentvolumeclaims
  verbs:
  - create
  - delete
- apiGroups:
  - ""
  attributeRestrictions: null
  resources:
  - pods
  - pods/exec
  verbs:
  - create
  - get
  - list
  - patch
  - update
  - watch

Environment:

  • Argo version:
v2.1.0-beta1
  • Kubernetes version :
clientVersion:
  buildDate: 2018-04-13T22:29:03Z
  compiler: gc
  gitCommit: d4ab47518836c750f9949b9e0d387f20fb92260b
  gitTreeState: clean
  gitVersion: v1.10.1
  goVersion: go1.9.5
  major: "1"
  minor: "10"
  platform: darwin/amd64
serverVersion:
  buildDate: 2018-04-19T18:22:07Z
  compiler: gc
  gitCommit: a0ce1bc
  gitTreeState: clean
  gitVersion: v1.9.1+a0ce1bc657
  goVersion: go1.9.2
  major: ""
  minor: ""
  platform: linux/amd64

Other debugging information (if applicable):

time="2018-07-26T23:46:04Z" level=info msg="workflow controller configuration from rolling-otter-workflow-controller-configmap:\ninstanceID: \nserviceAccountName: argo-sa\nnamespace: dev-axis\nartifactRepository:\nexecutorImage: \"argoproj/argoexec:v2.1.0-beta1\"\n"
time="2018-07-26T23:46:04Z" level=info msg="Workflow Controller (version: v2.1.0-beta1) starting"
time="2018-07-26T23:46:04Z" level=info msg="Watch Workflow controller config map updates"
time="2018-07-26T23:46:04Z" level=info msg="Detected ConfigMap update. Updating the controller config."
time="2018-07-26T23:46:04Z" level=info msg="workflow controller configuration from rolling-otter-workflow-controller-configmap:\ninstanceID: \nserviceAccountName: argo-sa\nnamespace: dev-axis\nartifactRepository:\nexecutorImage: \"argoproj/argoexec:v2.1.0-beta1\"\n"
time="2018-07-26T23:48:59Z" level=info msg="Processing workflow" namespace=dev-axis workflow=arguments-via-webhook-event
time="2018-07-26T23:48:59Z" level=info msg="Updated phase  -> Running" namespace=dev-axis workflow=arguments-via-webhook-event
time="2018-07-26T23:48:59Z" level=info msg="Failed to create pod arguments-via-webhook-event (arguments-via-webhook-event): pods \"arguments-via-webhook-event\" is forbidden: cannot set an ownerRef on a resource you can't delete: User \"system:serviceaccount:dev-axis:argo-sa\" cannot delete pods in project \"dev-axis\", <nil>" namespace=dev-axis workflow=arguments-via-webhook-event
@jessesuen
Copy link
Member

cannot set an ownerRef on a resource you can't delete

Yes the error message is pretty clear on what needs to be done. I haven't seen this on K8s v1.10 on GKE. I searched the k8s codebase for this error message, and it appears to be related to the GC admission plugin.

https://github.com/kubernetes/kubernetes/blob/88c25ca2d957ed32b9d24b91880450560b0062c1/plugin/pkg/admission/gc/gc_admission.go#L116

I think the fix should be simple:

- apiGroups:
  - ""
  resources:
  - pods
  - pods/exec
  verbs:
  - create
  - get
  - list
  - watch
  - update
  - patch
  - delete <<<<< this rule should be added

I don't have a way to reproduce this. Can you confirm that by adding the delete to the policy rule, it will fix the issue? If so I'll update the rules.

@magaldima
Copy link
Contributor Author

@jessesuen after adding the following delete role.. I got a slightly different error:

time="2018-07-27T12:43:28Z" level=info msg="Processing workflow" namespace=dev-axis workflow=arguments-via-webhook-event
time="2018-07-27T12:43:28Z" level=info msg="Updated phase  -> Running" namespace=dev-axis workflow=arguments-via-webhook-event
time="2018-07-27T12:43:28Z" level=info msg="Failed to create pod arguments-via-webhook-event (arguments-via-webhook-event): pods \"arguments-via-webhook-event\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: User \"system:serviceaccount:dev-axis:argo-sa\" cannot update workflows/finalizers.argoproj.io in project \"dev-axis\", <nil>" namespace=dev-axis workflow=arguments-via-webhook-event
time="2018-07-27T12:43:28Z" level=info msg="Pod node arguments-via-webhook-event (arguments-via-webhook-event) initialized Error (message: pods \"arguments-via-webhook-event\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: User \"system:serviceaccount:dev-axis:argo-sa\" cannot update workflows/finalizers.argoproj.io in project \"dev-axis\", <nil>)" namespace=dev-axis workflow=arguments-via-webhook-event

After finding this openshift issue, I updated the clusterrole to:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: argo-cluster-role
rules:
- apiGroups: ["argoproj.io"]
  resources: ["workflows", "workflows/finalizers"]
  verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list"]
- apiGroups: [""]
  resources: ["persistentvolumeclaims"]
  verbs: ["create", "delete"]
- apiGroups: [""]
  resources: ["pods", "pods/exec", "pods/log"]
  verbs: ["create", "get", "list", "watch", "update", "patch", "delete"]

This worked!

@jessesuen
Copy link
Member

Thanks for discovering this! I had no idea that */finalizers was a resource. I'll update the rules to include workflows/finalizers and delete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants