-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the process of pod updates by preferring non-mutating SCCs and reducing pod mutations #16934
Improve the process of pod updates by preferring non-mutating SCCs and reducing pod mutations #16934
Conversation
8941821
to
13f83ad
Compare
/assign @eparis |
pkg/security/admission/admission.go
Outdated
if errs := scc.AssignSecurityContext(provider, pod, field.NewPath(fmt.Sprintf("provider %s: ", provider.GetSCCName()))); len(errs) > 0 { | ||
clone, err := kapi.Scheme.DeepCopy(pod) | ||
if err != nil { | ||
return admission.NewForbidden(a, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt This is the equivalent for pod.DeepCopy()
. I hope that it's ok to return Forbidden when we can't clone a pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's fine. definitely a "should never happen" type of error
// ensure anything we expected to be defaulted on the container level is set | ||
if *containers[0].SecurityContext.RunAsUser != v.expectedUID { | ||
t.Errorf("%s expected UID %d but found %d", k, v.expectedUID, *containers[0].SecurityContext.RunAsUser) | ||
podSC := securitycontext.NewPodSecurityContextAccessor(v.pod.Spec.SecurityContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt I decided to use accessors here but I suspect there should be a better option for updating this test. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it needs to look like the upstream tests... expectedPodSecurityContext and expectedContainerSecurityContext, and do literal comparisons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've filled a separate PR for this change: #17296
@@ -15,6 +15,7 @@ import ( | |||
v1kapi "k8s.io/kubernetes/pkg/api/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt Shall I backport the tests for admission
from your PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it probably won't be an exact copy (since we prefer more restrictive rather than non-mutating), but we do need the tests to start looking at exact matches of PodSecurityContext and SecurityContext, and to check for non-mutation on update
PTAL @openshift/sig-security |
@@ -141,16 +141,17 @@ func TestAdmitCaps(t *testing.T) { | |||
for i := 0; i < 2; i++ { | |||
for k, v := range tc { | |||
v.pod.Spec.Containers, v.pod.Spec.InitContainers = v.pod.Spec.InitContainers, v.pod.Spec.Containers | |||
|
|||
testSCCAdmit(k, v.sccs, v.pod, v.shouldPass, t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because testSCCAdmit()
modifies containers, I had to move it above, so when we're assigning containers to a local variable we get the updated value.
As far I understand, prior this change the admissions plugin was modifying a members of the passed argument and now it completely replacing an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's fine... shouldn't start saving pointers to internal bits of the admitted object until after it is admitted
pkg/security/admission/admission.go
Outdated
allowingProvider = provider | ||
break loop | ||
case specMutationAllowed && allowedPod == nil: | ||
// if mutation is allowed and this is the first SCC to allow the pod, remember it, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach that @liggitt has used for Kubernetes differs from the our current. For example, I created a simple pod under system:admin
and a pod was admitted by the privileged
SCC (because it accepts a pod without mutation). But the current version of OpenShift would admit my pod by the anyuid
SCC (because it less permissive).
@liggitt @pweil- What is the desired behavior for this case? Preserve backward compatibility, be compatible with Kubernetes, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to preserve backwards compatibility as much as possible here while working around the mutation bug. But if it doesn't mutate the pod does it matter which SCC admitted it? I would think only a change in mutation behavior would be a problem here. (wouldn't anyuid
still mutate the selinux labels where privileged
would not?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on create, keep our current priority order. on update (where mutation is not allowed), follow the non-mutating logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our handling of this will look different than PSP, because of our "prefer most restrictive" behavior we have to stay compatible with.
I'd expect something along these lines:
// the entire pod validated
switch {
case specMutationAllowed:
// if we allow mutation, just use the first SCC that allows the pod
allowedPod = podCopy
allowingProvider = provider
break loop
case apiequality.Semantic.DeepEqual(pod, podCopy):
// if we don't allow mutation, only use the validated pod if it didn't require any spec changes
allowedPod = podCopy
allowingProvider = provider
glog.V(6).Infof("pod %s (generate: %s) validated against provider %s without mutation", pod.Name, pod.GenerateName, provider.GetSCCName())
break loop
default:
glog.V(6).Infof("pod %s (generate: %s) validated against provider %s, but required mutation, skipping", pod.Name, pod.GenerateName, provider.GetSCCName())
}
Test Flake #16999 /test extended_conformance_gce |
/test cmd Test flake #16273 |
sc := securitycontext.NewEffectiveContainerSecurityContextMutator( | ||
securitycontext.NewPodSecurityContextAccessor(pod.Spec.SecurityContext), | ||
securitycontext.NewContainerSecurityContextMutator(container.SecurityContext), | ||
) | ||
annotations := maps.CopySS(pod.Annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to copy this? we're copying top-level... we should be free to modify anything in the pod now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need to copy, so why didn't you remove it -- kubernetes/kubernetes@8c5b013#diff-291b8dd7d08cc034975ddb3925dbb08fR245 ? Should we remove it then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oversight. we can remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've submitted kubernetes/kubernetes#55504 for that.
582ef5a
to
5d40689
Compare
Automatic merge from submit-queue (batch tested with PRs 17369, 17296). admission_test.go(TestAdmit): compare SecurityContexts instead of particular members This makes the test close to the upstream implementation. Extracted from #16934 (comment) PTAL @liggitt @pweil- @adelton @simo5
e857076
to
753f7ae
Compare
@liggitt All comments were addressed. PTAL. |
/hold |
7ec5eba
to
de94214
Compare
/hold cancel |
All checks failed because #17884 |
/retest |
[Feature:Builds][pruning] prune builds based on settings in the buildconfig should prune completed builds based on the successfulBuildsHistoryLimit setting [Suite:openshift/conformance/parallel] failed because of #17330 executing 'oc observe services --once --all-namespaces -a "bad{ .metadata.annotations.unset }key" --strict-templates' expecting failure and text 'annotations is not found' failed because of #17811 |
@liggitt PTAL |
@liggitt New Year Ping :) |
/retest |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: liggitt, php-coder The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Test flake #17781 |
Test flake #17811 |
#17811 again |
Tests failed because of #17985 |
/retest |
that is caused by:
that is a test flake #17330
that is test flake #17581 /test extended_conformance_install |
Test /test extended_conformance_install |
Automatic merge from submit-queue (batch tested with PRs 17856, 16934, 17979, 17993, 18001). |
…ring_pod_update Automatic merge from submit-queue (batch tested with PRs 17856, 16934, 17979, 17993, 18001). Improve the process of pod updates by preferring non-mutating SCCs and reducing pod mutations This is adaptation of kubernetes/kubernetes#52849 to SCC. Details (mostly copy&pasted from original PR): - SCCs which allow the pod as-is (no defaulting/mutating) are preferred - During update operations, when mutations to pod specs are disallowed, only non-mutating SCCs are used to validate the pod - Removes unnecessary mutation of pods: - Determines effective security context for pods using a wrapper containing the pod and container security context, rather than building/setting a combined struct on every admission - Does not set `privileged: &false` on security contexts with `privileged: nil` - Does not set `runAsNonRoot: &true` on security contexts that already have a non-nil, non-0 runAsUser - Does not mutate/normalize container capabilities unless changes are required (missing `defaultAddCapabilities` or `requiredDropCapabilities`) Fixes #16467
This is adaptation of kubernetes/kubernetes#52849 to SCC.
Details (mostly copy&pasted from original PR):
privileged: &false
on security contexts withprivileged: nil
runAsNonRoot: &true
on security contexts that already have a non-nil, non-0 runAsUserdefaultAddCapabilities
orrequiredDropCapabilities
)Fixes #16467