Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UPSTREAM: <carry>: pods in openshift-* namespace can be marked critical #19104

Merged

Conversation

derekwaynecarr
Copy link
Member

If we pursue the static pod based deployment topology, we should extend the critical pod support to openshift-* namespaces pending priority/preemption graduation in kubernetes.

input from @smarterclayton @liggitt @sjenning @aveshagarwal appreciated.

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 26, 2018
@derekwaynecarr
Copy link
Member Author

/hold pending review.

@openshift-merge-robot openshift-merge-robot added the vendor-update Touching vendor dir or related files label Mar 26, 2018
@derekwaynecarr derekwaynecarr added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 26, 2018
@aveshagarwal
Copy link
Contributor

@derekwaynecarr where are we using scheduler.alpha.kubernetes.io/critical-pod in openshift?

@aveshagarwal
Copy link
Contributor

@derekwaynecarr is it for avoiding eviction by kubelet when eviction threshold is reached? Or is there any other reason? just trying to understand.

@aveshagarwal
Copy link
Contributor

Using scheduler.alpha.kubernetes.io/critical-pod which never came out of alpha and is being deprecated by priority/preemption, does not seem right way to go at this stage. May be we can find some other ways for time being than using scheduler.alpha.kubernetes.io/critical-pod?

@aveshagarwal
Copy link
Contributor

@derekwaynecarr it is not clear to me why we need this for static pod based deployment topology?

@derekwaynecarr
Copy link
Member Author

@aveshagarwal see details here: kubernetes/kubernetes#40573

static pods once evicted are never restarted, so we need to ensure they are never evicted.

@aveshagarwal
Copy link
Contributor

@aveshagarwal see details here: kubernetes/kubernetes#40573

static pods once evicted are never restarted, so we need to ensure they are never evicted.

Wondering what would evicts those pods in the context we are discussing here? If the eviction happens due to reaching any eviction thresholds on a machine, we should make sure such conditions don't occur on those machines (should not be difficult if we dont allow any other pods on the machines), where infra static pods are running. If its about drain, drain by default does not evict static pods unless --force is used.

@smarterclayton
Copy link
Contributor

smarterclayton commented Mar 27, 2018 via email

@derekwaynecarr
Copy link
Member Author

@aveshagarwal independent of masters, openshift-logging would benefit as well.

@aveshagarwal
Copy link
Contributor

@aveshagarwal independent of masters, openshift-logging would benefit as well.

I agree with that.

Was just wondering if there is an option to not use scheduler.alpha.kubernetes.io/critical-pod and disable eviction on kubelet for static pod deployment strategy (assuming only critical work loads are run and not any non-critical work loads).

For openshift-logging which needs to run on all nodes, right now it seems only scheduler.alpha.kubernetes.io/critical-podcan help to avoid eviction. Still thinking/wondering if there is any possibility to use priority/pre-emption alpha feature than this critical annotation based alpha feature.

@sjenning
Copy link
Contributor

This doesn't build right now

vendor/k8s.io/kubernetes/pkg/kubelet/types/pod_update.go:155:56: not enough arguments in call to strings.HasPrefix

@derekwaynecarr
Copy link
Member Author

derekwaynecarr commented Apr 20, 2018

current list of critical pods:

  1. sdn
  2. ovs
  3. sync

should be critical (but do not appear to be as of 4/20)

  1. master-api
  2. master-controllers
  3. master-etcd
  4. logging fluentd ds

up for debate:

  1. router (in default namespace, so not covered by this pr, i dont think its crtical)
  2. docker-registry (^)
  3. web-console (dont think its critical)

fyi @smarterclayton

@derekwaynecarr
Copy link
Member Author

i think this should pass all checkers now...

@derekwaynecarr derekwaynecarr removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 24, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2018
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 24, 2018
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2018
@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 24, 2018
@liggitt
Copy link
Contributor

liggitt commented May 1, 2018

/lgtm

will drop once we get priority in 1.11

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 1, 2018
@derekwaynecarr
Copy link
Member Author

/test gcp
/test extended_conformance_install

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@@ -287,7 +287,7 @@ var defaultKubernetesFeatureGates = map[utilfeature.Feature]utilfeature.FeatureS
AppArmor: {Default: true, PreRelease: utilfeature.Beta},
DynamicKubeletConfig: {Default: false, PreRelease: utilfeature.Alpha},
ExperimentalHostUserNamespaceDefaultingGate: {Default: false, PreRelease: utilfeature.Beta},
ExperimentalCriticalPodAnnotation: {Default: false, PreRelease: utilfeature.Alpha},
ExperimentalCriticalPodAnnotation: {Default: true, PreRelease: utilfeature.Alpha},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that I am responding late but yesterday, I realized that a side-effect of enabling this feature is kubelet will start preempting pods(https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/preemption/preemption.go) when there is a resource crunch on the node. This is when kubelet starts or when static pods are admitted onto the node but there is very good chance that some 'high priority'(not critical ones) pods are evicted from the node as preemption logic there is not priority aware.(except for critical pods).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants