You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the current setup, NVIDIA_VISIBLE_DEVICES env variable is added to configmap so that we pin the pod to a MIG slice. A user pod could have this variable set in the pod at submit time which will provide container access to a slice not chosen by InstaSlice and in the worst case access to all the GPUs on the node. We should modify the webhook to reject such pods at submit time.
The text was updated successfully, but these errors were encountered:
Also, we can consider modifying the webhook to intercept not just pod creation but pod update to make sure users don't intentionally set NVIDIA_VISIBLE_DEVICES to 0.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
/remove-lifecycle stale
openshift-cibot
added
lifecycle/rotten
Denotes an issue or PR that has aged beyond stale and will be auto-closed.
and removed
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
labels
Feb 16, 2025
In the current setup,
NVIDIA_VISIBLE_DEVICES
env variable is added to configmap so that we pin the pod to a MIG slice. A user pod could have this variable set in the pod at submit time which will provide container access to a slice not chosen by InstaSlice and in the worst case access to all the GPUs on the node. We should modify the webhook to reject such pods at submit time.The text was updated successfully, but these errors were encountered: