-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Hot-plug vCPUs #2609
Comments
@jeromegn does vertical scaling is the only solution to the problem you are to trying to fix? Does horizontal scaling like multiple instances not helped. Just curious to understand the usecase. |
Hi @jeromegn! You might also want to look at cpu online/offline kernel feature - https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu. |
@KarthikNedunchezhiyan we need to support a large variety of workloads. More VMs isn't always the solution, but we're already doing that. @raduiliescu thanks! That could work, but our users have |
We need to think about it for a bit, we will get back to you. |
marking this as parked right now but we will track this as part of our roadmap. We consider this as well while working on this. |
Hey all, |
Thanks for the update on this. Just to clarify, 30ms per vCPU sounds like an acceptable amount of latency for a vCPU hotplug process for most use cases, but it sounds like there's an issue in the prototype that requires re-adding previously-hotplugged vCPUs after a snapshot-restore, which is why this latency applies to the post-restore scenario. In the linked branch, it sounds like the issue has something to do with the vCPU config not persisting after a restore: firecracker/docs/vcpu-hotplug.md Lines 127 to 132 in c41ceb5
Is this understanding correct? Does the next step in the implementation involve updating the snapshot state format to include the relevant vCPU hotplug info in the microVM state, or was there some other issue in the implementation that requires vCPU hotplug latency in the snapshot restore process? |
Hi Will,
We only implemented pre-snapshot hotplugging back in July to get an idea of the latencies involved, but for the usecase we were looking at, we actually want to hotplug vCPUs only after restore (e.g. no hotplug ever happens before a snapshot is taken).
Essentially, the entire ACPI hotplug device would need to be persisted in the snapshot, but generally there shouldn't be much of an issue with that. |
Feature Request
We'd like to be able to add (and remove) vCPUs from running firecracker microvms. It appears to be possible to do that with KVM. Examples I've seen define how many vCPUs at most a VM might use and then the actual number it will be using at boot. Then you can add vCPUs via
virsh
.https://www.unixarena.com/2015/12/linux-kvm-how-to-add-remove-vcpu-to-guest-on-fly.html/
This would allow use to add a "burst" feature when CPU usage spikes.
Describe the desired solution
An API to modify a running microvm's vCPUs count.
This should notify the guest VM of the change:
Describe possible alternatives
We could give every firecracker microvm access to all cores and only use cgroups to limit actual scheduling time. This is not great though as it might create a lot of CPU steal. We prefer to give full cores when possible.
Checks
The text was updated successfully, but these errors were encountered: