Skip to content

Latest commit

 

History

History
29 lines (17 loc) · 2.24 KB

README.md

File metadata and controls

29 lines (17 loc) · 2.24 KB

Adopters, Integrations and Presentations

Adopters

This is based on public documentations, please open an issue if you would like to be added or removed the list.

AWS:

  • Amazon EKS supports to run superpod with LeaderWorkerSet to server large LLMs, see blog here.
  • A Terraform based EKS Blueprints pattern can be found here. This pattern demonstrates an Amazon EKS Cluster with an EFA-enabled nodegroup that support multi-node inference using vLLM and LeaderWorkerSet.

DaoCloud: LeaderWorkerSet is the default deployment method to run large models crossing multiple nodes on Kubernetes.

Google Cloud:

  • GKE leverages LeaderWorkerSet to deploy and serve multi-host gen AI large open models, see blog here.
  • A guide to serve DeepSeek-R1 671B or Llama 3.1 405B on GKE, see guide here

Nvidia: LeaderWorkerSet deployments are the recommended method for deploying Multi-Node models with NIM, see document here.

Integrations

Feel free to submit a PR if you use LeaderWorkerSet in your project and want to be added here.

llmaz: llmaz, serving as an easy to use and advanced inference platform, uses LeaderWorkerSet as the underlying workload to support both single-host and multi-host inference scenarios.

vLLM: vLLM is a fast and easy-to-use library for LLM inference, it can be deployed with LWS on Kubernetes for distributed model serving, see documentation here.

Talks and Presentations