Skip to content
#

tensor-parallelism

Here are 15 public repositories matching this topic...

vLLM - High-throughput, memory-efficient LLM inference engine with PagedAttention, continuous batching, CUDA/HIP optimization, quantization (GPTQ/AWQ/INT4/INT8/FP8), tensor/pipeline parallelism, OpenAI-compatible API, multi-GPU/TPU/Neuron support, prefix caching, and multi-LoRA capabilities

  • Updated Jan 11, 2026
  • Elixir

Improve this page

Add a description, image, and links to the tensor-parallelism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tensor-parallelism topic, visit your repo's landing page and select "manage topics."

Learn more