vLLM

Organization

@vllm-project

On GitHub since June 2023

View on GitHub

Published Tools

616,369

Total Stars

Weekly Downloads

3,435

GitHub Followers

Public Repos

100/100

Avg Security

Published Tools

15 Skills1 Agentacross 3 categories

vllm-tpu

vLLM Team

A high-throughput and memory-efficient inference and serving engine for LLMs

Skillai-ml

77K1 dir

vllm-cpu-amxbf16

vLLM CPU inference engine (AVX512 + VNNI + BF16 + AMX optimized)

Skillai-ml

75K1 dir

vllm-cpu-avx512bf16

vLLM CPU inference engine (AVX512 + VNNI + BF16 optimized)

Skillai-ml

75K1 dir

vllm-cpu-avx512vnni

vLLM CPU inference engine (AVX512 + VNNI optimized)

Skillai-ml

75K1 dir

vllm-cpu-avx512

vLLM CPU inference engine (AVX512 optimized)

Skillai-ml

75K1 dir

vllm-cpu

vLLM Team

A high-throughput and memory-efficient inference and serving engine for LLMs

Skillai-ml

75K1 dir

vllm-hust

vLLM Team

A high-throughput and memory-efficient inference and serving engine for LLMs

Skillai-ml

74K1 dir

vllm

vLLM Team

A high-throughput and memory-efficient inference and serving engine for LLMs

AgentLLM Inference

74K3 dirs

vllm-omni

vLLM-Omni Team

A framework for efficient model inference with omni-modality models

Skilluncategorised

3.7K2 dirs

vllm-sr-sim

vLLM Semantic Router Team

vLLM Semantic Router fleet simulator for capacity planning, SLO validation, and what-if analysis

Skillai-ml

3.5K1 dir

llm-katan

Yossi Ovadia <[email protected]>

LLM Katan - Lightweight LLM Server for Testing - Real tiny models with FastAPI and HuggingFace

Skilluncategorised

3.5K2 dirs

llmcompressor

Neuralmagic, Inc.

A library for compressing large language models utilizing the latest techniques and research in the field for both training aware and post training techniques. The library is designed to be flexible and easy to use on top of PyTorch and HuggingFace Transformers, allowing for quick experimentation.

...more

Skillai-ml

3K1 dir

vllm-ascend

vLLM-Ascend team

vLLM Ascend backend plugin

Skillai-ml

1.9K1 dir

vllm-sr

vLLM-SR Team

vLLM Semantic Router - Intelligent routing for Mixture-of-Models

Skilluncategorised

2 dirs

wxy-test

vLLM Team

A high-throughput and memory-efficient inference and serving engine for LLMs

Skillai-ml

1 dir

vllm-cpu-nightly

vLLM Team

A high-throughput and memory-efficient inference and serving engine for LLMs

Skillai-ml

1 dir