Nvidia Framework for LLM Inference
Cross-referenced across 55 tracked directories
#355
Popularity Rank
2 / 55
Listed In
Emerging
Adoption Stage
Mar 13, 2026
First Seen
Recently added to the ecosystem
Cross-Posting Opportunities
Could also be listed in these directories:
Score: 100/100
0 dependency vulnerabilities found
Run an AI-powered security scan to analyze this package's source code for vulnerabilities, prompt injection vectors, data exfiltration risks, and behavior mismatches.
Scans fetch actual source code from the GitHub repository — not just the README.
a toolkit for deploying and serving Large Language Models (LLMs).
To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.
...moreSGLang is a fast serving framework for large language models and vision language models.
NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)