⚙️
LLM Inference & Serving
74 tools
Inference runtimes, model serving platforms, fine-tuning infra, and GPU/accelerator providers for LLMs.
OpenRouter
OpenRouter, Inc.
The Unified Interface For LLMs
24 stories
vLLM
vLLM Project
The High-Throughput and Memory-Efficient inference and serving engine for LLMs
18 stories
SGLang
LMSYS Corp.
Fast serving framework for LLMs and agents
12 stories
AI Studio
Google
Fastest way to start building with Gemini
5 stories
Ollama
Ollama Inc.
Ollama is the easiest way to run open AI models locally or in the cloud, with a simple API and 40,000+ integrations.
5 stories
Amazon Bedrock
Amazon Web Services
The platform for building generative AI applications and agents at production scale
2 stories
Hugging Face Hub
Hugging Face
The central hub for models, datasets, and Spaces.
2 stories
NVIDIA NIM
NVIDIA
Deploy AI models with optimized inference microservices
2 stories
Anthropic
Anthropic
AI developer platform
1 story
Baidu Qianfan
Baidu
Baidu AI Cloud's large-model platform
1 story
Decoupled DiLoCo
Google DeepMind
Resilient, distributed AI training at scale
1 story
D
DFlash
DFlash
DFlash software product
1 story
FlashQLA
Alibaba Cloud
Alibaba Cloud software product
1 story
Miles RL Training
RadixArk
Reinforcement-learning training software
1 story
N
Nous Portal
Nous Research
Access Nous Research's AI portal
1 story
O
OpenAI
OpenAI
AI platform and product suite
1 story
Tile Kernels
Hangzhou DeepSeek Artificial Intelligence Co., Ltd.
A kernel library written in tilelang
1 story
Zyphra Inference
Zyphra
Serverless inference for frontier open-weight models
1 story
AFM Playground
Arcee AI
Playground for AFM models
0 stories
AI/ML API
AI/ML API
One API +400 AI models
0 stories
Baidu AI Studio LLM API
Baidu
LLM API for Baidu AI Studio
0 stories
Baseten
Baseten
Inference Platform: Deploy AI models in production
0 stories
BytePlus
BytePlus Pte Ltd.
AI-Native Cloud for Enterprise Growth
0 stories
Cerebras
Cerebras Systems
AI training and inference platform
0 stories
Coding Agents
Baseten
The best coding agents run on Baseten
0 stories
Conway
Conway Research
Infrastructure for self-improving, self-replicating, autonomous AI
0 stories
D
DeepClaude
Asterisk
1+1 > 2 - Combine Advanced Reasoning and Coding
0 stories
DeepEP
DeepSeek
High-throughput, low-latency expert parallel communication library.
0 stories
DeepGEMM
DeepSeek
FP8 GEMM library
0 stories
DGX Spark
NVIDIA
AI supercomputer on your desk
0 stories
Exo
Exo Labs
Run frontier AI locally.
0 stories
fal
fal
Generative media platform for developers.
0 stories
Fireworks AI
Fireworks AI, Inc.
Build with the best open models.
0 stories
FlashMLA
DeepSeek
Fast MLA decoding kernel for Hopper GPUs
0 stories
Gemini Live API
Google
Real-time, bidirectional multimodal API for Gemini.
0 stories
Google AI Edge Gallery
Google LLC
Explore, Experience, and Evaluate the Future of On-Device Generative AI with Google AI Edge.
0 stories
Google Cloud
Google
Cloud computing services from Google
0 stories
Grok Build
xAI
Build with Grok
0 stories
GuideLLM
Red Hat
SLO-aware Benchmarking and Evaluation Platform for Optimizing Real-World LLM Inference
0 stories
Hugging Face
Hugging Face
The AI community building the future.
0 stories
Interfaze
JigsawStack, Inc.
AI interface platform
0 stories
Keras Kinetic
Keras
Run ML workloads remotely on cloud TPUs and GPUs.
0 stories
Lightning AI
Lightning AI
Idea to AI product, ⚡️ fast.
0 stories
LiteLLM
BerriAI
AI Gateway to provide model access, fallbacks and spend tracking across 100+ LLMs. All in the OpenAI format.
0 stories
LM Studio
Element Labs, Inc.
Run AI models, locally and privately.
0 stories
Mirage
Crisp
Augment your apps with AI
0 stories
ModelScope
Alibaba Cloud
ModelScope 魔搭社区
0 stories
Modular
Modular
Inference from Kernel to Cloud.
0 stories
Mooncake
KVCache.ai
A KVCache-centric Disaggregated Architecture for LLM Serving
0 stories
Multimodal Max
Modular
GenAI-native serving and modeling, built for performance.
0 stories
NeMo-RL
NVIDIA
Scalable RL post-training for language models.
0 stories
NVIDIA DGX 8xB200
NVIDIA
The foundation for your AI factory.
0 stories
Open Generative AI
Muapi
Generative AI service
0 stories
O
Open Responses
Open Responses
Unverified product profile
0 stories
OrcaRouter
Continuum AI
One API. Multi-provider. Zero markup.
0 stories
PaddlePaddle AI Studio
Baidu, Inc.
One-stop AI development platform
0 stories
Pocket TTS
Kyutai
Text-to-speech by Kyutai
0 stories
Prime Intellect
Prime Intellect
Distributed training and inference infrastructure
0 stories
Prime Intellect Lab
Prime Intellect
AI lab for experimenting with language models
0 stories
RunPod
Runpod, Inc.
AI infrastructure developers trust
0 stories
T
TileLang
Tile-AI
A concise domain-specific language for high-performance GPU/CPU kernels.
0 stories
TileLang-Ascend
Tile-AI
Ascend TileLang adapter
0 stories
Together Fine-Tuning
Together AI
Fine-tune language models on Together AI.
0 stories
Train Models
TrainEngine.ai
Train models with TrainEngine.ai
0 stories
Unsloth
Unsloth AI
Easily run & train models locally.
0 stories
Unsloth AI
Unsloth
Train and Run Models Locally
0 stories
U
Unsloth Studio
Unsloth
open-source, no-code web UI for training, running and exporting open models in one unified local interface
0 stories
Vast.ai
Vast.ai
Cloud GPU Marketplace
0 stories
Venice API
Venice.ai
Developer API for AI models
0 stories
Vertex AI
Google Cloud
Build, deploy, and scale machine learning models.
0 stories
vLLM Omni
vLLM Project
Multimodal inference and serving
0 stories
xAI
xAI
Developer AI platform
0 stories
ZenMux
AI Force Singapore Pte. Ltd.
AI software platform
0 stories
Zyphra Cloud
Zyphra Technologies Inc.
A full-stack AI platform on AMD powered by TensorWave
0 stories