BenchWiki
Community AI inference benchmarks
⌕
LLM Inference
LLM Quality
Cloud Latency
Image Gen
Speech
Embedding
Topology
All
UNIFIED (Apple)
GPU_SINGLE
GPU_MULTI
HYBRID
CPU_ONLY
Framework
All
Ollama
MLX
vLLM
llama.cpp
TensorRT-LLM
LM Studio
oMLX
Quant
All
GGUF
MLX
AWQ
GPTQ
FP8
BF16
Provider
All
OpenAI
Anthropic
Together
Fireworks
Groq
Gemini
Dir
↓
Loading…
← Prev
Next →