Inception: Mercury
inception
Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude...
- Context window
- 128,000 tokens
- Input cost
- $0.25 / 1M
- Output cost
- $0.75 / 1M
- Latency (p50)
- —
