AI21 Jamba 1.6 Large
ai21
AI21 Jamba 1.6 Large uses a hybrid Mamba-Transformer architecture offering low memory footprint and high throughput compared to equivalent Transformer models. Features 256K context at a fraction of the inference cost.
- Context window
- 256,000 tokens
- Input cost
- $2.00 / 1M
- Output cost
- $8.00 / 1M
- Latency (p50)
- —
