llama-3.3-70b-instruct-fp8-fast
meta
Llama 3.3 70B quantized to fp8 precision, optimized to be faster.
- Context window
- 24,000 tokens
- Input cost
- $0.29 / 1M
- Output cost
- $2.25 / 1M
- Latency (p50)
- —
Run side-by-side checks for pricing, context window, and latency.
meta
Llama 3.3 70B quantized to fp8 precision, optimized to be faster.