qwen1.5-7b-chat-awq
qwen
Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization.
- Context window
- 20,000 tokens
- Input cost
- $0.00 / 1M
- Output cost
- $0.00 / 1M
- Latency (p50)
- —
