llama-2-7b-chat-hf-lora
meta-llama
This is a Llama2 base model that Cloudflare dedicated for inference with LoRA adapters. Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.
- Context window
- 8,192 tokens
- Input cost
- $0.00 / 1M
- Output cost
- $0.00 / 1M
- Latency (p50)
- —
