modelstop.top
Back to models
theblokemodel

TinyLlama-1.1B-Chat-v0.3-GPTQ

Open-source TinyLlama-1.1B-Chat-v0.3-GPTQ model from thebloke — available for download and self-hosting on Hugging Face.

Best for

General ChatSummarisationContent Generation
Context Window
Input Cost
Free
Output Cost
Latency p50

Pricing Details

No pricing data. Model may be free or requires direct access.

Run it locally

Local run command

pip install transformers accelerate && python -c "from transformers import AutoModelForCausalLM, AutoTokenizer; tok = AutoTokenizer.from_pretrained('thebloke/TinyLlama-1.1B-Chat-v0.3-GPTQ'); model = AutoModelForCausalLM.from_pretrained('thebloke/TinyLlama-1.1B-Chat-v0.3-GPTQ'); print('Loaded thebloke/TinyLlama-1.1B-Chat-v0.3-GPTQ')"

Presumptive Specs:

GPU: 12+ GB VRAM; System RAM: 16+ GB; Disk: 20+ GB free

Hallucination Score™ (est.)

Community reliability estimate · not official

Not yet rated

About this score: Community-estimated based on user reports and publicly available benchmark data (e.g. TruthfulQA). This is not an official score from the model provider. Scores may be inaccurate — always verify with the official leaderboard before making production decisions.

Price History

Not enough historical data yet. Check back after the next pricing sync.

Provider

thebloke

Community Prompts

Proven prompts shared by the community for this model

Loading prompts…