TinyLlama 1.1B Chat v1.0

flagship

tinyllama · released 2023-09-01 · text

currently routing · 4.2k rpm

2.0M tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

open

License

/ ABOUT

TinyLlama 1.1B Chat v1.0 is a compact 1.1-billion parameter language model built on the LLaMA architecture, designed for resource-constrained deployments. Despite its small size, it provides functional chat capabilities through instruction tuning on conversational datasets.

The model was trained as a community project to create a truly lightweight open model that can run on consumer hardware, mobile devices, and even CPUs. It's suitable for basic chat, simple question answering, and lightweight text generation tasks where larger models would be impractical.

TinyLlama is ideal for educational purposes, prototyping, embedded applications, and any scenario where model size is the primary constraint. It demonstrates that useful AI can exist at very small scales.

Providers for TinyLlama 1.1B Chat v1.0

1 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider

Context

Quant

Uptime · 30d

Cloudflare Workers AI

—

bf16

0.00%