Run meta-llama/Llama-3.2-3B-Instruct on your data

meta-llama/Llama-3.2-3B-Instruct is an LLM designed for multilingual dialogue, agentic retrieval, and summarization tasks. It excels in supporting up to eight officially recognized languages, offers a long context length of 128,000 tokens, and is optimized for efficient on-device use where privacy and low latency are important.

Some other noteworthy use cases of meta-llama/Llama-3.2-3B-Instruct include tool use (such as extracting action items or sending calendar invites) and customizable fine-tuning for domain-specific applications.

Metric	Value
Parameter Count	3.21 billion
Mixture of Experts	No
Context Length	128,000 tokens
Multilingual	Yes
Quantized*	No

*Quantization is specific to the inference provider and the model may be offered with different quantization levels by other providers.

	Modality	Price (1M tokens)
Llama 3.1 405B Instruct	text	text	$3.00	$3.00
Llama 3.1 70B Instruct	text	text	$0.90	$0.90
Llama 3.3 70B Instruct	text	text	$0.90	$0.90
Llama 3.3 70B Versatile 128k	text	text	$0.59	$0.79
Llama 4 Maverick	text, image	text	$0.22	$0.88
meta-llama/Llama-3.1-8B-Instruct	text	text	$0.05	$0.08
meta-llama/Llama-3.2-1B-Instruct	text	text	N/A	N/A
meta-llama/Llama-3.2-3B-Instruct	text	text	$0.02	$0.02
meta-llama/Llama-4-Scout-17B-16E-Instruct	text	text	$0.08	$0.30

Modality

Price (1M tokens)

Model

Input

Output

Input

Output

Llama 3.1 405B Instruct

text

$3.00

Llama 3.1 70B Instruct

text