Guides / chatbots

Cheapest LLM for a chatbot

For an always-on chatbot, cost is dominated by the system prompt and conversation history replayed on every turn — so input price matters most. These are the cheapest generally-available models for a typical chat workload.

The cheapest pickLlama 3.1 8B Instruct
$0.29/mo for a busy chatbot handling ~10M input and ~3M output tokens a month · $0.02 in / $0.03 out per 1M · Meta
The ranking

Cheapest models for chatbots

Monthly cost for a busy chatbot handling ~10M input and ~3M output tokens a month. Sorted cheapest first.

#ModelContextInput $/MOutput $/MMonthly cost
1Llama 3.1 8B Instruct
Meta
128K$0.02$0.03$0.29 ◎
2Ministral 3 3B
Mistral
$0.04$0.04$0.52
3Amazon Nova Micro
Amazon
128K$0.035$0.14$0.77
4Command R7B
Cohere
128K$0.037$0.15$0.82
5Amazon Nova Lite
Amazon
300K$0.06$0.24$1.32
6Qwen-Flash
Alibaba
1M$0.05$0.4$1.70
7Llama 4 Scout (17B-16E Instruct)
Meta
10M$0.1$0.3$1.90
8Ministral 3 8B
Mistral
256K$0.15$0.15$1.95
9Llama 3.3 70B Instruct
Meta
128K$0.1$0.32$1.96
10Qwen3.5-Flash
Alibaba
1M$0.1$0.4$2.20
11Ministral 3 14B
Mistral
$0.2$0.2$2.60
12Llama 4 Maverick (17B-128E Instruct)
Meta
1M$0.15$0.6$3.30

Estimate only; excludes prompt caching, batch discounts and free tiers. Different volumes change the ranking —run your own numbers. Prices verified against official docs · catalog updated 2026-06-28.

Methodology

Chat traffic skews input-heavy (~3:1), because the system prompt and prior turns are re-sent each message while replies stay short. We weight a 10M-in / 3M-out monthly workload accordingly. A small, fast model is usually enough; reach for a flagship only when answer quality demands it.

FAQ

Cheapest LLM for chatbots

What is the cheapest LLM for chatbots?

Llama 3.1 8B Instruct (Meta) is the cheapest generally-available model we track for chatbots, at $0.02 per 1M input tokens and $0.03 per 1M output tokens — about $0.29/month for a busy chatbot handling ~10M input and ~3M output tokens a month. Ministral 3 3B is the next cheapest at $0.52/month.

How is "cheapest for chatbots" calculated?

We price a representative monthly workload — a busy chatbot handling ~10M input and ~3M output tokens a month — against every generally-available model, then rank by total cost. All prices are USD per 1M tokens, sourced from official provider documentation.

Is the cheapest model always the right choice for chatbots?

No. Price is one axis; quality, latency, rate limits and reliability matter too. Use this ranking to shortlist, then test the top candidates on your own chatbots workload before committing. Cost is easy to measure — fit is not.