Cerebras logo
AI provider on ChatMitra

Cerebras for WhatsApp business

World-record inference speed — Llama 3.3 70B at 2,600+ tokens/sec.

  • Free tier ✓
  • Cost: Free–Very low
  • Speed: World-record (2,600+ TPS)
  • Best for: Record-fast inference
What is a Cerebras chatbot?

It's a WhatsApp bot that runs on Cerebras's wafer-scale chips — replies arrive in milliseconds. With ChatMitra you bring your Cerebras API key — setup takes about 5 minutes, and the free tier is the most generous in the industry: 1M tokens / day.

Founded
2015
HQ
Sunnyvale, USA

What it looks like in real businesses

Four chats from real industries, all using Cerebras (llama3.3-70b) inside ChatMitra. Mix of English, Hindi, and Hinglish — the way customers actually type.

Big Earphone Store
E-commerce • Online
hinglish
Bhai earphone ka kya price hai aur kahan kahan deliver karte ho?
10:42 PM ✓✓
Hamare bestseller earphones ₹999 se start hote hain — free home delivery across India. Specs ya colour ke baare me kuch puchhna hai?
10:42 PM
Order karna hai. Ahmedabad deliver hoga?
10:42 PM ✓✓
Haan ji, Ahmedabad me delivery available hai. Aapka naam aur phone bhej dijiye — main hamari sales team ko forward kar deta hoon, woh order confirm karke link bhej denge ek-do minute me.
10:42 PM
Sharma Kirana
Food / kirana • Online
hindi
Aaj fresh paneer hai?
7:30 AM ✓✓
Haan ji, paneer hum daily morning fresh order karte hain — ₹400/kg. Kitna chahiye?
7:30 AM
500 gram. Ghar pe deliver kar do
7:30 AM ✓✓
Theek hai. 500g paneer = ₹200. Aapka naam aur address bhej dein — main shop owner ko message kar deta hoon, woh stock confirm karke delivery time aapko bhejenge.
7:30 AM
Wandr Travels
Travel • Online
hinglish
Bali 5N 6D family of 4 ka package?
2:18 PM ✓✓
Bali 5N6D for family of 4: standard 3-star package starts ~₹95,000 (per family, July low season) — flights + hotels + 2 day-trips. Premium 5-star same itinerary ~₹1,80,000. Aap dates aur budget share karein, hamari travel team aapko personalised quote bhej degi.
2:18 PM
Last week of July, flexible by 2 days
2:18 PM ✓✓
Got it. Email kya hai? Hamari consultant team ko share kar deta hoon — woh 2 quote options (3-star and 4-star) email pe bhejenge tomorrow morning ke andar.
2:18 PM
Bloom Hair & Beauty
Salon • Online
english
Hi, what's the price for a haircut + global colour?
6:50 PM ✓✓
Combined haircut + global colour with our senior stylists is ₹3,500 + tax — runs about 2 hours. Evenings tend to fill up fast. Want our front desk to find you a slot?
6:50 PM
Yes, tomorrow evening if possible
6:50 PM ✓✓
Sure — share your name and a rough preferred time. I'll forward your request and the salon team will confirm availability and the exact slot back to you.
6:50 PM

The most-recommended free-tier provider on ChatMitra for indie developers and weekend projects.

Best for
Record-fast inference
Cost
Free–Very low
Speed
World-record (2,600+ TPS)
Setup time
~5 minutes

In simple terms: Cerebras is Groq's main competitor on raw inference speed — and they offer the most generous free tier on the market: 1M tokens / day, no credit card. If your customers wait for replies, Cerebras eliminates the wait.

Not 100% sure Cerebras is right for you?

That's okay — you can switch any time inside ChatMitra. Quick orientation:

  • • If cost is your main concern, Groq or DeepSeek are worth a look
  • • If top-shelf quality matters more, OpenAI or Anthropic are the usual picks
  • • If you want balance, Gemini sits well in the middle

Still unsure? Start with Cerebras — switching later is one click inside ChatMitra without rebuilding your bot.

What is Cerebras, really?

If speed and cost matter more than polish, Cerebras is one of two providers that matter (the other is Groq). Replies arrive in milliseconds — by the time a customer finishes typing the next question, the previous reply is already on their screen.

Cerebras is a US AI compute company founded in 2015. Their wafer-scale chips run open-source models like Llama 3.3, Qwen3, and GPT-OSS at 2,600+ tokens / second — world-record speeds in independent benchmarks.

Cerebras models in 2026

The model with the green badge is what we usually recommend for everyday WhatsApp business chats. Step up only if you actually need it.

llama3.3-70b Pick
FastCheapFree tier
Context:128K tokens (8K free)
Input: $0.6(~₹50)
Output: $0.6(~₹50)
Best for:everyday WhatsApp chat at world-record speeds
llama3.1-8b
FastCheapFree tier
Context:128K tokens (8K free)
Input: $0.1(~₹10)
Output: $0.1(~₹10)
Best for:ultra-cheap routine FAQ replies
qwen3-32b
SmartCheapFree tier
Context:128K tokens (8K free)
Input: $0.4(~₹35)
Output: $0.4(~₹35)
Best for:Alibaba's strong open model on Cerebras hardware
gpt-oss-120b
SmartFree tier
Context:128K tokens (8K free)
Input: $0.6(~₹50)
Output: $0.6(~₹50)
Best for:OpenAI's open-source model hosted on Cerebras

Pricing in USD as published on Cerebras's pricing page. INR figures are approximate. Verified on 2026-04-30.

Source: official model docs · official pricing page

What will Cerebras cost my business?

Three rough volumes, costs in approximate INR. Your actual number will vary — see assumptions below.

Cerebras has a free tier: 1,000,000 tokens / day free, no credit card. 30 RPM, 60K-100K TPM. Free tier has an 8,192-token context cap. Up to 24M tokens/day on free tier (≈ $48/day in equivalent value).
Small store
100 chats / month
~₹15
Busy store
1,000 chats / month
~₹125
Diwali / festive scale
10,000 chats / month
~₹1300
Assumptions used:
  • • Avg 15–20 messages per conversation
  • • Avg 50–100 tokens per message
  • • Pricing as published by Cerebras on 2026-04-30 — may change. Source: www.cerebras.ai/pricing
Common mistake on Cerebras: Running long conversations on Cerebras's free tier. The free tier caps context at 8,192 tokens — once your customer chat goes past ~30 messages, the bot starts losing earlier context. For sustained traffic move to the Developer tier (still cheap, ~$0.60/1M) and unlock the full 128K context window.

Best for these businesses

Pick Cerebras when you want Groq's speed AND a more generous free tier. Especially great for prototyping and weekend launches.

  • Indie developers and small teams testing AI chatbots without budget
  • High-volume D2C / kirana with mostly short queries (free 8K context cap is plenty)
  • Festival sale weeks when a free 1M tokens / day covers genuine traffic
Where it doesn't fit
  • Long conversation contexts on the free tier (8K context cap)
  • Closed-flagship-quality replies — open models still trail GPT-5 / Claude on hardest queries

Cerebras's free tier is real and useful, but the 8K context cap means once a conversation hits ~30 messages, you'll start losing earlier context. For longer conversations, upgrade to the Developer tier (still cheap) or fall back to a larger-context provider.

How to get your Cerebras API key

Whole flow takes about 5 minutes. Mockups below are deliberately generic — Cerebras's dashboard may look slightly different by the time you read this, but the steps stay roughly the same.

1. Create your Cerebras account

Sign up using your business email. If you already have a Cerebras login (or a parent-product login), the same one usually works.

cloud.cerebras.ai/
Sign-up form
your-email@business.com
••••••••
Continue

2. Verify your email & claim free credits

Cerebras drops you a free-tier credit (or rate-limit allowance) the moment you verify. That's enough for the first several thousand chats.

Billing • Payment method added

3. Open the API keys section

In the dashboard sidebar (or your profile menu), find "API keys". Click into it.

cloud.cerebras.ai/platform/api-keys
API keys
prod-key-1
+ Create new key

4. Create a new secret key

Click "Create new key", give it a name like "ChatMitra production". Copy the key the moment it appears — most providers won't show it again.

Your new API key
sk-•••••••••••••••••
⚠ Copy now — won't be shown again

5. Paste it into ChatMitra

Inside the AI Chatbot wizard (Step 2 — Pick your AI), select Cerebras, paste the key. ChatMitra validates it on the spot. Green tick = ready.

Sign-up form
your-email@business.com
••••••••
Continue

UI may change slightly as Cerebras updates their dashboard. The flow stays roughly the same — sign up, add a payment method (if needed), find the API keys section, create a key, copy it.

Set up Cerebras in ChatMitra (about 5 minutes)

Once you've copied your API key from Cerebras, the rest happens inside ChatMitra. Same wizard you'd use for any other provider — just pre-filled for Cerebras.

Step 2 of 5

Pick Cerebras as your AI

In the ChatMitra AI Chatbot wizard, Step 2 shows the provider grid. Click Cerebras's tile, paste the API key you just copied. ChatMitra checks the key right then — green tick = ready.

Cerebras Cerebras Selected
Valid
Step 2b

Pick a model

ChatMitra auto-selects llama3.3-70b as the default — that's our pick for everyday WhatsApp business chat. You can override if you want a different one.

llama3.3-70b
everyday WhatsApp chat at world-record speeds
Step 3 of 5

Write the bot's personality

Tell ChatMitra about your business — name, industry, a one-line description. Then write the personality prompt yourself, or click Generate with AI and tweak the draft. Cerebras reads this prompt before every reply.

Step 4 of 5

Add a fallback (optional, recommended)

Add up to 3 backup providers. If Cerebras rate-limits or has an outage, ChatMitra automatically switches to the backup — your customers don't see an error. Most teams add one of Groq or Fireworks here.

Step 5 of 5

Test, then activate

Built-in chat simulator lets you message your bot before any real customer does. Try a tricky question. When it looks right, hit Activate — and the bot starts answering on the WhatsApp number you've already connected to ChatMitra.

In simple words: Cerebras is the brain. ChatMitra is the body — the WhatsApp connection, the inbox, the dashboard, the analytics. You don't have to choose between them.

Worried about getting locked in? You can change AI providers any time inside ChatMitra. The bot's personality, conversation history, and customer data stay exactly where they are — only the underlying AI changes. Try Cerebras first, switch later if something better fits.

Cerebras for Indian SMBs — pros & cons

What's good

  • Most generous free tier in the industry — 1M tokens / day, no card
  • World-record inference speed — 2,600+ tokens / second
  • Llama 3.1 8B on Developer tier at $0.10 / $0.10 per 1M is among the cheapest 'good' models anywhere
  • GPT-OSS 120B available — OpenAI's open-source model, hosted fast

What to watch out for

  • Free tier capped at 8,192-token context — fine for short chat, not long sales conversations
  • Smaller model catalog than Together / Fireworks
  • Reply quality is open-model-tier — not closed-flagship

Compare with other providers

Which is better — Cerebras or Groq? It depends what you need.

Each label is what the provider does best. Match it to your priority — that's usually the right pick.

Top picks by category:

  1. OpenAI — best for accuracy and complex replies
  2. Groq — best for cost and high-volume chat
  3. Google Gemini — best balanced choice for most Indian SMBs
  4. Cerebras — world-record inference speed — llama 3.3 70b at 2,600+ tokens/sec

Sources & references

Every model name, context window, and price on this page is copied from Cerebras's own public pages on 2026-04-30. Providers update pricing fairly often — please double-check before you commit to a plan.

Try Cerebras on your WhatsApp in 5 minutes

15 days free. No card. Switch providers any time without rebuilding your bot.

Frequently Asked Questions

Is Cerebras's 1M-tokens/day free tier really free?

Yes — 1,000,000 tokens / day free, no credit card. Within the rate limits (30 RPM, 60K-100K TPM, 8K context cap), this is genuinely usable for production. Many small ChatMitra stores never leave the free tier.

How does Cerebras compare to Groq on speed?

Both are world-class. Cerebras's 2,600+ TPS on Llama 3.3 70B is the published world record; Groq's 300+ TPS is also outstanding. In practice, both feel instant. Pick Cerebras if you specifically want the larger free tier.

Which Cerebras model should I pick?

Llama 3.3 70B is the right default — best balance of quality and speed. Llama 3.1 8B is for ultra-routine queries. Qwen3 32B if you're testing Chinese models. GPT-OSS 120B if you want OpenAI's open-source model.

How much does a Cerebras WhatsApp chatbot cost?

Many ChatMitra small stores pay ₹0 — they stay inside the 1M-tokens/day free tier. On the Developer tier, expect ₹400–₹2,500 / month for a typical busy store doing 1,000–10,000 chats.

Does Cerebras handle Hindi?

Through Llama 3.3 70B, yes — decently. Hindi and Hinglish work, regional dialects are weaker. Test before committing.

Try Cerebras now →