Use Claude
Smarter

A practical guide to understanding tokens, usage limits, and how to get more out of every conversation without burning through your quota.

check_circle Free + Prosmart_toy Claude.aiterminal Claude Codesmartphone Claude App

token What are tokens?

Tokens are the units Claude thinks in. Not words, not characters — chunks.

When you send a message to Claude, it doesn't read your text word by word. It breaks everything into tokens — small pieces that can be a word, part of a word, or even punctuation.

How Claude sees this sentence:

I design products by day and build with AI by night

11 tokens — roughly 1 token per word, but not always

Every conversation has two token counts that matter:

arrow_upward Input tokens

What you send — your prompt, pasted code, uploaded files, and the entire conversation history Claude re-reads each turn.

arrow_downward Output tokens

What Claude generates — its response, code, explanations. Output tokens typically cost more than input tokens.

lightbulb

Key insight: The longer a conversation goes, the more input tokens each message costs — because Claude re-reads the entire history every single turn. A 50-message thread isn't 50x the cost of one message, it's much more.

speed How usage limits work

You're not paying per token. You're given a budget that resets on a timer.

star Free tier

Limited daily messages. Enough to explore, but you'll hit walls fast with heavy use. No Claude Code access.

workspace_premium Pro tier ($20/mo)

Significantly higher limits. Access to Claude Code, Opus model, and extended thinking. Limit resets every 5 hours.

timer

Rate limit

You've sent too many messages too fast. Wait a few minutes and try again. This is about speed, not total usage.

block

Usage limit

You've used your allocated tokens for this window. The counter resets every ~5 hours on Pro. No extra charge — just a cooldown.

description

Context window

The maximum amount of text Claude can hold in one conversation (around 200K tokens). Think of it as Claude's working memory — once full, older messages get dropped.

tune

Model choice

Opus is the most capable but costs the most tokens. Sonnet is faster and lighter. Haiku is the cheapest. Pick based on task complexity, not habit.

local_fire_department Why tokens burn fast

common mistakes

Most people don't run out of tokens because of hard problems. They run out because of wasteful habits.

file_copy

Pasting entire files

Dumping a 2,000-line file when you only need Claude to look at 10 lines. Every extra line is tokens you're burning for nothing.

select_all

"Review everything"

Asking Claude to "review this whole codebase" or "check everything" without a specific focus. Broad requests = broad (expensive) responses.

chat_bubble

Long back-and-forth

Going 40 messages deep in one thread. Remember: Claude re-reads the full history each turn, so message #40 costs way more than message #1.

description

Letting Claude over-explain

Claude defaults to thorough explanations. If you don't tell it to be concise, it'll write 3 paragraphs when you needed 1 line.

replay

Repeating failed approaches

Trying the same broken prompt 5 times instead of rephrasing or giving Claude more context about what went wrong.

auto_awesome

Always using the biggest model

Using Opus for every task — even simple formatting or renaming. A Haiku call costs a fraction of an Opus call for the same simple task.

bolt Tactics to save tokens

Small changes in how you prompt and structure your workflow can 2–3x your effective usage.

edit_note Prompting

Be specific about what you want

Vague prompts force Claude to guess, which means longer responses and more follow-ups. Precision eliminates round-trips.

close Vague

"Help me with this file"

check Specific

"Refactor the handleSubmit function on line 45 of Cart.js to use async/await instead of .then() chains"

savings Saves ~40–60% tokens by avoiding "what do you mean?" follow-ups

Set output constraints

Claude defaults to thorough explanations. Telling it to be concise directly cuts output tokens — the most expensive kind.

close Unconstrained

"Explain what this useEffect does"

Claude writes 3 paragraphs + a code example

check Constrained

"Explain this useEffect in one sentence. No code."

Claude: "It fetches user data on mount and updates the profile state."

savings Cuts output by 70–80% on explanation-heavy prompts

Give context upfront

When Claude doesn't know your stack, it guesses — often wrong. One line of context prevents an entire wasted exchange.

close No context

"Add a loading spinner to my page"

Claude asks: "What framework? Client or server component? CSS framework?"

check With context

"Next.js 15, App Router, CSS Modules. Add a loading spinner to the /dashboard page as a loading.js file"

Claude writes the exact file immediately — no questions

savings Eliminates 1–2 clarification round-trips (~2x token savings per task)

Ask for diffs, not full rewrites

When editing a 200-line file, Claude defaults to rewriting the whole thing. Asking for just the changed lines saves massive output.

close Full rewrite

"Fix the bug in this file"

Claude outputs all 200 lines with 2 lines changed

check Diff only

"Fix the null check on line 87. Show only the changed lines with 3 lines of context above and below"

Claude outputs ~8 lines instead of 200

savings Up to 95% fewer output tokens on large file edits

terminal Claude Code

Use /compact to compress history

After ~20 messages, your conversation history becomes the biggest token cost. /compact summarizes it so Claude keeps context without re-reading every word.

close Without /compact

Message #30 in a long thread
→ Claude re-reads all 29 prior messages
→ ~15,000 input tokens just for history

check After /compact

Run /compact at message #20
→ History compressed to ~2,000 tokens
→ Message #30 costs a fraction of before

savings Can reduce input tokens by 80%+ in long conversations

Create a CLAUDE.md file

Instead of explaining your project setup every session, put it in a CLAUDE.md at your repo root. Claude reads it automatically on every new session.

description Example CLAUDE.md

# Project
Next.js 15 App Router, CSS Modules, Sanity CMS

# Conventions
- Components in src/components/
- Use CSS Modules, not Tailwind
- Server components by default

# Don'ts
- Never add "use client" unless needed
- Don't create new utility files

savings Eliminates repeating project context every session (~500–1,000 tokens saved per conversation start)

Start fresh for new tasks

A conversation about auth middleware doesn't help when you switch to styling. Old context adds cost and confusion.

close One long thread

Messages 1–15: Fix login bug
Messages 16–30: Style the dashboard
Messages 31–45: Add API endpoint
→ Message #45 carries all prior context (~20K tokens)

check Separate sessions

Session 1: Fix login bug (15 messages)
Session 2: Style dashboard (15 messages)
Session 3: Add API endpoint (15 messages)
→ Each starts clean (~0 history tokens)

savings 3 focused sessions use ~60% fewer total tokens than 1 mega-thread

account_tree Workflow

Match the model to the task

Opus is powerful but expensive. Most tasks don't need it. Pick the cheapest model that can handle the job.

tune Model guide

Haiku → Rename variables, fix typos, format code
Sonnet → Write components, debug errors, refactor functions
Opus → Architecture decisions, complex multi-file refactors, debugging race conditions

savings Haiku costs ~10x less than Opus for the same simple task

Break big tasks into focused sessions

One massive prompt with 5 requirements leads to long responses and revision cycles. Smaller scopes = fewer wasted tokens.

close One mega-prompt

"Build me a user dashboard with auth, data tables, charts, search, and export to CSV"

Claude writes 500 lines, half need rewriting → double the cost

check Step by step

Session 1: "Set up the dashboard layout with a sidebar"
Session 2: "Add the data table with sorting"
Session 3: "Add the chart component using Recharts"

Each piece is reviewable, testable, and cheap to revise

savings Reduces revision waste by 50%+ on complex features

Use headless mode for batch work

Claude Code's -p flag runs a single prompt without interactive overhead. Perfect for repetitive tasks you'd otherwise do one at a time.

terminal Example

claude -p "Add JSDoc comments to all exported functions in src/utils/auth.js"

claude -p "Convert this CSS file to CSS Modules: src/styles/header.css"

# Or loop through files:
for f in src/utils/*.js; do
claude -p "Add TypeScript types to $f"
done

savings No history accumulation — each call starts at zero context cost

compare_arrows Quick reference

cheat sheet

close Instead of thischeck Do this

"Here's my entire 500-line file, find the bug"

"There's a null error on line 42 of Cart.js — here's lines 35–55"

"Review this codebase"

"Review the auth middleware for security issues"

"Can you explain how React works and then build me a form?"

"Build a login form using React Hook Form with email/password fields"

Using Opus to rename a variable across 3 files

Using Haiku or Sonnet for mechanical refactors

Continuing a 60-message thread about a new topic

Starting a fresh session with clear context

"Fix it" (after a failed attempt)

"The previous approach failed because X. Try Y instead."

person MEET THE DUDE

Mandeep Kumar

Senior Product Designer · Beem

I design products by day and build with AI by night. DudeInDesign is where both sides meet.

location_on Toronto, ON, Canada

Live · Toronto

--:--:--

location_on America/Toronto

Use ClaudeSmarter

Use Claude
Smarter