Claude Code Best Practices: The Working Developer’s Playbook for 2026

What it is: The capstone of our harness-engineering series — a working developer’s playbook for Claude Code in 2026. Distills Anthropic’s official best-practices article plus everything we’ve covered (failure modes, AGENTS.md, long-running tasks, verification, observability, feature lists) into a single reference you can implement this week.
Who it is for: Developers actively using Claude Code who want one consolidated source for the practices that make agents reliable.
Best if: You want named commands, configuration files, and a 30-day rollout — not a theory tour.
Skip if: You haven’t installed Claude Code yet. Start with Claude Code Beginners Guide first. Want one practical AI workflow every morning? Subscribe to our free daily newsletter.

Heads up — this is the capstone of a more intermediate AI series. Brand-new to AI? Start with How to Use AI and Claude Code Beginners Guide first. This post pulls everything from the rest of our harness-engineering cluster into one playbook: Why AI Coding Agents Fail, What is AGENTS.md?, Long-Running Claude Code Tasks, AI Agent Verification, AI Agent Observability, and Feature Lists for AI Coding Agents.

What is the Claude Code playbook?

Claude Code is at its most powerful when it runs against a deliberate harness — a small stack of files, hooks, and habits that turn the model from a brilliant-but-erratic intern into a reliable engineer. Anthropic publishes their own best-practices article (linked below) that’s required reading. The list of distilled rules in this guide is faithful to Anthropic’s official guidance, layered with patterns the broader harness-engineering community (Martin Fowler, HumanLayer, walkinglabs) has converged on through 2025 and 2026.

Anthropic’s single most-quoted line on the topic: “Give Claude a way to verify its work … this is the single highest-leverage thing you can do.” Everything else in this playbook is a corollary of that one rule.

What is Anthropic’s “explore, plan, code, commit” loop?

Anthropic’s canonical four-step workflow, named in their official docs, is the spine of every reliable Claude Code session:

  • Explore. Read the relevant files. Use Plan Mode (Shift+Tab) so the agent reads without editing. Look at adjacent code, tests, and any spec.
  • Plan. Produce a written plan describing what will change, in which files, in what order. For non-trivial work, save the plan to a file the implementation phase can refer back to.
  • Code. Implement against the plan. Run tests as you go. Don’t deviate from the plan without an explicit decision.
  • Commit. Commit cleanly, write a real message, open a PR. The git history becomes part of the handoff for the next session.

Anthropic’s own advice on skipping the plan step: “Skip planning only if you could describe the diff in one sentence.” If you can’t describe the change in one sentence, you need a plan.

How do you set up Claude Code on day one?

  • Run /init. Anthropic ships a slash command that generates a starter CLAUDE.md from inspecting your repo. Their explicit advice: “Keep it short and human-readable. For each line, ask: would removing this cause Claude to make mistakes? If not, cut it. Bloated CLAUDE.md files cause Claude to ignore your actual instructions.”
  • Learn the CLAUDE.md hierarchy. ~/.claude/CLAUDE.md for global rules across all your projects; ./CLAUDE.md at the repo root for team rules (commit it); ./CLAUDE.local.md for personal rules (gitignore it). Parent and child directory files are imported on demand. See our CLAUDE.md Guide for the full pattern.
  • Add an AGENTS.md too. Cross-tool standard, sibling to CLAUDE.md. Symlink one to the other so all your agents read the same file. Full walkthrough in What is AGENTS.md?
  • Write the “be specific” rule into your habits. Anthropic gives this exact contrast: bad — "add tests for foo.py"; good — "write a test for foo.py covering the edge case where the user is logged out. avoid mocks." The specificity gap is the quality gap.
  • Reference files with @. Paste images, give URLs, pipe in data (cat error.log | claude). Letting the agent fetch context itself is more reliable than describing it.

What does the “complete harness” look like?

Working developers in 2026 typically run a stack of eight pieces. None are mandatory; you don’t need all eight to get value. But the more you have, the more reliable Claude Code becomes.

  • CLAUDE.md — the persistent rules for your codebase. Conventions, build commands, things not to touch.
  • .claude/skills/ — reusable domain workflows. A SKILL.md per workflow, with frontmatter naming the skill and a one-line description.
  • .claude/agents/ — specialist subagents. Each lives in its own context window; only summaries return to the main session.
  • .claude/settings.json — hooks (PreToolUse, PostToolUse, Stop, etc.) and permissions allowlist.
  • feature_list.json — scope guardrail. What we’re building, with verification steps per feature. See Feature Lists for AI Coding Agents.
  • claude-progress.md — state between sessions. What was tried, what worked, what failed and why. See Long-Running Claude Code Tasks.
  • Test suite + linter — the verification loop. npm test or pytest wired into a PostToolUse hook so it runs after every edit. See AI Agent Verification.
  • Observability — Logfire, Langfuse, or Claude Code’s native OpenTelemetry (CLAUDE_CODE_ENABLE_TELEMETRY=1) pointed at any OTel-compatible platform. See AI Agent Observability.

The minimum viable harness is three files: CLAUDE.md, a test suite, and one PostToolUse hook that runs the tests. Everything else is depth.

Want a daily Claude Code workflow in your inbox? The free Beginners in AI daily brief ships one practical pattern per day. Plain English, no tech background required.

What is Plan Mode and when should you use it?

Plan Mode is the single most underused feature in Claude Code. Press Shift+Tab twice and the agent reads files, lists trade-offs, and writes a plan — without editing anything. Press Ctrl+G to open the plan in your editor and adjust it. Approve the plan and the agent executes it.

Use Plan Mode whenever the task is non-trivial — multi-file changes, architectural decisions, refactors, or anything where you’d want to think before typing. Skip it only for genuinely one-liner edits.

Claude Code’s thinking budget toggle compounds with Plan Mode. Type "think" in your prompt for a modest reasoning budget; "think hard" or "think harder" for a larger one; "ultrathink" for a 31,999-token reasoning budget on a single decision. Available in the terminal client (not the API). Use it on architecture decisions, hard debugging, and design reviews.

What are hooks, skills, and subagents?

  • Hooks. Shell commands that fire at 25 lifecycle points (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification, etc.). Configure in .claude/settings.json or run /hooks. Anthropic’s rule: “Unlike CLAUDE.md instructions which are advisory, hooks are deterministic and guarantee the action happens.” Use hooks for the things that absolutely must happen every time. The single highest-value hook is a PostToolUse that runs your test command after every file edit.
  • Skills. Reusable workflows defined in .claude/skills/<name>/SKILL.md. Frontmatter names the skill and gives a one-line description; the body describes the workflow. The agent invokes them as if they were tools. Great for repeat patterns (release deploy, dependency upgrade, doc generation).
  • Subagents. Specialist agents defined in .claude/agents/<name>.md with frontmatter declaring their tools and model. Each runs in an isolated context window; only the summary comes back to the main conversation. Use them for investigation tasks where you don’t want exploration to pollute your main session’s context.
  • Plugins. Browse and install via /plugin. Bundle skills + hooks + subagents + MCP servers into one installable package. The 2026 marketplace has plugins for almost every common workflow.

How should you handle permissions and safety?

  • Use the permissions allowlist. Run /permissions and pre-approve common commands you’ll always want to allow (npm run lint, git commit, your test command). Reduces the prompting overhead without giving the agent blanket access.
  • Sandboxing. Run /sandbox for OS-level filesystem and network isolation. Useful for testing untrusted code or running risky operations.
  • Devcontainer mode. Anthropic ships a reference devcontainer with an egress firewall and persistent volumes. Their words: “Enhanced security measures allow you to safely run --dangerously-skip-permissions for unattended operation.” For overnight or unattended runs, this is the right setup.
  • Never run --dangerously-skip-permissions on bare metal. Even Anthropic’s own engineers only run it inside containers. The performance gain isn’t worth the blast radius of a runaway agent on your real filesystem.
  • Use git worktrees for parallel sessions. git worktree add per task. Each Claude session lives in its own working copy. Parallel agents on the same codebase can’t collide.
  • Auto mode (claude --permission-mode auto) routes routine work through a classifier model that blocks scope escalation and hostile-content-driven actions. A good middle ground between manual prompting and blanket permission.

How do you manage cost and rate limits in 2026?

  • Pro $20/mo. Light usage. You’ll hit rate limits within hours if Claude Code is your daily driver.
  • Max 5x $100/mo. Roughly 88K tokens per 5-hour rolling window. About 140–280 hours of Sonnet plus 15–35 hours of Opus weekly. The right tier for “Claude Code is my primary tool.”
  • Max 20x $200/mo. Roughly 220K tokens per 5-hour window. 240–480 hours of Sonnet plus 24–40 hours of Opus weekly. The right tier for continuous autonomous sessions.
  • 5-hour rolling windows. Usage resets continuously, not on the hour. If you hit the limit at 2pm, the window starts opening up around 7pm rather than waiting until the next day.
  • API vs flat-rate. API metered per token; the Max plans include prompt cache reads (often more than 90% of tokens in long sessions). Most working developers come out ahead on Max plans once they’re running multi-hour sessions weekly.
  • Prompt caching can cut effective cost by 90%+ on workloads where the same context loads repeatedly. The Claude Code session model leans on this heavily — you typically don’t have to do anything to benefit.

What are the named anti-patterns?

Anthropic’s own docs name five recurring failure patterns. Each is recoverable; recognition is the skill.

  • Kitchen-sink session. You let one chat sprawl across six unrelated tasks. Context gets cluttered. Quality drops. Fix: /clear between unrelated tasks.
  • Correcting over and over. Anthropic’s specific rule: “If you’ve corrected Claude more than twice on the same issue in one session, the context is cluttered. Run /clear and start fresh.”
  • Over-specified CLAUDE.md. Bloated files cause Claude to ignore your actual instructions. Cut anything where removing it wouldn’t cause a mistake.
  • Trust-then-verify gap. The agent says done, you accept, and only discover later that the tests are red. Fix: a PostToolUse or Stop hook that runs the test command before “done” can complete. See AI Agent Verification.
  • Infinite exploration. The agent spends 20 minutes reading files instead of writing code. Fix: use Plan Mode to bound the exploration phase; subagents for deep investigation that shouldn’t bloat the main session.

Three more from community experience:

  • Skipping /init. Starting without a CLAUDE.md leaves the agent guessing about every convention. Fifteen minutes of /init avoids hours of correction.
  • Multiple agents in the same directory. Without git worktrees, parallel sessions overwrite each other’s work. Always worktree the second agent.
  • Letting the agent edit the spec. If you have a feature_list.json, lock the description and steps fields. The agent flips passes only. See Feature Lists for AI Coding Agents.

What’s the 30-day Claude Code mastery rollout?

  • Days 1–3 (install + first feature). Install Claude Code. Run /init. Ship one small feature using the explore-plan-code-commit loop. Learn the navigation shortcuts: Esc to interrupt, Esc+Esc or /rewind to restore state, /clear to reset, "Undo that" to revert.
  • Days 4–10 (habits + tooling). Add 3–5 lines to your CLAUDE.md (project-specific conventions). Install gh so the agent can interact with GitHub. Use /permissions to allowlist 5 commands you trust. Try a Plan Mode session on a non-trivial change.
  • Days 11–17 (verification + skills). Add a PostToolUse hook that runs your test command after every edit. Write your first SKILL.md for a workflow you repeat (deploy, dependency upgrade, weekly cleanup). Try ultrathink on a real architecture decision.
  • Days 18–24 (parallel + safety). Set up a devcontainer. Spawn parallel work via git worktree. Try a writer/reviewer pattern across two sessions — one builds, one reviews against your AGENTS.md and feature list.
  • Days 25–30 (autonomy + observability). Run a /goal-driven autonomous refactor on a contained task. Wire up Logfire, Langfuse, or native OTel observability. Install one plugin from the marketplace. Try shipping a small task via claude -p in CI.

After 30 days you have a complete working harness, a daily habit, and the production-grade safety setup most teams take quarters to assemble. Nothing in the list requires a Max plan — Pro is enough to learn the workflow before you upgrade.

Frequently asked questions

Do I need all of these patterns to use Claude Code?

No. The minimum viable harness is three things: a CLAUDE.md (run /init), a test suite, and one PostToolUse hook that runs the tests after edits. That covers maybe 70% of the reliability gap. Add the other patterns when their specific failure modes start hurting you — not before.

Pro, Max 5x, or Max 20x — which plan should I start with?

Start with Pro at $20/mo. After two weeks of real use you’ll know whether you’re hitting rate limits often enough to justify Max. Most developers who use Claude Code daily land on Max 5x ($100/mo); only those running continuous autonomous sessions need Max 20x. There’s no harm in upgrading later — the prompt cache state and configs all carry over.

When should I use Plan Mode vs just letting Claude code?

Anthropic’s heuristic: skip planning if you could describe the diff in one sentence. Otherwise use Plan Mode. In practice that means: trivial typos and one-line tweaks go direct; multi-file changes, architecture decisions, refactors, and anything that touches more than one component go through Plan Mode first.

Is --dangerously-skip-permissions ever safe?

Inside a devcontainer or VM with no access to your real filesystem, yes. On your bare-metal machine, no. Anthropic’s own engineers only use it inside containers. The flag exists for long unattended autonomous runs where prompting would defeat the purpose — the safety story relies on the container, not on trust in the agent.

How is Claude Code different from Cursor or GitHub Copilot?

Claude Code is terminal-first and built around long-running autonomous sessions — the explore-plan-code-commit loop. Cursor is IDE-first with strong autocomplete plus a more recent agentic Composer mode. GitHub Copilot started as autocomplete and has added agentic surfaces more recently. They overlap in capability; the right choice depends on whether you live in a terminal (Claude Code), an IDE (Cursor), or already inside GitHub’s product ecosystem (Copilot). See Claude Code vs Cursor vs Copilot for the head-to-head.

Get Smarter About AI Every Morning

Free daily newsletter — one story, one tool, one tip. Plain English, no jargon.

Free forever. Unsubscribe anytime.

1-on-1 Coaching

Claude AI Crash Course

1-hour private video session with James. Walk through your real codebase: CLAUDE.md, hooks, Plan Mode, /goal command, subagents, and the full harness pattern from this series. Best for developers who want to compress the 30-day rollout into one focused session.

$75

1-hour live

Book session →

Group Format

AI Workshops for Teams

Team workshops for engineering departments standardising the full Claude Code harness across a codebase — CLAUDE.md, hooks, feature lists, observability, verification gates. Best for teams of 3+ developers shipping production code with AI agents. Custom-built around your repository.

Custom

pricing

Get a quote →

You may also like

Sources

Last reviewed: May 2026. Claude Code changes weekly — verify command syntax and pricing on the Anthropic docs above before relying on any specific detail for a production setup.

Discover more from Beginners in AI

Subscribe now to keep reading and get access to the full archive.

Continue reading