Skip to content

Verification Loops

Give Claude a way to verify its own work — the single most important technique for high-quality AI-assisted output.

Giving Claude a feedback loop to verify its own work is the single biggest force multiplier for output quality. Without verification, Claude produces first-draft quality — plausible but unchecked. With a verification loop, Claude iterates until the output meets real-world standards. Verification closes the gap between “looks right” and “works right.”


The Core Principle

Every verification loop follows the same structure:

1. CHANGE  — Claude makes an edit or generates output
2. CHECK   — Claude runs a verification step (tests, build, browser, logs)
3. READ    — Claude reads the result
4. FIX     — If verification failed, Claude fixes and re-checks
5. REPEAT  — Until verification passes

The difference between a mediocre AI-assisted workflow and a great one is whether step 2 exists. Without it, you are reviewing every line manually. With it, Claude catches its own mistakes before you ever see them.


Types of Verification Loops

Test Suite Verification

Run tests after every change, read failures, fix, repeat. This is the most common and most reliable verification loop. The TDD red-green-refactor cycle is itself a verification loop — each test run tells Claude whether the implementation is correct.

Implement a password strength validator that requires at least 8 characters,
one uppercase letter, and one special character. After each change, run
npm test. If any tests fail, read the failure output and fix the code.
Repeat until all tests pass.

Why it works: Test output is unambiguous. Either tests pass or they do not. Claude can read failure messages and fix the exact problem.

Browser and UI Verification

Use the --chrome flag or Playwright/Cypress to verify visual and interactive changes. Claude opens a browser, inspects the result, and iterates on what is broken.

Run the dev server, open http://localhost:4321 in Chrome, and verify that
the navigation menu renders correctly on mobile viewport (375px wide).
If anything is misaligned or overflowing, fix the CSS and reload to verify.

When to use: Layout changes, responsive design, component rendering, any work where “does it look right” matters more than “does it compile.”

Linter and Formatter Verification

PostToolUse hooks auto-run formatters after every edit, making this verification loop completely automatic. Claude never ships code with lint errors because the linter runs on every save.

{
  "hooks": {
    "PostToolUse": [{
      "matcher": "Write|Edit",
      "hooks": [{ "type": "command", "command": "npm run lint:fix || true" }]
    }]
  }
}

Add this to .claude/settings.json. Every time Claude writes or edits a file, the linter runs automatically. Claude sees the output and self-corrects without being prompted.

Build Verification

Run the build after changes to catch type errors, missing imports, and configuration issues that tests might not cover.

Make the change, then run npm run build. If it fails, read the error output,
fix the issues, and rebuild. Repeat until the build succeeds with zero errors.

Best for: TypeScript projects, monorepos, anything with a compilation step. Build errors are fast to detect and usually straightforward to fix.

Log-Based Verification

For backend and distributed systems, verification means checking Docker logs, server output, and database state. Claude reads logs, identifies issues, fixes the code, and re-checks.

Start the Docker containers with docker compose up -d, then check the logs
with docker compose logs api. If you see any errors or stack traces,
fix the code and restart the affected container. Repeat until the logs
show clean startup.

Best for: API development, microservices, database migrations, infrastructure changes.


Setting Up Verification in Your Workflow

Step 1: Identify Your Verification Method

What tells you a change is correct? The answer depends on the domain:

  • Frontend: Does it render correctly in a browser?
  • Backend: Do the tests pass? Do the logs look clean?
  • Full-stack: Do the E2E tests pass?
  • Infrastructure: Does it build? Does the deploy succeed?

Step 2: Tell Claude About It

Include verification steps directly in your prompts or in your project’s CLAUDE.md:

## Verification
After making changes to any component, run `npm test` and `npm run build`.
Both must pass before considering the work complete.

Claude reads CLAUDE.md at the start of every session. Putting verification instructions there means you never have to repeat them.

Step 3: Automate with Hooks

PostToolUse hooks make verification automatic — Claude does not need to remember to run checks because the hooks fire after every tool use.

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          { "type": "command", "command": "npm run lint:fix || true" },
          { "type": "command", "command": "npm run typecheck || true" }
        ]
      }
    ]
  }
}

Step 4: Close the Loop

The critical instruction is telling Claude to iterate until done:

Run the tests. If any fail, fix them and run again. Repeat until all pass.
Do not stop until you see zero failures.

Without this explicit instruction, Claude may report the first failure and wait for you to respond. Closing the loop means Claude keeps working autonomously until verification passes.


Quick Reference

DomainVerification MethodSpeedReliability
FrontendBrowser testing, visual regression, StorybookMediumHigh for visual issues
BackendTest suites, API endpoint testing, log analysisFastHigh
Full-stackE2E tests (Playwright/Cypress), integration testsSlowHighest
InfrastructureBuild verification, type checking, lintingFastHigh for compile-time errors
DataQuery validation, schema checksFastMedium

Tips

  • Automatic beats manual. Hooks that run on every edit are more reliable than prompts that ask Claude to remember. Start with hooks.
  • Specific criteria produce better self-correction. “Run the tests” is good. “Run the tests, and if the snapshot test for Header fails, update the snapshot only if the new output is correct” is better.
  • Combine verification types for highest confidence. Linter + build + tests catches different classes of errors. No single verification method catches everything.
  • Start with the fastest loop. Linter errors take milliseconds to detect. Build errors take seconds. Test failures take seconds to minutes. Browser checks take the longest. Run the fast checks first so Claude fixes trivial issues immediately.
  • Verification is not just for code. Claude can verify documentation by checking that all links resolve, verify configs by running validation commands, and verify data by running queries.