Claude, Stop Making Mocks

  ·  5 min read

Generative AI has carved out a comfortable spot in daily development. Need to scaffold a feature? Generate test cases? Explain a gnarly regex? Your AI assistant handles it.

But mocks are different. They’re deterministic infrastructure that must match interfaces exactly. LLMs improvise. That’s their strength, and precisely what makes them hazardous for mock generation.

Should We Let LLMs Generate Mocks? #

The case for AI-generated mocks: it’s fast. You describe the interface, Claude writes the struct, you move on. For prototypes and throwaway experiments, this is fine.

The case against: mocks need to be boring. When they drift from their interfaces (wrong signatures, missing methods, incorrect return types) your tests lie. They pass when they shouldn’t. They fail for baffling reasons. And when you change an interface, your AI-generated mocks stay stale.

Tools like mockery exist to solve exactly this. They generate mocks deterministically from interfaces. When your interface changes, you regenerate. No drift. No surprises.

The Problem: Claude Keeps Generating Mocks #

Here’s the scenario that prompted me to build a hook: I work primarily in Go, where mockery is the standard tool for generating test mocks. I use Claude Code daily. And despite instructions to the contrary, it kept generating mocks manually.

You could add “never generate mocks manually, always use mockery” to your CLAUDE.md file. I did. It even works, sometimes. But CLAUDE.md instructions compete with other context. When the user prompts “write me a mock for this interface,” the instruction to help directly can override the instruction to use tooling. Context windows are finite, priorities shift, and instructions fade.

So I needed a guardrail. Not for me - I know to run mockery. But for the AI assistant itself. A way to intercept mock generation attempts and redirect them to the right tool.

Hooks don’t forget. They don’t get deprioritized. They operate at the tool layer, not the instruction layer. A blocked Write operation is a harder boundary than a remembered preference.

Designing the Hook #

Claude Code’s PreToolUse hook runs before Write or Edit operations. Perfect for validation: check what Claude is about to write, block if it looks like a mock, provide the right command instead.

The logic is straightforward:

1. Read tool parameters from stdin (JSON)
2. Extract: tool_name, file_path, content

3. If file matches *_test.go or *mock*.go:
   4. Check content for mock patterns:
      - "type Mock* struct"
      - "func (m *Mock..."
      - "testify/mock" imports

   5. If any pattern found:
      - Return JSON with blocked: true and reason
      - Exit 1 (block the operation)

6. Exit 0 (allow the operation)

When the hook blocks, it returns a structured response that Claude displays to the user:

{
  "blocked": true,
  "reason": "AI-generated mocks are not deterministic.

⚠️  Mock Generation Policy:
- Mocks MUST be generated using mockery
- Use: mockery --name=InterfaceName --dir=./path/to/package

This hook ensures test code quality."
}

Why this works:

Hooks encode workflow preferences directly in your development environment. Instead of repeatedly asking Claude not to generate mocks, the hook makes it impossible. It enforces the rule automatically, preventing human error and AI improvisation.

The full implementation includes additional patterns and logging for debugging.

When LLM Mocks Might Be Acceptable #

If you’re spiking a feature or teaching someone how mocks work, having Claude generate one on the fly is fine. It’s disposable. You’re exploring, not building for production.

But once you need reliability (once your codebase depends on these mocks) AI-generated code becomes a liability. You’re trading convenience for technical debt that accumulates quietly until it bites you during a refactor.

Hooks as Workflow Guardrails #

The hook enforces a simple rule: mocks come from mockery, not from AI generation. This keeps Claude useful for everything else while preventing it from generating inconsistent test infrastructure.

This hook is specific to mocks, but the pattern applies broadly. Hooks let you define boundaries where determinism matters more than convenience.

Some other examples where this pattern works:

  • Database migrations: Block hand-written SQL, enforce migration tooling
  • API clients: Block manual HTTP code, enforce generated clients from OpenAPI specs
  • Configuration files: Block JSON edits, enforce structured config management
  • Generated code: Block edits to auto-generated files, enforce regeneration instead

In each case, you’re drawing a line: here, deterministic tools should operate. AI can help with everything else (documentation, refactoring, debugging) but not here. This is where the pipes live, and pipes should be boring.

Getting the Hook #

The enforce-mockery hook is available as part of my claude_code_template repository, along with other workflow guardrails for Claude Code.

# Clone the template
git clone https://github.com/rikdc/claude_code_template.git

# Copy the hook
cp claude_code_template/.claude/hooks/enforce-mockery.sh your-project/.claude/hooks/

# Make it executable
chmod +x your-project/.claude/hooks/enforce-mockery.sh

Then configure it in .claude/settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "\"${CLAUDE_PROJECT_DIR}\"/.claude/hooks/enforce-mockery.sh"
          }
        ]
      }
    ]
  }
}

Final Thought #

AI is phenomenally useful. But not everywhere. Not for everything. The trick is knowing where to draw lines.

Mocks are infrastructure. They’re deterministic contracts between your tests and your code. They don’t benefit from creativity or improvisation. They benefit from tools designed to generate them reliably.

Hooks let you encode that preference once and enforce it automatically. Claude remains helpful for everything else—the explanations, the refactors, the late-night debugging sessions—but it doesn’t get to touch the pipes.

That’s the real win here. Not rejecting AI, but using it deliberately. Letting deterministic tools handle deterministic outputs, while AI handles the rest.

Sometimes the best way to work with an assistant is to teach it when to step back.