Inside the Claude Code PostToolUse hook: what fires on edit

The Claude Code PostToolUse hook is the most useful integration point in the agent’s lifecycle and also the most under-documented one. This post walks through what fires, what payload you get, what the exit codes mean, and how tailtest hooks the lifecycle without blocking Claude’s turn. The Claude Code PostToolUse hook is what makes hook-based testing possible, so understanding the contract is worth the half-hour.

I’m Vaishnavi. I work on the Claude Code plugin specifically. I have spent the last four months reading PostToolUse logs, instrumenting the hook entry point, and reproducing the few cases where it does something unexpected. Most of what is here is in the Claude Code source if you read closely; I am collecting it in one place because the official docs cover the happy path and the source covers the rest.

What fires, and when

When Claude Code invokes a tool (Edit, Write, MultiEdit, Bash, and the rest), the lifecycle is:

PreToolUse hooks fire. These can veto the tool call by exiting non-zero.
The tool runs.
The tool returns a result.
PostToolUse hooks fire. These see the result.
The result is rendered into Claude’s context as tool output.
Claude composes its next response.

PostToolUse is the boundary between “the tool ran” and “Claude sees the result.” This is the only place in the lifecycle where you can intercept what Claude just did before Claude responds to it. Pre-tool hooks see the intent; post-tool hooks see the consequence. For testing, you want the consequence.

The hooks themselves are configured in ~/.claude/settings.json (user level) or .claude/settings.json (project level). The shape is an array under the hooks key:

{
  "hooks": [
    {
      "name": "tailtest-post-edit",
      "match": {
        "event": "PostToolUse",
        "tool": "Edit",
        "file_path_regex": "\\.(py|js|ts|tsx|go)$"
      },
      "command": "uvx tailtest --hook claude --event PostToolUse",
      "timeout_ms": 30000
    }
  ]
}

The matchers Claude Code supports as of v0.8.4 are event, tool, file_path_regex, and cwd_regex. The first two are required for any sensible PostToolUse hook; the last two narrow the scope. The timeout_ms field defaults to 60 seconds. Hooks that exceed the timeout get killed and the tool result is rendered without the hook output.

The event payload

When PostToolUse fires, Claude Code passes a JSON payload to the hook command’s stdin. The shape, as of v0.8.4:

{
  "event": "PostToolUse",
  "tool": "Edit",
  "tool_input": {
    "file_path": "/abs/path/to/file.py",
    "old_string": "...",
    "new_string": "..."
  },
  "tool_result": {
    "success": true,
    "diff": "...",
    "file_path": "/abs/path/to/file.py"
  },
  "cwd": "/abs/path/to/project",
  "session_id": "01HMXN..."
}

The hook can also access a subset of this via environment variables, which is easier than parsing stdin for simple cases. The variables Claude Code sets for PostToolUse are:

TOOL_NAME (Edit, Write, MultiEdit, etc.)
TOOL_FILE_PATH (absolute path of the edited file, when applicable)
TOOL_SUCCESS (true or false)
CWD (current working directory of the Claude session)
SESSION_ID (the session UUID)

For tailtest’s PostToolUse handler we read both stdin (for the full payload) and the env vars (for the fast path). Most hook implementations will only need the env vars.

Exit codes and what they mean

The exit code of the hook command is load-bearing. Claude Code interprets it as follows:

0 means the hook succeeded. Stdout is captured and may be surfaced to Claude as a context note.
Non-zero means the hook failed. Stderr is captured. Claude sees an “additional context: hook failed” line and may or may not act on it depending on the tool.
Specifically for PostToolUse, non-zero exit does not roll back the tool call. The edit already happened. The non-zero is informational. (This is different from PreToolUse, where non-zero blocks the tool.)

The way tailtest uses this: we exit 0 in almost all cases and use stdout to surface a structured summary that Claude can read in its next turn. The summary looks like:

[tailtest] tests:passed=12 failed=0 classified=ok adversarial=skipped budget=384

That one line is sufficient for Claude to know “edits are clean, continue.” When tests fail, the line expands:

[tailtest] tests:passed=10 failed=2 classified=real_bug,test_bug
[tailtest] real_bug: tests/test_pricing.py::test_negative_quantity
[tailtest] test_bug: tests/test_cart.py::test_add_item (stale fixture)
[tailtest] report: .tailtest/reports/latest.json

Claude reads this in its next turn and acts on it: usually fixing the real_bug, sometimes acknowledging the test_bug and regenerating the stale fixture. The structured form (real_bug, test_bug) maps to R12 classification labels.

Timing constraints and the latency budget

PostToolUse runs synchronously in Claude’s turn. Whatever the hook takes, the user feels. This is the single biggest constraint on hook design.

We measured latency budgets across 2,100 PostToolUse events in tailtest’s own dogfooding logs. The shape:

Hook dispatch overhead (Claude Code’s machinery): 18ms median, 42ms p99
Tailtest entry point (Python startup, config load, dispatch): 142ms median, 290ms p99
Test runner (pytest with testmon): 980ms median, 3.2s p99
R12 classification: 28ms median, 60ms p99
R15 adversarial (when budget allows): 2.4s median, 8.1s p99
Report write and emit: 12ms median, 24ms p99

Total median for a standard PostToolUse cycle without adversarial: 1.18 seconds. With adversarial: 3.58 seconds. Both are within the default 30-second timeout, but the second one is felt as a pause.

The quick depth mode in tailtest exists to compress this. It skips R15 entirely and runs only impacted tests via pytest --testmon against the changed file’s blast radius. Median drops to 380ms. For tight refactor loops this is what you want.

Non-obvious gotchas

A few things we learned the hard way.

Hooks fire on every Edit, including failed ones. If Claude’s Edit tool returns a failure (the old_string did not match), PostToolUse still fires. The tool_result.success field tells you. Tailtest no-ops on failed edits, but it took us two days of weird logs to figure out why we were getting empty test runs.

MultiEdit emits one PostToolUse for the whole batch, not one per edit. If Claude makes 6 edits to the same file in one MultiEdit call, you get one PostToolUse with the file path. We dedupe on the file path within a 200ms window to handle this.

Bash tool also fires PostToolUse. If Claude runs a shell command, PostToolUse fires with TOOL_NAME=Bash and no TOOL_FILE_PATH. Tailtest filters on the tool name to only act on file edits. Forgetting this matcher will cause your test runner to fire after every ls.

Working directory is not always the project root. Claude Code’s cwd is wherever the session was started. If the user invoked Claude from a subdirectory, cwd is that subdir. We resolve to the project root by walking up to the nearest .tailtest/config.yaml or .git/.

Settings files are merged user-then-project, project-wins. A project-level hook with the same name as a user-level hook replaces, not merges. This is how tailtest’s installer works: it writes a project-level hook so the user’s other hooks in ~/.claude/settings.json are not affected.

What we hook beyond PostToolUse

Tailtest’s Claude Code plugin also installs:

A PreToolUse veto on Bash calls that would run a known-destructive pattern in the project root (rm -rf, git push --force, etc.). This is conservative and overridable.
A Stop hook that runs at the end of Claude’s turn. The Stop hook does the cross-file consistency check that PostToolUse cannot do efficiently (some R-rules need the full set of edits in a turn before they can run).
A Notification hook that writes a structured event to .tailtest/sessions/<session_id>.jsonl for the eventual session replay tooling.

The Stop hook is worth highlighting. PostToolUse is per-edit; Stop is per-turn. If Claude made 8 edits in one turn, PostToolUse fired 8 times and Stop fires once at the end. The Stop hook is where we run the suite-level checks (full test suite at standard depth, integration tests at thorough depth). The per-edit cycle catches the obvious bugs at edit time; the per-turn cycle catches the integration bugs at turn boundary.

The split mirrors what Shridip described in the 5 levels of AI testing maturity. Level 3 (per-edit hooks) catches unit-level bugs. Level 4 (per-turn integration plus failure classification) catches the next layer.

How to write your own PostToolUse hook

If you want to write your own hook without tailtest, the minimum viable shape:

#!/usr/bin/env bash
# .claude/hooks/run-tests-on-edit.sh
set -e
case "$TOOL_FILE_PATH" in
  *.py) pytest --testmon -q "$(dirname "$TOOL_FILE_PATH")" ;;
  *.ts|*.tsx) jest --findRelatedTests "$TOOL_FILE_PATH" ;;
  *) exit 0 ;;
esac

Add this to your .claude/settings.json:

{
  "hooks": [
    {
      "match": {"event": "PostToolUse", "tool": "Edit"},
      "command": "bash .claude/hooks/run-tests-on-edit.sh"
    }
  ]
}

That is the entire MVP. Forty lines including the JSON. You do not need tailtest to start. What tailtest adds is the runner dispatch across four languages, R12 classification, R15 adversarial pass, the structured report, and the four-agent abstraction so the same config works across Claude Code, Cursor, Codex CLI, and Cline. If you only care about Claude Code and Python, the above bash script is a real starting point. If you grow into needing the rest, the upgrade path is uvx tailtest install --agent claude.

Where to read more

The agent edits platform page covers the runtime where the hook plugs in. The hook-based testing explained post covers the broader architectural argument for hooks over prompts. The Claude Code solution page walks through the full integration.

FAQ

What is the Claude Code PostToolUse hook?

PostToolUse is a hook event that fires after Claude Code’s tools (Edit, Write, MultiEdit, Bash) complete. It is configured in .claude/settings.json and runs a shell command with the tool’s result available in stdin and environment variables.

Does PostToolUse block Claude’s response?

Yes, synchronously. The hook runs before Claude composes its next message. Whatever latency the hook adds is felt by the user. Median for a typical testing hook is around one to two seconds.

Can PostToolUse roll back an edit?

No. PostToolUse fires after the edit has already happened. To prevent edits before they occur, use PreToolUse, which can veto the tool call by exiting non-zero.

What is the maximum runtime for a PostToolUse hook?

The default timeout is 60 seconds. The timeout_ms field in the hook config overrides this. Hooks that exceed the timeout get killed and the tool result is rendered without the hook output.

How is PostToolUse different from a git pre-commit hook?

PostToolUse fires per edit, inside the agent’s turn. Pre-commit fires per commit, when the human commits. Agents make dozens of edits between commits. PostToolUse catches what pre-commit misses.