Claude Code · Cursor · Codex · Cline

AI Software Testing.

Open source. Hook-based. Works with Claude Code, Cursor, Codex, and Cline.

You build. Claude builds. tailtest makes sure it works.

Automatically runs the test cycle you'd otherwise have to ask for manually -- generating scenarios for what was just built, running them, and surfacing only what fails. Zero commands. Zero setup.

Claude Code · Cursor · Codex · Cline

$ claude plugin marketplace add avansaber/tailtest

$ claude plugin install tailtest@avansaber-tailtest

MIT · No telemetry · Claude Code, Cursor, Codex CLI, Cline

AI agents write code fast.
Who checks if it works?

AI coding agents can build a complete feature in one session. When was the last time someone asked whether it handles edge cases?
Tests pass because the AI wrote both the code and the tests. Real usage looks nothing like the tests.
A feature ships. It works on the happy path. The first user who tries something slightly different hits an error nobody anticipated.

tailtest closes the gap. Automatically. Every time your AI builds something.

Automated testing for AI coding agents in Claude Code, Cursor, Codex CLI, and Cline. Adversarial test mode finds bugs the AI did not think to test for.

How it works

Automatic. Quiet when passing. Specific when failing.

01 -- Your AI builds

> Add discount validator to checkout.py

Edit: checkout.py

lines: +12

status: applied

Your AI agent makes changes.

02 -- tailtest runs

Generating scenarios...

12 scenarios written

pytest -q tests/test_checkout.py

12/12 passed 0.4s

tailtest generates production-like scenarios and runs them.

03 -- Only failures surface

2 scenarios failed

stacked: 10%-off + FREE

→ total: -$2.40

(expected: $0.00)

Want me to fix these?

If everything passes, a one-line confirmation. If something fails, a specific finding.

What happens on every AI edit.

Tests, automatically. Failures, specifically.

Automated test generation for AI coding agents. Same R1-R15 rule layer + adversarial mode across Claude Code, Cursor, Codex CLI, and Cline.

Zero config

Install once in your AI coding agent. No configuration file. No prompts. No setup interview. The plugin works immediately across Claude Code, Cursor, Codex CLI, and Cline.

Quiet when it passes

If every scenario passes, tailtest says so in one line and nothing more. No dashboard, no progress bar. One quiet confirmation, then you keep building.

Surfaces only failures

When a scenario fails, you get a specific finding: what ran, what broke, what the actual result was vs. what was expected. No false positives. No noise. Failures are R12-classified into real bugs, environment issues, and test bugs so you know what to fix.

Production-like scenarios

tailtest generates scenarios based on what your AI agent actually built -- realistic inputs, edge cases, and usage patterns, not just happy-path unit tests. Frameworks Flask, FastAPI, NestJS, Spring Boot, Django, Rails, Laravel, and more get matching test patterns out of the box.

Adversarial test mode (V13)

Coverage tests confirm the code works. Adversarial tests try to break it. R15 + 8 scenario categories (boundary inputs, format / injection, type confusion, concurrent state, time / locale edges, partial failures, resource exhaustion, off-by-one) probe the bugs your AI did not think to test for. 25 real bugs found in 6 popular Python repos in one production run.

Works with any language

The plugin runs alongside your AI coding agent regardless of project language. Python, TypeScript, JavaScript, Go, Ruby, Java, Kotlin, C#, PHP, Rust, and more -- if your AI can edit it, tailtest can test it.

Baseline filtering

Pre-existing failures stay silent. tailtest only reports on what broke in this session, so you're never interrupted by old debt you already know about. Accept findings to move them out of the hot loop; review later via /tailtest-debt.

Want to see adversarial mode in action? Read the adversarial test generation docs .

commands to run tailtest

it fires on every edit

configuration required

works out of the box

1234

tests across all 4 plugins

we eat our own dogfood

AI tools supported

Claude Code · Cursor · Codex · Cline

Now live. Claude Code · Cursor · Codex · Cline.

Get started

Up and running in 60 seconds.

Pick your AI tool. Install. Then just build.

Claude Code plugin

$ claude plugin marketplace add avansaber/tailtest

$ claude plugin install tailtest@avansaber-tailtest

# Restart Claude Code. Then just build.

Update / uninstall:

# update: claude plugin marketplace update avansaber-tailtest && claude plugin update tailtest@avansaber-tailtest

# uninstall: claude plugin remove tailtest@avansaber-tailtest

Read the quickstart View on GitHub

Not sure which to pick? Compare the variants .

AI Software Testing.

AI agents write code fast. Who checks if it works?

Automatic. Quiet when passing. Specific when failing.

Tests, automatically. Failures, specifically.

Up and running in 60 seconds.

AI agents write code fast.
Who checks if it works?