From Guardrails to Methodology | Dragos Romario Rosu

It started as a leash.

I was using Claude Code on a few projects and kept hitting the same wall. The AI would confidently generate code without tests, hallucinate API signatures, suggest installing dependencies I already had, and — once — tried to force-push to main. It wasn’t a capability problem. The model was sharp. It just had no principles.

So I wrote a bootstrap script. A set of markdown procedures that told the AI: here’s how to analyze a project, here are the rules, here’s what never to do. A guardrails file. A config dump. Claude-specific, rigid, about 2,600 lines of “stop doing that.”

It worked. Mostly. And then it didn’t.

The Discovery Problem

The first thing that broke was enforcement. I had rules like “always write tests first” and “never bypass RLS.” But the AI couldn’t follow them because it didn’t understand the project well enough to know what to test or where the data layer lived. Rules without context are just noise.

So the bootstrap grew. Deep project analysis — scanning the directory structure, reading package manifests, detecting frameworks, identifying patterns. Before the AI could follow rules, it needed to discover the codebase. Before enforcement, comprehension.

This is obvious in retrospect. You don’t hand a new engineer the style guide on day one and expect production-ready code. You give them a tour first.

Tests as Hypotheses

The next shift was philosophical. I’d been writing TDD enforcement as a checklist: write test, run test, see it fail, write code, run test, see it pass. Red-green-refactor. Standard.

But something clicked when I started watching how the AI used these instructions. The AI doesn’t understand why TDD matters. It follows the procedure mechanically. To make it actually work, I had to reframe the entire concept.

A test is not proof that code works. A test is a hypothesis — a falsifiable claim about expected behavior. Red-green-refactor is the scientific method: state your prediction, run the experiment, refine the model. When I rewrote the TDD guide in those terms, the AI’s output improved measurably. Not because the model got smarter, but because the framing gave it a reasoning structure it could actually use.

That led to two more methodology guides. Iterative problem-solving: don’t jump to a fix, narrow the hypothesis space first. Multi-approach validation: when you’re not sure which solution is right, prototype two and let evidence decide. These aren’t AI features. They’re engineering discipline that happens to be executable by a machine.

The Rename

Up to this point, everything was Claude-specific. The files were called CLAUDE.md. The procedures referenced Claude Code directly. The MCP server configuration assumed a single tool.

Then I looked at the AGENTS.md convention — a standard for AI coding assistant configuration that 20,000+ repositories had already adopted. Tool-agnostic by design. And I realized my system wasn’t really about Claude. It was about methodology. The methodology doesn’t care which model reads it.

So I renamed everything. CLAUDE.md became a two-line forwarding file. AGENTS.md became the source of truth. The procedures were rewritten to detect which AI tool was running — Claude Code, Cursor, Windsurf, Cline — and adapt accordingly.

This was the hardest decision in the project, and the most important one. I lost Claude-specific optimizations. I gained portability, longevity, and a principle: your way of working should outlive your toolchain.

Two Concerns, Two Modes

The architecture had another problem. Installing global tools (MCP servers, shared configuration) and bootstrapping a specific project (generating AGENTS.md, analyzing the codebase) were tangled together. Run the bootstrap twice and things broke. Skip a step and things broke differently.

The fix was separation. Environment setup runs once per machine — it installs Serena for semantic code navigation, Context7 for live library documentation, and Sequential Thinking for structured reasoning. Project bootstrap runs once per repo — it detects your stack, generates a tailored AGENTS.md, creates methodology guides, and sets up architecture decision records.

First-time users get both automatically. The system checks what’s already done and skips it. Idempotent, like good infrastructure should be.

Progressive Disclosure

A recurring mistake in early versions was trying to stuff everything into one file. The AI would get a 400-line AGENTS.md and treat every instruction with equal weight, burning through its context window before reaching the code.

The fix came from documentation design: progressive disclosure. AGENTS.md is a table of contents — commands, key paths, critical rules, and links to deeper docs. When the AI needs to write tests, it reads the TDD guide. When it needs security patterns, it reads the security doc. When it doesn’t need either, those files don’t exist in its context.

This matters more for AI than for humans. A developer skims and ignores what’s irrelevant. An AI model processes everything you give it, relevant or not. Smaller context, focused content, better output.

The Meta Moment

At some point I bootstrapped AI Praxis with AI Praxis.

The repo now has its own AGENTS.md, its own methodology guides, its own Serena configuration. The tool that teaches AI assistants how to work on codebases is itself a codebase that an AI assistant works on, following the methodology the tool teaches.

It’s recursive, and it actually converges. The human judgment lives in the methodology choices — what to enforce, what to suggest, where to draw the line between guardrail and straitjacket. The AI handles execution. The system handles the bridge.

What It Is Now

AI Praxis is open source. Two commands to set up:

git clone https://github.com/rosudrag/ai-praxis.git /tmp/ai-praxis

Then tell your AI assistant:

Read /tmp/ai-praxis and use it to bootstrap my project

It detects your stack, installs the tooling, and generates everything — AGENTS.md, TDD guides, security docs, code quality standards, architecture decision records. There are examples for Next.js, .NET, Django, and Go. It works with any MCP-compatible tool.

What I Learned

Configuration tells an AI what to do. Methodology teaches it how to think. The difference matters when you hand an agent autonomous control over your codebase.

Methodology scales better than rules. Rules are brittle — they break when context changes. A methodology adapts because it encodes reasoning patterns, not just constraints.

Tool-agnosticism is a feature, not a compromise. The moment you couple your workflow to a vendor, you’ve built a workaround, not a system.

And progressive disclosure matters more for AI than for humans. Because AI will use everything you give it, whether it should or not.

The first version was a leash. The current version is a curriculum.