Interactive Guide
Every terminal below is a real scenario. Watch commands execute, see what gets generated, understand what Forge does at each step.
Your Stack, Understood
forge setup reads your codebase - package.json, directory structure, config files - and builds a complete architecture profile. No questionnaires, no configuration files. It just works.
Backend Project
9 gates in under 10 seconds. Forge detects your full stack (NestJS, Prisma, PostgreSQL, JWT, Bull queues), audits naming conventions across all services, measures existing skill coverage, and generates a tailored dev-pipeline with 50+ quality patterns scoped to your exact capabilities.
◆ Gate 1: Architecture Audit
Stack: NestJS 11.0 + TypeScript 5.8 + Prisma 6.4
Architecture: Clean Architecture (3 layers)
Database: PostgreSQL 16 | CI: GitHub Actions
Auth: JWT + Passport | Events: Bull queues
Parallel: 7/10 | Conflicts: 2 zones
✓ Gate 2: Development Pipeline
Git: trunk-based · Commits: conventional (92%) · Mode: balanced
✓ Gate 3: Task Tracking → GitHub Issues + 3 scripts
✓ Gate 3.5: Script Verification → 3/3 syntax + smoke tests
◆ Gate 4: Quality Checklist
Capabilities: api · database · auth · events · docker
✓ Loaded: api.md + infra.md + shared.md → 53 patterns
◆ Gate 5: Naming Conventions
✓ Services: *Service (8/8) ✓ DTOs: Create*Dto (12/15)
⚠ Events: inconsistent → adopted *CreatedEvent
✓ Gate 6: Knowledge Export → CLAUDE.md updated, 15 rationalization detectors
◆ Gate 6.5: Skill Reconciliation
dev 35% → Replace (5/13 core features)
deploy 62% → Augment (+resume, +tasklist, +self-review)
testing 90% → Keep (12/13 core features)
◆ Generation: Dev-skills & Config
.claude/
├── CLAUDE.md ← architecture + conventions
├── settings.json ← 3 execution modes
└── skills/dev/
├── SKILL.md ← 6-phase TDD pipeline
├── step-catalog.md ← 52 scoped steps
└── quality-patterns/ ← api + infra + shared
✦ Setup complete. 9/9 gates passed. AI is now architecture-aware.
Works with: Claude Code · Cursor · Windsurf · any AI agent
Frontend Project
Same command, completely different output. On a Next.js + React 19 + Tailwind v4 project, Forge generates UI-specific quality patterns (accessibility, performance, component boundaries), configures component testing with Testing Library + Vitest, and adapts the entire pipeline to frontend conventions.
◆ Gate 1: Architecture Audit
Stack: Next.js 15.3 + React 19 + TypeScript 5.8
Rendering: App Router (hybrid SSR/SSG)
Styling: Tailwind v4 + shadcn/ui | Auth: NextAuth v5
✓ Gate 2: Development Pipeline
✓ Gate 4: Quality Checklist → 42 UI patterns (React, a11y, perf)
✓ Gate 5: Naming Conventions
✓ Gate 6: Knowledge Export → 12 rationalization detectors
✓ Gate 6.5: Skill Reconciliation
◆ Generation: Dev-skills & Config
✓ 48 steps · quality-patterns/ui.md · quality-patterns/shared.md
✓ Component testing configured (Testing Library + Vitest)
✦ Setup complete. 7/7 gates passed. AI is now UI-aware.
Ecosystem Discovery
Gate 0 scans your installed adapters and recommends additional modules based on your stack and workflow. Install them in seconds - they auto-register into the dev pipeline.
◆ Gate 0: Ecosystem Discovery
Installed:
✓ forge-core v8.0.0
✓ forge-product v4.1.1
Recommended for your stack:
◇ forge-qa Test strategy orchestrator
◇ forge-tracker ClickUp/Linear/Jira sync
◇ forge-prompts Prompt architecture tools
Installing recommended adapters...
✓ forge-qa v3.8.1 installed
✓ Test skills registered in dev-pipeline
✓ Quality gates activated
✦ Ecosystem ready. 3 adapters configured.
Tell It What You Want
Without Hub, you need to know that /forge:product should run before /forge:qa, which should run before /dev. With Hub, you describe what you want in plain English. /forge:hub checks what prerequisites exist, finds what's missing, and builds the right execution chain automatically.
Automatic Prerequisite Resolution
Ask for a payment system. Hub checks if product stories exist for payments - they don't. Checks if a test plan exists - it doesn't. Instead of jumping straight to code with missing context, Hub builds a 3-step chain: design the feature first, generate tests from acceptance criteria, then implement with TDD. Every prerequisite resolved before a single line of code is written.
◆ Intent → product_feature · domain: payments
Prerequisite check:
✓ discovery → market context exists
✓ marketing → positioning exists
✗ product → no stories for payments
✗ qa → no test plan for payments
Execution chain:
1. product design "payments" → stories + acceptance criteria
2. qa test "payments" → test plan from AC
3. dev "payments" → implementation + TDD
✓ Proceed with 3 steps? Y
◆ Step 1/3: product design "payments"
✓ Generated 3 user stories, 7 acceptance criteria, 2 user flows
◆ Step 2/3: qa test "payments"
✓ Mapped 7 AC to 12 assertions · Generated 4 test files
◆ Step 3/3: dev "payments" → 6-phase TDD pipeline
✓ 12/12 tests passing · 7/7 AC covered · PR #94
✦ Chain complete. 3 steps · 12 tests · 7 AC · 0 skipped prerequisites.
The Dev Pipeline
When Hub routes to development, /dev takes over with a 6-phase TDD pipeline. All your modules work together: forge-worktree isolates in a branch, forge-product loads acceptance criteria, forge-qa generates tests, forge-tracker syncs status - full integration, zero manual coordination.
6-Phase Pipeline with Module Integration
Give it a feature description. Forge creates an isolated worktree, loads acceptance criteria from forge-product, generates failing tests via forge-qa (dispatched on Sonnet for cost efficiency), implements until green, runs self-review on Haiku, verifies AC coverage, and opens a PR - all in one command. Subagents run on cheaper models so you don't burn Opus tokens on mechanical work.
◆ Phase 0: Gate
Task: new-feature · ~8 files · Mode: balanced
✓ forge-worktree → branch feat/eng-142-webhook in isolated worktree
✓ forge-product → loaded 5 AC from features/payment.md
✓ forge-tracker → linked ENG-142 → In Progress
✓ Red flags: none
◆ Phase 1: Understand
Schema: WebhookEvent table · Layers: domain, infra, interface
✓ Contracts: IWebhookRepository, ProcessWebhookCommand, WebhookDto
◆ Phase 2: Test (RED)
✓ forge-qa → AC mapped to 8 assertions · dispatched test-generator (sonnet)
Generated: webhook.spec.ts (5 tests) + stripe-sig.spec.ts (3 tests)
✗ 8 tests failing - ready for implementation
◆ Phase 3: Implement (GREEN)
core.3.10: WebhookEvent entity + value objects
core.3.20: WebhookRepository (Prisma)
core.3.30: ProcessWebhookHandler + signature verification
core.3.40: WebhookController + route guards
✓ 8/8 tests passing
◆ Phase 4: Verify
✓ Type check · ✓ Lint · ✓ Quality patterns: 14/14
✓ Self-review (haiku) → 0 issues · 0 rationalization flags
✓ forge-product → AC coverage: 5/5
✓ forge-tracker → ENG-142 marked complete
◆ Phase 5: Close
✓ forge-worktree → PR #91 created, worktree cleaned
✦ Feature complete. 6 phases · 5/5 AC · 8 tests · 0 flags.
Rationalization Detectors
AI models are trained to be helpful - sometimes too helpful. When an AI agent rationalizes skipping best practices ("let's add tests later", "error handling can wait"), Forge's 15 rationalization detectors catch it mid-generation and block the shortcut. Patterns are scoped to your capabilities: API guards, database migrations, auth flows, event handling - not generic checklists.
⚠ Rationalization Detector Triggered
DETECTED: "Skip for now" pattern
Rule: api.quality.error-handling
What AI said:
"We can add error handling later
to keep things simple for now"
Why it matters:
→ 73% of production incidents trace to
missing error handling added "later"
✗ Blocked. Error handling is required.
✓ Generating error boundary template...
✦ Quality preserved. AI stayed on track.
Specialized Modules
Beyond setup and the dev pipeline, Forge has specialized modules for testing, task tracking, product design, prompt engineering, and fully autonomous development.
Test Generation (forge-qa)
Tests generated from acceptance criteria, not AI guesswork. forge-qa reads your feature specs from forge-product, maps each AC to concrete assertions, and generates tests across 11 modes (unit, integration, component, acceptance, e2e). Every test traces back to a requirement through 4-level traceability: AC, use cases, UX criteria, and LLM-as-Judge evaluation.
◆ Analyzing acceptance criteria...
Feature: Payment Webhook Endpoint
Source: .claude/forge/product/features/payment.md
✓ 5 acceptance criteria found
✓ 3 edge cases derived
◆ Generated tests:
src/test/
├── payment-webhook.spec.ts 5 tests
├── stripe-signature.spec.ts 3 tests
└── payment-events.e2e.spec.ts 4 tests
Running tests...
✓ 12/12 passing (1.3s)
✦ Tests traced to requirements. Not guesswork.
Task Sync (forge-tracker)
Bidirectional sync between your task tracker and the AI agent. Pull tasks from Linear, ClickUp, Jira, GitHub, or Notion - acceptance criteria flow directly into the dev pipeline. When implementation is done, forge-tracker marks the task complete and posts a summary. No copy-pasting ticket IDs, no manual status updates.
◆ Syncing with Linear...
Sprint: March W2 (4 items remaining)
ENG-142 Payment webhook Urgent
ENG-138 Email notification fix High
ENG-145 Dashboard filters Medium
ENG-147 API rate limiting Medium
Picked: ENG-142 - Payment webhook
✓ Status → In Progress
✓ Branch: feat/eng-142-payment-webhook
✓ Acceptance criteria loaded (5 items)
✦ Task synced. AI has full context.
Agent Teams (forge-autopilot)
The endgame. Forge triages your backlog, detects file conflicts between features, groups them into parallel waves, and spawns AI agent teams in isolated git worktrees. A tech-lead agent coordinates while teammates execute the full /dev pipeline independently. 3 autonomy modes: supervised (you approve each PR), semi-auto (auto-merge green PRs), full-auto (hands-off sprint execution).
◆ Phase 1: Triage
Analyzing backlog: 6 features classified
✓ 4 parallelizable, 2 sequential
◆ Phase 2: Wave Planning
Wave 1 (parallel):
● Team A → ENG-142 Payment webhook
● Team B → ENG-145 Dashboard filters
● Team C → ENG-147 Rate limiting
◆ Phase 3: Executing Wave 1...
Team A: ████████████░░ 85% (GREEN)
Team B: ██████████████ 100% ✓ PR #89
Team C: ██████░░░░░░░░ 45% (RED)
✦ 2/3 features complete. 12 min elapsed.
And More
forge-product
Feature design with user stories, acceptance criteria, UX flows
forge-prompts
Prompt engineering with CO-STAR, RISEN, TIDD-EC frameworks
forge-worktree
Git worktree lifecycle - parallel branches, zero port conflicts
9 Advisory Modules
Discovery, marketing, analytics, SEO, copy, onboarding, growth, A/B tests
Start Building Better
Everything you saw runs in under 10 seconds. The artifacts live in .claude/ - they work with Claude Code, Cursor, Windsurf, and any AI agent. Remove Forge after setup if you want - the artifacts are standalone.
EUR 29
Starter - core + worktree
EUR 79
Pro - + product, QA, tracker, prompts
EUR 149
Complete - all 15+ modules