LEVEL 1 SETUP

Your Stack, Understood

forge setup reads your codebase - package.json, directory structure, config files - and builds a complete architecture profile. No questionnaires, no configuration files. It just works.

Backend Project

9 gates in under 10 seconds. Forge detects your full stack (NestJS, Prisma, PostgreSQL, JWT, Bull queues), audits naming conventions across all services, measures existing skill coverage, and generates a tailored dev-pipeline with 50+ quality patterns scoped to your exact capabilities.

Terminal

❯

◆ Gate 1: Architecture Audit

Stack: NestJS 11.0 + TypeScript 5.8 + Prisma 6.4

Architecture: Clean Architecture (3 layers)

Database: PostgreSQL 16 | CI: GitHub Actions

Auth: JWT + Passport | Events: Bull queues

Parallel: 7/10 | Conflicts: 2 zones

✓ Gate 2: Development Pipeline

Git: trunk-based · Commits: conventional (92%) · Mode: balanced

✓ Gate 3: Task Tracking → GitHub Issues + 3 scripts

✓ Gate 3.5: Script Verification → 3/3 syntax + smoke tests

◆ Gate 4: Quality Checklist

Capabilities: api · database · auth · events · docker

✓ Loaded: api.md + infra.md + shared.md → 53 patterns

◆ Gate 5: Naming Conventions

✓ Services: *Service (8/8) ✓ DTOs: Create*Dto (12/15)

⚠ Events: inconsistent → adopted *CreatedEvent

✓ Gate 6: Knowledge Export → CLAUDE.md updated, 15 rationalization detectors

◆ Gate 6.5: Skill Reconciliation

dev 35% → Replace (5/13 core features)

deploy 62% → Augment (+resume, +tasklist, +self-review)

testing 90% → Keep (12/13 core features)

◆ Generation: Dev-skills & Config

.claude/

├── CLAUDE.md ← architecture + conventions

├── settings.json ← 3 execution modes

└── skills/dev/

├── SKILL.md ← 6-phase TDD pipeline

├── step-catalog.md ← 52 scoped steps

└── quality-patterns/ ← api + infra + shared

✦ Setup complete. 9/9 gates passed. AI is now architecture-aware.

Works with: Claude Code · Cursor · Windsurf · any AI agent

Frontend Project

Same command, completely different output. On a Next.js + React 19 + Tailwind v4 project, Forge generates UI-specific quality patterns (accessibility, performance, component boundaries), configures component testing with Testing Library + Vitest, and adapts the entire pipeline to frontend conventions.

Terminal

❯

◆ Gate 1: Architecture Audit

Stack: Next.js 15.3 + React 19 + TypeScript 5.8

Rendering: App Router (hybrid SSR/SSG)

Styling: Tailwind v4 + shadcn/ui | Auth: NextAuth v5

✓ Gate 2: Development Pipeline

✓ Gate 4: Quality Checklist → 42 UI patterns (React, a11y, perf)

✓ Gate 5: Naming Conventions

✓ Gate 6: Knowledge Export → 12 rationalization detectors

✓ Gate 6.5: Skill Reconciliation

◆ Generation: Dev-skills & Config

✓ 48 steps · quality-patterns/ui.md · quality-patterns/shared.md

✓ Component testing configured (Testing Library + Vitest)

✦ Setup complete. 7/7 gates passed. AI is now UI-aware.

Ecosystem Discovery

Gate 0 scans your installed adapters and recommends additional modules based on your stack and workflow. Install them in seconds - they auto-register into the dev pipeline.

Terminal

❯

◆ Gate 0: Ecosystem Discovery

Installed:

✓ forge-core v8.0.0

✓ forge-product v4.1.1

Recommended for your stack:

◇ forge-qa Test strategy orchestrator

◇ forge-tracker ClickUp/Linear/Jira sync

◇ forge-prompts Prompt architecture tools

Installing recommended adapters...

✓ forge-qa v3.8.1 installed

✓ Test skills registered in dev-pipeline

✓ Quality gates activated

✦ Ecosystem ready. 3 adapters configured.

LEVEL 2 SMART ROUTING

Tell It What You Want

Without Hub, you need to know that /forge:product should run before /forge:qa, which should run before /dev. With Hub, you describe what you want in plain English. /forge:hub checks what prerequisites exist, finds what's missing, and builds the right execution chain automatically.

Automatic Prerequisite Resolution

Ask for a payment system. Hub checks if product stories exist for payments - they don't. Checks if a test plan exists - it doesn't. Instead of jumping straight to code with missing context, Hub builds a 3-step chain: design the feature first, generate tests from acceptance criteria, then implement with TDD. Every prerequisite resolved before a single line of code is written.

Terminal

❯

◆ Intent → product_feature · domain: payments

Prerequisite check:

✓ discovery → market context exists

✓ marketing → positioning exists

✗ product → no stories for payments

✗ qa → no test plan for payments

Execution chain:

1. product design "payments" → stories + acceptance criteria

2. qa test "payments" → test plan from AC

3. dev "payments" → implementation + TDD

✓ Proceed with 3 steps? Y

◆ Step 1/3: product design "payments"

✓ Generated 3 user stories, 7 acceptance criteria, 2 user flows

◆ Step 2/3: qa test "payments"

✓ Mapped 7 AC to 12 assertions · Generated 4 test files

◆ Step 3/3: dev "payments" → 6-phase TDD pipeline

✓ 12/12 tests passing · 7/7 AC covered · PR #94

✦ Chain complete. 3 steps · 12 tests · 7 AC · 0 skipped prerequisites.

LEVEL 3 DAILY WORKFLOW

The Dev Pipeline

When Hub routes to development, /dev takes over with a 6-phase TDD pipeline. All your modules work together: forge-worktree isolates in a branch, forge-product loads acceptance criteria, forge-qa generates tests, forge-tracker syncs status - full integration, zero manual coordination.

6-Phase Pipeline with Module Integration

Give it a feature description. Forge creates an isolated worktree, loads acceptance criteria from forge-product, generates failing tests via forge-qa (dispatched on Sonnet for cost efficiency), implements until green, runs self-review on Haiku, verifies AC coverage, and opens a PR - all in one command. Subagents run on cheaper models so you don't burn Opus tokens on mechanical work.

Terminal

❯

◆ Phase 0: Gate

Task: new-feature · ~8 files · Mode: balanced

✓ forge-worktree → branch feat/eng-142-webhook in isolated worktree

✓ forge-product → loaded 5 AC from features/payment.md

✓ forge-tracker → linked ENG-142 → In Progress

✓ Red flags: none

◆ Phase 1: Understand

Schema: WebhookEvent table · Layers: domain, infra, interface

✓ Contracts: IWebhookRepository, ProcessWebhookCommand, WebhookDto

◆ Phase 2: Test (RED)

✓ forge-qa → AC mapped to 8 assertions · dispatched test-generator (sonnet)

Generated: webhook.spec.ts (5 tests) + stripe-sig.spec.ts (3 tests)

✗ 8 tests failing - ready for implementation

◆ Phase 3: Implement (GREEN)

core.3.10: WebhookEvent entity + value objects

core.3.20: WebhookRepository (Prisma)

core.3.30: ProcessWebhookHandler + signature verification

core.3.40: WebhookController + route guards

✓ 8/8 tests passing

◆ Phase 4: Verify

✓ Type check · ✓ Lint · ✓ Quality patterns: 14/14

✓ Self-review (haiku) → 0 issues · 0 rationalization flags

✓ forge-product → AC coverage: 5/5

✓ forge-tracker → ENG-142 marked complete

◆ Phase 5: Close

✓ forge-worktree → PR #91 created, worktree cleaned

✦ Feature complete. 6 phases · 5/5 AC · 8 tests · 0 flags.

Rationalization Detectors

AI models are trained to be helpful - sometimes too helpful. When an AI agent rationalizes skipping best practices ("let's add tests later", "error handling can wait"), Forge's 15 rationalization detectors catch it mid-generation and block the shortcut. Patterns are scoped to your capabilities: API guards, database migrations, auth flows, event handling - not generic checklists.

Terminal

❯

⚠ Rationalization Detector Triggered

DETECTED: "Skip for now" pattern

Rule: api.quality.error-handling

What AI said:

"We can add error handling later

to keep things simple for now"

Why it matters:

→ 73% of production incidents trace to

missing error handling added "later"

✗ Blocked. Error handling is required.

✓ Generating error boundary template...

✦ Quality preserved. AI stayed on track.

LEVEL 4 ECOSYSTEM

Specialized Modules

Beyond setup and the dev pipeline, Forge has specialized modules for testing, task tracking, product design, prompt engineering, and fully autonomous development.

Test Generation (forge-qa)

Tests generated from acceptance criteria, not AI guesswork. forge-qa reads your feature specs from forge-product, maps each AC to concrete assertions, and generates tests across 11 modes (unit, integration, component, acceptance, e2e). Every test traces back to a requirement through 4-level traceability: AC, use cases, UX criteria, and LLM-as-Judge evaluation.

Terminal

❯

◆ Analyzing acceptance criteria...

Feature: Payment Webhook Endpoint

Source: .claude/forge/product/features/payment.md

✓ 5 acceptance criteria found

✓ 3 edge cases derived

◆ Generated tests:

src/test/

├── payment-webhook.spec.ts 5 tests

├── stripe-signature.spec.ts 3 tests

└── payment-events.e2e.spec.ts 4 tests

Running tests...

✓ 12/12 passing (1.3s)

✦ Tests traced to requirements. Not guesswork.

Task Sync (forge-tracker)

Bidirectional sync between your task tracker and the AI agent. Pull tasks from Linear, ClickUp, Jira, GitHub, or Notion - acceptance criteria flow directly into the dev pipeline. When implementation is done, forge-tracker marks the task complete and posts a summary. No copy-pasting ticket IDs, no manual status updates.

Terminal

❯

◆ Syncing with Linear...

Sprint: March W2 (4 items remaining)

ENG-142 Payment webhook Urgent

ENG-138 Email notification fix High

ENG-145 Dashboard filters Medium

ENG-147 API rate limiting Medium

Picked: ENG-142 - Payment webhook

✓ Status → In Progress

✓ Branch: feat/eng-142-payment-webhook

✓ Acceptance criteria loaded (5 items)

✦ Task synced. AI has full context.

Agent Teams (forge-autopilot)

The endgame. Forge triages your backlog, detects file conflicts between features, groups them into parallel waves, and spawns AI agent teams in isolated git worktrees. A tech-lead agent coordinates while teammates execute the full /dev pipeline independently. 3 autonomy modes: supervised (you approve each PR), semi-auto (auto-merge green PRs), full-auto (hands-off sprint execution).

Terminal

❯

◆ Phase 1: Triage

Analyzing backlog: 6 features classified

✓ 4 parallelizable, 2 sequential

◆ Phase 2: Wave Planning

Wave 1 (parallel):

● Team A → ENG-142 Payment webhook

● Team B → ENG-145 Dashboard filters

● Team C → ENG-147 Rate limiting

◆ Phase 3: Executing Wave 1...

Team A: ████████████░░ 85% (GREEN)

Team B: ██████████████ 100% ✓ PR #89

Team C: ██████░░░░░░░░ 45% (RED)

✦ 2/3 features complete. 12 min elapsed.

And More

forge-product

Feature design with user stories, acceptance criteria, UX flows

forge-prompts

Prompt engineering with CO-STAR, RISEN, TIDD-EC frameworks

forge-worktree

Git worktree lifecycle - parallel branches, zero port conflicts

9 Advisory Modules

Discovery, marketing, analytics, SEO, copy, onboarding, growth, A/B tests

Interactive Guide

Your Stack, Understood

Backend Project

Frontend Project

Ecosystem Discovery

Tell It What You Want

Automatic Prerequisite Resolution

The Dev Pipeline

6-Phase Pipeline with Module Integration

Rationalization Detectors

Specialized Modules

Test Generation (forge-qa)

Task Sync (forge-tracker)

Agent Teams (forge-autopilot)

And More

Start Building Better