Part of Forge DevKit ecosystem

forge-prompts

Manage prompts like code

The problem

Prompts drift across sessions

Same question, different answers. No consistent framework. Each session reinvents the prompt wheel.

No way to test prompt quality

You change a system prompt and hope it still works. No regression tests, no quality metrics.

Prompt knowledge stays in one person's head

The developer who wrote the prompt leaves. Nobody knows why it's structured that way.

How it works

1

Install

One command adds forge-prompts to your environment.

forge install forge-prompts
2

Configure

3-gate wizard detects your LLM stack, establishes prompt principles, and selects frameworks (CO-STAR, RISEN, TIDD-EC).

3

Manage

Inventory all prompts, audit against principles, review for quality, test for regressions.

Mode: inventory / audit / review / test / evolve
4

Evolve

Learning loop captures findings from audits and tests. Principles improve automatically over time.

Key capabilities

5 operational modes

Inventory, audit, review, test, evolve. Full lifecycle management for every prompt in your project.

3 prompt frameworks

CO-STAR (context-structured), RISEN (role-based), TIDD-EC (task-decomposed) - or define your own. Each enforces a different prompt architecture.

Regression testing

LLM-as-judge tests ensure prompt changes don't break existing behavior. Integrated with forge-qa.

5 psychology biases

Anchoring to first drafts, confirmation bias in test evaluation, sunk cost on failing prompts, authority bias toward vendor examples, and framing effects in A/B prompt comparison.

Learning loop

Audit findings become new principles automatically. After 3 cycles, your prompt guidelines reflect real project patterns, not generic best practices.

Sample output

A real-world example of what this module produces.

forge:prompts audit
 Prompt Audit - acme-web

File                              Framework  Score  Issues
prompts/generate-summary.md       CO-STAR    9/10   -
prompts/classify-ticket.md        RISEN      6/10   missing negative examples
prompts/draft-email.md            none       3/10   no role, no output format

Total: 3 prompts | 1 passing | 1 warning | 1 failing

Who is this for

AI Engineer

Manage and version-control prompts with frameworks, audit trails, and regression tests.

Developer Using LLM APIs

Stop ad-hoc prompt writing - get structured frameworks and automated quality checks.

Team Lead

Standardize prompt engineering across the team with shared principles and learning loops.

forge-prompts vs Manual prompt engineering

Dimension Manual prompt engineering Forge DevKit
Prompt management Scattered across files, no inventory Full catalog with principles and frameworks
Quality assurance Manual spot-checking Automated audit + LLM-as-judge regression tests
Knowledge retention In developer's head Documented principles with learning loop evolution
Consistency Each prompt written ad-hoc Framework-guided with team-wide principles
Get Forge →