Initial project scaffold (Cleo)
This commit is contained in:
commit
b9a4e75da9
|
|
@ -0,0 +1,7 @@
|
||||||
|
__pycache__/
|
||||||
|
.env
|
||||||
|
*.pyc
|
||||||
|
*.pyo
|
||||||
|
.vscode/
|
||||||
|
.DS_Store
|
||||||
|
*.swp
|
||||||
|
|
@ -0,0 +1,39 @@
|
||||||
|
# Initial Product Spec (Draft)
|
||||||
|
|
||||||
|
## Project: Adobe Sign to DocuSign Template Migrator
|
||||||
|
|
||||||
|
### Purpose
|
||||||
|
Develop an agent/toolkit that can programmatically extract template data and field logic from Adobe Sign (“library documents”), map/transform into DocuSign’s template model, and create new DocuSign templates to reduce manual migration effort.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### High-Level Goals
|
||||||
|
- Download all template structure (PDFs, fields/tabs, roles, routing, logic) from Adobe Sign
|
||||||
|
- Generate best-approximation DocuSign templates programmatically
|
||||||
|
- Handle all basic field types and recipient roles
|
||||||
|
- Detect and warn on features needing special/manual handling (complex logic, custom validations, non-mappable features)
|
||||||
|
|
||||||
|
### Key Features (MVP)
|
||||||
|
- Connect to Adobe Sign and DocuSign APIs via credentials loaded from .env
|
||||||
|
- Extract template listing from Adobe Sign sandbox/account
|
||||||
|
- Pull all required endpoints: metadata, formFields, recipients, workflows
|
||||||
|
- Assemble complete data model for each imported template
|
||||||
|
- Mapping layer: field type/role/routing normalization (see field-mapping.md)
|
||||||
|
- Programmatically create equivalent template and roles in DocuSign
|
||||||
|
- Logging and reporting of success, errors, edge cases
|
||||||
|
|
||||||
|
### Stretch (Future)
|
||||||
|
- UI for side-by-side compare/QA
|
||||||
|
- Complex feature transform plugins
|
||||||
|
- Bulk mode & idempotent re-runs
|
||||||
|
- Support for in-place PDF field overlay (anchors/rects)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
#### Out of Scope (MVP)
|
||||||
|
- Agreement instance migration (focus on templates only)
|
||||||
|
- Custom integrations outside API surface
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Last updated: 2026-04-14 (scaffolded by Cleo)*
|
||||||
|
|
@ -0,0 +1,25 @@
|
||||||
|
# Initial Project Scaffold (by Cleo)
|
||||||
|
|
||||||
|
## Structure
|
||||||
|
- README.md (project purpose, background, key links/notes)
|
||||||
|
- api-samples.md (JSON examples of Adobe Sign API responses)
|
||||||
|
- docs/agent-harness/ (fill in agent template/execution structure)
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
1. Copy/initialize Agent Harness templates:
|
||||||
|
- /docs/agent-harness/README.md
|
||||||
|
- /docs/agent-harness/SPEC-CREATION-GUIDE.md
|
||||||
|
- ... (see BOOTSTRAP.md for all recommended files)
|
||||||
|
2. Collect and version sample API payloads for real templates
|
||||||
|
3. Begin mapping table for Adobe→DocuSign fields/roles/workflows
|
||||||
|
4. Draft initial Product Spec in docs/
|
||||||
|
|
||||||
|
## Suggestions for further setup
|
||||||
|
- Add a .gitignore (Node/Python/LLM junk, e.g. .DS_Store, .venv)
|
||||||
|
- Initialize a Git repo
|
||||||
|
- Optionally set up a `requirements.txt` or `package.json` if scripting
|
||||||
|
- (Later) Add test/docs folders, migration state tracking
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*This file generated 2026-04-14 to bootstrap the project scaffold. Update as needed for specifics!*
|
||||||
|
|
@ -0,0 +1,30 @@
|
||||||
|
# adobe-to-docusign-migrator
|
||||||
|
|
||||||
|
## Project Purpose
|
||||||
|
A migration toolkit/agent that automates extraction of Adobe Sign library templates (PDFs, fields, roles, workflow), transforms them to DocuSign template model, and creates/imports them into DocuSign.
|
||||||
|
|
||||||
|
## Background
|
||||||
|
- Adobe Sign and DocuSign both expose APIs to manage templates, fields, recipients, logic, and documents.
|
||||||
|
- Adobe Sign (Acrobat Sign) uses "library documents" as templates, with data accessible via JSON API calls (but not exportable in a single file). You assemble the template info by calling:
|
||||||
|
- `/libraryDocuments/{libraryDocumentId}` (metadata, PDFs, roles)
|
||||||
|
- `/libraryDocuments/{libraryDocumentId}/formFields` (fields/tags)
|
||||||
|
- `/libraryDocuments/{libraryDocumentId}/recipients` (recipients)
|
||||||
|
- `/libraryDocuments/{libraryDocumentId}/workflows` (if applicable)
|
||||||
|
- `/libraryDocuments/{libraryDocumentId}/auditTrail` (audit log, rarely needed for migration)
|
||||||
|
- DocuSign templates are more easily exported as a single payload, but you can also build them incrementally over the API.
|
||||||
|
- The most complex part of migration is mapping logic/fields/roles that are not 1:1 matches (conditional fields, complex routing).
|
||||||
|
|
||||||
|
## API Output Samples
|
||||||
|
See `api-samples.md` for category-by-category JSON breakdowns.
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
- Add full agent harness scaffold (`docs/agent-harness/` structure, project README, spec templates)
|
||||||
|
- Collect sample real-world Adobe Sign template JSONs
|
||||||
|
- Define mapping/transforms for fields, roles, logic
|
||||||
|
- Write initial extraction + mapping scripts
|
||||||
|
- Draft Product Spec
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Created: 2026-04-14*
|
||||||
|
*scaffolded by Cleo*
|
||||||
|
|
@ -0,0 +1,114 @@
|
||||||
|
# API Data Samples (Adobe Sign)
|
||||||
|
|
||||||
|
## 1. Library Document Metadata (`/libraryDocuments/{libraryDocumentId}`)
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"libraryDocumentId": "abc123",
|
||||||
|
"id": "abc123",
|
||||||
|
"name": "Sample NDA Template",
|
||||||
|
"status": "ACTIVE",
|
||||||
|
"ownerEmail": "someone@example.com",
|
||||||
|
"createdDate": "2026-04-14T15:40:00Z",
|
||||||
|
"sharingMode": "ACCOUNT",
|
||||||
|
"locale": "en_US",
|
||||||
|
"participants": [
|
||||||
|
{
|
||||||
|
"recipientSetRole": "SIGNER",
|
||||||
|
"recipientSetMemberInfos": [
|
||||||
|
{ "name": "Signer 1", "email": "signer1@example.com" }
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"fileInfos": [
|
||||||
|
{
|
||||||
|
"label": "Main PDF",
|
||||||
|
"transientDocumentId": "XYZ987",
|
||||||
|
"url": "https://secure.na1.adobesign.com/public/document?..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2. Form Fields (`/libraryDocuments/{libraryDocumentId}/formFields`)
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"fieldName": "FullName",
|
||||||
|
"type": "TEXT_FIELD",
|
||||||
|
"required": true,
|
||||||
|
"readOnly": false,
|
||||||
|
"defaultValue": "",
|
||||||
|
"locations": [
|
||||||
|
{ "pageNumber": 1, "rect": { "left": 120, "top": 250, "width": 200, "height": 15 } }
|
||||||
|
],
|
||||||
|
"recipientIndex": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"fieldName": "SignatureField1",
|
||||||
|
"type": "SIGNATURE",
|
||||||
|
"required": true,
|
||||||
|
"locations": [
|
||||||
|
{ "pageNumber": 2, "rect": { "left": 320, "top": 450, "width": 120, "height": 35 } }
|
||||||
|
],
|
||||||
|
"recipientIndex": 1
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3. Recipients (`/libraryDocuments/{libraryDocumentId}/recipients`)
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"recipientSetId": "1",
|
||||||
|
"recipientSetRole": "SIGNER",
|
||||||
|
"recipientSetMemberInfos": [
|
||||||
|
{ "email": "signer1@example.com", "name": "Signer One", "role": "SIGNER" }
|
||||||
|
],
|
||||||
|
"order": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"recipientSetId": "2",
|
||||||
|
"recipientSetRole": "APPROVER",
|
||||||
|
"recipientSetMemberInfos": [
|
||||||
|
{ "email": "approver@example.com", "name": "Approver Person", "role": "APPROVER" }
|
||||||
|
],
|
||||||
|
"order": 2
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Workflow Info (`/libraryDocuments/{libraryDocumentId}/workflows`)
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"workflowId": "w123",
|
||||||
|
"workflowName": "NDA Approval Chain",
|
||||||
|
"steps": [
|
||||||
|
{ "stepType": "SIGNER", "participantSetRole": "SIGNER", "order": 1 },
|
||||||
|
{ "stepType": "APPROVER", "participantSetRole": "APPROVER", "order": 2 }
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5. Audit (`/libraryDocuments/{libraryDocumentId}/auditTrail`)
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"eventType": "CREATED",
|
||||||
|
"actingUserEmail": "someone@example.com",
|
||||||
|
"date": "2026-04-14T15:40:00Z",
|
||||||
|
"description": "Template was created."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"eventType": "SHARED",
|
||||||
|
"actingUserEmail": "admin@example.com",
|
||||||
|
"date": "2026-04-15T12:14:00Z",
|
||||||
|
"description": "Template shared with account."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Expand/add new samples as we work through real templates.
|
||||||
|
|
@ -0,0 +1,3 @@
|
||||||
|
# Agent Harness Placeholder
|
||||||
|
|
||||||
|
Copy other relevant files from docs/agent-harness/ as needed. This is the main harness doc for the migration agent.
|
||||||
|
|
@ -0,0 +1,3 @@
|
||||||
|
# Execution Board Template
|
||||||
|
|
||||||
|
See main docs/agent-harness/EXECUTION-BOARD-TEMPLATE.md for the structure to use here (copy in as needed during implementation).
|
||||||
|
|
@ -0,0 +1,255 @@
|
||||||
|
# Agent Harness Templates
|
||||||
|
|
||||||
|
A complete system for running autonomous AI coding agents on complex projects.
|
||||||
|
|
||||||
|
## Files
|
||||||
|
|
||||||
|
### Core Templates (copy into your project)
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `AGENT.md` | The agent's "system prompt" — reads this every iteration. Defines the core loop, mandatory pre-commit checklist (tests + TypeScript), commit attribution format, Tests-Added rule, and known anti-patterns |
|
||||||
|
| `PROJECT-SPEC.md` | Template for defining your problem. Sections for: overview, tech stack, requirements with acceptance criteria, data model, API design, constraints, phasing, anti-patterns |
|
||||||
|
| `DECISIONS.md` | Architecture Decision Record (ADR) template for documenting non-obvious technical choices. Prevents agent drift by creating continuity across fresh contexts |
|
||||||
|
| `EXECUTION-BOARD-TEMPLATE.md` | **⭐ New.** Pre-implementation planning artifact for a stream. Defines ALL packets, known-answer tests, and acceptance criteria BEFORE any code is written. The core of the plan-then-implement discipline. |
|
||||||
|
| `VALIDATION-TEMPLATE.md` | **⭐ New.** Per-packet evidence file written after each packet completes. Records test counts, known-answer results, and acceptance criteria tick-off. |
|
||||||
|
| `PROCESS-EVAL-TEMPLATE.md` | **⭐ New.** Stream retrospective written after merge. Honest assessment of task sizing, test-first compliance, and model quality. |
|
||||||
|
| `TASK-SPEC-TEMPLATE.md` | Reusable pre-delegation contract for non-trivial tasks. Defines objective, acceptance criteria, constraints, boundaries, verification, and proof artifact before work starts. |
|
||||||
|
| `ralph-loop.sh` | The Ralph Wiggum bash loop — spawns fresh agent instances, checks for completion signals, restarts until done. Supports Claude, Codex, Aider, Gemini, and custom agents |
|
||||||
|
| `model-report.ts` | Parses git log `Agent:` trailers to generate per-model quality table (commits, tests added, TypeScript errors). Copy to `scripts/model-report.ts`, add `"model-report": "ts-node scripts/model-report.ts"` to package.json |
|
||||||
|
| `scaffold-project.sh` | Helper script to scaffold a new simple or large project with core harness files, starter docs, and optional `.harness/` structure. |
|
||||||
|
| `PROJECT-KICKOFF.md` | Project-local kickoff checklist template to confirm spec, tooling, evals, and runtime choices are ready before implementation begins. |
|
||||||
|
| `GAP-AUDIT-2026-04-04.md` | Point-in-time audit of the harness. Documents current strengths, gaps, priorities, and the consolidation work package. |
|
||||||
|
|
||||||
|
### Process Guides (read before you start)
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `SPEC-CREATION-GUIDE.md` | **Start here.** How to create a great spec through structured interview. The interview protocol, domain knowledge extraction, and spec quality checklist |
|
||||||
|
| `TUTORIAL.md` | **Best way to learn.** Complete 30-minute walkthrough building a markdown link checker CLI tool from zero. Concrete, copy-pasteable example of the entire workflow |
|
||||||
|
| `GETTING-STARTED.md` | Practical startup/scaffolding guide for real projects: create project root, copy templates, choose harness mode, scaffold `.harness/`, and start the first loop cleanly. |
|
||||||
|
| `CURRENT-STATE.md` | One-page executive summary of the harness: what is mature, what improved recently, and what should be improved next. |
|
||||||
|
| `WAVE-BASED-MANAGEMENT.md` | **⭐ New.** How to structure larger projects into waves, streams, and packets. The plan-then-implement discipline, execution boards, known-answer tests, and wave gates. Essential for projects with 10+ tasks. |
|
||||||
|
| `PLAN-MANAGEMENT.md` | How the IMPLEMENTATION_PLAN.md works — the living document agents update. Task decomposition patterns, intervention strategies, progress tracking |
|
||||||
|
| `REVIEW-AND-QA.md` | How to evaluate agent output. When to review, what to look for, how to course-correct. Review checklist template including model attribution and TypeScript hygiene checks |
|
||||||
|
| `EVAL-INFRASTRUCTURE.md` | Consolidated guide to the harness eval stack: implementation correctness, domain correctness, regression protection, and process quality. |
|
||||||
|
| `POST-RUN-VALIDATION.md` | How the harness decides work is really done after execution. Especially important for script-orchestrated runtimes that must not trust agent self-reporting blindly. |
|
||||||
|
| `SUPERVISION.md` | Optional operations layer for unattended Ralph runs. Covers supervisor/watchdog patterns, state files, and audit trails for long-running script-orchestrated sessions. |
|
||||||
|
| `WORKFLOW-SEAMS.md` | Map of the handoffs between spec, plan, execution boards, validation evidence, review, process evals, and runtime orchestration. |
|
||||||
|
| `WORKFLOW-DIAGRAM.md` | Visual map of the harness showing project phases, where each document matters most, and how the artifacts connect across the lifecycle. |
|
||||||
|
| `COST-OPTIMIZATION.md` | Getting more work per dollar. Request-based vs token-based billing, optimal strategies per provider, model selection guide, the hybrid strategy, anti-patterns |
|
||||||
|
| `OPENCLAW-INTEGRATION.md` | Running the harness in OpenClaw with sessions_spawn, cron jobs, and shell scripts. Model selection, monitoring, cost optimization |
|
||||||
|
| `TROUBLESHOOTING.md` | When things go wrong. The five failure modes (stuck loop, drift, overengineering, test theater, context overflow) and how to fix each |
|
||||||
|
| `PARALLEL-AGENTS.md` | Running multiple agents simultaneously on independent tasks. When to parallelize, how to split work, how to merge results, conflict resolution, OpenClaw patterns |
|
||||||
|
|
||||||
|
### Examples & Reference
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `EXAMPLES.md` | Worked example: Fintrove-style finance app spec + comparison of three approaches (Ezward, Ralph Wiggum, Nate Jones) |
|
||||||
|
| `CHANGELOG.md` | Version history and evolution of the agent harness project itself |
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
## Runtime Models
|
||||||
|
|
||||||
|
There are two different harness runtime models in this system, and it helps to keep them separate:
|
||||||
|
|
||||||
|
### 1. Agent-Orchestrated Runtime
|
||||||
|
|
||||||
|
This is the OpenClaw/manual-orchestration model.
|
||||||
|
|
||||||
|
- a supervising agent decides what to run next
|
||||||
|
- that agent can inspect execution boards, validation evidence, git history, and prior results
|
||||||
|
- that agent can spawn sub-agents, review outcomes, and adapt the workflow dynamically
|
||||||
|
|
||||||
|
Use this when:
|
||||||
|
- you want a smart orchestrator in the loop
|
||||||
|
- you want sub-agent fan-out
|
||||||
|
- you want richer judgment between iterations
|
||||||
|
|
||||||
|
Primary guide:
|
||||||
|
- `OPENCLAW-INTEGRATION.md`
|
||||||
|
|
||||||
|
### 2. Script-Orchestrated Runtime
|
||||||
|
|
||||||
|
This is the `ralph-loop.sh` model.
|
||||||
|
|
||||||
|
- the shell script is the orchestrator
|
||||||
|
- the script must interpret completion/stuck/error signals itself
|
||||||
|
- any judgment the supervising agent would normally provide must be encoded into runtime checks
|
||||||
|
|
||||||
|
Use this when:
|
||||||
|
- you want a portable terminal-native loop
|
||||||
|
- you want tmux/background shell operation
|
||||||
|
- you want minimal dependencies beyond the CLI agent itself
|
||||||
|
|
||||||
|
Important implication:
|
||||||
|
- if the script is the orchestrator, reliability has to come from explicit checks, not from assuming the agent will always judge correctly
|
||||||
|
- for long unattended runs, add a separate supervisor/watchdog layer rather than assuming tmux alone is sufficient
|
||||||
|
|
||||||
|
### New to the Harness? (Start Here)
|
||||||
|
1. **Read** `CURRENT-STATE.md` — understand what the harness is good at right now
|
||||||
|
2. **Read** `WORKFLOW-DIAGRAM.md` — get the phase map before diving into details
|
||||||
|
3. **Read** `GETTING-STARTED.md` — scaffold a real project cleanly
|
||||||
|
4. **Use** `scaffold-project.sh` or `new-harness-project` if you want the fastest reliable setup
|
||||||
|
5. **Read** `TUTORIAL.md` — 30-minute hands-on walkthrough building a real CLI tool
|
||||||
|
6. **Read** `SPEC-CREATION-GUIDE.md` — learn the interview protocol
|
||||||
|
7. **Read** `TASK-SPEC-TEMPLATE.md` — learn the packet-sized contract for non-trivial delegation
|
||||||
|
8. **Use** `PROJECT-KICKOFF.md` in your new project as the readiness checklist
|
||||||
|
9. **Try it** — build your own project using the workflow
|
||||||
|
|
||||||
|
### Ready to Build? (Simple project, <10 tasks)
|
||||||
|
1. **Read** `COST-OPTIMIZATION.md` — understand your billing model before you start burning budget
|
||||||
|
2. **Interview** — work with your agent to create the spec (or do it solo)
|
||||||
|
3. **Fill out** `PROJECT-SPEC.md` with your problem definition
|
||||||
|
4. **Read** `EVAL-INFRASTRUCTURE.md` if the project has calculations, regulated logic, or other high-cost-to-be-wrong behavior
|
||||||
|
5. **Read** `POST-RUN-VALIDATION.md` if the runtime will need to validate task completion mechanically
|
||||||
|
6. **Read** `SUPERVISION.md` if the script-orchestrated runtime will run unattended for hours
|
||||||
|
7. **Copy** `PROJECT-SPEC.md`, `AGENT.md`, and `DECISIONS.md` into your project root
|
||||||
|
8. **Choose a runtime**:
|
||||||
|
- `./ralph-loop.sh` for the script-orchestrated model
|
||||||
|
- OpenClaw sessions/sub-agents for the agent-orchestrated model
|
||||||
|
9. **Review** at phase boundaries using `REVIEW-AND-QA.md` checklist
|
||||||
|
10. **Troubleshoot** failures using `TROUBLESHOOTING.md`
|
||||||
|
|
||||||
|
For unattended script-orchestrated runs, consider adding an optional supervisor/watchdog wrapper around `ralph-loop.sh` so process death, stale waits, and silent stalls can be detected independently of the tmux pane.
|
||||||
|
The optional guide and starter templates live in `SUPERVISION.md`, `supervise-ralph-loop.template.sh`, and `audit-ralph-loop.template.sh`.
|
||||||
|
|
||||||
|
### Building Something Larger? (10+ tasks, multiple features)
|
||||||
|
1. **Read** `WAVE-BASED-MANAGEMENT.md` — the plan-then-implement discipline
|
||||||
|
2. **Read** `WORKFLOW-SEAMS.md` — understand how the harness artifacts hand off to each other
|
||||||
|
3. **Read** `EVAL-INFRASTRUCTURE.md` — define the eval stack before implementation starts
|
||||||
|
4. **Read** `POST-RUN-VALIDATION.md` — define how the runtime will decide packet completion is real
|
||||||
|
5. **Create** your `IMPLEMENTATION_PLAN.md` with all tasks grouped into waves
|
||||||
|
6. **Create** `.harness/EXECUTION_MASTER.md` — your wave/stream dashboard
|
||||||
|
7. **For each stream:** copy `EXECUTION-BOARD-TEMPLATE.md`, fill ALL packets before coding any
|
||||||
|
8. **For non-trivial delegated packets:** create a task spec from `TASK-SPEC-TEMPLATE.md`
|
||||||
|
9. **After each packet:** copy `VALIDATION-TEMPLATE.md` and fill it in
|
||||||
|
10. **After each stream:** copy `PROCESS-EVAL-TEMPLATE.md` and write the retrospective
|
||||||
|
11. **At each wave boundary:** run the wave gate checklist before starting the next wave
|
||||||
|
|
||||||
|
If you use `ralph-loop.sh` for a larger project, pass the active board explicitly:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./ralph-loop.sh --board .harness/<stream>/execution-board.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## The Core Insight
|
||||||
|
|
||||||
|
All successful agent approaches share the same loop:
|
||||||
|
|
||||||
|
```
|
||||||
|
Orient (read spec + plan) → Pick ONE task → Build → Test → Commit → Exit → Restart fresh
|
||||||
|
```
|
||||||
|
|
||||||
|
The spec defines WHAT. The plan tracks WHERE we are. Fresh context each iteration prevents drift. The human reviews and course-corrects.
|
||||||
|
|
||||||
|
In the script-orchestrated runtime, some of that review must be encoded into the loop itself.
|
||||||
|
In the agent-orchestrated runtime, a supervising agent can supply more of that judgment dynamically.
|
||||||
|
Task specs improve the preconditions for delegation. Post-run validation improves the postconditions.
|
||||||
|
|
||||||
|
See each file for detailed instructions.
|
||||||
|
|
||||||
|
## When to Use Which Guide
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────┐
|
||||||
|
│ "Which guide do I need?" │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Just starting a real project? │
|
||||||
|
│ → GETTING-STARTED.md │
|
||||||
|
│ │
|
||||||
|
│ Want it scaffolded for you? │
|
||||||
|
│ → scaffold-project.sh / new-harness-project │
|
||||||
|
│ │
|
||||||
|
│ Need a kickoff checklist inside the project? │
|
||||||
|
│ → PROJECT-KICKOFF.md │
|
||||||
|
│ │
|
||||||
|
│ Want the one-page status view? │
|
||||||
|
│ → CURRENT-STATE.md │
|
||||||
|
│ │
|
||||||
|
│ Want the visual phase map? │
|
||||||
|
│ → WORKFLOW-DIAGRAM.md │
|
||||||
|
│ │
|
||||||
|
│ Want hands-on learning? │
|
||||||
|
│ → TUTORIAL.md (hands-on learning) │
|
||||||
|
│ │
|
||||||
|
│ Creating a spec? │
|
||||||
|
│ → SPEC-CREATION-GUIDE.md (interview) │
|
||||||
|
│ │
|
||||||
|
│ Delegating a non-trivial task? │
|
||||||
|
│ → TASK-SPEC-TEMPLATE.md │
|
||||||
|
│ │
|
||||||
|
│ Agent is stuck? │
|
||||||
|
│ → TROUBLESHOOTING.md (failure modes) │
|
||||||
|
│ │
|
||||||
|
│ Reviewing agent output? │
|
||||||
|
│ → REVIEW-AND-QA.md (what to check) │
|
||||||
|
│ │
|
||||||
|
│ Need a full eval strategy? │
|
||||||
|
│ → EVAL-INFRASTRUCTURE.md │
|
||||||
|
│ │
|
||||||
|
│ Need runtime completion checks? │
|
||||||
|
│ → POST-RUN-VALIDATION.md │
|
||||||
|
│ │
|
||||||
|
│ Need unattended-run supervision? │
|
||||||
|
│ → SUPERVISION.md │
|
||||||
|
│ │
|
||||||
|
│ Confused about how docs hand off? │
|
||||||
|
│ → WORKFLOW-SEAMS.md │
|
||||||
|
│ │
|
||||||
|
│ Worried about cost? │
|
||||||
|
│ → COST-OPTIMIZATION.md (billing models) │
|
||||||
|
│ │
|
||||||
|
│ Multiple independent features? │
|
||||||
|
│ → PARALLEL-AGENTS.md (coordination) │
|
||||||
|
│ │
|
||||||
|
│ Using OpenClaw? │
|
||||||
|
│ → OPENCLAW-INTEGRATION.md (sessions_spawn) │
|
||||||
|
│ │
|
||||||
|
│ Agent keeps changing past decisions? │
|
||||||
|
│ → DECISIONS.md (ADR template) │
|
||||||
|
│ │
|
||||||
|
│ Want to see it in action? │
|
||||||
|
│ → EXAMPLES.md (real project example) │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Philosophy
|
||||||
|
|
||||||
|
### Fresh Context > Long Context
|
||||||
|
Each iteration starts with a fresh agent. No accumulated confusion, no stale reasoning. The git history and plan file provide continuity.
|
||||||
|
|
||||||
|
### One Task > Many Tasks
|
||||||
|
Agents that try to do everything in one session produce spaghetti. Agents that focus on ONE task produce clean commits.
|
||||||
|
|
||||||
|
### Spec Quality > Agent Quality
|
||||||
|
A great spec with a mediocre agent beats a vague spec with a great agent. The spec is your leverage point.
|
||||||
|
|
||||||
|
### Review > Repair
|
||||||
|
It's easier to review and guide than to debug and fix. Catch drift early through periodic reviews.
|
||||||
|
|
||||||
|
### Explicit > Implicit
|
||||||
|
Agents can't read your mind. Write down constraints, anti-patterns, and decisions. What's obvious to you is invisible to the agent.
|
||||||
|
|
||||||
|
## Contributing
|
||||||
|
|
||||||
|
This harness is a living system. If you:
|
||||||
|
- Discover new failure modes
|
||||||
|
- Develop better patterns
|
||||||
|
- Find gaps in the guides
|
||||||
|
- Create examples for other project types
|
||||||
|
|
||||||
|
Document them and contribute back. The harness improves as we learn what works.
|
||||||
|
|
||||||
|
## Version
|
||||||
|
|
||||||
|
Current version: **2.0.0** (see `CHANGELOG.md` for history)
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
Public domain. Use it, modify it, share it. No attribution required.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
_The harness doesn't write code. It creates conditions where agents can write code reliably._
|
||||||
|
|
@ -0,0 +1,331 @@
|
||||||
|
# Spec Creation Guide — The Interview Protocol
|
||||||
|
|
||||||
|
> The spec is the most important document in the entire harness.
|
||||||
|
> A bad spec produces bad code, no matter how good the agent is.
|
||||||
|
> This guide teaches you how to create a great spec through structured conversation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why Interview-Based Spec Creation?
|
||||||
|
|
||||||
|
Most agent harness guides say "write a good spec" and move on. That's like telling someone "just write good code." The spec requires **two kinds of knowledge**:
|
||||||
|
|
||||||
|
1. **Domain knowledge** — what the human knows (goals, constraints, edge cases, things they've tried)
|
||||||
|
2. **Technical knowledge** — what the agent/engineer knows (architecture patterns, tooling, testing strategies)
|
||||||
|
|
||||||
|
Neither side has the full picture. The interview process brings them together.
|
||||||
|
|
||||||
|
### The Anti-Pattern: Agent-Written Specs
|
||||||
|
|
||||||
|
If you ask an agent to "analyze this codebase and write a spec," you get a description of **what exists**, not a plan for **what should exist**. The agent can't know:
|
||||||
|
- Why you're building this
|
||||||
|
- What you've tried that didn't work
|
||||||
|
- What tradeoffs you're willing to make
|
||||||
|
- What "done" looks like to you
|
||||||
|
|
||||||
|
### The Anti-Pattern: Human-Only Specs
|
||||||
|
|
||||||
|
If the human writes the spec alone, you get:
|
||||||
|
- Vague acceptance criteria ("it should be fast")
|
||||||
|
- Missing technical details (no build commands, no test strategy)
|
||||||
|
- Implied knowledge that the agent can't access
|
||||||
|
- Gaps where the human assumed things were obvious
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The Interview Protocol
|
||||||
|
|
||||||
|
### Phase 1: Vision & Context (5-10 minutes)
|
||||||
|
|
||||||
|
Start broad. Understand the "why" before the "what."
|
||||||
|
|
||||||
|
**Questions to ask:**
|
||||||
|
|
||||||
|
1. **"What are we building, in one sentence?"**
|
||||||
|
- Forces clarity. If they can't say it in one sentence, the scope isn't clear yet.
|
||||||
|
- Good: "A CLI toolkit for interacting with DocuSign APIs without a proxy server."
|
||||||
|
- Bad: "Something to help with DocuSign stuff."
|
||||||
|
|
||||||
|
2. **"Who is this for?"**
|
||||||
|
- The user? Other developers? An automated system?
|
||||||
|
- This shapes API design, error messages, documentation needs.
|
||||||
|
|
||||||
|
3. **"Why now? What's the trigger?"**
|
||||||
|
- Understanding urgency and motivation reveals hidden requirements.
|
||||||
|
- "I'm tired of copying tokens manually" → auto-refresh is a core requirement, not a nice-to-have.
|
||||||
|
|
||||||
|
4. **"What does 'done' look like? How will you know it's working?"**
|
||||||
|
- Push for measurable criteria, not feelings.
|
||||||
|
- "It works" → "I can run `cli auth` and get a valid token without opening a browser."
|
||||||
|
|
||||||
|
5. **"What have you tried before? What didn't work?"**
|
||||||
|
- This is GOLD. Anti-patterns save agents hours of wasted effort.
|
||||||
|
- "Node.js fetch sends headers that break SpringCM" → use curl instead.
|
||||||
|
|
||||||
|
6. **"What options did you consider and reject?"**
|
||||||
|
- Capture plausible alternatives that should stay rejected unless something material changes.
|
||||||
|
- Example: "We considered WebSockets, but polling is enough for MVP and much simpler to operate."
|
||||||
|
|
||||||
|
**What you're listening for:**
|
||||||
|
- Unstated assumptions ("obviously it needs to...")
|
||||||
|
- Emotional language (frustration = high-priority requirement)
|
||||||
|
- Scope creep indicators ("and eventually it could also...")
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 2: Requirements Extraction (10-15 minutes)
|
||||||
|
|
||||||
|
Now go feature by feature. For each feature:
|
||||||
|
|
||||||
|
**The requirement loop:**
|
||||||
|
|
||||||
|
1. **"Walk me through how you'd use this feature."**
|
||||||
|
- Get the happy path first. Concrete scenario, not abstract description.
|
||||||
|
- "I'd run `cli templates list` and see my 20 most recent templates with names and IDs."
|
||||||
|
|
||||||
|
2. **"What could go wrong?"**
|
||||||
|
- Error cases, edge cases, permissions issues.
|
||||||
|
- "The token could be expired." → auto-refresh requirement.
|
||||||
|
- "The account might not have that API enabled." → graceful error message.
|
||||||
|
|
||||||
|
3. **"What's the input? What's the output?"**
|
||||||
|
- Be specific about formats, fields, defaults.
|
||||||
|
- "Input: template ID. Output: JSON with name, ID, folder, page count, created date."
|
||||||
|
|
||||||
|
4. **"How would you test this?"**
|
||||||
|
- If they can describe a test, you have an acceptance criterion.
|
||||||
|
- "I'd run it and check that I get at least one template back with a valid ID."
|
||||||
|
|
||||||
|
5. **"Is this a must-have or nice-to-have?"**
|
||||||
|
- Prioritization prevents scope explosion.
|
||||||
|
- Phase 1 = must-haves. Phase 2+ = nice-to-haves.
|
||||||
|
|
||||||
|
**Pro tip:** Number requirements as you go (FR-001, FR-002...). It creates shared language for the rest of the project.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 3: Technical Discovery (10-15 minutes)
|
||||||
|
|
||||||
|
This is where the engineer/agent fills in what the human might not think to specify.
|
||||||
|
|
||||||
|
**Questions to explore together:**
|
||||||
|
|
||||||
|
1. **Tech stack confirmation**
|
||||||
|
- "You're using TypeScript with npm workspaces — should we keep that pattern?"
|
||||||
|
- Don't assume. The human might want to change direction.
|
||||||
|
|
||||||
|
2. **Existing code patterns**
|
||||||
|
- Read the codebase. Identify patterns already in use.
|
||||||
|
- "I see you're using Commander.js for CLI parsing — should all packages follow that?"
|
||||||
|
- "Your auth module uses JWT with RSA keys — should new packages share that?"
|
||||||
|
|
||||||
|
3. **Build and test infrastructure**
|
||||||
|
- "What are the build commands? What test framework? What's the CI/CD setup?"
|
||||||
|
- If there's no test framework, that's a Phase 0 task.
|
||||||
|
|
||||||
|
4. **Data model and persistence**
|
||||||
|
- "Where does data live? Files? Database? Environment variables?"
|
||||||
|
- "How do packages share configuration?" (e.g., monorepo root `.env`)
|
||||||
|
|
||||||
|
5. **Deployment and environment**
|
||||||
|
- "Is this demo-only or does it need production support?"
|
||||||
|
- "What environments exist?" (demo, staging, production)
|
||||||
|
|
||||||
|
6. **Dependencies and external services**
|
||||||
|
- "What APIs are involved? What are their quirks?"
|
||||||
|
- "Any rate limits, authentication requirements, or known issues?"
|
||||||
|
|
||||||
|
**What the engineer contributes:**
|
||||||
|
- Suggest architecture patterns the human hasn't considered
|
||||||
|
- Identify missing infrastructure (test framework, linting, CI)
|
||||||
|
- Spot potential issues early (circular dependencies, shared state)
|
||||||
|
- Propose phasing based on technical dependencies
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 4: Constraint Mapping (5 minutes)
|
||||||
|
|
||||||
|
Explicitly capture the guardrails.
|
||||||
|
|
||||||
|
**Three categories:**
|
||||||
|
|
||||||
|
1. **MUST** — Non-negotiable requirements
|
||||||
|
- "MUST use curl for HTTP calls (fetch breaks SpringCM)"
|
||||||
|
- "MUST store tokens in .env file"
|
||||||
|
|
||||||
|
2. **MUST NOT** — Explicit prohibitions
|
||||||
|
- "MUST NOT commit secrets to git"
|
||||||
|
- "MUST NOT use React for the frontend"
|
||||||
|
|
||||||
|
3. **PREFER** — Soft preferences
|
||||||
|
- "PREFER ES modules over CommonJS"
|
||||||
|
- "PREFER shared utilities over code duplication"
|
||||||
|
|
||||||
|
**Why this matters:** Agents follow explicit constraints better than implied ones. A MUST NOT prevents entire categories of mistakes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 5: Spec Assembly (Agent's job)
|
||||||
|
|
||||||
|
After the interview, the agent assembles the spec:
|
||||||
|
|
||||||
|
1. **Fill the PROJECT-SPEC.md template** with interview answers
|
||||||
|
2. **Add technical details** discovered from code review
|
||||||
|
3. **Write acceptance criteria** from the requirement conversations
|
||||||
|
4. **Propose phasing** based on dependencies
|
||||||
|
5. **Include anti-patterns** from "what didn't work" answers
|
||||||
|
6. **Capture rejected approaches** from the interview in the spec when they are likely to prevent future drift
|
||||||
|
7. **Present to human for review**
|
||||||
|
|
||||||
|
**The review conversation:**
|
||||||
|
- Read through each section together
|
||||||
|
- Human corrects misunderstandings
|
||||||
|
- Agent asks clarifying questions on gaps
|
||||||
|
- Iterate until the human says "yes, that's what I want"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 6: Self-Containment Test (5 minutes)
|
||||||
|
|
||||||
|
> **The critical test:** Can the spec be solved without the agent needing
|
||||||
|
> to fetch information not included in it?
|
||||||
|
>
|
||||||
|
> This is Toby Lütke's insight: *Can you state a problem with enough
|
||||||
|
> context that the task is plausibly solvable without the agent going
|
||||||
|
> out and getting more information?*
|
||||||
|
|
||||||
|
**The test — rewrite the spec as if:**
|
||||||
|
1. The reader has never seen your project before
|
||||||
|
2. The reader doesn't know your coding conventions or style
|
||||||
|
3. The reader has no access to information you don't include
|
||||||
|
4. The reader will stop and do nothing if anything is ambiguous
|
||||||
|
|
||||||
|
**The checklist:**
|
||||||
|
- [ ] Every acronym is defined on first use
|
||||||
|
- [ ] File paths referenced actually exist and are correct
|
||||||
|
- [ ] External dependencies have versions pinned or install instructions included
|
||||||
|
- [ ] Domain-specific terms are explained (not everyone knows what "JWT" or "FTS" means)
|
||||||
|
- [ ] The agent can find all referenced files without searching
|
||||||
|
- [ ] If removing any sentence would cause the agent to make mistakes, the spec isn't self-contained yet
|
||||||
|
|
||||||
|
**The failure mode this catches:**
|
||||||
|
Agents fill gaps with statistical plausibility — they guess in ways that are
|
||||||
|
often subtly wrong. A spec that relies on shared context (even 5 minutes of
|
||||||
|
prior conversation) will produce outputs that look right but aren't.
|
||||||
|
|
||||||
|
**If the spec fails the test:** Add the missing context. If you can't add it
|
||||||
|
(too much to document), add an ESCALATE constraint: "If you encounter
|
||||||
|
information not covered by this spec, do not assume — ask the human."
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Spec Quality Checklist
|
||||||
|
|
||||||
|
Before handing a spec to agents, verify:
|
||||||
|
|
||||||
|
### Completeness
|
||||||
|
- [ ] Every feature has numbered acceptance criteria (FR-NNN)
|
||||||
|
- [ ] Data model is defined with types and constraints
|
||||||
|
- [ ] Build and test commands are specified and work
|
||||||
|
- [ ] Anti-patterns section exists with real examples
|
||||||
|
- [ ] Phasing is defined with dependencies noted
|
||||||
|
- [ ] All four constraint categories are filled (MUST / MUST NOT / PREFER / ESCALATE)
|
||||||
|
- [ ] Evaluation design section exists with test cases and verification steps
|
||||||
|
- [ ] Rejected approaches are captured when they would prevent repeated bad decisions
|
||||||
|
|
||||||
|
### Clarity
|
||||||
|
- [ ] A stranger could read this and understand what to build
|
||||||
|
- [ ] No ambiguous words ("fast", "nice", "good") — use numbers
|
||||||
|
- [ ] Input/output examples for key operations
|
||||||
|
- [ ] Error cases are explicitly described
|
||||||
|
|
||||||
|
### Testability
|
||||||
|
- [ ] Every acceptance criterion can be verified by running code
|
||||||
|
- [ ] Sample data or fixtures are provided
|
||||||
|
- [ ] Performance criteria have specific thresholds
|
||||||
|
- [ ] "Done" is objectively measurable
|
||||||
|
|
||||||
|
### Feasibility
|
||||||
|
- [ ] Tech stack is proven for this type of project
|
||||||
|
- [ ] External dependencies are accessible (API keys, permissions)
|
||||||
|
- [ ] Scope fits the timeline (phasing handles overflow)
|
||||||
|
- [ ] Known challenges are documented with mitigation strategies
|
||||||
|
|
||||||
|
### Self-Containment
|
||||||
|
- [ ] A stranger could solve this without asking follow-up questions
|
||||||
|
- [ ] No domain-specific terms used without definition
|
||||||
|
- [ ] All file paths, commands, and references are correct
|
||||||
|
- [ ] ESCALATE constraints cover situations where spec is ambiguous
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Common Interview Mistakes
|
||||||
|
|
||||||
|
### 1. Leading the witness
|
||||||
|
**Bad:** "You probably want auto-refresh, right?"
|
||||||
|
**Good:** "What happens when the token expires mid-session?"
|
||||||
|
|
||||||
|
### 2. Accepting vague answers
|
||||||
|
**Bad:** Human: "It should handle errors well." Agent: "Got it."
|
||||||
|
**Good:** "Can you give me an example of an error? What should the user see?"
|
||||||
|
|
||||||
|
### 3. Skipping the 'why'
|
||||||
|
**Bad:** Jumping straight to features.
|
||||||
|
**Good:** Understanding context first — it changes how you interpret every requirement.
|
||||||
|
|
||||||
|
### 4. Over-engineering the spec
|
||||||
|
**Bad:** 50-page spec with UML diagrams for a CLI tool.
|
||||||
|
**Good:** Enough detail for an agent to work autonomously, no more.
|
||||||
|
|
||||||
|
### 5. Forgetting anti-patterns
|
||||||
|
**Bad:** Only describing what TO do.
|
||||||
|
**Good:** Explicitly listing what NOT to do — saves agents from repeating your mistakes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Template: Interview Notes
|
||||||
|
|
||||||
|
Use this to capture notes during the interview before assembling the spec:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
# Interview Notes — [Project Name]
|
||||||
|
**Date:** YYYY-MM-DD
|
||||||
|
**Participants:** [Human], [Agent]
|
||||||
|
|
||||||
|
## Vision
|
||||||
|
- One-liner:
|
||||||
|
- Target user:
|
||||||
|
- Trigger/motivation:
|
||||||
|
- Success criteria:
|
||||||
|
|
||||||
|
## Features (raw notes)
|
||||||
|
1. Feature name — description — happy path — error cases — priority
|
||||||
|
2. ...
|
||||||
|
|
||||||
|
## Technical Context
|
||||||
|
- Existing stack:
|
||||||
|
- Patterns to follow:
|
||||||
|
- Patterns to avoid:
|
||||||
|
- Build/test commands:
|
||||||
|
|
||||||
|
## Constraints
|
||||||
|
- MUST:
|
||||||
|
- MUST NOT:
|
||||||
|
- PREFER:
|
||||||
|
|
||||||
|
## Anti-patterns (things that didn't work)
|
||||||
|
1.
|
||||||
|
2.
|
||||||
|
|
||||||
|
## Rejected Approaches
|
||||||
|
1. [Option] — rejected because [...] — scope: [project-only / reusable lesson]
|
||||||
|
2.
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
1.
|
||||||
|
2.
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
_This guide is a living document. Update it as you learn what works._
|
||||||
|
|
@ -0,0 +1,31 @@
|
||||||
|
# Field Mapping: Adobe Sign → DocuSign
|
||||||
|
|
||||||
|
This doc will be used to track direct mappings and required transforms between Adobe Sign library document (template) properties and DocuSign template properties.
|
||||||
|
|
||||||
|
## Simple Field Type Mapping
|
||||||
|
| Adobe Sign Type | DocuSign Tab Type | Notes |
|
||||||
|
|---------------------|-------------------------|------------------------|
|
||||||
|
| TEXT_FIELD | text | Valid for plain text |
|
||||||
|
| SIGNATURE | signHere | |
|
||||||
|
| CHECKBOX | checkbox | |
|
||||||
|
| DATE | dateSigned | May need transform |
|
||||||
|
| RADIO | radio | Group mapping required |
|
||||||
|
| DROPDOWN | list | Data mapping |
|
||||||
|
| APPROVER | signer/approver role | DocuSign roles more flexible |
|
||||||
|
| ... | ... | ... |
|
||||||
|
|
||||||
|
## Role/Recipient Mapping
|
||||||
|
| Adobe Field | DocuSign Field | Notes |
|
||||||
|
|-----------------------|---------------------|-------|
|
||||||
|
| recipientSetRole | role (signer, etc.) | Matching by role name |
|
||||||
|
| recipientSetMemberInfos.email | role.email | |
|
||||||
|
|
||||||
|
## Workflow Feature Mapping (Rough)
|
||||||
|
- Sequential routing → Recipient order
|
||||||
|
- Parallel routing → Recipient routing order logic (sequential/parallel in DocuSign)
|
||||||
|
- Conditional logic → Needs review, possible via DocuSign conditional tabs/logic
|
||||||
|
|
||||||
|
## To Do
|
||||||
|
- Add table for conditional logic/rule mapping
|
||||||
|
- Add validation/transforms needed for field masks, validation, default values
|
||||||
|
- Collect pain points/edge cases for high-fidelity migration
|
||||||
|
|
@ -0,0 +1,3 @@
|
||||||
|
requests
|
||||||
|
python-dotenv
|
||||||
|
pydantic
|
||||||
Loading…
Reference in New Issue