278 lines
12 KiB
Markdown
278 lines
12 KiB
Markdown
# Architecture & Design — Adobe Sign → DocuSign Migrator
|
||
|
||
*Last updated: 2026-04-23*
|
||
|
||
---
|
||
|
||
## System Overview
|
||
|
||
The migrator is a Python toolkit with two interfaces that share the same core pipeline:
|
||
|
||
- **CLI** (`src/`) — shell scripts for one-off or scripted migrations
|
||
- **Web UI** (`web/`) — FastAPI + vanilla JS SPA for browser-based, multi-user migrations
|
||
|
||
Both interfaces execute the same sequence: authenticate → download → normalize → validate → compose → upload → report.
|
||
|
||
---
|
||
|
||
## Component Map
|
||
|
||
```
|
||
Browser / CLI
|
||
│
|
||
▼
|
||
┌─────────────────────────────────────────────────┐
|
||
│ web/app.py (FastAPI) OR src/migrate_*.py │
|
||
│ – session management (web only) │
|
||
│ – OAuth orchestration (web only) │
|
||
│ – batch job queue (in-memory dict, web only) │
|
||
└──────────────┬──────────────────────────────────┘
|
||
│ calls
|
||
┌──────────┴──────────┐
|
||
▼ ▼
|
||
src/adobe_api.py src/upload_docusign_template.py
|
||
(Adobe Sign REST) (DocuSign REST — upsert)
|
||
│ ▲
|
||
│ raw JSON │ DocuSign JSON
|
||
▼ │
|
||
src/services/mapping_service.py
|
||
└─► src/models/normalized_template.py
|
||
│ NormalizedTemplate
|
||
▼
|
||
src/services/validation_service.py
|
||
│ blockers / warnings
|
||
▼
|
||
src/compose_docusign_template.py
|
||
└─► src/models/field_issue.py
|
||
│ (template_dict, warnings, field_issues)
|
||
│
|
||
▼
|
||
src/reports/report_builder.py
|
||
└─► MigrationReport written to migration-output/.history.json
|
||
```
|
||
|
||
---
|
||
|
||
## Pipeline Stages
|
||
|
||
### 1. Authentication
|
||
|
||
| Surface | Adobe Sign | DocuSign |
|
||
|---------|-----------|---------|
|
||
| CLI | OAuth Auth Code via `adobe_auth.py`; tokens stored in `.env` | OAuth Auth Code via `docusign_auth.py`; tokens stored in `.env` |
|
||
| Web | OAuth Auth Code via `/api/auth/adobe/callback`; tokens in server-side session file | OAuth Auth Code via `/api/auth/docusign/callback`; tokens in server-side session file |
|
||
|
||
The web UI never stores OAuth tokens in `.env` — each browser session carries its own tokens in a signed server-side session file under `.session-store/`. Sessions are identified by a cookie (`session_id`) signed with `SESSION_SECRET_KEY`.
|
||
|
||
### 2. Download (Adobe Sign)
|
||
|
||
`src/adobe_api.py` fetches from the Adobe Sign REST v6 API. Shard is configured via `ADOBE_SIGN_BASE_URL` (default: `https://api.eu2.adobesign.com/api/rest/v6`).
|
||
|
||
For each template, three artifacts are written to `downloads/<template-name>__<id>/`:
|
||
|
||
| File | Content |
|
||
|------|---------|
|
||
| `metadata.json` | Template metadata (name, status, creator, dates) |
|
||
| `form_fields.json` | Full form field list with locations, conditions, validations |
|
||
| `documents.json` | Document list metadata |
|
||
| `<name>.pdf` | Binary PDF (base64 decoded) |
|
||
|
||
### 3. Normalize (`mapping_service.py`)
|
||
|
||
`MappingService.from_folder(path)` reads the three JSON files and produces a `NormalizedTemplate` (Pydantic model). This platform-agnostic intermediate schema decouples Adobe-specific field names from the DocuSign composition step.
|
||
|
||
Key transformations at this stage:
|
||
- Participant sets → typed role list (`SIGN`, `APPROVE`, `CC`)
|
||
- Field locations expanded into flat list (multi-location fields produce N entries)
|
||
- Conditional action references converted to normalized `ConditionalRule` objects
|
||
|
||
### 4. Validate (`validation_service.py`)
|
||
|
||
Runs pre-migration checks and returns `(blockers: list[str], warnings: list[str])`.
|
||
|
||
| Check | Result on failure |
|
||
|-------|-----------------|
|
||
| No recipients | Blocker |
|
||
| No documents | Blocker |
|
||
| No signature fields | Warning |
|
||
| Unassigned fields | Warning |
|
||
| Unsupported feature detected | Warning |
|
||
|
||
Blockers halt migration. Warnings are stored in the history and surfaced in the UI but do not stop the pipeline.
|
||
|
||
### 5. Compose (`compose_docusign_template.py`)
|
||
|
||
Converts `NormalizedTemplate` → DocuSign `envelopeTemplate` JSON. Returns a 3-tuple:
|
||
|
||
```python
|
||
(template_dict: dict, warnings: list[str], field_issues: list[dict])
|
||
```
|
||
|
||
`field_issues` are structured `FieldIssue` objects (see `src/models/field_issue.py`) emitted when a field migrates successfully but something was silently dropped or approximated. Each issue has a machine-readable `code` (e.g. `CROSS_RECIPIENT_CONDITIONAL`, `HIDE_ACTION`, `FIELD_TYPE_SKIPPED`). See [field-mapping.md](../field-mapping.md) for the full list.
|
||
|
||
### 6. Upload (`upload_docusign_template.py`)
|
||
|
||
Upsert pattern:
|
||
1. Search DocuSign for an existing template with the same name
|
||
2. If found: `PUT /templates/{id}` (update the most recently modified match)
|
||
3. If not found: `POST /templates` (create new)
|
||
4. `--force-create` flag bypasses the search and always creates
|
||
|
||
### 7. Report (`report_builder.py`)
|
||
|
||
A `MigrationReport` is built per template and appended to `migration-output/.history.json`. Each record contains:
|
||
- template name, Adobe ID, DocuSign ID
|
||
- status (`success`, `dry_run`, `skipped`, `error`)
|
||
- blockers, warnings, field_issues
|
||
- PDF checksum (SHA-256)
|
||
- timestamp
|
||
|
||
---
|
||
|
||
## Web Layer
|
||
|
||
### FastAPI App (`web/app.py`)
|
||
|
||
- Mounts all routers under `/api/`
|
||
- Serves the SPA shell from `web/static/index.html`
|
||
- Installs `SanitizingFilter` on the root logger at startup (redacts tokens and secrets from all log output)
|
||
- Logs a warning at startup if `SESSION_SECRET_KEY` is the default development value
|
||
|
||
### Routers
|
||
|
||
| Router | Prefix | Responsibility |
|
||
|--------|--------|---------------|
|
||
| `auth.py` | `/api/auth` | Adobe Sign + DocuSign OAuth flows, session status |
|
||
| `templates.py` | `/api/templates` | Adobe template listing; migration status per template |
|
||
| `migrate.py` | `/api/migrate` | Single and batch migration; history; job polling |
|
||
| `verify.py` | `/api/verify` | Send test envelopes; poll status; void |
|
||
| `audit.py` | `/api/audit` | Audit log access + CSV export |
|
||
| `admin.py` | `/api/admin` | Admin-only operations (admin_emails gating) |
|
||
|
||
### Session Lifecycle
|
||
|
||
```
|
||
Browser makes first request
|
||
→ middleware generates UUID session_id
|
||
→ signed cookie set (itsdangerous, SESSION_SECRET_KEY)
|
||
→ session file created at .session-store/<session_id>.json
|
||
|
||
User connects Adobe Sign / DocuSign
|
||
→ OAuth tokens written to session file (never to .env)
|
||
→ session file updated on every token refresh
|
||
|
||
User disconnects or session file deleted
|
||
→ next request gets a fresh session_id and new file
|
||
→ old file can be deleted manually to force re-auth
|
||
```
|
||
|
||
Session files are plain JSON. Delete all files in `.session-store/` to reset all user sessions. Set `SESSION_STORE_DIR` in `.env` to change the location.
|
||
|
||
### Multi-Account DocuSign Support
|
||
|
||
When a DocuSign user belongs to multiple accounts, the web UI:
|
||
1. Fetches `/oauth/userinfo` after the OAuth callback
|
||
2. Sorts available accounts alphabetically
|
||
3. Prompts the user to pick one account for the session
|
||
4. Stores `docusign_account_id` in the session alongside the tokens
|
||
|
||
### Batch Job State
|
||
|
||
Batch migrations are tracked in an in-memory dict (`_batch_jobs`) in `web/routers/migrate.py`. Job state is lost on server restart — any in-flight batch becomes unrecoverable. This is a known limitation appropriate for single-operator deployments. Production deployments requiring durability should persist job state to a database or file store.
|
||
|
||
### Audit Log
|
||
|
||
`web/audit.py` writes one JSONL record per migration event to `AUDIT_LOG_FILE` (default: `.audit-log.jsonl`). Each record:
|
||
|
||
```json
|
||
{
|
||
"timestamp": "2026-04-23T12:00:00Z",
|
||
"session_id": "abc123",
|
||
"user_email": "user@example.com",
|
||
"action": "migrate",
|
||
"template_name": "Sales Agreement",
|
||
"adobe_template_id": "3AAA...",
|
||
"docusign_template_id": "uuid",
|
||
"status": "success",
|
||
"field_issues_count": 2,
|
||
"pdf_checksum": "sha256:abcdef..."
|
||
}
|
||
```
|
||
|
||
The `/api/audit` endpoints expose this log with filtering and CSV export. Sensitive fields (tokens, secrets) are never written — the `SanitizingFilter` on the root logger ensures they are redacted before hitting any output.
|
||
|
||
---
|
||
|
||
## Frontend SPA
|
||
|
||
Single-page app in `web/static/`. No build step — plain HTML + ES modules.
|
||
|
||
| File | Responsibility |
|
||
|------|---------------|
|
||
| `index.html` | Shell, left nav, top bar, router outlet |
|
||
| `js/router.js` | Hash-based routing (`#/templates`, `#/results`, etc.) |
|
||
| `js/state.js` | Global pub/sub state store |
|
||
| `js/api.js` | Typed fetch wrappers for all backend endpoints |
|
||
| `js/auth.js` | Auth chip UI, OAuth flow, toast notifications |
|
||
| `js/templates.js` | Templates view + detail tabs (overview / issues / history) |
|
||
| `js/migration.js` | Migration modal, progress polling, results view |
|
||
| `js/issues.js` | Issues & Warnings view |
|
||
| `js/verification.js` | Verification view (send / poll / void envelopes) |
|
||
| `js/history.js` | History & Audit view |
|
||
| `js/settings.js` | Settings view |
|
||
| `js/project.js` | Per-customer project context (localStorage) |
|
||
| `js/utils.js` | `escHtml`, `formatDate`, `renderFieldIssues`, etc. |
|
||
|
||
CSS uses DocuSign 2024 brand design tokens defined in `css/tokens.css`.
|
||
|
||
### Template Issue Summary
|
||
|
||
The Templates and Issues & Warnings pages use `/api/templates/status`. A
|
||
template is shown as `Clean` only when all of these are empty:
|
||
|
||
- validation `blockers`
|
||
- validation `warnings`
|
||
- composition `field_issues`
|
||
|
||
On the web server, migration downloads are temporary. If no persistent
|
||
`downloads/` folder exists for re-analysis, `/api/templates/status` falls back
|
||
to the current browser session's `migration-output/.history.json` records so
|
||
field issues discovered during migration still appear in the Templates summary.
|
||
|
||
---
|
||
|
||
## Security Design
|
||
|
||
| Concern | Mechanism |
|
||
|---------|----------|
|
||
| Token leakage in logs | `SanitizingFilter` installed on root logger at startup; redacts Bearer tokens, JWTs, long base64 strings, and key=value assignments for known secret fields |
|
||
| Session integrity | Sessions signed with `SESSION_SECRET_KEY` via `itsdangerous`; secret must be set in `.env` |
|
||
| Secret exposure at startup | Warning logged if `SESSION_SECRET_KEY` is the default value |
|
||
| PDF integrity | SHA-256 checksum computed before upload and stored in history |
|
||
| Credential storage | OAuth tokens stored in server-side session files, never in browser localStorage or logs |
|
||
|
||
---
|
||
|
||
## Utilities
|
||
|
||
### `src/utils/retry.py`
|
||
|
||
`retry_with_backoff` and `async_retry_with_backoff` decorators implement exponential backoff (configurable max retries, base delay, max delay). They target HTTP 429 / 5xx transient errors. These decorators are defined and tested but are not yet applied to API call sites — adding `@retry_with_backoff()` to functions in `adobe_api.py` and `upload_docusign_template.py` is the recommended next step for production hardening.
|
||
|
||
### `src/utils/log_sanitizer.py`
|
||
|
||
`install_sanitizing_filter()` attaches a `logging.Filter` to the root logger. The filter runs `redact()` on every log record's message and args, replacing Bearer tokens, JWTs, long base64 strings, and key=value secret assignments with `[REDACTED]`.
|
||
|
||
---
|
||
|
||
## Known Limitations
|
||
|
||
| Limitation | Impact | Mitigation |
|
||
|-----------|--------|-----------|
|
||
| Batch job state is in-memory | Lost on restart | Acceptable for CLI/single-operator; add DB persistence for multi-operator prod |
|
||
| Adobe shard configured via full base URL only | Changing shard requires `.env` update | Set `ADOBE_SIGN_BASE_URL` in `.env` |
|
||
| Retry decorators not applied to API calls | 429/5xx errors propagate immediately | Apply `@retry_with_backoff()` to `adobe_api.py` + `upload_docusign_template.py` |
|
||
| Regression tests require real fixture data | CI cannot run regression tests without downloaded templates | Check in anonymised fixtures or generate synthetic ones |
|
||
|
||
*Updated 2026-04-23 — reflects v2 web UI, session lifecycle, audit log schema, multi-account support, batch job state, security design.*
|