# Phase 1: CopyMeThat Format Analysis
**Completed:** 2026-03-28 23:52 EDT
**Analyzed by:** Cleo
---
## File Structure
### TXT Export
- **Location:** `data/exports/Copy_Me_That_TXT_20260328_58775_z1p5lpjsgz/`
- **Format:** One `.txt` file per recipe
- **Naming:** Snake-case recipe names (e.g., `apple_crowned_coffee_cake.txt`)
- **Estimated count:** ~50-100+ recipes
### HTML Export
- **Location:** `data/exports/Copy_Me_That_HTML_20260328_58775_z1p5lpjsgz/`
- **Format:** Single `recipes.html` file containing ALL recipes
- **Images:** `/images/` subfolder with recipe photos
- **Structure:** Semantic HTML with consistent IDs
---
## TXT Format Specification
### Structure
```
[Recipe Title]
Adapted from [URL]
tags: [Tag1], [Tag2], [Tag3]
[Optional: "I made this."]
Servings: [serving info]
INGREDIENTS
[ingredient 1]
[ingredient 2]
...
STEPS
1) [step 1]
2) [step 2]
...
NOTES
[optional notes]
```
### Example
```
Apple-Crowned Coffee Cake
Adapted from http://www.kraftcanada.com/recipes/apple-crowned-coffee-cake-191423
tags: Cake, Dessert
I made this.
Servings: 16 servings, 1 piece (76 g) each
INGREDIENTS
2 cups flour
2 Tbsp. granulated sugar
...
STEPS
1) Heat oven to 375°F.
2) Combine flour...
NOTES
If the glaze is too thick...
```
### Key Fields
- **Title:** First line
- **Source URL:** "Adapted from [URL]"
- **Tags:** Comma-separated after "tags:"
- **Made flag:** Presence of "I made this."
- **Servings:** After "Servings:"
- **Ingredients:** Plain list between "INGREDIENTS" and "STEPS"
- **Instructions:** Numbered list after "STEPS"
- **Notes:** Optional, after "NOTES"
---
## HTML Format Specification
### Structure
Single HTML file with repeated `.recipe` div blocks:
```html
Recipe Title
Tag1
Tag2
Description text
- step 1
...
```
### Key Selectors
- `.recipe` — Recipe container
- `#name` — Title
- `#original_link` — Source URL
- `.recipeImage` — Image path
- `.recipeCategory` — Tags
- `#description` — Description
- `#made_this` — Made flag
- `#ratingValue` — Rating (1-5)
- `#recipeYield` — Servings
- `.recipeIngredient` — Ingredients (list items)
- `.instruction` — Steps (ordered list items)
- `.recipeNote` — Notes
---
## Implementation Strategy
### Recommended Approach: HTML Parser Primary
**Rationale:**
- HTML has MORE data (images, ratings, descriptions)
- Single file = easier batch import
- Well-structured semantic markup
- Images already linked
**Fallback:** TXT parser for edge cases
### Parser Architecture
```
ImportService
├── CopyMeThatHtmlParser
│ ├── parseRecipes(html: string): Recipe[]
│ ├── extractRecipeBlocks(html: string): HTMLElement[]
│ └── parseRecipeBlock(block: HTMLElement): Recipe
└── CopyMeThatTxtParser (optional fallback)
└── parseTxtFile(content: string): Recipe
```
### API Endpoint Design
```
POST /api/recipes/import/copyme that
Content-Type: multipart/form-data
Request:
- file: recipes.html OR multiple .txt files
- options: { skipDuplicates: boolean, importImages: boolean }
Response:
{
success: true,
data: {
imported: 45,
skipped: 3,
failed: 2,
recipes: [...] // preview
}
}
```
---
## Data Mapping
| CopyMeThat Field | Recipe Schema Field | Notes |
|------------------|---------------------|-------|
| `#name` | `title` | Direct mapping |
| `#original_link` | `source_url` | Direct mapping |
| `#description` | `description` | Direct mapping |
| `.recipeCategory` | `tags` | Parse into tag array |
| `#recipeYield` | `servings` | Extract number if possible |
| `.recipeIngredient` | `ingredients[].item` | Plain text list |
| `.instruction` | `steps[].instruction` | Numbered list |
| `.recipeNote` | Notes field? | May need schema extension |
| `.recipeImage` | `image_url` | Copy to app storage |
| `#made_this` | Custom field? | Boolean flag |
| `#ratingValue` | Custom field? | 1-5 rating |
### Schema Extensions Needed
- `made: boolean` — User has cooked this
- `rating: number` — 1-5 stars
- `notes: string` — General notes field
---
## Edge Cases to Handle
1. **Duplicate detection** — Match on title + source_url
2. **Missing fields** — Title/ingredients/steps are required
3. **Image handling** — Copy images or store paths?
4. **Encoding** — UTF-8 special characters
5. **HTML entities** — `&`, `"`, etc.
6. **Large batches** — Memory limits for 100+ recipes
7. **Malformed HTML** — Graceful degradation
---
## Next Steps (Phase 2)
1. Extend Recipe schema with `made`, `rating`, `notes` fields
2. Implement `CopyMeThatHtmlParser` service
3. Create `POST /api/recipes/import/file` endpoint
4. Add multipart file upload handler
5. Unit tests for parser
6. Integration tests for endpoint
---
**Status:** ✅ Analysis complete, ready for implementation