244 lines
5.5 KiB
Markdown
244 lines
5.5 KiB
Markdown
# Phase 1: CopyMeThat Format Analysis
|
|
|
|
**Completed:** 2026-03-28 23:52 EDT
|
|
**Analyzed by:** Cleo
|
|
|
|
---
|
|
|
|
## File Structure
|
|
|
|
### TXT Export
|
|
- **Location:** `data/exports/Copy_Me_That_TXT_20260328_58775_z1p5lpjsgz/`
|
|
- **Format:** One `.txt` file per recipe
|
|
- **Naming:** Snake-case recipe names (e.g., `apple_crowned_coffee_cake.txt`)
|
|
- **Estimated count:** ~50-100+ recipes
|
|
|
|
### HTML Export
|
|
- **Location:** `data/exports/Copy_Me_That_HTML_20260328_58775_z1p5lpjsgz/`
|
|
- **Format:** Single `recipes.html` file containing ALL recipes
|
|
- **Images:** `/images/` subfolder with recipe photos
|
|
- **Structure:** Semantic HTML with consistent IDs
|
|
|
|
---
|
|
|
|
## TXT Format Specification
|
|
|
|
### Structure
|
|
```
|
|
[Recipe Title]
|
|
|
|
Adapted from [URL]
|
|
|
|
tags: [Tag1], [Tag2], [Tag3]
|
|
|
|
[Optional: "I made this."]
|
|
|
|
Servings: [serving info]
|
|
|
|
INGREDIENTS
|
|
|
|
[ingredient 1]
|
|
[ingredient 2]
|
|
...
|
|
|
|
STEPS
|
|
|
|
1) [step 1]
|
|
|
|
2) [step 2]
|
|
|
|
...
|
|
|
|
NOTES
|
|
|
|
[optional notes]
|
|
```
|
|
|
|
### Example
|
|
```
|
|
Apple-Crowned Coffee Cake
|
|
|
|
Adapted from http://www.kraftcanada.com/recipes/apple-crowned-coffee-cake-191423
|
|
|
|
tags: Cake, Dessert
|
|
|
|
I made this.
|
|
|
|
Servings: 16 servings, 1 piece (76 g) each
|
|
|
|
INGREDIENTS
|
|
|
|
2 cups flour
|
|
2 Tbsp. granulated sugar
|
|
...
|
|
|
|
STEPS
|
|
|
|
1) Heat oven to 375°F.
|
|
|
|
2) Combine flour...
|
|
|
|
NOTES
|
|
|
|
If the glaze is too thick...
|
|
```
|
|
|
|
### Key Fields
|
|
- **Title:** First line
|
|
- **Source URL:** "Adapted from [URL]"
|
|
- **Tags:** Comma-separated after "tags:"
|
|
- **Made flag:** Presence of "I made this."
|
|
- **Servings:** After "Servings:"
|
|
- **Ingredients:** Plain list between "INGREDIENTS" and "STEPS"
|
|
- **Instructions:** Numbered list after "STEPS"
|
|
- **Notes:** Optional, after "NOTES"
|
|
|
|
---
|
|
|
|
## HTML Format Specification
|
|
|
|
### Structure
|
|
Single HTML file with repeated `.recipe` div blocks:
|
|
|
|
```html
|
|
<div class="recipe">
|
|
<div id="name">Recipe Title</div>
|
|
<div id="link">
|
|
Adapted from <a id="original_link" href="...">URL</a>
|
|
</div>
|
|
<img class="recipeImage" src="images/filename.jpg"/>
|
|
<div id="categories">
|
|
<span class="recipeCategory">Tag1</span>
|
|
<span class="recipeCategory">Tag2</span>
|
|
</div>
|
|
<div id="description">Description text</div>
|
|
<div id="extra_info">
|
|
<span id="made_this">I made this.</span>
|
|
<span id="rating">Rated <span id="ratingValue">3</span>/5</span>
|
|
</div>
|
|
<div id="servings">
|
|
Servings: <a id="recipeYield">8 servings...</a>
|
|
</div>
|
|
<ul id="recipeIngredients">
|
|
<li class="recipeIngredient">ingredient 1</li>
|
|
...
|
|
</ul>
|
|
<ol id="recipeInstructions">
|
|
<li class="instruction" value="1">step 1</li>
|
|
...
|
|
</ol>
|
|
<div id="recipeNotes">
|
|
<div class="recipeNote">note text</div>
|
|
</div>
|
|
</div>
|
|
```
|
|
|
|
### Key Selectors
|
|
- `.recipe` — Recipe container
|
|
- `#name` — Title
|
|
- `#original_link` — Source URL
|
|
- `.recipeImage` — Image path
|
|
- `.recipeCategory` — Tags
|
|
- `#description` — Description
|
|
- `#made_this` — Made flag
|
|
- `#ratingValue` — Rating (1-5)
|
|
- `#recipeYield` — Servings
|
|
- `.recipeIngredient` — Ingredients (list items)
|
|
- `.instruction` — Steps (ordered list items)
|
|
- `.recipeNote` — Notes
|
|
|
|
---
|
|
|
|
## Implementation Strategy
|
|
|
|
### Recommended Approach: HTML Parser Primary
|
|
**Rationale:**
|
|
- HTML has MORE data (images, ratings, descriptions)
|
|
- Single file = easier batch import
|
|
- Well-structured semantic markup
|
|
- Images already linked
|
|
|
|
**Fallback:** TXT parser for edge cases
|
|
|
|
### Parser Architecture
|
|
```
|
|
ImportService
|
|
├── CopyMeThatHtmlParser
|
|
│ ├── parseRecipes(html: string): Recipe[]
|
|
│ ├── extractRecipeBlocks(html: string): HTMLElement[]
|
|
│ └── parseRecipeBlock(block: HTMLElement): Recipe
|
|
└── CopyMeThatTxtParser (optional fallback)
|
|
└── parseTxtFile(content: string): Recipe
|
|
```
|
|
|
|
### API Endpoint Design
|
|
```
|
|
POST /api/recipes/import/copyme that
|
|
Content-Type: multipart/form-data
|
|
|
|
Request:
|
|
- file: recipes.html OR multiple .txt files
|
|
- options: { skipDuplicates: boolean, importImages: boolean }
|
|
|
|
Response:
|
|
{
|
|
success: true,
|
|
data: {
|
|
imported: 45,
|
|
skipped: 3,
|
|
failed: 2,
|
|
recipes: [...] // preview
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Data Mapping
|
|
|
|
| CopyMeThat Field | Recipe Schema Field | Notes |
|
|
|------------------|---------------------|-------|
|
|
| `#name` | `title` | Direct mapping |
|
|
| `#original_link` | `source_url` | Direct mapping |
|
|
| `#description` | `description` | Direct mapping |
|
|
| `.recipeCategory` | `tags` | Parse into tag array |
|
|
| `#recipeYield` | `servings` | Extract number if possible |
|
|
| `.recipeIngredient` | `ingredients[].item` | Plain text list |
|
|
| `.instruction` | `steps[].instruction` | Numbered list |
|
|
| `.recipeNote` | Notes field? | May need schema extension |
|
|
| `.recipeImage` | `image_url` | Copy to app storage |
|
|
| `#made_this` | Custom field? | Boolean flag |
|
|
| `#ratingValue` | Custom field? | 1-5 rating |
|
|
|
|
### Schema Extensions Needed
|
|
- `made: boolean` — User has cooked this
|
|
- `rating: number` — 1-5 stars
|
|
- `notes: string` — General notes field
|
|
|
|
---
|
|
|
|
## Edge Cases to Handle
|
|
|
|
1. **Duplicate detection** — Match on title + source_url
|
|
2. **Missing fields** — Title/ingredients/steps are required
|
|
3. **Image handling** — Copy images or store paths?
|
|
4. **Encoding** — UTF-8 special characters
|
|
5. **HTML entities** — `&`, `"`, etc.
|
|
6. **Large batches** — Memory limits for 100+ recipes
|
|
7. **Malformed HTML** — Graceful degradation
|
|
|
|
---
|
|
|
|
## Next Steps (Phase 2)
|
|
|
|
1. Extend Recipe schema with `made`, `rating`, `notes` fields
|
|
2. Implement `CopyMeThatHtmlParser` service
|
|
3. Create `POST /api/recipes/import/file` endpoint
|
|
4. Add multipart file upload handler
|
|
5. Unit tests for parser
|
|
6. Integration tests for endpoint
|
|
|
|
---
|
|
|
|
**Status:** ✅ Analysis complete, ready for implementation
|