21 KiB
FTS5 Virtual Table Schema & Trigger Plan — Design Spec
Task: T1: Design FTS5 virtual table schema + trigger plan for Recipe Manager
Created: 2026-03-30
Target: Implementation blueprint for T2 (migration)
Schema baseline: src/backend/db/schema.sql (2026-03-28 MVP normalized)
1. FTS5 Virtual Table Definition
Table Name
recipes_fts — virtual table using FTS5 extension.
Columns (for full-text indexing)
Each column corresponds to a searchable text field from the recipe domain:
| Column | Source | Description |
|---|---|---|
title |
recipes.title |
Recipe title (primary identifier) |
description |
recipes.description |
Recipe description/brief |
ingredient_item |
ingredients.item aggregated |
Ingredient names (e.g., "chicken", "flour") |
ingredient_notes |
ingredients.notes aggregated |
Ingredient-specific notes (e.g., " diced", "to taste") |
step_instruction |
steps.instruction aggregated |
Cooking step instructions |
tag_name |
tags.name aggregated via recipe_tags |
Tag names associated with the recipe |
Tokenizer Configuration
CREATE VIRTUAL TABLE recipes_fts
USING fts5(
title,
description,
ingredient_item,
ingredient_notes,
step_instruction,
tag_name,
tokenize='porter unicode61'
);
Rationale:
porterstemming: reduces words to root forms (e.g., "chopped" → "chop", "chicken" stays "chicken", "sautéed" → "saut"). Improves recall for cooking terminology.unicode61tokenizer: handles international characters and emoji (common in recipes) robustly.- Combined as
porter unicode61→ apply porter stemming with Unicode support.
Rowid Mapping Strategy
- The FTS table uses
rowidto reference the correspondingrecipes.id. - We will store
recipes.idas therowiddirectly:INSERT INTO recipes_fts(rowid, ...) VALUES(recipe_id, ...). - This enables efficient joins:
SELECT r.* FROM recipes r JOIN recipes_fts fts ON r.id = fts.rowid WHERE fts MATCH ?.
2. Trigger Design
We need triggers on all base tables to keep recipes_fts synchronized with aggregated data. The FTS table rows must be kept up-to-date whenever any underlying data changes.
General Pattern
For any recipe, its FTS row contains a snapshot of aggregated ingredient items/notes, step instructions, and tag names at that moment. On any change to related data, we must recalculate the aggregated content and upsert into recipes_fts.
Helper SQL Functions
We recommend creating SQL functions to encapsulate the aggregation logic:
-- Aggregate ingredient text for a recipe
CREATE OR REPLACE FUNCTION fts_aggregate_ingredients(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(item || COALESCE(' ' || notes, ''), ' ')
FROM ingredients
WHERE recipe_id = recipe_id;
END;
-- Aggregate step instructions for a recipe
CREATE OR REPLACE FUNCTION fts_aggregate_steps(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(instruction, ' ')
FROM steps
WHERE recipe_id = recipe_id
ORDER BY position;
END;
-- Aggregate tag names for a recipe
CREATE OR REPLACE FUNCTION fts_aggregate_tags(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(t.name, ' ')
FROM tags t
JOIN recipe_tags rt ON t.id = rt.tag_id
WHERE rt.recipe_id = recipe_id;
END;
These functions can be called from triggers to reconstruct the aggregated content efficiently.
Trigger Definitions
2.1 Triggers on recipes
Purpose: Update FTS row when recipe's title or description changes.
recipes_after_insert: Insert initial FTS row for new recipe.recipes_after_update: Update FTS row if title/description changed.recipes_after_delete: Delete FTS row when recipe is removed.
-- AFTER INSERT on recipes
CREATE TRIGGER recipes_after_insert
AFTER INSERT ON recipes
BEGIN
INSERT INTO recipes_fts(rowid, title, description, ingredient_item, ingredient_notes, step_instruction, tag_name)
VALUES (
NEW.id,
NEW.title,
NEW.description,
fts_aggregate_ingredients(NEW.id),
fts_aggregate_ingredients(NEW.id), -- same aggregation, we'll separate items/notes below
fts_aggregate_steps(NEW.id),
fts_aggregate_tags(NEW.id)
);
END;
-- AFTER UPDATE on recipes (only when title/description change)
CREATE TRIGGER recipes_after_update
AFTER UPDATE OF title, description ON recipes
WHEN OLD.title != NEW.title OR OLD.description != NEW.description
BEGIN
UPDATE recipes_fts
SET
title = NEW.title,
description = NEW.description
WHERE rowid = OLD.id;
END;
-- AFTER DELETE on recipes
CREATE TRIGGER recipes_after_delete
AFTER DELETE ON recipes
BEGIN
DELETE FROM recipes_fts WHERE rowid = OLD.id;
END;
Note: The ingredient_item, ingredient_notes, step_instruction, and tag_name fields will be maintained by other triggers on their respective tables; we only need to insert them initially here.
2.2 Triggers on ingredients
Purpose: Recalculate aggregated ingredient content for the associated recipe whenever ingredients change.
- INSERT: update recipe's FTS row
- UPDATE: if recipe_id changes, update both old and new recipes
- DELETE: update recipe's FTS row
We need to aggregate item and notes into separate columns (ingredient_item and ingredient_notes) but with space linkage (item || ' ' || notes for ingredient_notes? Actually spec says ingredient_notes column should contain notes only? Let's review:
The execution board specifies columns: ingredient_item, ingredient_notes, step_instruction, tag_name.
Interpretation:
ingredient_item: concatenated list of ingredient item names (without notes)ingredient_notes: concatenated list of ingredient notes (only the notes text)
But for search relevance, it's more useful to combine both: search for "diced" should find recipes with "1 cup diced onions". If we separate them, the note "diced" may not match the item column. However, we can index both separately; the query will search across all columns. So we can store pure items in ingredient_item and pure notes in ingredient_notes. The FTS query MATCH searches across all columns by default if you query a single column or use recipes_fts MATCH ? without column spec. But we might want to weight columns differently? That's an advanced optimization not required now.
Given the spec, I'll implement:
ingredient_item:GROUP_CONCAT(item, ' ')ingredient_notes:GROUP_CONCAT(notes, ' ')(excluding nulls)
-- Helper to get aggregated ingredient item text
CREATE OR REPLACE FUNCTION fts_aggregate_ingredient_items(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(item, ' ')
FROM ingredients
WHERE recipe_id = recipe_id;
END;
-- Helper to get aggregated ingredient notes text
CREATE OR REPLACE FUNCTION fts_aggregate_ingredient_notes(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(notes, ' ')
FROM ingredients
WHERE recipe_id = recipe_id AND notes IS NOT NULL AND notes != '';
END;
-- AFTER INSERT on ingredients
CREATE TRIGGER ingredients_after_insert
AFTER INSERT ON ingredients
BEGIN
UPDATE recipes_fts
SET
ingredient_item = fts_aggregate_ingredient_items(NEW.recipe_id),
ingredient_notes = fts_aggregate_ingredient_notes(NEW.recipe_id)
WHERE rowid = NEW.recipe_id;
-- If recipe doesn't exist in FTS yet (e.g., recipe added earlier without trigger), insert it
INSERT INTO recipes_fts(rowid, title, description, ingredient_item, ingredient_notes, step_instruction, tag_name)
SELECT
r.id,
r.title,
r.description,
fts_aggregate_ingredient_items(r.id),
fts_aggregate_ingredient_notes(r.id),
fts_aggregate_steps(r.id),
fts_aggregate_tags(r.id)
FROM recipes r
WHERE r.id = NEW.recipe_id AND NOT EXISTS (SELECT 1 FROM recipes_fts WHERE rowid = r.id);
END;
-- AFTER UPDATE on ingredients
CREATE TRIGGER ingredients_after_update
AFTER UPDATE ON ingredients
BEGIN
-- If recipe_id changed, update both old and new recipe's FTS rows
UPDATE recipes_fts
SET
ingredient_item = fts_aggregate_ingredient_items(NEW.recipe_id),
ingredient_notes = fts_aggregate_ingredient_notes(NEW.recipe_id)
WHERE rowid = NEW.recipe_id;
UPDATE recipes_fts
SET
ingredient_item = fts_aggregate_ingredient_items(OLD.recipe_id),
ingredient_notes = fts_aggregate_ingredient_notes(OLD.recipe_id)
WHERE rowid = OLD.recipe_id AND OLD.recipe_id != NEW.recipe_id;
END;
-- AFTER DELETE on ingredients
CREATE TRIGGER ingredients_after_delete
AFTER DELETE ON ingredients
BEGIN
UPDATE recipes_fts
SET
ingredient_item = fts_aggregate_ingredient_items(OLD.recipe_id),
ingredient_notes = fts_aggregate_ingredient_notes(OLD.recipe_id)
WHERE rowid = OLD.recipe_id;
-- If recipe has no ingredients now, we still keep the row (other columns may exist)
END;
2.3 Triggers on steps
-- AFTER INSERT on steps
CREATE TRIGGER steps_after_insert
AFTER INSERT ON steps
BEGIN
UPDATE recipes_fts
SET step_instruction = fts_aggregate_steps(NEW.recipe_id)
WHERE rowid = NEW.recipe_id;
INSERT INTO recipes_fts(rowid, title, description, ingredient_item, ingredient_notes, step_instruction, tag_name)
SELECT
r.id,
r.title,
r.description,
fts_aggregate_ingredient_items(r.id),
fts_aggregate_ingredient_notes(r.id),
fts_aggregate_steps(r.id),
fts_aggregate_tags(r.id)
FROM recipes r
WHERE r.id = NEW.recipe_id AND NOT EXISTS (SELECT 1 FROM recipes_fts WHERE rowid = r.id);
END;
-- AFTER UPDATE on steps
CREATE TRIGGER steps_after_update
AFTER UPDATE ON steps
BEGIN
UPDATE recipes_fts
SET step_instruction = fts_aggregate_steps(NEW.recipe_id)
WHERE rowid = NEW.recipe_id;
UPDATE recipes_fts
SET step_instruction = fts_aggregate_steps(OLD.recipe_id)
WHERE rowid = OLD.recipe_id AND OLD.recipe_id != NEW.recipe_id;
END;
-- AFTER DELETE on steps
CREATE TRIGGER steps_after_delete
AFTER DELETE ON steps
BEGIN
UPDATE recipes_fts
SET step_instruction = fts_aggregate_steps(OLD.recipe_id)
WHERE rowid = OLD.recipe_id;
END;
2.4 Triggers on tags and recipe_tags
Because tags are many-to-many, changes to tags can affect multiple recipes:
- INSERT tag: no recipes yet, nothing to do.
- UPDATE tag name: affects all recipes linked to that tag.
- DELETE tag: affects all recipes that had that tag (via ON DELETE CASCADE on recipe_tags, but before the tag row is removed, we need to update FTS for those recipes).
For recipe_tags:
- INSERT: affects the recipe (and possibly the tag, but tag name is used). The recipe gains a new tag → need to update its
tag_namecolumn. - DELETE: recipe loses a tag → need to update its
tag_namecolumn. - UPDATE on
recipe_tags(rare): combination of delete + insert.
-- AFTER INSERT on recipe_tags
CREATE TRIGGER recipe_tags_after_insert
AFTER INSERT ON recipe_tags
BEGIN
UPDATE recipes_fts
SET tag_name = fts_aggregate_tags(NEW.recipe_id)
WHERE rowid = NEW.recipe_id;
INSERT INTO recipes_fts(rowid, title, description, ingredient_item, ingredient_notes, step_instruction, tag_name)
SELECT
r.id,
r.title,
r.description,
fts_aggregate_ingredient_items(r.id),
fts_aggregate_ingredient_notes(r.id),
fts_aggregate_steps(r.id),
fts_aggregate_tags(r.id)
FROM recipes r
WHERE r.id = NEW.recipe_id AND NOT EXISTS (SELECT 1 FROM recipes_fts WHERE rowid = r.id);
END;
-- AFTER DELETE on recipe_tags
CREATE TRIGGER recipe_tags_after_delete
AFTER DELETE ON recipe_tags
BEGIN
UPDATE recipes_fts
SET tag_name = fts_aggregate_tags(OLD.recipe_id)
WHERE rowid = OLD.recipe_id;
END;
-- AFTER UPDATE on tags (name changed) — need to update all recipes with that tag
CREATE TRIGGER tags_after_update
AFTER UPDATE OF name ON tags
BEGIN
UPDATE recipes_fts
SET tag_name = fts_aggregate_tags(rt.recipe_id)
FROM recipe_tags rt
WHERE recipes_fts.rowid = rt.recipe_id
AND rt.tag_id = NEW.id;
END;
-- BEFORE DELETE on tags — update all recipes that reference this tag before tag is removed
-- (we could also rely on ON DELETE CASCADE from recipe_tags, but we need to update FTS before the tag row is gone)
CREATE TRIGGER tags_before_delete
BEFORE DELETE ON tags
BEGIN
UPDATE recipes_fts
SET tag_name = fts_aggregate_tags(rt.recipe_id)
FROM recipe_tags rt
WHERE recipes_fts.rowid = rt.recipe_id
AND rt.tag_id = OLD.id;
END;
Note: SQLite's UPDATE ... FROM is available in modern versions (3.33.0+). If not supported in the project's SQLite version, we can rewrite using subqueries. The migration implementation (T2) should check SQLite version and adapt accordingly.
3. Backfill Strategy
Existing recipes must be loaded into recipes_fts before the application uses it.
Migration Steps
-
Ensure FTS5 extension is available (SQLite 3.9.0+). Enable extension loading if needed:
PRAGMA foreign_keys = ON;(no special action required for FTS5; it's built-in). -
Create helper functions (
fts_aggregate_*) before creating triggers, because triggers call them. -
Create the virtual table:
CREATE VIRTUAL TABLE IF NOT EXISTS recipes_fts USING fts5( title, description, ingredient_item, ingredient_notes, step_instruction, tag_name, tokenize='porter unicode61' ); -
Backfill all existing recipes in a single transaction (for performance and consistency):
INSERT INTO recipes_fts(rowid, title, description, ingredient_item, ingredient_notes, step_instruction, tag_name) SELECT r.id, r.title, r.description, GROUP_CONCAT(i.item, ' ') as ingredient_items, GROUP_CONCAT(i.notes, ' ') as ingredient_notes, (SELECT GROUP_CONCAT(s.instruction, ' ') FROM steps s WHERE s.recipe_id = r.id) as step_instruction, (SELECT GROUP_CONCAT(t.name, ' ') FROM tags t JOIN recipe_tags rt ON t.id = rt.tag_id WHERE rt.recipe_id = r.id) as tag_name FROM recipes r LEFT JOIN ingredients i ON r.id = i.recipe_id GROUP BY r.id;Why this works: Uses
GROUP BY r.idto aggregate ingredients per recipe. Subqueries aggregate steps and tags. Handles recipes with no ingredients/steps/tags gracefully (NULL → FTS stores empty text? Need to check SQLite FTS behavior: NULL becomes empty string, fine). -
Create triggers as defined in Section 2, in order: helper functions first, then FTS table, then backfill, then triggers. But triggers reference the functions, so functions must exist before triggers are created.
-
Test backfill completeness:
SELECT COUNT(*) FROM recipes; -- should equal COUNT(*) FROM recipes_fts
Idempotency & Rollback
- Migration script should be idempotent: use
CREATE TABLE IF NOT EXISTS,CREATE TRIGGER IF NOT EXISTS, and check if backfill is needed (e.g., by checkingrecipes_ftsrow count before inserting). - For rollback: drop all triggers and the FTS table. Keep data in main tables intact.
4. Search Query Usage Pattern (for backend developer)
Once FTS5 is set up, the search endpoint should:
-- Basic query with BM25 ranking
SELECT
r.*,
bm25(recipes_fts) as rank_score
FROM recipes r
JOIN recipes_fts fts ON r.id = fts.rowid
WHERE fts MATCH ?
ORDER BY rank_score ASC -- lower bm25 is better
LIMIT ? OFFSET ?;
To get total count for pagination:
SELECT COUNT(*) FROM recipes_fts WHERE recipes_fts MATCH ?;
Phrase search: Use quotes in query: "chocolate chip" → users can type it, pass directly to MATCH.
Prefix search: Use * at end: choc* → user may type, treat as prefix.
Security: Parameterize the query string to avoid SQL injection. Use prepared statements.
5. Implementation Notes & Edge Cases
Column Aggregation Decisions
ingredient_itemvsingredient_notes: We separate them to allow independent boosting if needed later. Currently both are concatenated with spaces; nulls are excluded.- Ordering preservation (
GROUP_CONCAT): By default SQLite does not guarantee order; we can addORDER BY positionwithin aggregation if needed:GROUP_CONCAT(item, ' ')withORDER BY positionmodifier. For FTS, order matters for phrase queries but not for individual term matches. Still, it's good to preserve natural order. Modify helpers:SELECT GROUP_CONCAT(item, ' ') FROM (SELECT item FROM ingredients WHERE recipe_id = ? ORDER BY position); - Same for steps: should honor
position.
Update functions accordingly:
CREATE OR REPLACE FUNCTION fts_aggregate_ingredient_items(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(item, ' ')
FROM (SELECT item FROM ingredients WHERE recipe_id = recipe_id ORDER BY position);
END;
CREATE OR REPLACE FUNCTION fts_aggregate_ingredient_notes(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(notes, ' ')
FROM (SELECT notes FROM ingredients WHERE recipe_id = recipe_id ORDER BY position)
WHERE notes IS NOT NULL AND notes != '';
END;
CREATE OR REPLACE FUNCTION fts_aggregate_steps(recipe_id INTEGER)
RETURNS TEXT DETERMINISTIC
BEGIN
SELECT GROUP_CONCAT(instruction, ' ')
FROM (SELECT instruction FROM steps WHERE recipe_id = recipe_id ORDER BY position);
END;
Multi-Recipes Updated by Single Trigger
Triggers on tags and recipe_tags can affect many recipes (e.g., deleting a tag popular in many recipes). The UPDATE recipes_fts ... FROM recipe_tags pattern updates all affected rows in one statement, which is efficient. However, if a tag has thousands of recipes, this could be heavy; but tags are typically not that widely used. Acceptable.
Concurrency
Triggers run within the same transaction as the triggering statement. So FTS remains consistent with the main tables. No additional locking needed beyond SQLite's transaction isolation.
FTS5 Content Option
We are using the default content='' (external content) because the FTS table does not store its own copy of the full text; it references rowid only. We manage content manually via triggers. That's what we designed above.
Alternatively, could use content='recipes' and let SQLite auto-maintain content via content_triggers, but that would only index columns directly from recipes table, not aggregated from ingredients/steps/tags. So external content + explicit triggers is necessary.
Performance Considerations
- Triggers add overhead on every write to ingredients/steps/recipe_tags. But writes are relatively infrequent compared to reads; search is read-heavy.
- Backfill may be slow for large datasets (>10k recipes). Can batch if needed, but likely okay for a personal recipe manager.
- FTS5 indexes maintain inverted index; queries should be fast.
Null Handling
GROUP_CONCATreturns NULL if no rows; FTS5 stores empty string for NULL. We can coalesce:COALESCE(GROUP_CONCAT(...), '')in functions to be explicit.
6. Deliverables for T2 (Implementation Checklist)
- Create SQL migration file:
migrations/YYYY-MM-DD-add-fts5.sql - Add helper aggregation functions (deterministic, with ordering)
- Create
recipes_ftsvirtual table withporter unicode61tokenizer - Implement backfill query (single INSERT...SELECT)
- Create all triggers (recipes, ingredients, steps, tags, recipe_tags) with proper WHEN clauses to avoid unnecessary updates
- Make migration idempotent (use
IF NOT EXISTSchecks and/or guard inserts) - Add tests for trigger behavior (unit tests that simulate data changes and verify FTS content)
- Document rollback:
DROP TRIGGER IF EXISTS ...andDROP TABLE IF EXISTS recipes_fts - Verify row counts match after backfill
7. Schema Version Bump
Current schema version: 2026-03-28 (from migrations). This FTS5 addition constitutes a schema change. Recommendation: increment to 2026-03-30-fts5 or similar. Update any schemaVersion constant in application code.
8. Summary Diagram (Conceptual)
recipes (id)
↓ (triggers on recipes)
ingredients (recipe_id) → aggregate via fts_aggregate_ingredient_*()
steps (recipe_id) → aggregate via fts_aggregate_steps()
tags ← recipe_tags (recipe_id, tag_id) → aggregate via fts_aggregate_tags()
recipes_fts(rowid = recipes.id) stores searchable text from all above.
Search query: SELECT recipes.* FROM recipes JOIN recipes_fts USING(rowid) WHERE MATCH.
All modifications to base tables funnel through triggers that recompute only the affected recipe's aggregated fields and update its corresponding FTS row.
Design completed for T2 implementation.
Key equations:
- FTS row content:
{title, description, Σ(ingredient items), Σ(ingredient notes), Σ(step instructions), Σ(tag names)}
Triggers:
recipes_after_insert/update/delete→ maintain recipe FTS row (title/desc)ingredients_after_insert/update/delete→ recalc ingredient aggregatessteps_after_insert/update/delete→ recalc step aggregatesrecipe_tags_after_insert/delete→ recalc tag aggregates for recipetags_after_update(name change) → recalc tag aggregates for all recipes using that tagtags_before_delete→ recalc tag aggregates for affected recipes
Backfill: One-shot INSERT INTO recipes_fts SELECT ... GROUP BY recipes.id with left joins/subqueries.