thingamablog-v2/DESIGN.md

7.6 KiB

Thingamablog v2 - Technical Design Document

Overview

Thingamablog v2 is a migration and modernization project for Paul's legacy blog data from the early 2000s Thingamablog platform. The system extracts blog posts from an obsolete HSQLDB database, cleans and structures the data, and provides a modern web interface for browsing.

Project Structure

There are two related projects:

  1. thingamablog-api (projects/thingamablog-api/): Contains the Java export tool for data extraction
  2. thingamablog-v2 (projects/thingamablog-v2/): Contains the web app for browsing the extracted data

No Git repositories are set up for these folders yet.

Problem Statement

  • Legacy Data Lock-In: Blog posts stored in HSQLDB 1.8 (obsolete Java DB format from 2000s)
  • Data Extraction Challenges: Binary format difficult to parse reliably with modern tools
  • User Experience: No easy way to browse/search 20+ years of personal blog content
  • Future Integration: Need structured data for AI/vector search (Open Brain project)

Solution Architecture

High-Level Architecture

[Legacy HSQLDB DB] → [Java Export Tool] → [blog-export.json] → [Node.js Backend] → [React Frontend]
      ↓
   (Binary Data)      (Clean JSON)         (API Server)        (Web UI)

Components

1. Data Extraction Layer (Java CLI Tool)

  • Location: projects/thingamablog-api/ExportTool.java
  • Purpose: Bridge from legacy DB to modern JSON
  • Technology: Java with HSQLDB 1.8.0.10 driver
  • Input: HSQLDB database files (database.script, database.data)
  • Output: blog-export.json with structured blog entries
  • Design Decision: Standalone CLI tool avoids Spring Boot complexity

2. Data Storage Layer (JSON File)

  • Format: Clean JSON array of blog post objects
  • Schema:
    {
      "id": 1,
      "title": "Digital Imaging Notes",
      "date": "2003-11-03 16:41:22.053",
      "author": "Paul",
      "categories": "Hobbies",
      "content": "<p>Full HTML content...</p>"
    }
    
  • Benefits: Human-readable, easily parseable, version-controllable

3. Backend API Layer (Node.js/Express)

  • Endpoints:
    • GET /api/posts - List posts with metadata
    • GET /api/posts/:id - Full post content
  • Features:
    • Priority loading of blog-export.json
    • Fallback to legacy HSQLDB parser
    • Static image serving
    • CORS support for frontend

4. Frontend UI Layer (React/Material-UI)

  • Components:
    • PostList: Scrollable filtered list
    • PostDetail: Full content viewer
    • Filters: Category/date navigation
  • Responsive Design: Desktop and mobile support

Detailed Design

Data Extraction Process

The "Bridge" Approach

  1. Java Tool Creation:

    • ExportTool.java: Simple class using JDBC to connect to HSQLDB
    • Uses hsqldb-1.8.0.10.jar driver (downloaded via Maven)
    • Executes SQL query: SELECT * FROM ENTRY_TABLE_1096292361887
    • Maps result set to JSON objects
  2. Compilation & Execution:

    cd projects/thingamablog-api
    javac -cp target/dependency/hsqldb-1.8.0.10.jar ExportTool.java
    java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > ../thingamablog-v2/backend/blog-export.json
    
  3. Data Cleaning:

    • Converts timestamps to readable format
    • Preserves HTML content in body
    • Extracts categories from metadata
    • Assigns sequential IDs

Migration Results

  • Input: Binary HSQLDB files (~2MB total)
  • Output: Clean JSON (1.3MB, 467 entries)
  • Quality: Perfect titles, dates, HTML content preserved
  • Performance: One-time export, instant loading thereafter

API Design

REST Endpoints

  • GET /api/posts

    • Response: Array of post summaries
    • Filtering: None (client-side)
    • Sorting: By date descending (for clean JSON)
  • GET /api/posts/:id

    • Response: Full post object
    • Image URL rewriting: Converts Windows paths to HTTP URLs
    • Error handling: 404 for missing posts

Data Flow

  1. Frontend requests post list
  2. Backend loads JSON/falls back to DB parsing
  3. Backend serves filtered/sorted data
  4. Frontend renders with Material-UI components

Frontend Design

Component Hierarchy

App
├── Sidebar (Filters)
│   ├── AllPostsFilter
│   ├── CategoryAccordion
│   └── ArchiveAccordion
├── PostList (Scrollable)
└── PostViewer (Rich content)

State Management

  • React hooks for local state
  • No external state library (simple app)
  • URL-based state for selected post/filter

UI/UX Principles

  • Progressive Disclosure: Filters collapsed by default
  • Responsive Grid: 3-column desktop, stacked mobile
  • Accessibility: Keyboard navigation, screen reader support
  • Performance: Virtual scrolling for large post lists

Implementation Details

Technology Choices

Component Technology Rationale
Export Tool Java + HSQLDB Driver Native compatibility with legacy DB
Data Format JSON Universal, human-readable
Backend Node.js/Express Simple, fast for file-based data
Frontend React/Material-UI Modern, component-based UI
Images Static serving Direct file access for performance

File Structure

projects/
├── thingamablog-api/        # Data extraction tool
│   ├── ExportTool.java      # Java export tool
│   ├── pom.xml              # Maven config
│   ├── src/                 # Maven source structure
│   ├── target/              # Compiled classes + dependencies
│   └── api.log              # Execution log
└── thingamablog-v2/         # Web application
    ├── backend/
    │   ├── server.js        # API server
    │   ├── hsqldbParser.js  # Fallback parser
    │   └── blog-export.json # Clean data
    └── frontend/src/
        ├── App.js           # Main UI
        ├── theme.js         # Styling
        └── components/      # Reusable UI pieces

Deployment & Operations

Local Development

  1. Extract data: cd projects/thingamablog-api && java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > ../thingamablog-v2/backend/blog-export.json
  2. Start backend: cd projects/thingamablog-v2/backend && node server.js
  3. Start frontend: cd projects/thingamablog-v2/frontend && npm start
  4. Access: http://localhost:3000

Production Considerations

  • Scalability: JSON file size (1.3MB) is fine for personal use
  • Backup: Version control the JSON export
  • Updates: Re-run export tool if DB changes
  • Security: Local-only access, no authentication needed

Risks & Mitigations

Risk Mitigation
HSQLDB driver availability Downloaded and cached locally
Java version compatibility Tested with Java 8+
Data corruption JSON validation on load
Performance with large datasets Client-side pagination/filtering

Future Enhancements

  • Search: Full-text search within posts
  • Tagging: Enhanced category management
  • Export: Additional formats (Markdown, PDF)
  • Open Brain Integration: Vector embedding for AI queries
  • Multi-user: User accounts and permissions

Success Metrics

  • Data Integrity: 100% posts extracted with correct metadata
  • Performance: <2s page load, <500ms API response
  • Usability: Intuitive filtering/navigation
  • Maintainability: Clear code structure, comprehensive docs

Conclusion

Thingamablog v2 successfully modernizes legacy blog data through a "bridge" approach, providing a clean migration path from obsolete technology to contemporary web standards. The modular design allows for easy maintenance and future AI integrations.