7.6 KiB

Raw Blame History

Thingamablog v2 - Technical Design Document

Overview

Thingamablog v2 is a migration and modernization project for Paul's legacy blog data from the early 2000s Thingamablog platform. The system extracts blog posts from an obsolete HSQLDB database, cleans and structures the data, and provides a modern web interface for browsing.

Project Structure

There are two related projects:

thingamablog-api (projects/thingamablog-api/): Contains the Java export tool for data extraction
thingamablog-v2 (projects/thingamablog-v2/): Contains the web app for browsing the extracted data

No Git repositories are set up for these folders yet.

Problem Statement

Legacy Data Lock-In: Blog posts stored in HSQLDB 1.8 (obsolete Java DB format from 2000s)
Data Extraction Challenges: Binary format difficult to parse reliably with modern tools
User Experience: No easy way to browse/search 20+ years of personal blog content
Future Integration: Need structured data for AI/vector search (Open Brain project)

Solution Architecture

High-Level Architecture

[Legacy HSQLDB DB] → [Java Export Tool] → [blog-export.json] → [Node.js Backend] → [React Frontend]
      ↓
   (Binary Data)      (Clean JSON)         (API Server)        (Web UI)

Components

1. Data Extraction Layer (Java CLI Tool)

Location: projects/thingamablog-api/ExportTool.java
Purpose: Bridge from legacy DB to modern JSON
Technology: Java with HSQLDB 1.8.0.10 driver
Input: HSQLDB database files (database.script, database.data)
Output: blog-export.json with structured blog entries
Design Decision: Standalone CLI tool avoids Spring Boot complexity

2. Data Storage Layer (JSON File)

Format: Clean JSON array of blog post objects

Schema:

{
  "id": 1,
  "title": "Digital Imaging Notes",
  "date": "2003-11-03 16:41:22.053",
  "author": "Paul",
  "categories": "Hobbies",
  "content": "<p>Full HTML content...</p>"
}

Benefits: Human-readable, easily parseable, version-controllable

3. Backend API Layer (Node.js/Express)

Endpoints:
- GET /api/posts - List posts with metadata
- GET /api/posts/:id - Full post content
Features:
- Priority loading of blog-export.json
- Fallback to legacy HSQLDB parser
- Static image serving
- CORS support for frontend

4. Frontend UI Layer (React/Material-UI)

Components:
- PostList: Scrollable filtered list
- PostDetail: Full content viewer
- Filters: Category/date navigation
Responsive Design: Desktop and mobile support

Detailed Design

Data Extraction Process

The "Bridge" Approach

Java Tool Creation:
- ExportTool.java: Simple class using JDBC to connect to HSQLDB
- Uses hsqldb-1.8.0.10.jar driver (downloaded via Maven)
- Executes SQL query: SELECT * FROM ENTRY_TABLE_1096292361887
- Maps result set to JSON objects

Compilation & Execution:

cd projects/thingamablog-api
javac -cp target/dependency/hsqldb-1.8.0.10.jar ExportTool.java
java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > ../thingamablog-v2/backend/blog-export.json

Data Cleaning:
- Converts timestamps to readable format
- Preserves HTML content in body
- Extracts categories from metadata
- Assigns sequential IDs

Migration Results

Input: Binary HSQLDB files (~2MB total)
Output: Clean JSON (1.3MB, 467 entries)
Quality: Perfect titles, dates, HTML content preserved
Performance: One-time export, instant loading thereafter

API Design

REST Endpoints

GET /api/posts
- Response: Array of post summaries
- Filtering: None (client-side)
- Sorting: By date descending (for clean JSON)
GET /api/posts/:id
- Response: Full post object
- Image URL rewriting: Converts Windows paths to HTTP URLs
- Error handling: 404 for missing posts

Data Flow

Frontend requests post list
Backend loads JSON/falls back to DB parsing
Backend serves filtered/sorted data
Frontend renders with Material-UI components

Frontend Design

Component Hierarchy

App
├── Sidebar (Filters)
│   ├── AllPostsFilter
│   ├── CategoryAccordion
│   └── ArchiveAccordion
├── PostList (Scrollable)
└── PostViewer (Rich content)

State Management

React hooks for local state
No external state library (simple app)
URL-based state for selected post/filter

UI/UX Principles

Progressive Disclosure: Filters collapsed by default
Responsive Grid: 3-column desktop, stacked mobile
Accessibility: Keyboard navigation, screen reader support
Performance: Virtual scrolling for large post lists

Implementation Details

Technology Choices

Component	Technology	Rationale
Export Tool	Java + HSQLDB Driver	Native compatibility with legacy DB
Data Format	JSON	Universal, human-readable
Backend	Node.js/Express	Simple, fast for file-based data
Frontend	React/Material-UI	Modern, component-based UI
Images	Static serving	Direct file access for performance

File Structure

projects/
├── thingamablog-api/        # Data extraction tool
│   ├── ExportTool.java      # Java export tool
│   ├── pom.xml              # Maven config
│   ├── src/                 # Maven source structure
│   ├── target/              # Compiled classes + dependencies
│   └── api.log              # Execution log
└── thingamablog-v2/         # Web application
    ├── backend/
    │   ├── server.js        # API server
    │   ├── hsqldbParser.js  # Fallback parser
    │   └── blog-export.json # Clean data
    └── frontend/src/
        ├── App.js           # Main UI
        ├── theme.js         # Styling
        └── components/      # Reusable UI pieces

Deployment & Operations

Local Development

Extract data: cd projects/thingamablog-api && java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > ../thingamablog-v2/backend/blog-export.json
Start backend: cd projects/thingamablog-v2/backend && node server.js
Start frontend: cd projects/thingamablog-v2/frontend && npm start
Access: http://localhost:3000

Production Considerations

Scalability: JSON file size (1.3MB) is fine for personal use
Backup: Version control the JSON export
Updates: Re-run export tool if DB changes
Security: Local-only access, no authentication needed

Risks & Mitigations

Risk	Mitigation
HSQLDB driver availability	Downloaded and cached locally
Java version compatibility	Tested with Java 8+
Data corruption	JSON validation on load
Performance with large datasets	Client-side pagination/filtering

Future Enhancements

Search: Full-text search within posts
Tagging: Enhanced category management
Export: Additional formats (Markdown, PDF)
Open Brain Integration: Vector embedding for AI queries
Multi-user: User accounts and permissions

Success Metrics

Data Integrity: 100% posts extracted with correct metadata
Performance: <2s page load, <500ms API response
Usability: Intuitive filtering/navigation
Maintainability: Clear code structure, comprehensive docs

Conclusion

Thingamablog v2 successfully modernizes legacy blog data through a "bridge" approach, providing a clean migration path from obsolete technology to contemporary web standards. The modular design allows for easy maintenance and future AI integrations.

7.6 KiB Raw Blame History