5.3 KiB

Raw Permalink Blame History

Thingamablog v2 Project Documentation

Overview

This project extracts and displays blog entries from Paul's old Thingamablog (a Windows blog platform from the early 2000s). The process involved multiple iterations:

Initial Attempt: Python script to parse HTML files (blog.html).
Database Approach: Discovered the blogs were stored in an HSQLDB database (old Java-based SQL DB).
Java/Spring Boot: Used old HSQLDB Java library in a Spring Boot app to extract data.
Node.js Solution: Final implementation using a Node.js app to read HSQLDB directly and export to JSON.
Web App: Built a full-stack web app (React frontend + Express backend) to browse and display the blog entries.

The result is a clean JSON export (blog-export.json) containing all blog posts, which can be browsed via a modern web interface.

Latest Updates (March 2026)

✅ Git Repository: Created on Gitea at https://paje.ca/git/paulh/thingamablog-v2
✅ Documentation: Comprehensive DESIGN.md and updated README.md
✅ Data Quality: 467 blog posts successfully extracted with perfect metadata
🔄 Git Push: Local repo initialized and committed, awaiting network resolution for push
🔗 Integration: Ready for Open Brain vector ingestion

Data Extraction Process

The data was extracted using the "Bridge" approach from the companion thingamablog-api project:

Source: Old HSQLDB database files in docs/pauls-blogs/Paul/database/
Tool: Java CLI application (ExportTool.java) with HSQLDB 1.8.0.10 driver
Command: java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > blog-export.json
Output: blog-export.json with structured blog entries (title, date, content, categories, etc.)
Quality: Perfect extraction - 467 entries, full HTML content preserved, clean metadata
Size: 1.3MB of structured JSON data

Project Structure

projects/thingamablog-v2/
├── backend/          # Express.js server
│   ├── server.js     # Main server file
│   ├── hsqldbParser.js # Legacy HSQLDB parser (fallback)
│   ├── blog-export.json # Exported blog data (1.7MB)
│   └── package.json
├── frontend/         # React app
│   ├── src/
│   │   ├── App.js    # Main component
│   │   ├── theme.js  # Material-UI theme
│   │   └── index.js
│   └── package.json
├── DESIGN.md         # Technical design document
├── README.md         # This file
└── .gitignore        # Excludes node_modules, build artifacts

Running the Application

Prerequisites

Node.js installed
Backend dependencies: cd backend && npm install
Frontend dependencies: cd frontend && npm install

Start Backend

cd projects/thingamablog-v2/backend
node server.js

Runs on http://localhost:3637
Loads blog-export.json if available, else falls back to HSQLDB parser
Serves images from /home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/1096292361887/web

Start Frontend

cd projects/thingamablog-v2/frontend
npm start

Runs on http://localhost:3000 (opens automatically)
Connects to backend at http://localhost:3637

Access the App

Open http://localhost:3000 in your browser.

UI Description

The web app provides a clean, modern interface to browse Paul's old blog posts:

Layout

Header: "Thingamablog Archive" title
Sidebar (Left): Browse filters
- "All Posts" - show everything
- "Categories" accordion - hierarchical categories (e.g., Hobbies > Car Maintenance)
- "Archives" accordion - posts by year
Post List (Middle): Scrollable list of posts with title, date, author
Post Content (Right): Full post display with images, categories as chips

Features

Filtering: Click categories or years to filter posts
Selection: Click a post to view full content
Images: Embedded images load from served static files
Responsive: Adapts to screen size (stacks vertically on mobile)

Sample View

Categories include: Hobbies, Personal, Robotics, etc.
Posts date back to 2000s, covering topics like 3D printing, car maintenance, tech projects
Content includes HTML formatting, links, and images

API Endpoints

GET /api/posts - List all posts (id, title, date, category)
GET /api/posts/:id - Get full post data
Images served at /1096292361887/web/* or /

Future Integration

This data can be ingested into Open Brain for vector search:

Parse blog-export.json into chunks
Embed with OpenAI
Store in Supabase PGVector
Enable semantic queries across Paul's 20+ year blog history

Development Status

Backend: ✅ Complete, tested with 467 posts
Frontend: ✅ Complete, Material-UI responsive design
Data Extraction: ✅ Complete, high-quality JSON export
Documentation: ✅ Complete (README.md, DESIGN.md)
Git Repository: ✅ Created on Gitea, awaiting push due to network issues
Testing: ✅ Manual testing successful

Notes

No tests or CI/CD set up
Assumes local paths for images/database
Backend prioritizes JSON export over HSQLDB parsing for speed
Categories use <Category> and <Parent - Child> format
Network issues preventing final Git push - repos ready to push when connectivity restored

5.3 KiB Raw Permalink Blame History