5.1 KiB

Raw Blame History

Thingamablog v2 Project Documentation

Overview

This project extracts and displays blog entries from Paul's old Thingamablog (a Windows blog platform from the early 2000s). The process involved multiple iterations:

Initial Attempt: Python script to parse HTML files (blog.html).
Database Approach: Discovered the blogs were stored in an HSQLDB database (old Java-based SQL DB).
Java/Spring Boot: Used old HSQLDB Java library in a Spring Boot app to extract data.
Node.js Solution: Final implementation using a Node.js app to read HSQLDB directly and export to JSON.
Web App: Built a full-stack web app (React frontend + Express backend) to browse and display the blog entries.

The result is a clean JSON export (blog-export.json) containing all blog posts, which can be browsed via a modern web interface.

Data Extraction Process

Source: Old HSQLDB database files in docs/pauls-blogs/Paul/database/
Method: Custom Node.js script using an HSQLDB reader library (likely hsqldb or similar npm package)

Command to Generate blog-export.json: (Not fully documented, but likely)

# Assumed command (run in backend directory)
node -e "
const { parseHSQLDB } = require('./hsqldbParser');
const entries = parseHSQLDB('/home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/database');
const cleanEntries = entries.map((e, idx) => ({
  id: idx + 1,
  title: e.TITLE || 'Untitled',
  date: e.TIMESTAMP || '',
  author: e.AUTHOR || 'Paul',
  categories: e.CATEGORIES || '',
  content: e.ENTRY || ''
}));
console.log(JSON.stringify(cleanEntries, null, 2));
" > blog-export.json

This uses the fallback parser, but the actual export may have used a more robust library for better data extraction.

Output: blog-export.json with structured blog entries (title, date, content, categories, etc.)
Challenges: HSQLDB is an obsolete format; required finding compatible libraries. The JSON cleans up the raw DB data into readable format.

Project Structure

projects/thingamablog-v2/
├── backend/          # Express.js server
│   ├── server.js     # Main server file
│   ├── hsqldbParser.js # Legacy HSQLDB parser (fallback)
│   ├── blog-export.json # Exported blog data (1.7MB)
│   └── package.json
└── frontend/         # React app
    ├── src/
    │   ├── App.js    # Main component
    │   ├── theme.js  # Material-UI theme
    │   └── index.js
    └── package.json

Web App Features

Backend: Express server serving API endpoints for posts
Frontend: React with Material-UI for modern UI
Filters: Browse by category (hierarchical) or date archives
Images: Served statically from original blog image folder
Responsive: Works on desktop and mobile

Running the Application

Prerequisites

Node.js installed
Backend dependencies: cd backend && npm install
Frontend dependencies: cd frontend && npm install

Start Backend

cd projects/thingamablog-v2/backend
node server.js

Runs on http://localhost:3637
Loads blog-export.json if available, else falls back to HSQLDB parser
Serves images from /home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/1096292361887/web

Start Frontend

cd projects/thingamablog-v2/frontend
npm start

Runs on http://localhost:3000 (opens automatically)
Connects to backend at http://localhost:3637

Access the App

Open http://localhost:3000 in your browser.

UI Description

The web app provides a clean, modern interface to browse Paul's old blog posts:

Layout

Header: "Thingamablog Archive" title
Sidebar (Left): Browse filters
- "All Posts" - show everything
- "Categories" accordion - hierarchical categories (e.g., Hobbies > Car Maintenance)
- "Archives" accordion - posts by year
Post List (Middle): Scrollable list of posts with title, date, author
Post Content (Right): Full post display with images, categories as chips

Features

Filtering: Click categories or years to filter posts
Selection: Click a post to view full content
Images: Embedded images load from served static files
Responsive: Adapts to screen size (stacks vertically on mobile)

Sample View

Categories include: Hobbies, Personal, Robotics, etc.
Posts date back to 2000s, covering topics like 3D printing, car maintenance, tech projects
Content includes HTML formatting, links, and images

API Endpoints

GET /api/posts - List all posts (id, title, date, category)
GET /api/posts/:id - Get full post data
Images served at /1096292361887/web/* or /

Future Integration

This data can be ingested into Open Brain for vector search:

Parse blog-export.json into chunks
Embed with OpenAI
Store in Supabase PGVector
Enable semantic queries across Paul's 20+ year blog history

Notes

No tests or CI/CD set up
Assumes local paths for images/database
Backend prioritizes JSON export over HSQLDB parsing for speed
Categories use <Category> and <Parent - Child> format

5.1 KiB Raw Blame History