5.1 KiB
5.1 KiB
Thingamablog v2 Project Documentation
Overview
This project extracts and displays blog entries from Paul's old Thingamablog (a Windows blog platform from the early 2000s). The process involved multiple iterations:
- Initial Attempt: Python script to parse HTML files (
blog.html). - Database Approach: Discovered the blogs were stored in an HSQLDB database (old Java-based SQL DB).
- Java/Spring Boot: Used old HSQLDB Java library in a Spring Boot app to extract data.
- Node.js Solution: Final implementation using a Node.js app to read HSQLDB directly and export to JSON.
- Web App: Built a full-stack web app (React frontend + Express backend) to browse and display the blog entries.
The result is a clean JSON export (blog-export.json) containing all blog posts, which can be browsed via a modern web interface.
Data Extraction Process
- Source: Old HSQLDB database files in
docs/pauls-blogs/Paul/database/ - Method: Custom Node.js script using an HSQLDB reader library (likely
hsqldbor similar npm package) - Command to Generate blog-export.json: (Not fully documented, but likely)
# Assumed command (run in backend directory) node -e " const { parseHSQLDB } = require('./hsqldbParser'); const entries = parseHSQLDB('/home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/database'); const cleanEntries = entries.map((e, idx) => ({ id: idx + 1, title: e.TITLE || 'Untitled', date: e.TIMESTAMP || '', author: e.AUTHOR || 'Paul', categories: e.CATEGORIES || '', content: e.ENTRY || '' })); console.log(JSON.stringify(cleanEntries, null, 2)); " > blog-export.json- This uses the fallback parser, but the actual export may have used a more robust library for better data extraction.
- Output:
blog-export.jsonwith structured blog entries (title, date, content, categories, etc.) - Challenges: HSQLDB is an obsolete format; required finding compatible libraries. The JSON cleans up the raw DB data into readable format.
Project Structure
projects/thingamablog-v2/
├── backend/ # Express.js server
│ ├── server.js # Main server file
│ ├── hsqldbParser.js # Legacy HSQLDB parser (fallback)
│ ├── blog-export.json # Exported blog data (1.7MB)
│ └── package.json
└── frontend/ # React app
├── src/
│ ├── App.js # Main component
│ ├── theme.js # Material-UI theme
│ └── index.js
└── package.json
Web App Features
- Backend: Express server serving API endpoints for posts
- Frontend: React with Material-UI for modern UI
- Filters: Browse by category (hierarchical) or date archives
- Images: Served statically from original blog image folder
- Responsive: Works on desktop and mobile
Running the Application
Prerequisites
- Node.js installed
- Backend dependencies:
cd backend && npm install - Frontend dependencies:
cd frontend && npm install
Start Backend
cd projects/thingamablog-v2/backend
node server.js
- Runs on
http://localhost:3637 - Loads
blog-export.jsonif available, else falls back to HSQLDB parser - Serves images from
/home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/1096292361887/web
Start Frontend
cd projects/thingamablog-v2/frontend
npm start
- Runs on
http://localhost:3000(opens automatically) - Connects to backend at
http://localhost:3637
Access the App
Open http://localhost:3000 in your browser.
UI Description
The web app provides a clean, modern interface to browse Paul's old blog posts:
Layout
- Header: "Thingamablog Archive" title
- Sidebar (Left): Browse filters
- "All Posts" - show everything
- "Categories" accordion - hierarchical categories (e.g., Hobbies > Car Maintenance)
- "Archives" accordion - posts by year
- Post List (Middle): Scrollable list of posts with title, date, author
- Post Content (Right): Full post display with images, categories as chips
Features
- Filtering: Click categories or years to filter posts
- Selection: Click a post to view full content
- Images: Embedded images load from served static files
- Responsive: Adapts to screen size (stacks vertically on mobile)
Sample View
- Categories include: Hobbies, Personal, Robotics, etc.
- Posts date back to 2000s, covering topics like 3D printing, car maintenance, tech projects
- Content includes HTML formatting, links, and images
API Endpoints
GET /api/posts- List all posts (id, title, date, category)GET /api/posts/:id- Get full post data- Images served at
/1096292361887/web/*or/
Future Integration
This data can be ingested into Open Brain for vector search:
- Parse
blog-export.jsoninto chunks - Embed with OpenAI
- Store in Supabase PGVector
- Enable semantic queries across Paul's 20+ year blog history
Notes
- No tests or CI/CD set up
- Assumes local paths for images/database
- Backend prioritizes JSON export over HSQLDB parsing for speed
- Categories use
<Category>and<Parent - Child>format