5.3 KiB
5.3 KiB
Thingamablog v2 Project Documentation
Overview
This project extracts and displays blog entries from Paul's old Thingamablog (a Windows blog platform from the early 2000s). The process involved multiple iterations:
- Initial Attempt: Python script to parse HTML files (
blog.html). - Database Approach: Discovered the blogs were stored in an HSQLDB database (old Java-based SQL DB).
- Java/Spring Boot: Used old HSQLDB Java library in a Spring Boot app to extract data.
- Node.js Solution: Final implementation using a Node.js app to read HSQLDB directly and export to JSON.
- Web App: Built a full-stack web app (React frontend + Express backend) to browse and display the blog entries.
The result is a clean JSON export (blog-export.json) containing all blog posts, which can be browsed via a modern web interface.
Latest Updates (March 2026)
- ✅ Git Repository: Created on Gitea at https://paje.ca/git/paulh/thingamablog-v2
- ✅ Documentation: Comprehensive DESIGN.md and updated README.md
- ✅ Data Quality: 467 blog posts successfully extracted with perfect metadata
- 🔄 Git Push: Local repo initialized and committed, awaiting network resolution for push
- 🔗 Integration: Ready for Open Brain vector ingestion
Data Extraction Process
The data was extracted using the "Bridge" approach from the companion thingamablog-api project:
- Source: Old HSQLDB database files in
docs/pauls-blogs/Paul/database/ - Tool: Java CLI application (
ExportTool.java) with HSQLDB 1.8.0.10 driver - Command:
java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > blog-export.json - Output:
blog-export.jsonwith structured blog entries (title, date, content, categories, etc.) - Quality: Perfect extraction - 467 entries, full HTML content preserved, clean metadata
- Size: 1.3MB of structured JSON data
Project Structure
projects/thingamablog-v2/
├── backend/ # Express.js server
│ ├── server.js # Main server file
│ ├── hsqldbParser.js # Legacy HSQLDB parser (fallback)
│ ├── blog-export.json # Exported blog data (1.7MB)
│ └── package.json
├── frontend/ # React app
│ ├── src/
│ │ ├── App.js # Main component
│ │ ├── theme.js # Material-UI theme
│ │ └── index.js
│ └── package.json
├── DESIGN.md # Technical design document
├── README.md # This file
└── .gitignore # Excludes node_modules, build artifacts
Running the Application
Prerequisites
- Node.js installed
- Backend dependencies:
cd backend && npm install - Frontend dependencies:
cd frontend && npm install
Start Backend
cd projects/thingamablog-v2/backend
node server.js
- Runs on
http://localhost:3637 - Loads
blog-export.jsonif available, else falls back to HSQLDB parser - Serves images from
/home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/1096292361887/web
Start Frontend
cd projects/thingamablog-v2/frontend
npm start
- Runs on
http://localhost:3000(opens automatically) - Connects to backend at
http://localhost:3637
Access the App
Open http://localhost:3000 in your browser.
UI Description
The web app provides a clean, modern interface to browse Paul's old blog posts:
Layout
- Header: "Thingamablog Archive" title
- Sidebar (Left): Browse filters
- "All Posts" - show everything
- "Categories" accordion - hierarchical categories (e.g., Hobbies > Car Maintenance)
- "Archives" accordion - posts by year
- Post List (Middle): Scrollable list of posts with title, date, author
- Post Content (Right): Full post display with images, categories as chips
Features
- Filtering: Click categories or years to filter posts
- Selection: Click a post to view full content
- Images: Embedded images load from served static files
- Responsive: Adapts to screen size (stacks vertically on mobile)
Sample View
- Categories include: Hobbies, Personal, Robotics, etc.
- Posts date back to 2000s, covering topics like 3D printing, car maintenance, tech projects
- Content includes HTML formatting, links, and images
API Endpoints
GET /api/posts- List all posts (id, title, date, category)GET /api/posts/:id- Get full post data- Images served at
/1096292361887/web/*or/
Future Integration
This data can be ingested into Open Brain for vector search:
- Parse
blog-export.jsoninto chunks - Embed with OpenAI
- Store in Supabase PGVector
- Enable semantic queries across Paul's 20+ year blog history
Development Status
- Backend: ✅ Complete, tested with 467 posts
- Frontend: ✅ Complete, Material-UI responsive design
- Data Extraction: ✅ Complete, high-quality JSON export
- Documentation: ✅ Complete (README.md, DESIGN.md)
- Git Repository: ✅ Created on Gitea, awaiting push due to network issues
- Testing: ✅ Manual testing successful
Notes
- No tests or CI/CD set up
- Assumes local paths for images/database
- Backend prioritizes JSON export over HSQLDB parsing for speed
- Categories use
<Category>and<Parent - Child>format - Network issues preventing final Git push - repos ready to push when connectivity restored