Modern web app for browsing Thingamablog posts (React + Node.js)

Go to file

Paul Huliganga be1a25c6fb Update README.md with latest findings - Add Git repository status and Gitea links - Document successful data extraction (467 posts, 1.3MB) - Include current development status and testing results - Add Open Brain integration readiness - Update project structure with new files		2026-03-03 11:09:35 -05:00
backend	Initial commit: Thingamablog web app (React + Node.js)	2026-03-03 10:51:29 -05:00
frontend	Initial commit: Thingamablog web app (React + Node.js)	2026-03-03 10:51:29 -05:00
.gitignore	Initial commit: Thingamablog web app (React frontend + Node.js backend)	2026-03-03 10:53:53 -05:00
DESIGN.md	Initial commit: Thingamablog web app (React + Node.js)	2026-03-03 10:51:29 -05:00
README.md	Update README.md with latest findings	2026-03-03 11:09:35 -05:00

README.md

Thingamablog v2 Project Documentation

Overview

This project extracts and displays blog entries from Paul's old Thingamablog (a Windows blog platform from the early 2000s). The process involved multiple iterations:

Initial Attempt: Python script to parse HTML files (blog.html).
Database Approach: Discovered the blogs were stored in an HSQLDB database (old Java-based SQL DB).
Java/Spring Boot: Used old HSQLDB Java library in a Spring Boot app to extract data.
Node.js Solution: Final implementation using a Node.js app to read HSQLDB directly and export to JSON.
Web App: Built a full-stack web app (React frontend + Express backend) to browse and display the blog entries.

The result is a clean JSON export (blog-export.json) containing all blog posts, which can be browsed via a modern web interface.

Latest Updates (March 2026)

✅ Git Repository: Created on Gitea at https://paje.ca/git/paulh/thingamablog-v2
✅ Documentation: Comprehensive DESIGN.md and updated README.md
✅ Data Quality: 467 blog posts successfully extracted with perfect metadata
🔄 Git Push: Local repo initialized and committed, awaiting network resolution for push
🔗 Integration: Ready for Open Brain vector ingestion

Data Extraction Process

The data was extracted using the "Bridge" approach from the companion thingamablog-api project:

Source: Old HSQLDB database files in docs/pauls-blogs/Paul/database/
Tool: Java CLI application (ExportTool.java) with HSQLDB 1.8.0.10 driver
Command: java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > blog-export.json
Output: blog-export.json with structured blog entries (title, date, content, categories, etc.)
Quality: Perfect extraction - 467 entries, full HTML content preserved, clean metadata
Size: 1.3MB of structured JSON data

Project Structure

projects/thingamablog-v2/
├── backend/          # Express.js server
│   ├── server.js     # Main server file
│   ├── hsqldbParser.js # Legacy HSQLDB parser (fallback)
│   ├── blog-export.json # Exported blog data (1.7MB)
│   └── package.json
├── frontend/         # React app
│   ├── src/
│   │   ├── App.js    # Main component
│   │   ├── theme.js  # Material-UI theme
│   │   └── index.js
│   └── package.json
├── DESIGN.md         # Technical design document
├── README.md         # This file
└── .gitignore        # Excludes node_modules, build artifacts

Running the Application

Prerequisites

Node.js installed
Backend dependencies: cd backend && npm install
Frontend dependencies: cd frontend && npm install

Start Backend

cd projects/thingamablog-v2/backend
node server.js

Runs on http://localhost:3637
Loads blog-export.json if available, else falls back to HSQLDB parser
Serves images from /home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/1096292361887/web

Start Frontend

cd projects/thingamablog-v2/frontend
npm start

Runs on http://localhost:3000 (opens automatically)
Connects to backend at http://localhost:3637

Access the App

Open http://localhost:3000 in your browser.

UI Description

The web app provides a clean, modern interface to browse Paul's old blog posts:

Layout

Header: "Thingamablog Archive" title
Sidebar (Left): Browse filters
- "All Posts" - show everything
- "Categories" accordion - hierarchical categories (e.g., Hobbies > Car Maintenance)
- "Archives" accordion - posts by year
Post List (Middle): Scrollable list of posts with title, date, author
Post Content (Right): Full post display with images, categories as chips

Features

Filtering: Click categories or years to filter posts
Selection: Click a post to view full content
Images: Embedded images load from served static files
Responsive: Adapts to screen size (stacks vertically on mobile)

Sample View

Categories include: Hobbies, Personal, Robotics, etc.
Posts date back to 2000s, covering topics like 3D printing, car maintenance, tech projects
Content includes HTML formatting, links, and images

API Endpoints

GET /api/posts - List all posts (id, title, date, category)
GET /api/posts/:id - Get full post data
Images served at /1096292361887/web/* or /

Future Integration

This data can be ingested into Open Brain for vector search:

Parse blog-export.json into chunks
Embed with OpenAI
Store in Supabase PGVector
Enable semantic queries across Paul's 20+ year blog history

Development Status

Backend: ✅ Complete, tested with 467 posts
Frontend: ✅ Complete, Material-UI responsive design
Data Extraction: ✅ Complete, high-quality JSON export
Documentation: ✅ Complete (README.md, DESIGN.md)
Git Repository: ✅ Created on Gitea, awaiting push due to network issues
Testing: ✅ Manual testing successful

Notes

No tests or CI/CD set up
Assumes local paths for images/database
Backend prioritizes JSON export over HSQLDB parsing for speed
Categories use <Category> and <Parent - Child> format
Network issues preventing final Git push - repos ready to push when connectivity restored