Java tool for extracting Thingamablog HSQLDB data to JSON
Go to file
Paul Huliganga 08c6e492fc Add comprehensive README.md documentation
- Setup and build instructions for Java export tool
- Usage examples and sample output
- Data quality metrics and troubleshooting
- Integration details with thingamablog-v2
- File structure and development notes
2026-03-03 11:09:31 -05:00
src/main Initial commit: Java HSQLDB export tool 2026-03-03 10:51:01 -05:00
.gitignore Initial commit: Java HSQLDB export tool 2026-03-03 10:51:01 -05:00
ExportTool.java Initial commit: Java HSQLDB export tool 2026-03-03 10:51:01 -05:00
README.md Add comprehensive README.md documentation 2026-03-03 11:09:31 -05:00
pom.xml Initial commit: Java HSQLDB export tool 2026-03-03 10:51:01 -05:00

README.md

Thingamablog API - Data Extraction Tool

Overview

This is the data extraction component of the Thingamablog migration project. It contains a Java CLI tool that bridges legacy HSQLDB database files to modern JSON format, enabling the web application to serve clean, structured blog data.

Purpose

The Thingamablog platform (early 2000s) stored blog posts in an obsolete HSQLDB database format. This tool extracts that data into a clean JSON format that can be consumed by modern applications.

Architecture

  • Input: HSQLDB database files (database.script, database.data)
  • Tool: ExportTool.java - JDBC-based Java application
  • Driver: HSQLDB 1.8.0.10 JAR (legacy compatible)
  • Output: blog-export.json - Structured JSON array of blog posts

Setup & Build

Prerequisites

  • Java 8 or higher
  • Maven 3.x (for dependency management)

Dependencies

  • HSQLDB 1.8.0.10 JAR (automatically downloaded by Maven)
  • Maven coordinates: org.hsqldb:hsqldb:1.8.0.10

Build Process

# Download dependencies
mvn dependency:copy-dependencies

# Compile the tool
javac -cp target/dependency/hsqldb-1.8.0.10.jar ExportTool.java

# The compiled class will be in the root directory

Usage

Command Line

java -cp .:target/dependency/hsqldb-1.8.0.10.jar ExportTool > ../thingamablog-v2/backend/blog-export.json

What It Does

  1. Connects to HSQLDB database at /home/paulh/.openclaw/workspace/docs/pauls-blogs/Paul/database/
  2. Queries the ENTRY_TABLE_1096292361887 table
  3. Maps database columns to JSON fields:
    • IDid
    • TITLEtitle
    • TIMESTAMPdate
    • ENTRYcontent
    • CATEGORIEScategories
    • AUTHORauthor
  4. Outputs clean JSON array to stdout

Sample Output

[
  {
    "id": 1,
    "title": "Digital Imaging Notes",
    "date": "2003-11-03 16:41:22.053",
    "author": "Paul",
    "categories": "Hobbies",
    "content": "<p>Full HTML content preserved...</p>"
  }
]

Data Quality

The export produces high-quality data:

  • Perfect titles and dates
  • Full HTML content preserved
  • Categories properly extracted
  • Sequential IDs assigned
  • JSON validation passes
  • 467 entries successfully extracted (1.3MB)

Integration

The exported JSON feeds directly into the thingamablog-v2 web application:

  1. Place blog-export.json in ../thingamablog-v2/backend/
  2. The Node.js backend prioritizes this clean JSON over the fallback HSQLDB parser
  3. Web app serves posts via REST API

Troubleshooting

Common Issues

JDBC Driver Not Found

Error: org.hsqldb.jdbcDriver
  • Ensure Maven has downloaded the dependency: mvn dependency:copy-dependencies
  • Check classpath includes target/dependency/hsqldb-1.8.0.10.jar

Database Path Issues

SQL Exception: file not found
  • Verify HSQLDB files exist at the hardcoded path
  • Ensure read permissions on database files

Empty Output

  • Check database file integrity
  • Verify table name ENTRY_TABLE_1096292361887 exists

Legacy Considerations

  • HSQLDB 1.8.0.10 is from 2004 - very old format
  • Modern HSQLDB versions may not read these files
  • The "Bridge" approach isolates legacy dependencies

File Structure

thingamablog-api/
├── ExportTool.java          # Main extraction tool
├── pom.xml                  # Maven configuration
├── src/main/java/...        # Additional Spring Boot components (unused)
├── target/dependency/       # Maven dependencies
└── .gitignore               # Excludes build artifacts

Development Notes

  • Originally attempted with Spring Boot and newer HSQLDB drivers
  • Simplified to standalone Java CLI for reliability
  • Hardcoded paths for single-purpose extraction
  • JSON escaping implemented for HTML content safety
  • thingamablog-v2: Web application that consumes the exported JSON
  • docs/thingamablog-extract: Alternative extraction results (Markdown format)

Future Improvements

  • Parameterize database path and output file
  • Add command-line arguments for flexibility
  • Support for other HSQLDB table schemas
  • Integration with modern database migration tools