Commit Graph

11 Commits

Author SHA1 Message Date
84ce6f6b92 refactor: replace pdftoppm with go-fitz for in-process PDF rendering
Switch PDF page rendering from external pdftoppm/pdftocairo subprocess calls
to github.com/gen2brain/go-fitz (MuPDF wrapper), eliminating the poppler-utils
runtime dependency. Enable CGO in Dockerfile builder stage and install gcc/musl-dev
for the static MuPDF link; runtime image remains unchanged.
2026-03-31 21:21:17 +08:00
3e07c29376 fix: downgrade dependencies for go 1.20 compatibility 2026-03-31 19:37:51 +08:00
ac078a16bc fix: pin go directive to 1.20, add user ownership check on GetPDFTask
- Downgrade go directive in go.mod from 1.23.0 back to 1.20 to match
  Docker builder image (golang:1.20-alpine); re-run go mod tidy with
  go1.20 (via gvm) to keep go.sum consistent
- GetPDFTask now verifies callerUserID matches task.UserID to prevent
  cross-user data exposure of PDF page content

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 14:52:20 +08:00
9d712c921a feat: add PDF document recognition with 10-page pre-hook
- Migrate recognition_results table to JSON schema (meta_data + content),
  replacing flat latex/markdown/mathml/mml columns
- Add TaskTypePDF constant and update all formula read/write paths
- Add PDFRecognitionService using pdftoppm (Poppler) for CGO-free page
  rendering; limits processing to first 10 pages (pre-hook)
- Reuse existing downstream OCR endpoint (cloud.texpixel.com) for each
  page image; stores results as [{page_number, markdown}] JSON array
- Add Redis queue + distributed lock for PDF worker goroutine
- Add REST endpoints: POST /v1/pdf/recognition, GET /v1/pdf/recognition/:task_no
- Add .pdf to OSS upload file type whitelist
- Add migrations/pdf_recognition.sql for safe data migration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 14:17:44 +08:00
liuyuanchuang
52c9e48a0f fix: rm router db 2026-01-27 22:22:06 +08:00
liuyuanchuang
a04eedc423 feat: add track point 2026-01-27 22:20:07 +08:00
liuyuanchuang
a5f1ad153e refactor: update package path 2026-01-27 21:56:21 +08:00
5ee1cea0d7 feat: add gls 2025-12-26 17:11:59 +08:00
0bc77f61e2 feat: update dockerfile 2025-12-10 23:17:24 +08:00
083142491f refact: update oss config 2025-12-10 22:23:05 +08:00
liuyuanchuang
48e63894eb init repo 2025-12-10 18:33:37 +08:00