doc_processer/openspec/changes/add-doc-processing-api/tasks.md at main - doc_processer - texpixel

Explore Help

YogeLiu/doc_processer

1

0

You've already forked doc_processer

Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity

Files

main

doc_processer/openspec/changes/add-doc-processing-api/tasks.md

liuyuanchuang 874fd383cc init repo

2025-12-29 17:34:58 +08:00

1.4 KiB

Raw Permalink Blame History

1. Project Scaffolding

1.1 Create FastAPI project structure (app/, api/, core/, services/, schemas/)
1.2 Use uv handle with dependencies (fastapi, uvicorn, opencv-python, python-multipart, pydantic, httpx)
1.3 Create app/main.py with FastAPI app initialization
1.4 Create app/core/config.py with Pydantic Settings

2. Image OCR API

2.1 Create request/response schemas in app/schemas/image.py
2.2 Implement image preprocessing service with OpenCV padding (app/services/image_processor.py)
2.3 Implement DocLayout-YOLO wrapper (app/services/layout_detector.py)
2.4 Implement PaddleOCR-VL client (app/services/ocr_service.py)
2.5 Create image OCR endpoint (app/api/v1/endpoints/image.py)
2.6 Wire up router and test endpoint

3. Markdown to DOCX API

3.1 Create request/response schemas in app/schemas/convert.py
3.2 Integrate markdown_2_docx library (app/services/docx_converter.py)
3.3 Create conversion endpoint (app/api/v1/endpoints/convert.py)
3.4 Wire up router and test endpoint

4. Deployment

4.1 Create Dockerfile with CUDA base image for RTX 5080
4.2 Create docker-compose.yml (optional, for local development)
4.3 Document deployment steps in README

5. Validation

5.1 Test image OCR endpoint with sample images
5.2 Test markdown to DOCX conversion
5.3 Verify Docker build and GPU access

Reference in New Issue View Git Blame Copy Permalink

Powered by Gitea Version: 1.25.2 Page: 24ms Template: 3ms

English

Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語简体中文繁體中文（台灣）繁體中文（香港） 한국어

Licenses API