feat: aggressive image optimization for PPDocLayoutV3 only

- Remove doclayout-yolo (~4.8GB, torch/torchvision/triton)
- Replace opencv-python with opencv-python-headless (~200MB)
- Strip debug symbols from .so files (~300-800MB)
- Remove paddle C++ headers (~22MB)
- Use cuda:base instead of runtime (~3GB savings)
- Simplify dependencies: remove doc-parser extras
- Clean venv aggressively: no pip, setuptools, include/, share/

Expected size reduction:
  Before: 17GB
  After:  ~3GB (82% reduction)

Breakdown:
  - CUDA base: 0.4GB
  - Paddle: 0.7GB
  - PaddleOCR: 0.8GB
  - OpenCV-headless: 0.2GB
  - Other deps: 0.6GB
  Total: ~2.7-3GB

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
liuyuanchuang
2026-03-10 11:33:50 +08:00
parent 95c497829f
commit ef98f37525
2 changed files with 34 additions and 59 deletions

View File

@@ -11,7 +11,7 @@ authors = [
dependencies = [
"fastapi==0.128.0",
"uvicorn[standard]==0.40.0",
"opencv-python==4.12.0.88",
"opencv-python-headless==4.12.0.88", # headless: no Qt/FFmpeg GUI, server-only
"python-multipart==0.0.21",
"pydantic==2.12.5",
"pydantic-settings==2.12.0",
@@ -20,7 +20,6 @@ dependencies = [
"pillow==12.0.0",
"python-docx==1.2.0",
"paddleocr==3.4.0",
"doclayout-yolo==0.0.4",
"latex2mathml==3.78.1",
"paddle==1.2.0",
"pypandoc==1.16.2",