feat: add dockerfile

2025-12-15 22:31:13 +08:00
parent 9b88cec77b
commit ba0968b2da
8 changed files with 784 additions and 9 deletions
--- a/README.docker.md
+++ b/README.docker.md
@@ -0,0 +1,253 @@
+# TexTeller Docker Deployment Guide
+
+This guide explains how to deploy TexTeller using Docker with NVIDIA GPU support (optimized for RTX 5080).
+
+## Prerequisites
+
+1. **NVIDIA Driver**: Install NVIDIA driver version 525 or later
+2. **NVIDIA Container Toolkit**: Required for GPU access in Docker containers
+3. **Docker**: Version 20.10 or later
+4. **Docker Compose**: Version 1.29 or later (or use `docker compose` v2)
+5. **Pre-downloaded Model**: Model should be in `~/.cache/huggingface/hub/models--OleehyO--TexTeller/`
+
+## Setup NVIDIA Container Toolkit
+
+If you haven't installed the NVIDIA Container Toolkit:
+
+```bash
+# Add the package repository
+distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
+curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
+curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
+
+# Install nvidia-container-toolkit
+sudo apt-get update
+sudo apt-get install -y nvidia-container-toolkit
+
+# Restart Docker
+sudo systemctl restart docker
+```
+
+## Quick Start
+
+The easiest way to deploy is using the provided deployment script:
+
+```bash
+# Run all checks and deploy
+./deploy.sh deploy
+
+# Or check system requirements first
+./deploy.sh check
+
+# View available commands
+./deploy.sh
+```
+
+## Build and Run
+
+### Using the Deployment Script (Recommended)
+
+```bash
+# Full deployment (checks, build, and start)
+./deploy.sh deploy
+
+# Just build the image
+./deploy.sh build
+
+# Start/stop the service
+./deploy.sh start
+./deploy.sh stop
+
+# View logs
+./deploy.sh logs
+
+# Check status
+./deploy.sh status
+```
+
+### Using Docker Compose
+
+```bash
+# Build and start the service
+docker-compose up -d
+
+# View logs
+docker-compose logs -f
+
+# Stop the service
+docker-compose down
+```
+
+### Using Docker directly
+
+```bash
+# Build the image
+docker build -t texteller:latest .
+
+# Run the container
+docker run -d \
+  --name texteller-server \
+  --gpus '"device=0"' \
+  -p 8001:8001 \
+  -v ~/.cache/huggingface/hub/models--OleehyO--TexTeller:/root/.cache/huggingface/hub/models--OleehyO--TexTeller:ro \
+  -e CUDA_VISIBLE_DEVICES=0 \
+  texteller:latest
+```
+
+## API Usage
+
+The server accepts JSON requests with either base64-encoded images or image URLs at the `/predict` endpoint.
+
+### Using base64-encoded image
+
+```bash
+# Example with base64 image
+curl -X POST http://localhost:8001/predict \
+  -H "Content-Type: application/json" \
+  -d '{
+    "image_base64": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA..."
+  }'
+```
+
+### Using image URL
+
+```bash
+# Example with image URL
+curl -X POST http://localhost:8001/predict \
+  -H "Content-Type: application/json" \
+  -d '{
+    "image_url": "https://example.com/math_equation.png"
+  }'
+```
+
+### Python client example
+
+```python
+import requests
+import base64
+
+# Method 1: Using base64
+with open("equation.png", "rb") as f:
+    image_base64 = base64.b64encode(f.read()).decode()
+
+response = requests.post(
+    "http://localhost:8001/predict",
+    json={"image_base64": image_base64}
+)
+print(response.json())
+
+# Method 2: Using URL
+response = requests.post(
+    "http://localhost:8001/predict",
+    json={"image_url": "https://example.com/math_equation.png"}
+)
+print(response.json())
+```
+
+Or use the provided test script:
+
+```bash
+# Test with a local image
+python examples/test_server.py path/to/equation.png
+
+# Test with both local and URL
+python examples/test_server.py path/to/equation.png https://example.com/formula.png
+```
+
+### Response format
+
+Success response:
+```json
+{
+  "result": "\\frac{a}{b} = c"
+}
+```
+
+Error response:
+```json
+{
+  "error": "Failed to decode image"
+}
+```
+
+## Configuration
+
+You can configure the service by modifying environment variables in `docker-compose.yml`:
+
+- `CUDA_VISIBLE_DEVICES`: GPU device ID (default: 0)
+- `RAY_NUM_REPLICAS`: Number of Ray Serve replicas (default: 1)
+- `RAY_NCPU_PER_REPLICA`: CPUs per replica (default: 4)
+- `RAY_NGPU_PER_REPLICA`: GPUs per replica (default: 1)
+
+## Monitoring
+
+```bash
+# Check container status
+docker ps
+
+# View real-time logs
+docker-compose logs -f texteller
+
+# Check GPU usage
+nvidia-smi
+
+# Check container resource usage
+docker stats texteller-server
+```
+
+## Troubleshooting
+
+### GPU not detected
+```bash
+# Verify NVIDIA runtime is available
+docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
+```
+
+### Port already in use
+Change the port mapping in `docker-compose.yml`:
+```yaml
+ports:
+  - "8080:8000"  # Host port 8080 -> Container port 8000
+```
+
+### Model not found
+Ensure the model is downloaded to the correct location:
+```bash
+ls -la ~/.cache/huggingface/hub/models--OleehyO--TexTeller/
+```
+
+## Performance Notes
+
+- **RTX 5080**: Optimized for CUDA 12.8 with cuDNN 9
+- **Memory**: Container requires ~4-6GB GPU memory (RTX 5080 has 16GB)
+- **Throughput**: ~10-20 images/second depending on image complexity
+- **Startup time**: ~30-60 seconds for model loading
+
+## Advanced Configuration
+
+### Multiple GPUs
+
+To use multiple GPUs, modify `docker-compose.yml`:
+
+```yaml
+environment:
+  - CUDA_VISIBLE_DEVICES=0,1
+  - RAY_NUM_REPLICAS=2
+deploy:
+  resources:
+    reservations:
+      devices:
+        - driver: nvidia
+          device_ids: ['0', '1']
+          capabilities: [gpu]
+```
+
+### Production deployment
+
+For production, consider:
+1. Using a reverse proxy (nginx/traefik) for SSL/TLS
+2. Adding authentication middleware
+3. Implementing rate limiting
+4. Setting up monitoring (Prometheus/Grafana)
+5. Using orchestration (Kubernetes) for scaling
+