YogeLiu/TexTeller

Fork 0

Files

yoge ba0968b2da

Sphinx: Render docs / build (push) Has been cancelled

Details

Python Linting / lint (push) Has been cancelled

Details

Run Tests with Pytest / test (push) Has been cancelled

Details

feat: add dockerfile

2025-12-15 22:31:13 +08:00

5.4 KiB

Raw Blame History

TexTeller Docker Deployment Guide

This guide explains how to deploy TexTeller using Docker with NVIDIA GPU support (optimized for RTX 5080).

Prerequisites

NVIDIA Driver: Install NVIDIA driver version 525 or later
NVIDIA Container Toolkit: Required for GPU access in Docker containers
Docker: Version 20.10 or later
Docker Compose: Version 1.29 or later (or use docker compose v2)
Pre-downloaded Model: Model should be in ~/.cache/huggingface/hub/models--OleehyO--TexTeller/

Setup NVIDIA Container Toolkit

If you haven't installed the NVIDIA Container Toolkit:

# Add the package repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# Install nvidia-container-toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Restart Docker
sudo systemctl restart docker

Quick Start

The easiest way to deploy is using the provided deployment script:

# Run all checks and deploy
./deploy.sh deploy

# Or check system requirements first
./deploy.sh check

# View available commands
./deploy.sh

Build and Run

Using the Deployment Script (Recommended)

# Full deployment (checks, build, and start)
./deploy.sh deploy

# Just build the image
./deploy.sh build

# Start/stop the service
./deploy.sh start
./deploy.sh stop

# View logs
./deploy.sh logs

# Check status
./deploy.sh status

Using Docker Compose

# Build and start the service
docker-compose up -d

# View logs
docker-compose logs -f

# Stop the service
docker-compose down

Using Docker directly

# Build the image
docker build -t texteller:latest .

# Run the container
docker run -d \
  --name texteller-server \
  --gpus '"device=0"' \
  -p 8001:8001 \
  -v ~/.cache/huggingface/hub/models--OleehyO--TexTeller:/root/.cache/huggingface/hub/models--OleehyO--TexTeller:ro \
  -e CUDA_VISIBLE_DEVICES=0 \
  texteller:latest

API Usage

The server accepts JSON requests with either base64-encoded images or image URLs at the /predict endpoint.

Using base64-encoded image

# Example with base64 image
curl -X POST http://localhost:8001/predict \
  -H "Content-Type: application/json" \
  -d '{
    "image_base64": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA..."
  }'

Using image URL

# Example with image URL
curl -X POST http://localhost:8001/predict \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/math_equation.png"
  }'

Python client example

import requests
import base64

# Method 1: Using base64
with open("equation.png", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode()

response = requests.post(
    "http://localhost:8001/predict",
    json={"image_base64": image_base64}
)
print(response.json())

# Method 2: Using URL
response = requests.post(
    "http://localhost:8001/predict",
    json={"image_url": "https://example.com/math_equation.png"}
)
print(response.json())

Or use the provided test script:

# Test with a local image
python examples/test_server.py path/to/equation.png

# Test with both local and URL
python examples/test_server.py path/to/equation.png https://example.com/formula.png

Response format

Success response:

{
  "result": "\\frac{a}{b} = c"
}

Error response:

{
  "error": "Failed to decode image"
}

Configuration

You can configure the service by modifying environment variables in docker-compose.yml:

CUDA_VISIBLE_DEVICES: GPU device ID (default: 0)
RAY_NUM_REPLICAS: Number of Ray Serve replicas (default: 1)
RAY_NCPU_PER_REPLICA: CPUs per replica (default: 4)
RAY_NGPU_PER_REPLICA: GPUs per replica (default: 1)

Monitoring

# Check container status
docker ps

# View real-time logs
docker-compose logs -f texteller

# Check GPU usage
nvidia-smi

# Check container resource usage
docker stats texteller-server

Troubleshooting

GPU not detected

# Verify NVIDIA runtime is available
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Port already in use

Change the port mapping in docker-compose.yml:

ports:
  - "8080:8000"  # Host port 8080 -> Container port 8000

Model not found

Ensure the model is downloaded to the correct location:

ls -la ~/.cache/huggingface/hub/models--OleehyO--TexTeller/

Performance Notes

RTX 5080: Optimized for CUDA 12.8 with cuDNN 9
Memory: Container requires ~4-6GB GPU memory (RTX 5080 has 16GB)
Throughput: ~10-20 images/second depending on image complexity
Startup time: ~30-60 seconds for model loading

Advanced Configuration

Multiple GPUs

To use multiple GPUs, modify docker-compose.yml:

environment:
  - CUDA_VISIBLE_DEVICES=0,1
  - RAY_NUM_REPLICAS=2
deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          device_ids: ['0', '1']
          capabilities: [gpu]

Production deployment

For production, consider:

Using a reverse proxy (nginx/traefik) for SSL/TLS
Adding authentication middleware
Implementing rate limiting
Setting up monitoring (Prometheus/Grafana)
Using orchestration (Kubernetes) for scaling

5.4 KiB Raw Blame History

TexTeller Docker Deployment Guide

Prerequisites

Setup NVIDIA Container Toolkit

Quick Start

Build and Run

Using the Deployment Script (Recommended)

Using Docker Compose

Using Docker directly

API Usage

Using base64-encoded image

Using image URL

Python client example

Response format

Configuration

Monitoring

Troubleshooting

GPU not detected

Port already in use

Model not found

Performance Notes

Advanced Configuration

Multiple GPUs

Production deployment

5.4 KiB

Raw Blame History