Files
doc_ai_frontend/content/docs/en/ocr-accuracy.md
yoge 409bbf742e feat: optimize docs pages and add 4 new doc articles (en + zh)
- Rewrote DocsListPage and DocDetailPage with landing.css aesthetic
  (icon cards, skeleton loader, prose styles, CTA box)
- Added docs-specific CSS to landing.css
- Created image-to-latex, copy-to-word, ocr-accuracy, pdf-extraction
  articles in both English and Chinese
- Updated DocsSeoSection guide cards to link to real doc slugs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 16:15:22 +08:00

2.9 KiB
Raw Blame History

title, description, slug, date, tags, order
title description slug date tags order
OCR Accuracy Understanding TexPixel recognition accuracy and how to get the best results ocr-accuracy 2026-03-25
accuracy
tips
5

OCR Accuracy

TexPixel achieves industry-leading accuracy on mathematical formula recognition — but accuracy isn't uniform across all input types. This guide explains what affects accuracy and how to maximize it.

Accuracy by Formula Type

Formula Type Typical Accuracy
Printed formulas (textbooks, papers) 9599%
Clean handwritten formulas 8895%
Scanned documents (300 DPI+) 9398%
Photos of whiteboards 8292%
Low-resolution images (< 72 DPI) 6080%

These are approximate ranges. Individual results depend heavily on image quality.

Factors That Affect Accuracy

Image Quality

The single biggest factor. A blurry, low-resolution, or poorly lit image will always produce worse results than a clean scan.

  • Resolution — 150 DPI or higher is recommended. 300 DPI is ideal for documents.
  • Contrast — dark ink on a white background gives the clearest signal to the model.
  • Sharpness — avoid motion blur or out-of-focus shots.

Formula Complexity

Simple single-line equations are recognized with near-perfect accuracy. More complex structures may have occasional errors:

  • Multi-line equation systems
  • Large matrices (6×6 or larger)
  • Heavily nested fractions (3+ levels deep)
  • Non-standard notation or custom symbols

Handwriting Style

Printed (typed) formulas outperform handwritten ones, but TexPixel handles handwriting well when:

  • Letters are clearly formed and not connected (print style, not cursive)
  • Variables are written in distinct sizes (clearly different x and × for example)
  • Spacing between symbols is consistent

What Reduces Accuracy

  • Rotated images — formulas at an angle are harder to parse
  • Overlapping elements — crossed-out work, annotations, or arrows near symbols
  • Pencil on paper — low contrast; try increasing image brightness/contrast before uploading
  • Multiple formulas in one image — crop to the specific formula you need
  • Decorative fonts — calligraphic or stylized mathematical writing

Improving Results

If you're getting errors, try these steps in order:

  1. Increase image resolution — scan at 300 DPI instead of 150 DPI
  2. Improve contrast — use a photo editor to increase brightness and contrast
  3. Crop tightly — remove surrounding text and whitespace
  4. Straighten the image — correct rotation before uploading
  5. Re-photograph — better lighting, closer distance, sharper focus

Reporting Errors

Found a formula type that TexPixel consistently gets wrong? Let us know — accuracy feedback directly improves the model over time.

Contact us at: support@texpixel.com


Upload a formula and test accuracy →