- Delete blog/copy-math-to-word (EN+ZH) — identical to docs/copy-to-word - Rewrite blog/pdf-formula-issues as narrative troubleshooting story; operational steps now link out to docs/pdf-extraction - Add "Further reading" cross-links: 4 docs → relevant blog posts - Add "See also" cross-links: 3 blog posts → relevant docs Docs = product reference; Blog = narrative/use cases/opinions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.0 KiB
title, description, slug, date, tags, order
| title | description | slug | date | tags | order | ||
|---|---|---|---|---|---|---|---|
| OCR Accuracy | Understanding TexPixel recognition accuracy and how to get the best results | ocr-accuracy | 2026-03-25 |
|
5 |
OCR Accuracy
TexPixel achieves industry-leading accuracy on mathematical formula recognition — but accuracy isn't uniform across all input types. This guide explains what affects accuracy and how to maximize it.
Accuracy by Formula Type
| Formula Type | Typical Accuracy |
|---|---|
| Printed formulas (textbooks, papers) | 95–99% |
| Clean handwritten formulas | 88–95% |
| Scanned documents (300 DPI+) | 93–98% |
| Photos of whiteboards | 82–92% |
| Low-resolution images (< 72 DPI) | 60–80% |
These are approximate ranges. Individual results depend heavily on image quality.
Factors That Affect Accuracy
Image Quality
The single biggest factor. A blurry, low-resolution, or poorly lit image will always produce worse results than a clean scan.
- Resolution — 150 DPI or higher is recommended. 300 DPI is ideal for documents.
- Contrast — dark ink on a white background gives the clearest signal to the model.
- Sharpness — avoid motion blur or out-of-focus shots.
Formula Complexity
Simple single-line equations are recognized with near-perfect accuracy. More complex structures may have occasional errors:
- Multi-line equation systems
- Large matrices (6×6 or larger)
- Heavily nested fractions (3+ levels deep)
- Non-standard notation or custom symbols
Handwriting Style
Printed (typed) formulas outperform handwritten ones, but TexPixel handles handwriting well when:
- Letters are clearly formed and not connected (print style, not cursive)
- Variables are written in distinct sizes (clearly different x and × for example)
- Spacing between symbols is consistent
What Reduces Accuracy
- Rotated images — formulas at an angle are harder to parse
- Overlapping elements — crossed-out work, annotations, or arrows near symbols
- Pencil on paper — low contrast; try increasing image brightness/contrast before uploading
- Multiple formulas in one image — crop to the specific formula you need
- Decorative fonts — calligraphic or stylized mathematical writing
Improving Results
If you're getting errors, try these steps in order:
- Increase image resolution — scan at 300 DPI instead of 150 DPI
- Improve contrast — use a photo editor to increase brightness and contrast
- Crop tightly — remove surrounding text and whitespace
- Straighten the image — correct rotation before uploading
- Re-photograph — better lighting, closer distance, sharper focus
Reporting Errors
Found a formula type that TexPixel consistently gets wrong? Let us know — accuracy feedback directly improves the model over time.
Contact us at: support@texpixel.com
Further reading: 5 Tips for Better Handwriting Recognition →