- Delete blog/copy-math-to-word (EN+ZH) — identical to docs/copy-to-word - Rewrite blog/pdf-formula-issues as narrative troubleshooting story; operational steps now link out to docs/pdf-extraction - Add "Further reading" cross-links: 4 docs → relevant blog posts - Add "See also" cross-links: 3 blog posts → relevant docs Docs = product reference; Blog = narrative/use cases/opinions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
82 lines
3.0 KiB
Markdown
82 lines
3.0 KiB
Markdown
---
|
||
title: OCR Accuracy
|
||
description: Understanding TexPixel recognition accuracy and how to get the best results
|
||
slug: ocr-accuracy
|
||
date: 2026-03-25
|
||
tags: [accuracy, tips]
|
||
order: 5
|
||
---
|
||
|
||
# OCR Accuracy
|
||
|
||
TexPixel achieves industry-leading accuracy on mathematical formula recognition — but accuracy isn't uniform across all input types. This guide explains what affects accuracy and how to maximize it.
|
||
|
||
## Accuracy by Formula Type
|
||
|
||
| Formula Type | Typical Accuracy |
|
||
|---|---|
|
||
| Printed formulas (textbooks, papers) | 95–99% |
|
||
| Clean handwritten formulas | 88–95% |
|
||
| Scanned documents (300 DPI+) | 93–98% |
|
||
| Photos of whiteboards | 82–92% |
|
||
| Low-resolution images (< 72 DPI) | 60–80% |
|
||
|
||
These are approximate ranges. Individual results depend heavily on image quality.
|
||
|
||
## Factors That Affect Accuracy
|
||
|
||
### Image Quality
|
||
|
||
The single biggest factor. A blurry, low-resolution, or poorly lit image will always produce worse results than a clean scan.
|
||
|
||
- **Resolution** — 150 DPI or higher is recommended. 300 DPI is ideal for documents.
|
||
- **Contrast** — dark ink on a white background gives the clearest signal to the model.
|
||
- **Sharpness** — avoid motion blur or out-of-focus shots.
|
||
|
||
### Formula Complexity
|
||
|
||
Simple single-line equations are recognized with near-perfect accuracy. More complex structures may have occasional errors:
|
||
|
||
- Multi-line equation systems
|
||
- Large matrices (6×6 or larger)
|
||
- Heavily nested fractions (3+ levels deep)
|
||
- Non-standard notation or custom symbols
|
||
|
||
### Handwriting Style
|
||
|
||
Printed (typed) formulas outperform handwritten ones, but TexPixel handles handwriting well when:
|
||
|
||
- Letters are clearly formed and not connected (print style, not cursive)
|
||
- Variables are written in distinct sizes (clearly different x and × for example)
|
||
- Spacing between symbols is consistent
|
||
|
||
### What Reduces Accuracy
|
||
|
||
- **Rotated images** — formulas at an angle are harder to parse
|
||
- **Overlapping elements** — crossed-out work, annotations, or arrows near symbols
|
||
- **Pencil on paper** — low contrast; try increasing image brightness/contrast before uploading
|
||
- **Multiple formulas in one image** — crop to the specific formula you need
|
||
- **Decorative fonts** — calligraphic or stylized mathematical writing
|
||
|
||
## Improving Results
|
||
|
||
If you're getting errors, try these steps in order:
|
||
|
||
1. **Increase image resolution** — scan at 300 DPI instead of 150 DPI
|
||
2. **Improve contrast** — use a photo editor to increase brightness and contrast
|
||
3. **Crop tightly** — remove surrounding text and whitespace
|
||
4. **Straighten the image** — correct rotation before uploading
|
||
5. **Re-photograph** — better lighting, closer distance, sharper focus
|
||
|
||
## Reporting Errors
|
||
|
||
Found a formula type that TexPixel consistently gets wrong? Let us know — accuracy feedback directly improves the model over time.
|
||
|
||
Contact us at: [support@texpixel.com](mailto:support@texpixel.com)
|
||
|
||
---
|
||
|
||
**Further reading:** [5 Tips for Better Handwriting Recognition →](/blog/handwriting-tips)
|
||
|
||
[Upload a formula and test accuracy →](/app)
|