feat: optimize docs pages and add 4 new doc articles (en + zh)
- Rewrote DocsListPage and DocDetailPage with landing.css aesthetic (icon cards, skeleton loader, prose styles, CTA box) - Added docs-specific CSS to landing.css - Created image-to-latex, copy-to-word, ocr-accuracy, pdf-extraction articles in both English and Chinese - Updated DocsSeoSection guide cards to link to real doc slugs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
79
content/docs/en/ocr-accuracy.md
Normal file
79
content/docs/en/ocr-accuracy.md
Normal file
@@ -0,0 +1,79 @@
|
||||
---
|
||||
title: OCR Accuracy
|
||||
description: Understanding TexPixel recognition accuracy and how to get the best results
|
||||
slug: ocr-accuracy
|
||||
date: 2026-03-25
|
||||
tags: [accuracy, tips]
|
||||
order: 5
|
||||
---
|
||||
|
||||
# OCR Accuracy
|
||||
|
||||
TexPixel achieves industry-leading accuracy on mathematical formula recognition — but accuracy isn't uniform across all input types. This guide explains what affects accuracy and how to maximize it.
|
||||
|
||||
## Accuracy by Formula Type
|
||||
|
||||
| Formula Type | Typical Accuracy |
|
||||
|---|---|
|
||||
| Printed formulas (textbooks, papers) | 95–99% |
|
||||
| Clean handwritten formulas | 88–95% |
|
||||
| Scanned documents (300 DPI+) | 93–98% |
|
||||
| Photos of whiteboards | 82–92% |
|
||||
| Low-resolution images (< 72 DPI) | 60–80% |
|
||||
|
||||
These are approximate ranges. Individual results depend heavily on image quality.
|
||||
|
||||
## Factors That Affect Accuracy
|
||||
|
||||
### Image Quality
|
||||
|
||||
The single biggest factor. A blurry, low-resolution, or poorly lit image will always produce worse results than a clean scan.
|
||||
|
||||
- **Resolution** — 150 DPI or higher is recommended. 300 DPI is ideal for documents.
|
||||
- **Contrast** — dark ink on a white background gives the clearest signal to the model.
|
||||
- **Sharpness** — avoid motion blur or out-of-focus shots.
|
||||
|
||||
### Formula Complexity
|
||||
|
||||
Simple single-line equations are recognized with near-perfect accuracy. More complex structures may have occasional errors:
|
||||
|
||||
- Multi-line equation systems
|
||||
- Large matrices (6×6 or larger)
|
||||
- Heavily nested fractions (3+ levels deep)
|
||||
- Non-standard notation or custom symbols
|
||||
|
||||
### Handwriting Style
|
||||
|
||||
Printed (typed) formulas outperform handwritten ones, but TexPixel handles handwriting well when:
|
||||
|
||||
- Letters are clearly formed and not connected (print style, not cursive)
|
||||
- Variables are written in distinct sizes (clearly different x and × for example)
|
||||
- Spacing between symbols is consistent
|
||||
|
||||
### What Reduces Accuracy
|
||||
|
||||
- **Rotated images** — formulas at an angle are harder to parse
|
||||
- **Overlapping elements** — crossed-out work, annotations, or arrows near symbols
|
||||
- **Pencil on paper** — low contrast; try increasing image brightness/contrast before uploading
|
||||
- **Multiple formulas in one image** — crop to the specific formula you need
|
||||
- **Decorative fonts** — calligraphic or stylized mathematical writing
|
||||
|
||||
## Improving Results
|
||||
|
||||
If you're getting errors, try these steps in order:
|
||||
|
||||
1. **Increase image resolution** — scan at 300 DPI instead of 150 DPI
|
||||
2. **Improve contrast** — use a photo editor to increase brightness and contrast
|
||||
3. **Crop tightly** — remove surrounding text and whitespace
|
||||
4. **Straighten the image** — correct rotation before uploading
|
||||
5. **Re-photograph** — better lighting, closer distance, sharper focus
|
||||
|
||||
## Reporting Errors
|
||||
|
||||
Found a formula type that TexPixel consistently gets wrong? Let us know — accuracy feedback directly improves the model over time.
|
||||
|
||||
Contact us at: [support@texpixel.com](mailto:support@texpixel.com)
|
||||
|
||||
---
|
||||
|
||||
[Upload a formula and test accuracy →](/app)
|
||||
Reference in New Issue
Block a user