feat: add 5 new blog posts (en + zh)

- how-ai-reads-math: plain-English explainer of the recognition pipeline
- student-workflow: lecture-to-LaTeX workflow for students
- pdf-formula-issues: troubleshooting guide for PDF extraction errors
- copy-math-to-word: 3 methods for getting formulas into Word, ranked
- researcher-workflow: digitizing handwritten research notes at scale

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-26 16:46:31 +08:00
parent 012748fc3d
commit 76f1bde56d
10 changed files with 702 additions and 0 deletions

View File

@@ -0,0 +1,51 @@
---
title: "How AI Reads Math: Inside TexPixel's Recognition Engine"
description: A plain-English explanation of how TexPixel turns a photo of a formula into clean LaTeX code
slug: how-ai-reads-math
date: 2026-01-15
tags: [explainer, technology]
---
# How AI Reads Math: Inside TexPixel's Recognition Engine
When you upload a photo of a handwritten integral and get back clean LaTeX in under a second, it feels like magic. It's not — but the engineering behind it is genuinely interesting. Here's a plain-English explanation of how TexPixel turns pixels into math.
## Step 1: Image Preprocessing
Before any recognition happens, the image is cleaned up. This step matters more than most people realize.
TexPixel normalizes contrast, removes noise, deskews tilted images, and isolates the formula region from surrounding whitespace, printed text, or ruled lines. A formula photographed under harsh side-lighting — or scanned at a slight angle — is corrected before the model ever sees it.
This is why image quality affects accuracy so much: preprocessing can compensate for minor flaws, but severe blur or extremely low resolution (below ~72 DPI) leaves too little information to work with.
## Step 2: Symbol Detection
The preprocessed image is fed into a visual encoder — a neural network that has learned, from millions of math images, what mathematical symbols look like.
The key challenge here isn't recognizing individual symbols in isolation. It's recognizing them *in context*. The symbol `x` looks different when it's a variable, when it's a multiplication sign, and when it's written in different handwriting styles. The model learns to distinguish these from surrounding context: is there a dot nearby? What's the vertical position relative to a fraction bar?
This contextual understanding is what separates a good math OCR system from a general-purpose character recognizer.
## Step 3: Structure Parsing
Recognizing symbols is only half the problem. Math is two-dimensional in a way that ordinary text is not. A fraction has a numerator above a denominator. An integral has limits at the top and bottom. A matrix arranges expressions in rows and columns.
TexPixel's parser builds a structural tree from the detected symbols — understanding that this expression is a subscript of that symbol, and that expression lives inside a square root. This tree is then serialized into LaTeX, where the structural relationships are encoded as commands like `\frac{}{}`, `\sqrt{}`, `\sum_{}^{}`.
## Step 4: LaTeX Generation
The final step is walking the structural tree and emitting valid LaTeX. This includes choosing the right command for ambiguous cases — for example, whether a large `Σ` should be rendered as `\sum` (display math) or `\Sigma` (inline), based on context.
The output is then validated to ensure it compiles without errors before being returned.
## Why Handwriting Is Harder Than Print
Printed math (from textbooks or PDFs) has consistent, high-contrast strokes. Handwriting varies enormously — in size, slant, stroke weight, and letter formation. Two people's handwritten `7` and `1` can look nearly identical, and two people's `β` can look completely different.
TexPixel's model was trained on a large, diverse dataset of handwritten math to handle this variation. But accuracy on handwriting is always lower than on print — typically 8895% vs. 9599%. The [tips in our handwriting guide](/blog/handwriting-tips) can push that toward the upper end.
## The Whole Pipeline in One Second
Preprocessing → symbol detection → structure parsing → LaTeX generation: all of this runs in under a second. It's a well-engineered pipeline, not magic — but the speed still surprises most people the first time they try it.
[Upload a formula and see it in action →](/app)

View File

@@ -0,0 +1,71 @@
---
title: "From Whiteboard to LaTeX in 3 Seconds: A Student's Workflow"
description: How students use TexPixel to turn lecture notes and homework into clean digital documents without retyping a single formula
slug: student-workflow
date: 2026-02-01
tags: [tutorial, workflow, students]
---
# From Whiteboard to LaTeX in 3 Seconds: A Student's Workflow
If you've ever spent 20 minutes wrestling with `\underbrace`, `\overset`, or a nested fraction in LaTeX just to transcribe something your professor wrote in 10 seconds on a whiteboard — this workflow is for you.
## The Problem With Retyping
Retyping formulas by hand is slow, error-prone, and interrupts the flow of note-taking. A single misplaced brace breaks compilation. A wrong symbol — `\mu` instead of `\upsilon`, say — can change the meaning entirely. And some constructs, like large piecewise functions or multi-line aligned systems, take real LaTeX expertise to format correctly.
TexPixel removes all of this friction.
## The Workflow
### During the Lecture
Photograph each formula as it appears on the board. Don't worry about perfect framing — a quick phone shot is fine. A 150+ DPI photo taken under decent lighting gives TexPixel everything it needs.
You don't have to process anything during class. Just build up a folder of photos.
### After Class
1. Open TexPixel. Drag and drop the first photo.
2. In under a second, you get LaTeX output — paste it directly into your Overleaf document or VS Code `.tex` file.
3. Repeat for each formula.
For a typical lecture with 1015 formulas, this takes about 2 minutes. Compare that to 2030 minutes of manual retyping.
### For Homework
When working through problem sets:
1. Solve the problem on paper as you normally would.
2. Take a photo of your work.
3. Upload to TexPixel to extract the key formulas.
4. Paste into your write-up.
This is especially useful for multi-step derivations where you want to show your work digitally.
## Exporting to Word
Not using LaTeX? If your professor requires Word submissions, use TexPixel's DOCX export. It produces native Word equations — not images — so you can still edit them in Word's equation editor after exporting.
## A Real Example
Here's a typical formula from a linear algebra lecture:
$$A = U \Sigma V^T$$
Manual LaTeX: `A = U \Sigma V^T` — straightforward, but you need to know `\Sigma` and `V^T`.
With TexPixel: photograph it, get `A = U \Sigma V^T` in one second, paste. For more complex expressions — a full SVD decomposition with summation notation and indexed entries — the time savings are even more dramatic.
## Tips for Lecture Photography
- **Position yourself centrally** — formulas at the edges of the board get distorted by perspective
- **Wait for the professor to finish writing** — partial formulas confuse the parser
- **Avoid flash** — it creates glare and washes out chalk or whiteboard markers
- **Crop if needed** — if a photo contains multiple formulas, crop before uploading
## Building a Formula Library
Over a semester, you'll accumulate dozens of recognized formulas. Consider organizing them: paste each into a reference `.tex` file with a short comment. By exam time, you'll have a searchable personal formula sheet that took almost no effort to build.
[Start digitizing your notes →](/app)

View File

@@ -0,0 +1,73 @@
---
title: "Why Your PDF Formulas Come Out Wrong (and How to Fix It)"
description: The most common reasons PDF formula extraction produces errors, and exactly how to fix each one
slug: pdf-formula-issues
date: 2026-02-15
tags: [troubleshooting, PDF, tips]
---
# Why Your PDF Formulas Come Out Wrong (and How to Fix It)
PDF formula extraction should be simple — upload, get LaTeX, done. But sometimes the output looks garbled, symbols are missing, or the extractor says no formulas were found. Here's a breakdown of the most common causes and how to fix each one.
## Problem 1: The PDF is a Scan
**Symptoms:** Symbols look correct on screen but extraction output is garbage or empty.
**Why it happens:** A scanned PDF is just a collection of images — there's no actual text layer. The text you see in your PDF reader is either from OCR performed at scan time (often poor quality) or from the image itself.
**Fix:** Run TexPixel's image-based pipeline instead. Export individual pages as PNG at 300 DPI using any PDF viewer (File → Export as Image in Preview, or Adobe Acrobat's Export PDF feature), then upload the PNG directly. Image-based recognition handles scans correctly; direct PDF text extraction does not.
## Problem 2: Low-DPI Scan
**Symptoms:** Some symbols recognized correctly, others replaced with wrong characters or dropped entirely.
**Why it happens:** Below about 150 DPI, strokes in small symbols like `\prime`, `\cdot`, or subscript characters become a few pixels wide — too blurry to reliably distinguish.
**Fix:** Rescan at 300 DPI. Most modern flatbed scanners default to 200 DPI; bumping to 300 produces dramatically better results without significantly increasing file size. For phone scans, use a dedicated scanner app (e.g., Adobe Scan, Microsoft Lens) which applies automatic sharpening and perspective correction.
## Problem 3: Password-Protected PDF
**Symptoms:** "No formulas found" or upload fails entirely.
**Why it happens:** Encrypted PDFs require a password to access their content stream. TexPixel cannot process the content of a locked file.
**Fix:** Remove the password protection before uploading. In Preview (Mac), open with the password, then File → Export as PDF — the exported file won't have the password. In Adobe Reader, use File → Print → Save as PDF.
## Problem 4: Formulas Stored as Vector Paths
**Symptoms:** PDF looks perfect, but extraction returns nothing or incorrect text.
**Why it happens:** Some PDF generators (certain Word versions, some online LaTeX renderers) rasterize or vectorize math into paths — the formulas are essentially drawings, not characters. There's no character stream to extract.
**Fix:** Export the page as a high-resolution PNG (300 DPI), then upload as an image. TexPixel's visual recognition pipeline handles vector-rendered formulas well.
## Problem 5: Multi-Column Layout
**Symptoms:** Formulas from two columns are merged or interleaved in the output.
**Why it happens:** PDF text streams don't always encode reading order correctly, especially in two-column academic papers.
**Fix:** Crop to a single column before uploading. Use any image editor to crop the page into left and right halves, then upload each separately.
## Problem 6: Handwritten Annotations
**Symptoms:** Handwritten notes over a printed formula confuse the output.
**Why it happens:** TexPixel sees both the printed formula and the handwritten annotations together. It may try to recognize the annotations as part of the formula.
**Fix:** Crop tightly to just the printed formula, excluding any handwriting around it.
## Quick Diagnostic Checklist
Before uploading a problematic PDF:
- [ ] Is it a scan or a born-digital PDF?
- [ ] If a scan, what DPI was it scanned at?
- [ ] Is it password-protected?
- [ ] Does it have a two-column layout?
- [ ] Are there handwritten annotations?
Working through this list resolves the issue 90% of the time.
[Upload your PDF →](/app)

View File

@@ -0,0 +1,74 @@
---
title: "Copy Math to Word Without Losing Formatting — The Right Way"
description: Three methods for getting recognized formulas into Microsoft Word, ranked by quality and effort
slug: copy-math-to-word
date: 2026-03-01
tags: [tutorial, Word, export]
---
# Copy Math to Word Without Losing Formatting — The Right Way
Most people's first instinct when they need a formula in a Word document is to take a screenshot. It works — until you need to resize the document, change the font, or edit the formula. Screenshots break. Native equations don't.
Here are three ways to get TexPixel's output into Word, from best to worst.
## Method 1: DOCX Export (Best)
The cleanest option. TexPixel converts your recognized formula into a native Word equation (OMML format) and packages it in a `.docx` file.
**How:**
1. Upload your formula image to TexPixel.
2. Click **Export** → select **DOCX**.
3. Open the downloaded file in Word.
4. Select the equation, copy, paste into your target document.
**Why it's best:** The formula is fully editable in Word's built-in equation editor. Double-click it to open the editor, change any symbol, resize it — it behaves exactly like an equation you typed yourself. It also scales correctly when you change font sizes.
**Limitation:** Each upload produces one `.docx` file. If you have many formulas to insert, you'll need to repeat the process or batch them (see below).
## Method 2: Paste LaTeX into Word's Equation Editor (Good)
Word 2019+ and Microsoft 365 support pasting LaTeX directly into equations.
**How:**
1. Get the LaTeX output from TexPixel (e.g., `x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}`).
2. In Word, insert a new equation: **Insert → Equation** (or press `Alt+=`).
3. Make sure the equation box is in **LaTeX mode** (click the dropdown on the right side of the equation box → select "LaTeX").
4. Paste the LaTeX string. Press **Enter** or click outside.
Word converts the LaTeX to a rendered, editable equation.
**Why it's good:** Fast for single formulas. No file download required.
**Limitation:** Word's LaTeX parser doesn't support all LaTeX commands. Obscure or complex expressions may not render correctly. Test before relying on it for important documents.
## Method 3: Image Export (Worst, But Sometimes Necessary)
Export the formula as a PNG and insert it as an image in Word.
**When to use:** Only when you need the formula in a document being shared with someone who doesn't have Word's equation editor (e.g., older Word versions, third-party editors). Or when a complex formula doesn't render correctly via Methods 1 or 2.
**Downsides:** Not editable. Doesn't scale well. Accessibility tools can't read it.
## Handling Multiple Formulas
If you have many formulas to insert into a single document:
1. Upload each formula image and collect the LaTeX strings.
2. Open a new Word document.
3. For each formula, use the **Alt+=** method above to insert them in sequence.
4. Once all formulas are inserted, copy and paste the entire equation block into your target document.
This is faster than one DOCX export per formula.
## Google Docs
Google Docs doesn't natively support LaTeX paste. Options:
- Use the **Auto-LaTeX Equations** Google Docs add-on, which renders LaTeX strings as inline images.
- Export as DOCX and open in Google Docs (equations import as images, not editable).
- Use a tool like `mathpix-markdown-it` to convert to Markdown and render in a Markdown-compatible environment.
For serious equation-heavy work, Word or Overleaf remain better choices than Google Docs.
[Export your next formula to Word →](/app)

View File

@@ -0,0 +1,82 @@
---
title: "Digitizing a Decade of Research Notes with TexPixel"
description: How researchers use TexPixel to convert years of handwritten math into searchable, editable LaTeX documents
slug: researcher-workflow
date: 2026-03-08
tags: [workflow, research, tutorial]
---
# Digitizing a Decade of Research Notes with TexPixel
Researchers accumulate notebooks. Derivations sketched out at conferences, margin notes on printed papers, whiteboard captures from group meetings, half-finished proofs from 3 AM. For most of history, this material was effectively unsearchable — trapped in physical form, accessible only by paging through stacks of notebooks.
TexPixel changes the equation (so to speak).
## The Scope of the Problem
A typical active researcher might accumulate 510 filled notebooks per year, each containing hundreds of equations. Digitizing this by hand — retyping each formula in LaTeX — is essentially impossible. At 3 minutes per formula and 50 formulas per notebook, one year's worth of notes would take over 400 hours to transcribe manually.
With TexPixel, each formula takes under 5 seconds from photo to LaTeX. The same year's worth of notes: under 7 hours.
## A Practical Digitization Workflow
### Step 1: Photograph the Notebooks
Use a phone with a good camera and a document scanner app (Adobe Scan, Microsoft Lens, or Apple's built-in document scanner). These apps:
- Automatically detect page edges
- Correct perspective distortion
- Apply contrast enhancement for faded ink or pencil
- Export to PDF
Scan a full notebook in 1520 minutes.
### Step 2: Identify Formula-Dense Pages
Not every page needs digitizing. Quickly flip through and flag pages with equations you'll actually need. A single key derivation or set of equations is often worth digitizing even if the surrounding text isn't.
### Step 3: Batch Process with TexPixel
For each flagged page:
1. Export the page or crop area as a PNG
2. Upload to TexPixel
3. Copy the LaTeX output into your notes
For formula-dense pages, consider cropping individual formulas rather than uploading the full page — this gives more accurate results and cleaner output.
### Step 4: Organize into a Reference Document
Create a `.tex` document (or Overleaf project) structured by topic. Paste each extracted formula with a brief comment about its context:
```latex
% Variational lower bound — from 2022 NeurIPS derivation
\mathcal{L}(\theta, \phi) = \mathbb{E}_{q_\phi(z|x)}\left[\log p_\theta(x|z)\right] - D_{KL}(q_\phi(z|x) \| p(z))
```
After a few sessions, you'll have a searchable, compilable reference document that took a fraction of the time of manual transcription.
## Working with Whiteboards
Conference room whiteboards are particularly valuable targets. A single group meeting might produce 2030 key equations that would otherwise be lost when someone erases the board.
**Best practice:** Photograph the whiteboard before it's erased (obvious) but also photograph intermediate steps — derivations that get overwritten as the discussion progresses. The intermediate steps are often where the insight lives.
For whiteboards:
- Photograph straight-on, not at an angle
- Use even lighting — a photo taken with the lights on and no flash usually works better than using flash, which creates glare on glossy boards
- Crop each distinct equation before uploading
## Working with Printed Papers
For annotated printed papers, TexPixel can extract both the printed formulas and (with somewhat lower accuracy) handwritten margin notes. Crop tightly to the region you need, and upload each formula separately from its annotations.
## Building a Long-Term Knowledge Base
The real value of digitization compounds over time. A well-organized LaTeX reference document from 5 years of notes is something you can:
- Search with `grep` or your editor's search
- Cross-reference with a citation manager
- Share with collaborators
- Build on directly when writing new papers
Start with the past year's notebooks. The 7-hour investment pays dividends for years.
[Start digitizing your notes →](/app)

View File

@@ -0,0 +1,51 @@
---
title: "AI 如何读懂数学TexPixel 识别引擎揭秘"
description: 用通俗语言解释 TexPixel 如何将公式照片转换为干净的 LaTeX 代码
slug: how-ai-reads-math
date: 2026-01-15
tags: [技术, 原理]
---
# AI 如何读懂数学TexPixel 识别引擎揭秘
当你上传一张手写积分式的照片,不到一秒就得到了干净的 LaTeX——这感觉像魔法。其实不是但背后的工程确实很有意思。下面用通俗的语言解释 TexPixel 如何将像素转化为数学公式。
## 第一步:图像预处理
识别开始之前,图像会先被清理。这一步的重要性远超大多数人的预期。
TexPixel 会标准化对比度、去除噪点、矫正倾斜图像,并从周围的空白、印刷文字或横线中分离出公式区域。在强侧光下拍摄、或略微倾斜扫描的公式,在模型看到之前就已经被纠正了。
这就是图像质量如此影响准确率的原因:预处理可以弥补轻微的缺陷,但严重的模糊或极低分辨率(低于约 72 DPI留下的信息太少无法有效处理。
## 第二步:符号检测
预处理后的图像被输入视觉编码器——一个从数百万张数学图像中学习数学符号形态的神经网络。
这里的核心挑战不是孤立地识别单个符号,而是在**上下文中**识别它们。`x` 作为变量、作为乘号、以及以不同笔迹书写时,看起来各不相同。模型通过周围上下文来区分这些情况:附近有没有点?与分数线的垂直位置如何?
这种上下文理解,正是优秀数学 OCR 系统与通用字符识别器的本质区别。
## 第三步:结构解析
识别符号只是解决了一半的问题。数学是二维的,这是普通文字所没有的特性。分数有分子在上、分母在下;积分有上下限;矩阵将表达式排列成行和列。
TexPixel 的解析器从检测到的符号中构建结构树——理解这个表达式是那个符号的下标,那个表达式在根号内。然后将这棵树序列化为 LaTeX其中结构关系被编码为 `\frac{}{}``\sqrt{}``\sum_{}^{}` 等命令。
## 第四步LaTeX 生成
最后一步是遍历结构树并生成有效的 LaTeX。这包括处理歧义情况——例如根据上下文判断一个大写 `Σ` 应该渲染为 `\sum`(行间数学模式)还是 `\Sigma`(行内)。
输出结果在返回之前会经过验证,确保编译无误。
## 为什么手写比印刷体难
印刷数学(来自教材或 PDF笔画一致、对比度高。手写则变化极大——大小、倾斜度、笔画粗细和字母形态各异。两个人写的 `7``1` 可能几乎一样,而两个人写的 `β` 可能截然不同。
TexPixel 的模型在大量多样化的手写数学数据集上训练,以应对这种变化。但手写的准确率始终低于印刷体——通常为 8895% 对比 9599%。[手写技巧指南](/blog/handwriting-tips)中的建议可以将准确率推向上限。
## 整个流程在一秒内完成
预处理 → 符号检测 → 结构解析 → LaTeX 生成:所有这些在不到一秒内完成。这是精心设计的流水线,不是魔法——但第一次尝试时的速度仍然会让大多数人感到惊讶。
[上传公式,亲身体验 →](/app)

View File

@@ -0,0 +1,71 @@
---
title: "3 秒从白板到 LaTeX学生的高效工作流"
description: 如何用 TexPixel 把课堂笔记和作业变成干净的数字文档,无需手动输入一个公式
slug: student-workflow
date: 2026-02-01
tags: [教程, 工作流, 学生]
---
# 3 秒从白板到 LaTeX学生的高效工作流
如果你曾经为了把教授在黑板上 10 秒内写完的东西,花了 20 分钟和 `\underbrace``\overset` 或嵌套分数搏斗——这个工作流就是为你准备的。
## 手动录入的问题
手动重新输入公式既慢又容易出错,还会打断记笔记的节奏。一个错位的花括号就能导致编译失败。一个错误的符号——比如 `\mu` 写成 `\upsilon`——可能完全改变含义。某些结构,比如大型分段函数或多行对齐方程组,需要真正的 LaTeX 专业知识才能正确格式化。
TexPixel 消除了所有这些摩擦。
## 工作流程
### 上课时
每当公式出现在黑板上,拍一张照片。不用担心取景是否完美——手机随手拍就够了。在合适的光线下拍摄的 150+ DPI 照片,已经足够让 TexPixel 完成识别。
课上不需要处理任何东西,只需积累一个照片文件夹。
### 课后
1. 打开 TexPixel把第一张照片拖进去
2. 不到一秒,得到 LaTeX 输出——直接粘贴到 Overleaf 文档或 VS Code 的 `.tex` 文件中
3. 对每张公式照片重复此操作
一节课有 1015 个公式,整个过程约 2 分钟。相比手动录入的 2030 分钟,差距显著。
### 做作业时
在解题过程中:
1. 像平时一样在纸上解题
2. 拍下解题过程的照片
3. 用 TexPixel 提取关键公式
4. 粘贴到作业文档中
这对于需要展示推导过程的多步推导尤其实用。
## 导出到 Word
不用 LaTeX如果教授要求提交 Word 文档,使用 TexPixel 的 DOCX 导出功能。它生成的是原生 Word 方程式——不是图片——导出后仍然可以在 Word 的方程式编辑器中编辑。
## 实际例子
线性代数课上的一个典型公式:
$$A = U \Sigma V^T$$
手动 LaTeX`A = U \Sigma V^T`——算简单,但你需要知道 `\Sigma``V^T` 的写法。
用 TexPixel拍照一秒得到 `A = U \Sigma V^T`,粘贴。对于更复杂的表达式——带求和符号和下标的完整 SVD 分解——节省的时间更为显著。
## 课堂拍照技巧
- **站在正中间**——边角的公式会因透视产生畸变
- **等教授写完再拍**——不完整的公式会干扰解析器
- **不要用闪光灯**——会产生眩光,冲淡粉笔或白板笔
- **需要时裁剪**——如果一张照片包含多个公式,上传前先裁剪
## 建立公式库
一个学期下来,你会积累几十个识别出的公式。不妨整理一下:将每个公式粘贴到一个参考 `.tex` 文件中,加上简短注释。期末时,你将拥有一份几乎不费力气就建立起来的、可搜索的个人公式表。
[开始数字化你的笔记 →](/app)

View File

@@ -0,0 +1,73 @@
---
title: "PDF 公式识别出错的原因及修复方法"
description: PDF 公式提取产生错误最常见的原因,以及每种情况的具体解决方案
slug: pdf-formula-issues
date: 2026-02-15
tags: [故障排查, PDF, 技巧]
---
# PDF 公式识别出错的原因及修复方法
PDF 公式提取本应简单——上传、得到 LaTeX、完成。但有时输出乱码、符号丢失或者提示没有找到公式。以下是最常见原因的分析及对应的修复方法。
## 问题 1PDF 是扫描件
**症状:** 屏幕上公式显示正确,但提取输出是乱码或空白。
**原因:** 扫描 PDF 实际上只是一组图片——没有真正的文字层。你在 PDF 阅读器中看到的文字,要么来自扫描时进行的 OCR往往质量较差要么直接来自图像本身。
**解决方法:** 使用 TexPixel 的图像识别流程。用任意 PDF 查看器将页面导出为 300 DPI 的 PNGPreview 中选择"文件 → 导出为图像",或 Adobe Acrobat 的"导出 PDF"功能),然后直接上传 PNG。图像识别能正确处理扫描件直接提取 PDF 文字则不行。
## 问题 2扫描分辨率过低
**症状:** 部分符号识别正确,其他符号被替换为错误字符或直接丢失。
**原因:** 低于约 150 DPI 时,`\prime``\cdot` 或下标字符等小符号的笔画只有几个像素宽——模糊到无法可靠区分。
**解决方法:** 以 300 DPI 重新扫描。大多数平板扫描仪默认 200 DPI提高到 300 DPI 能显著改善效果,且文件大小增加不大。对于手机扫描,使用专用扫描 App如 Adobe Scan、Microsoft Lens——这些 App 会自动锐化并进行透视校正。
## 问题 3PDF 有密码保护
**症状:** 显示"未找到公式"或上传完全失败。
**原因:** 加密 PDF 需要密码才能访问内容流。TexPixel 无法处理加密文件的内容。
**解决方法:** 上传前移除密码保护。在 Mac 的 Preview 中,用密码打开后,选择"文件 → 导出为 PDF"——导出的文件不含密码。在 Adobe Reader 中,使用"文件 → 打印 → 存储为 PDF"。
## 问题 4公式存储为矢量路径
**症状:** PDF 显示完美,但提取结果为空或不正确。
**原因:** 某些 PDF 生成器(特定版本的 Word、部分在线 LaTeX 渲染器)会将数学公式光栅化或矢量化为路径——公式本质上是图形,而非字符,没有字符流可以提取。
**解决方法:** 将页面导出为高分辨率 PNG300 DPI然后作为图像上传。TexPixel 的视觉识别流程能很好地处理矢量渲染的公式。
## 问题 5双栏排版
**症状:** 两栏的公式在输出中被合并或交叉混排。
**原因:** PDF 文字流并不总是以正确的阅读顺序编码,在双栏学术论文中尤为如此。
**解决方法:** 上传前裁剪为单栏。用任意图像编辑器将页面裁成左右两半,分别上传。
## 问题 6手写批注
**症状:** 印刷公式上的手写笔记干扰输出。
**原因:** TexPixel 同时看到了印刷公式和手写批注,可能会尝试将批注识别为公式的一部分。
**解决方法:** 紧密裁剪,只保留印刷公式部分,排除周围的手写内容。
## 快速排查清单
上传有问题的 PDF 之前,先检查:
- [ ] 是扫描件还是数字原生 PDF
- [ ] 如果是扫描件,分辨率是多少 DPI
- [ ] 是否有密码保护?
- [ ] 是否是双栏排版?
- [ ] 是否有手写批注?
逐项排查,能解决 90% 的问题。
[上传你的 PDF →](/app)

View File

@@ -0,0 +1,74 @@
---
title: "把公式粘贴到 Word 而不丢失格式——正确的方法"
description: 三种将识别公式导入 Microsoft Word 的方法,按质量和操作难度排序
slug: copy-math-to-word
date: 2026-03-01
tags: [教程, Word, 导出]
---
# 把公式粘贴到 Word 而不丢失格式——正确的方法
大多数人的第一反应是截图。这能用——直到你需要调整文档大小、更改字体或编辑公式。截图会出问题,原生方程式不会。
以下是三种将 TexPixel 输出导入 Word 的方法,从最好到最差排序。
## 方法 1DOCX 导出(最佳)
最干净的选项。TexPixel 将识别的公式转换为原生 Word 方程式OMML 格式),并打包到 `.docx` 文件中。
**操作步骤:**
1. 上传公式图片到 TexPixel
2. 点击**导出** → 选择 **DOCX**
3. 在 Word 中打开下载的文件
4. 选中方程式,复制,粘贴到目标文档
**为什么最好:** 公式在 Word 内置方程式编辑器中完全可编辑。双击打开编辑器,修改任意符号、调整大小——行为和你自己输入的方程式完全一样。更改字体大小时也能正确缩放。
**限制:** 每次上传生成一个 `.docx` 文件。如果有很多公式需要插入,需要重复操作或批量处理。
## 方法 2将 LaTeX 粘贴到 Word 方程式编辑器(较好)
Word 2019+ 和 Microsoft 365 支持直接在方程式框中粘贴 LaTeX。
**操作步骤:**
1. 从 TexPixel 获取 LaTeX 输出(例如:`x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}`
2. 在 Word 中插入新方程式:**插入 → 公式**(或按 `Alt+=`
3. 确保方程式框处于 **LaTeX 模式**(点击方程式框右侧下拉菜单 → 选择"LaTeX"
4. 粘贴 LaTeX 字符串,按**回车**或点击外部
Word 会将 LaTeX 转换为可渲染、可编辑的方程式。
**为什么较好:** 单个公式处理很快,无需下载文件。
**限制:** Word 的 LaTeX 解析器不支持所有 LaTeX 命令。复杂或不常见的表达式可能无法正确渲染。用于重要文档前请先测试。
## 方法 3图片导出最差但有时必要
将公式导出为 PNG在 Word 中作为图片插入。
**何时使用:** 仅在需要与没有 Word 方程式编辑器的用户共享文档时使用(例如旧版 Word、第三方编辑器或当复杂公式通过方法 1 和 2 无法正确渲染时。
**缺点:** 不可编辑,缩放效果差,辅助工具无法读取。
## 处理多个公式
如果需要在一个文档中插入多个公式:
1. 上传每张公式图片,收集 LaTeX 字符串
2. 打开一个新 Word 文档
3. 对每个公式使用上面的 `Alt+=` 方法依次插入
4. 插入所有公式后,将整个方程式块复制粘贴到目标文档
这比每个公式单独导出 DOCX 更快。
## Google 文档
Google 文档不原生支持 LaTeX 粘贴。可选方案:
- 使用 **Auto-LaTeX Equations** Google 文档插件,将 LaTeX 字符串渲染为行内图片
- 导出为 DOCX 后在 Google 文档中打开(方程式以图片形式导入,不可编辑)
- 使用 `mathpix-markdown-it` 等工具转换为 Markdown在支持 Markdown 的环境中渲染
对于大量包含公式的工作Word 或 Overleaf 仍然是比 Google 文档更好的选择。
[导出你的下一个公式到 Word →](/app)

View File

@@ -0,0 +1,82 @@
---
title: "用 TexPixel 数字化十年科研笔记"
description: 研究人员如何用 TexPixel 将多年手写数学笔记转换为可搜索、可编辑的 LaTeX 文档
slug: researcher-workflow
date: 2026-03-08
tags: [工作流, 科研, 教程]
---
# 用 TexPixel 数字化十年科研笔记
研究人员会积累笔记本。会议上草拟的推导、印刷论文上的旁注、组会白板的拍照、凌晨三点写了一半的证明。在很长一段时间里,这些材料实际上是不可搜索的——被困在物理形态中,只能翻翻一叠叠笔记本才能找到。
TexPixel 改变了这个局面。
## 问题的规模
一个活跃的研究人员每年可能积累 510 本填满的笔记本,每本包含数百个方程式。手动数字化——逐个用 LaTeX 重新输入公式——几乎是不可能完成的任务。按每个公式 3 分钟、每本 50 个公式计算,一年的笔记需要 400 多小时才能手动转录。
用 TexPixel每个公式从拍照到 LaTeX 不到 5 秒。同样一年的笔记:不到 7 小时。
## 实用数字化工作流
### 第一步:拍摄笔记本
使用摄像头好的手机和文档扫描 AppAdobe Scan、Microsoft Lens 或 Apple 内置文档扫描仪)。这些 App 能够:
- 自动检测页面边缘
- 校正透视畸变
- 对褪色墨水或铅笔字迹进行对比度增强
- 导出为 PDF
扫描一整本笔记本需要 1520 分钟。
### 第二步:确定公式密集的页面
不是每页都需要数字化。快速翻阅并标记包含你实际需要的方程式的页面。即使周围的文字不需要,一个关键推导或一组方程式往往也值得数字化。
### 第三步:用 TexPixel 批量处理
对每个标记的页面:
1. 将页面或裁剪区域导出为 PNG
2. 上传到 TexPixel
3. 将 LaTeX 输出复制到笔记中
对于公式密集的页面,建议裁剪单个公式而不是上传整页——这能获得更准确的结果和更干净的输出。
### 第四步:整理到参考文档
创建一个按主题组织的 `.tex` 文档(或 Overleaf 项目)。将每个提取的公式粘贴进去,附上简短的上下文说明:
```latex
% 变分下界——来自 2022 NeurIPS 推导
\mathcal{L}(\theta, \phi) = \mathbb{E}_{q_\phi(z|x)}\left[\log p_\theta(x|z)\right] - D_{KL}(q_\phi(z|x) \| p(z))
```
几次整理之后,你将拥有一份可搜索、可编译的参考文档,所用时间只是手动转录的零头。
## 处理白板
会议室白板是特别有价值的目标。一次组会可能产生 2030 个关键方程式,否则随着有人擦掉白板就消失了。
**最佳实践:** 在擦板前拍照(显而易见),但也要拍摄中间步骤——讨论推进过程中被覆盖的推导。中间步骤往往才是洞见所在。
白板拍摄注意事项:
- 正对白板拍摄,不要斜着拍
- 使用均匀光线——开灯不用闪光灯通常比用闪光灯更好,闪光灯会在光滑白板上产生眩光
- 上传前将各个公式分别裁剪
## 处理印刷论文
对于有批注的印刷论文TexPixel 可以提取印刷公式,也可以(以略低的准确率)识别手写旁注。紧密裁剪到需要的区域,将公式与旁注分开上传。
## 建立长期知识库
数字化的真正价值随时间复利增长。5 年笔记整理出的结构良好的 LaTeX 参考文档,你可以:
-`grep` 或编辑器搜索功能检索
- 与引用管理器交叉引用
- 与合作者共享
- 写新论文时直接在此基础上构建
从过去一年的笔记本开始。7 小时的投入,将带来多年的回报。
[开始数字化你的笔记 →](/app)