From 99e1314bf9972d00841f69aa6ad0b819ea0165b4 Mon Sep 17 00:00:00 2001 From: yoge Date: Thu, 26 Mar 2026 16:52:27 +0800 Subject: [PATCH] refact: eliminate blog/docs content overlap MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Delete blog/copy-math-to-word (EN+ZH) — identical to docs/copy-to-word - Rewrite blog/pdf-formula-issues as narrative troubleshooting story; operational steps now link out to docs/pdf-extraction - Add "Further reading" cross-links: 4 docs → relevant blog posts - Add "See also" cross-links: 3 blog posts → relevant docs Docs = product reference; Blog = narrative/use cases/opinions Co-Authored-By: Claude Sonnet 4.6 --- .../blog/en/2026-02-01-student-workflow.md | 2 + .../blog/en/2026-02-15-pdf-formula-issues.md | 74 +++++++------------ .../blog/en/2026-03-01-copy-math-to-word.md | 74 ------------------- .../blog/en/2026-03-08-researcher-workflow.md | 2 + .../blog/en/2026-03-20-handwriting-tips.md | 2 + .../blog/zh/2026-02-01-student-workflow.md | 2 + .../blog/zh/2026-02-15-pdf-formula-issues.md | 74 +++++++------------ .../blog/zh/2026-03-01-copy-math-to-word.md | 74 ------------------- .../blog/zh/2026-03-08-researcher-workflow.md | 2 + .../blog/zh/2026-03-20-handwriting-tips.md | 2 + content/docs/en/copy-to-word.md | 2 + content/docs/en/image-to-latex.md | 2 + content/docs/en/ocr-accuracy.md | 2 + content/docs/en/pdf-extraction.md | 2 + content/docs/zh/copy-to-word.md | 2 + content/docs/zh/image-to-latex.md | 2 + content/docs/zh/ocr-accuracy.md | 2 + content/docs/zh/pdf-extraction.md | 2 + 18 files changed, 82 insertions(+), 242 deletions(-) delete mode 100644 content/blog/en/2026-03-01-copy-math-to-word.md delete mode 100644 content/blog/zh/2026-03-01-copy-math-to-word.md diff --git a/content/blog/en/2026-02-01-student-workflow.md b/content/blog/en/2026-02-01-student-workflow.md index 379a7c4..37de43a 100644 --- a/content/blog/en/2026-02-01-student-workflow.md +++ b/content/blog/en/2026-02-01-student-workflow.md @@ -68,4 +68,6 @@ With TexPixel: photograph it, get `A = U \Sigma V^T` in one second, paste. For m Over a semester, you'll accumulate dozens of recognized formulas. Consider organizing them: paste each into a reference `.tex` file with a short comment. By exam time, you'll have a searchable personal formula sheet that took almost no effort to build. +**See also:** For supported file types, size limits, and copy options, see the [Image to LaTeX documentation →](/docs/image-to-latex) + [Start digitizing your notes →](/app) diff --git a/content/blog/en/2026-02-15-pdf-formula-issues.md b/content/blog/en/2026-02-15-pdf-formula-issues.md index eaf0c19..ed91d12 100644 --- a/content/blog/en/2026-02-15-pdf-formula-issues.md +++ b/content/blog/en/2026-02-15-pdf-formula-issues.md @@ -1,73 +1,53 @@ --- -title: "Why Your PDF Formulas Come Out Wrong (and How to Fix It)" -description: The most common reasons PDF formula extraction produces errors, and exactly how to fix each one +title: "I Tried to Extract Formulas from My Professor's PDF. Here's What I Learned." +description: A real-world account of what goes wrong with PDF formula extraction — and why most problems come down to one of three root causes slug: pdf-formula-issues date: 2026-02-15 -tags: [troubleshooting, PDF, tips] +tags: [troubleshooting, PDF] --- -# Why Your PDF Formulas Come Out Wrong (and How to Fix It) +# I Tried to Extract Formulas from My Professor's PDF. Here's What I Learned. -PDF formula extraction should be simple — upload, get LaTeX, done. But sometimes the output looks garbled, symbols are missing, or the extractor says no formulas were found. Here's a breakdown of the most common causes and how to fix each one. +Last semester I was working through a 200-page lecture notes PDF — the kind that gets scanned from printed transparencies, emailed as a file attachment, and opens with a slightly-off angle on every page. I wanted to pull the key equations into my own notes. What followed was an education in how PDFs actually store (or don't store) mathematical content. -## Problem 1: The PDF is a Scan +## The First Surprise: Not All PDFs Are the Same -**Symptoms:** Symbols look correct on screen but extraction output is garbage or empty. +I naively assumed "PDF with formulas" meant "formulas I can extract." Not true. -**Why it happens:** A scanned PDF is just a collection of images — there's no actual text layer. The text you see in your PDF reader is either from OCR performed at scan time (often poor quality) or from the image itself. +There are at least three fundamentally different kinds of PDFs floating around in academic circles, and they behave completely differently: -**Fix:** Run TexPixel's image-based pipeline instead. Export individual pages as PNG at 300 DPI using any PDF viewer (File → Export as Image in Preview, or Adobe Acrobat's Export PDF feature), then upload the PNG directly. Image-based recognition handles scans correctly; direct PDF text extraction does not. +**Born-digital PDFs** (generated from LaTeX, Word, or typesetting software) contain actual vector math. Extraction from these is fast and 95%+ accurate — the formula structure is essentially already there. -## Problem 2: Low-DPI Scan +**Scanned PDFs** are just photographs of printed pages packaged into a container. There's no text layer. Extraction works through image recognition, and accuracy depends entirely on scan quality. My professor's notes were this kind. -**Symptoms:** Some symbols recognized correctly, others replaced with wrong characters or dropped entirely. +**Hybrid PDFs** have a text layer added by OCR software after scanning. Quality varies wildly — sometimes great, sometimes the "text" layer is completely wrong. These are the most unpredictable. -**Why it happens:** Below about 150 DPI, strokes in small symbols like `\prime`, `\cdot`, or subscript characters become a few pixels wide — too blurry to reliably distinguish. +## The Three Root Causes of Most Failures -**Fix:** Rescan at 300 DPI. Most modern flatbed scanners default to 200 DPI; bumping to 300 produces dramatically better results without significantly increasing file size. For phone scans, use a dedicated scanner app (e.g., Adobe Scan, Microsoft Lens) which applies automatic sharpening and perspective correction. +After a lot of trial and error, I found that failed extractions almost always come back to one of three things: -## Problem 3: Password-Protected PDF +**1. Resolution.** The scan was done at 150 DPI instead of 300. At low resolution, small symbols — subscripts, primes, dots — become a few pixels wide. The model can't reliably distinguish `\prime` from a stray speck. Rescanning at 300 DPI fixed more than half my problems. -**Symptoms:** "No formulas found" or upload fails entirely. +**2. Encryption.** Some PDFs are password-protected or have content restrictions that prevent any tool from reading the content stream. The PDF appears to open fine, but nothing can extract from it. Removing the password (File → Export as PDF in Preview, without the password lock) solved this. -**Why it happens:** Encrypted PDFs require a password to access their content stream. TexPixel cannot process the content of a locked file. +**3. Formulas stored as vector paths.** Some PDF generators draw equations as shapes rather than encoding them as characters. To any extraction tool, these formulas are invisible — just abstract geometry. The only way around this is to render the page as an image and run visual recognition on that instead. -**Fix:** Remove the password protection before uploading. In Preview (Mac), open with the password, then File → Export as PDF — the exported file won't have the password. In Adobe Reader, use File → Print → Save as PDF. +## What Actually Worked -## Problem 4: Formulas Stored as Vector Paths +For my professor's scanned notes, the workflow that worked: -**Symptoms:** PDF looks perfect, but extraction returns nothing or incorrect text. +1. Export each page as a 300 DPI PNG using Preview +2. Upload the PNG to TexPixel +3. Get clean LaTeX back in under a second -**Why it happens:** Some PDF generators (certain Word versions, some online LaTeX renderers) rasterize or vectorize math into paths — the formulas are essentially drawings, not characters. There's no character stream to extract. +Not the direct-PDF workflow I was hoping for, but reliable. The image-based pipeline doesn't care whether the original was scanned or born-digital — it just sees pixels and reads the math. -**Fix:** Export the page as a high-resolution PNG (300 DPI), then upload as an image. TexPixel's visual recognition pipeline handles vector-rendered formulas well. +## The Bigger Lesson -## Problem 5: Multi-Column Layout +PDF is a presentation format, not a data format. It's optimized for how things look, not for what they mean. Mathematical notation in particular gets mangled in transit — rendered, rasterized, path-converted — in ways that destroy the underlying structure. -**Symptoms:** Formulas from two columns are merged or interleaved in the output. +The most reliable signal is always the image. When in doubt, export to PNG and let visual recognition do the work. -**Why it happens:** PDF text streams don't always encode reading order correctly, especially in two-column academic papers. +--- -**Fix:** Crop to a single column before uploading. Use any image editor to crop the page into left and right halves, then upload each separately. - -## Problem 6: Handwritten Annotations - -**Symptoms:** Handwritten notes over a printed formula confuse the output. - -**Why it happens:** TexPixel sees both the printed formula and the handwritten annotations together. It may try to recognize the annotations as part of the formula. - -**Fix:** Crop tightly to just the printed formula, excluding any handwriting around it. - -## Quick Diagnostic Checklist - -Before uploading a problematic PDF: - -- [ ] Is it a scan or a born-digital PDF? -- [ ] If a scan, what DPI was it scanned at? -- [ ] Is it password-protected? -- [ ] Does it have a two-column layout? -- [ ] Are there handwritten annotations? - -Working through this list resolves the issue 90% of the time. - -[Upload your PDF →](/app) +For a systematic reference on PDF types, file limits, and what TexPixel can handle, see the [PDF Extraction documentation →](/docs/pdf-extraction) diff --git a/content/blog/en/2026-03-01-copy-math-to-word.md b/content/blog/en/2026-03-01-copy-math-to-word.md deleted file mode 100644 index 71fb121..0000000 --- a/content/blog/en/2026-03-01-copy-math-to-word.md +++ /dev/null @@ -1,74 +0,0 @@ ---- -title: "Copy Math to Word Without Losing Formatting — The Right Way" -description: Three methods for getting recognized formulas into Microsoft Word, ranked by quality and effort -slug: copy-math-to-word -date: 2026-03-01 -tags: [tutorial, Word, export] ---- - -# Copy Math to Word Without Losing Formatting — The Right Way - -Most people's first instinct when they need a formula in a Word document is to take a screenshot. It works — until you need to resize the document, change the font, or edit the formula. Screenshots break. Native equations don't. - -Here are three ways to get TexPixel's output into Word, from best to worst. - -## Method 1: DOCX Export (Best) - -The cleanest option. TexPixel converts your recognized formula into a native Word equation (OMML format) and packages it in a `.docx` file. - -**How:** -1. Upload your formula image to TexPixel. -2. Click **Export** → select **DOCX**. -3. Open the downloaded file in Word. -4. Select the equation, copy, paste into your target document. - -**Why it's best:** The formula is fully editable in Word's built-in equation editor. Double-click it to open the editor, change any symbol, resize it — it behaves exactly like an equation you typed yourself. It also scales correctly when you change font sizes. - -**Limitation:** Each upload produces one `.docx` file. If you have many formulas to insert, you'll need to repeat the process or batch them (see below). - -## Method 2: Paste LaTeX into Word's Equation Editor (Good) - -Word 2019+ and Microsoft 365 support pasting LaTeX directly into equations. - -**How:** -1. Get the LaTeX output from TexPixel (e.g., `x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}`). -2. In Word, insert a new equation: **Insert → Equation** (or press `Alt+=`). -3. Make sure the equation box is in **LaTeX mode** (click the dropdown on the right side of the equation box → select "LaTeX"). -4. Paste the LaTeX string. Press **Enter** or click outside. - -Word converts the LaTeX to a rendered, editable equation. - -**Why it's good:** Fast for single formulas. No file download required. - -**Limitation:** Word's LaTeX parser doesn't support all LaTeX commands. Obscure or complex expressions may not render correctly. Test before relying on it for important documents. - -## Method 3: Image Export (Worst, But Sometimes Necessary) - -Export the formula as a PNG and insert it as an image in Word. - -**When to use:** Only when you need the formula in a document being shared with someone who doesn't have Word's equation editor (e.g., older Word versions, third-party editors). Or when a complex formula doesn't render correctly via Methods 1 or 2. - -**Downsides:** Not editable. Doesn't scale well. Accessibility tools can't read it. - -## Handling Multiple Formulas - -If you have many formulas to insert into a single document: - -1. Upload each formula image and collect the LaTeX strings. -2. Open a new Word document. -3. For each formula, use the **Alt+=** method above to insert them in sequence. -4. Once all formulas are inserted, copy and paste the entire equation block into your target document. - -This is faster than one DOCX export per formula. - -## Google Docs - -Google Docs doesn't natively support LaTeX paste. Options: - -- Use the **Auto-LaTeX Equations** Google Docs add-on, which renders LaTeX strings as inline images. -- Export as DOCX and open in Google Docs (equations import as images, not editable). -- Use a tool like `mathpix-markdown-it` to convert to Markdown and render in a Markdown-compatible environment. - -For serious equation-heavy work, Word or Overleaf remain better choices than Google Docs. - -[Export your next formula to Word →](/app) diff --git a/content/blog/en/2026-03-08-researcher-workflow.md b/content/blog/en/2026-03-08-researcher-workflow.md index 21d2a5f..3880ad5 100644 --- a/content/blog/en/2026-03-08-researcher-workflow.md +++ b/content/blog/en/2026-03-08-researcher-workflow.md @@ -79,4 +79,6 @@ The real value of digitization compounds over time. A well-organized LaTeX refer Start with the past year's notebooks. The 7-hour investment pays dividends for years. +**See also:** For PDF file limits, supported types, and export options, see the [PDF Extraction documentation →](/docs/pdf-extraction) + [Start digitizing your notes →](/app) diff --git a/content/blog/en/2026-03-20-handwriting-tips.md b/content/blog/en/2026-03-20-handwriting-tips.md index f2f3f66..bd1e279 100644 --- a/content/blog/en/2026-03-20-handwriting-tips.md +++ b/content/blog/en/2026-03-20-handwriting-tips.md @@ -43,3 +43,5 @@ TexPixel works best when each image contains a single formula or a closely relat --- With these habits, you'll see noticeably better accuracy — often 95%+ even for complex handwritten expressions. + +**See also:** For a systematic breakdown of what affects accuracy (DPI, contrast, formula complexity), see the [OCR Accuracy documentation →](/docs/ocr-accuracy) diff --git a/content/blog/zh/2026-02-01-student-workflow.md b/content/blog/zh/2026-02-01-student-workflow.md index 1a3a669..1d59e71 100644 --- a/content/blog/zh/2026-02-01-student-workflow.md +++ b/content/blog/zh/2026-02-01-student-workflow.md @@ -68,4 +68,6 @@ $$A = U \Sigma V^T$$ 一个学期下来,你会积累几十个识别出的公式。不妨整理一下:将每个公式粘贴到一个参考 `.tex` 文件中,加上简短注释。期末时,你将拥有一份几乎不费力气就建立起来的、可搜索的个人公式表。 +**参考文档:** 关于支持的文件类型、大小限制和复制选项,请查看 [图片转 LaTeX 文档 →](/docs/image-to-latex) + [开始数字化你的笔记 →](/app) diff --git a/content/blog/zh/2026-02-15-pdf-formula-issues.md b/content/blog/zh/2026-02-15-pdf-formula-issues.md index d28267a..868861b 100644 --- a/content/blog/zh/2026-02-15-pdf-formula-issues.md +++ b/content/blog/zh/2026-02-15-pdf-formula-issues.md @@ -1,73 +1,53 @@ --- -title: "PDF 公式识别出错的原因及修复方法" -description: PDF 公式提取产生错误最常见的原因,以及每种情况的具体解决方案 +title: "我试着从教授的 PDF 里提取公式,结果学到了这些" +description: 一次真实的 PDF 公式提取经历——以及为什么大多数问题都归结为三个根本原因 slug: pdf-formula-issues date: 2026-02-15 -tags: [故障排查, PDF, 技巧] +tags: [故障排查, PDF] --- -# PDF 公式识别出错的原因及修复方法 +# 我试着从教授的 PDF 里提取公式,结果学到了这些 -PDF 公式提取本应简单——上传、得到 LaTeX、完成。但有时输出乱码、符号丢失,或者提示没有找到公式。以下是最常见原因的分析及对应的修复方法。 +上学期我在啃一份 200 页的讲义 PDF——那种从印刷胶片扫描而来、作为附件发出来、每页都略微倾斜的类型。我想把关键方程提取到自己的笔记里。接下来发生的事,让我深刻理解了 PDF 究竟是怎么存储(或者说不存储)数学内容的。 -## 问题 1:PDF 是扫描件 +## 第一个意外:不是所有 PDF 都一样 -**症状:** 屏幕上公式显示正确,但提取输出是乱码或空白。 +我天真地以为"有公式的 PDF"就意味着"可以提取的公式"。并非如此。 -**原因:** 扫描 PDF 实际上只是一组图片——没有真正的文字层。你在 PDF 阅读器中看到的文字,要么来自扫描时进行的 OCR(往往质量较差),要么直接来自图像本身。 +学术圈里流传着至少三种根本不同的 PDF,它们的行为完全不同: -**解决方法:** 使用 TexPixel 的图像识别流程。用任意 PDF 查看器将页面导出为 300 DPI 的 PNG(Preview 中选择"文件 → 导出为图像",或 Adobe Acrobat 的"导出 PDF"功能),然后直接上传 PNG。图像识别能正确处理扫描件;直接提取 PDF 文字则不行。 +**数字原生 PDF**(由 LaTeX、Word 或排版软件生成)包含真正的矢量数学内容。从这类 PDF 提取速度快、准确率 95% 以上——公式结构本质上已经在那里了。 -## 问题 2:扫描分辨率过低 +**扫描 PDF** 只是打印页面的照片,被包装进一个容器。没有文字层。提取依赖图像识别,准确率完全取决于扫描质量。教授的讲义就是这种。 -**症状:** 部分符号识别正确,其他符号被替换为错误字符或直接丢失。 +**混合 PDF** 是扫描后由 OCR 软件添加文字层的 PDF。质量参差不齐——有时很好,有时"文字层"完全是错的。这类 PDF 最难预测。 -**原因:** 低于约 150 DPI 时,`\prime`、`\cdot` 或下标字符等小符号的笔画只有几个像素宽——模糊到无法可靠区分。 +## 大多数失败的三个根本原因 -**解决方法:** 以 300 DPI 重新扫描。大多数平板扫描仪默认 200 DPI;提高到 300 DPI 能显著改善效果,且文件大小增加不大。对于手机扫描,使用专用扫描 App(如 Adobe Scan、Microsoft Lens)——这些 App 会自动锐化并进行透视校正。 +经过大量尝试和失败,我发现提取失败几乎总是归结为以下三种情况之一: -## 问题 3:PDF 有密码保护 +**1. 分辨率。** 扫描时用了 150 DPI 而不是 300 DPI。低分辨率下,小符号——下标、撇号、点——只有几个像素宽。模型无法可靠区分 `\prime` 和一个杂散的污点。提高到 300 DPI 重新扫描,解决了一半以上的问题。 -**症状:** 显示"未找到公式"或上传完全失败。 +**2. 加密。** 部分 PDF 有密码保护或内容限制,阻止任何工具读取内容流。PDF 看起来打开正常,但没有工具能从中提取。移除密码(在 Preview 中选择"文件 → 导出为 PDF",不勾选密码锁)解决了这个问题。 -**原因:** 加密 PDF 需要密码才能访问内容流。TexPixel 无法处理加密文件的内容。 +**3. 公式存储为矢量路径。** 部分 PDF 生成器将方程绘制为图形而非编码为字符。对任何提取工具来说,这些公式是隐形的——只是抽象的几何图形。唯一的办法是将页面渲染为图像,然后对图像进行视觉识别。 -**解决方法:** 上传前移除密码保护。在 Mac 的 Preview 中,用密码打开后,选择"文件 → 导出为 PDF"——导出的文件不含密码。在 Adobe Reader 中,使用"文件 → 打印 → 存储为 PDF"。 +## 最终有效的方法 -## 问题 4:公式存储为矢量路径 +对于教授的扫描讲义,有效的工作流是: -**症状:** PDF 显示完美,但提取结果为空或不正确。 +1. 用 Preview 将每页导出为 300 DPI PNG +2. 将 PNG 上传到 TexPixel +3. 不到一秒得到干净的 LaTeX -**原因:** 某些 PDF 生成器(特定版本的 Word、部分在线 LaTeX 渲染器)会将数学公式光栅化或矢量化为路径——公式本质上是图形,而非字符,没有字符流可以提取。 +不是我期望的直接处理 PDF 的工作流,但很可靠。图像识别流程不在乎原文件是扫描的还是数字原生的——它只看像素,读取数学内容。 -**解决方法:** 将页面导出为高分辨率 PNG(300 DPI),然后作为图像上传。TexPixel 的视觉识别流程能很好地处理矢量渲染的公式。 +## 更大的启示 -## 问题 5:双栏排版 +PDF 是展示格式,不是数据格式。它针对外观进行了优化,而不是含义。数学符号在传输过程中尤其容易被损坏——渲染、光栅化、路径转换——以破坏底层结构的方式。 -**症状:** 两栏的公式在输出中被合并或交叉混排。 +最可靠的信号永远是图像。如果不确定,导出为 PNG,让视觉识别来完成工作。 -**原因:** PDF 文字流并不总是以正确的阅读顺序编码,在双栏学术论文中尤为如此。 +--- -**解决方法:** 上传前裁剪为单栏。用任意图像编辑器将页面裁成左右两半,分别上传。 - -## 问题 6:手写批注 - -**症状:** 印刷公式上的手写笔记干扰输出。 - -**原因:** TexPixel 同时看到了印刷公式和手写批注,可能会尝试将批注识别为公式的一部分。 - -**解决方法:** 紧密裁剪,只保留印刷公式部分,排除周围的手写内容。 - -## 快速排查清单 - -上传有问题的 PDF 之前,先检查: - -- [ ] 是扫描件还是数字原生 PDF? -- [ ] 如果是扫描件,分辨率是多少 DPI? -- [ ] 是否有密码保护? -- [ ] 是否是双栏排版? -- [ ] 是否有手写批注? - -逐项排查,能解决 90% 的问题。 - -[上传你的 PDF →](/app) +关于 PDF 类型、文件限制以及 TexPixel 支持范围的系统性参考,请查看 [PDF 公式提取文档 →](/docs/pdf-extraction) diff --git a/content/blog/zh/2026-03-01-copy-math-to-word.md b/content/blog/zh/2026-03-01-copy-math-to-word.md deleted file mode 100644 index 817611a..0000000 --- a/content/blog/zh/2026-03-01-copy-math-to-word.md +++ /dev/null @@ -1,74 +0,0 @@ ---- -title: "把公式粘贴到 Word 而不丢失格式——正确的方法" -description: 三种将识别公式导入 Microsoft Word 的方法,按质量和操作难度排序 -slug: copy-math-to-word -date: 2026-03-01 -tags: [教程, Word, 导出] ---- - -# 把公式粘贴到 Word 而不丢失格式——正确的方法 - -大多数人的第一反应是截图。这能用——直到你需要调整文档大小、更改字体或编辑公式。截图会出问题,原生方程式不会。 - -以下是三种将 TexPixel 输出导入 Word 的方法,从最好到最差排序。 - -## 方法 1:DOCX 导出(最佳) - -最干净的选项。TexPixel 将识别的公式转换为原生 Word 方程式(OMML 格式),并打包到 `.docx` 文件中。 - -**操作步骤:** -1. 上传公式图片到 TexPixel -2. 点击**导出** → 选择 **DOCX** -3. 在 Word 中打开下载的文件 -4. 选中方程式,复制,粘贴到目标文档 - -**为什么最好:** 公式在 Word 内置方程式编辑器中完全可编辑。双击打开编辑器,修改任意符号、调整大小——行为和你自己输入的方程式完全一样。更改字体大小时也能正确缩放。 - -**限制:** 每次上传生成一个 `.docx` 文件。如果有很多公式需要插入,需要重复操作或批量处理。 - -## 方法 2:将 LaTeX 粘贴到 Word 方程式编辑器(较好) - -Word 2019+ 和 Microsoft 365 支持直接在方程式框中粘贴 LaTeX。 - -**操作步骤:** -1. 从 TexPixel 获取 LaTeX 输出(例如:`x = \frac{-b \pm \sqrt{b^2-4ac}}{2a}`) -2. 在 Word 中插入新方程式:**插入 → 公式**(或按 `Alt+=`) -3. 确保方程式框处于 **LaTeX 模式**(点击方程式框右侧下拉菜单 → 选择"LaTeX") -4. 粘贴 LaTeX 字符串,按**回车**或点击外部 - -Word 会将 LaTeX 转换为可渲染、可编辑的方程式。 - -**为什么较好:** 单个公式处理很快,无需下载文件。 - -**限制:** Word 的 LaTeX 解析器不支持所有 LaTeX 命令。复杂或不常见的表达式可能无法正确渲染。用于重要文档前请先测试。 - -## 方法 3:图片导出(最差,但有时必要) - -将公式导出为 PNG,在 Word 中作为图片插入。 - -**何时使用:** 仅在需要与没有 Word 方程式编辑器的用户共享文档时使用(例如旧版 Word、第三方编辑器),或当复杂公式通过方法 1 和 2 无法正确渲染时。 - -**缺点:** 不可编辑,缩放效果差,辅助工具无法读取。 - -## 处理多个公式 - -如果需要在一个文档中插入多个公式: - -1. 上传每张公式图片,收集 LaTeX 字符串 -2. 打开一个新 Word 文档 -3. 对每个公式使用上面的 `Alt+=` 方法依次插入 -4. 插入所有公式后,将整个方程式块复制粘贴到目标文档 - -这比每个公式单独导出 DOCX 更快。 - -## Google 文档 - -Google 文档不原生支持 LaTeX 粘贴。可选方案: - -- 使用 **Auto-LaTeX Equations** Google 文档插件,将 LaTeX 字符串渲染为行内图片 -- 导出为 DOCX 后在 Google 文档中打开(方程式以图片形式导入,不可编辑) -- 使用 `mathpix-markdown-it` 等工具转换为 Markdown,在支持 Markdown 的环境中渲染 - -对于大量包含公式的工作,Word 或 Overleaf 仍然是比 Google 文档更好的选择。 - -[导出你的下一个公式到 Word →](/app) diff --git a/content/blog/zh/2026-03-08-researcher-workflow.md b/content/blog/zh/2026-03-08-researcher-workflow.md index 193894b..6344013 100644 --- a/content/blog/zh/2026-03-08-researcher-workflow.md +++ b/content/blog/zh/2026-03-08-researcher-workflow.md @@ -79,4 +79,6 @@ TexPixel 改变了这个局面。 从过去一年的笔记本开始。7 小时的投入,将带来多年的回报。 +**参考文档:** 关于 PDF 文件限制、支持类型和导出选项,请查看 [PDF 公式提取文档 →](/docs/pdf-extraction) + [开始数字化你的笔记 →](/app) diff --git a/content/blog/zh/2026-03-20-handwriting-tips.md b/content/blog/zh/2026-03-20-handwriting-tips.md index a067095..0293595 100644 --- a/content/blog/zh/2026-03-20-handwriting-tips.md +++ b/content/blog/zh/2026-03-20-handwriting-tips.md @@ -43,3 +43,5 @@ TexPixel 在每张图片只包含一个公式或一组紧密相关的表达式 --- 养成这些习惯后,你会发现识别准确率明显提升——即使是复杂的手写表达式也能达到 95% 以上。 + +**参考文档:** 关于影响准确率的系统性分析(分辨率、对比度、公式复杂度),请查看 [识别准确率文档 →](/docs/ocr-accuracy) diff --git a/content/docs/en/copy-to-word.md b/content/docs/en/copy-to-word.md index 2dab050..64ebb12 100644 --- a/content/docs/en/copy-to-word.md +++ b/content/docs/en/copy-to-word.md @@ -63,4 +63,6 @@ DOCX export is compatible with: --- +**Further reading:** [LaTeX vs MathML: Which Format Should You Use? →](/blog/latex-vs-mathml) + [Try exporting a formula to Word →](/app) diff --git a/content/docs/en/image-to-latex.md b/content/docs/en/image-to-latex.md index 182c247..9fbee28 100644 --- a/content/docs/en/image-to-latex.md +++ b/content/docs/en/image-to-latex.md @@ -77,4 +77,6 @@ After recognition, you can copy output in multiple formats: --- +**Further reading:** [From Whiteboard to LaTeX in 3 Seconds: A Student's Workflow →](/blog/student-workflow) + Ready to try it? [Upload a formula image now →](/app) diff --git a/content/docs/en/ocr-accuracy.md b/content/docs/en/ocr-accuracy.md index 19bcdc5..747a72a 100644 --- a/content/docs/en/ocr-accuracy.md +++ b/content/docs/en/ocr-accuracy.md @@ -76,4 +76,6 @@ Contact us at: [support@texpixel.com](mailto:support@texpixel.com) --- +**Further reading:** [5 Tips for Better Handwriting Recognition →](/blog/handwriting-tips) + [Upload a formula and test accuracy →](/app) diff --git a/content/docs/en/pdf-extraction.md b/content/docs/en/pdf-extraction.md index bfb02e8..70b886e 100644 --- a/content/docs/en/pdf-extraction.md +++ b/content/docs/en/pdf-extraction.md @@ -72,4 +72,6 @@ Large PDFs with many pages can take 30–60 seconds. This is normal. The result --- +**Further reading:** [I tried to extract formulas from my professor's PDF — real-world troubleshooting →](/blog/pdf-formula-issues) + [Upload a PDF and extract formulas →](/app) diff --git a/content/docs/zh/copy-to-word.md b/content/docs/zh/copy-to-word.md index 19d744e..46c01e6 100644 --- a/content/docs/zh/copy-to-word.md +++ b/content/docs/zh/copy-to-word.md @@ -63,4 +63,6 @@ DOCX 导出与以下软件兼容: --- +**延伸阅读:** [LaTeX vs MathML:应该选哪种格式?→](/blog/latex-vs-mathml) + [尝试将公式导出到 Word →](/app) diff --git a/content/docs/zh/image-to-latex.md b/content/docs/zh/image-to-latex.md index d64f73a..7432940 100644 --- a/content/docs/zh/image-to-latex.md +++ b/content/docs/zh/image-to-latex.md @@ -77,4 +77,6 @@ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} --- +**延伸阅读:** [3 秒从白板到 LaTeX:学生的高效工作流 →](/blog/student-workflow) + 准备好了吗?[立即上传公式图片 →](/app) diff --git a/content/docs/zh/ocr-accuracy.md b/content/docs/zh/ocr-accuracy.md index f9e1d3e..f9485eb 100644 --- a/content/docs/zh/ocr-accuracy.md +++ b/content/docs/zh/ocr-accuracy.md @@ -76,4 +76,6 @@ TexPixel 在数学公式识别方面达到行业领先的准确率——但准 --- +**延伸阅读:** [提高手写公式识别准确率的 5 个技巧 →](/blog/handwriting-tips) + [上传公式测试识别准确率 →](/app) diff --git a/content/docs/zh/pdf-extraction.md b/content/docs/zh/pdf-extraction.md index be140c8..a31ff79 100644 --- a/content/docs/zh/pdf-extraction.md +++ b/content/docs/zh/pdf-extraction.md @@ -72,4 +72,6 @@ PDF 可能已加密,公式可能以复杂矢量路径存储,或使用了非 --- +**延伸阅读:** [我试着从教授的 PDF 里提取公式——真实排障经历 →](/blog/pdf-formula-issues) + [上传 PDF 提取公式 →](/app)