Files
doc_ai_frontend/content/blog/en/2026-03-08-researcher-workflow.md
yoge 76f1bde56d feat: add 5 new blog posts (en + zh)
- how-ai-reads-math: plain-English explainer of the recognition pipeline
- student-workflow: lecture-to-LaTeX workflow for students
- pdf-formula-issues: troubleshooting guide for PDF extraction errors
- copy-math-to-word: 3 methods for getting formulas into Word, ranked
- researcher-workflow: digitizing handwritten research notes at scale

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 16:46:31 +08:00

83 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Digitizing a Decade of Research Notes with TexPixel"
description: How researchers use TexPixel to convert years of handwritten math into searchable, editable LaTeX documents
slug: researcher-workflow
date: 2026-03-08
tags: [workflow, research, tutorial]
---
# Digitizing a Decade of Research Notes with TexPixel
Researchers accumulate notebooks. Derivations sketched out at conferences, margin notes on printed papers, whiteboard captures from group meetings, half-finished proofs from 3 AM. For most of history, this material was effectively unsearchable — trapped in physical form, accessible only by paging through stacks of notebooks.
TexPixel changes the equation (so to speak).
## The Scope of the Problem
A typical active researcher might accumulate 510 filled notebooks per year, each containing hundreds of equations. Digitizing this by hand — retyping each formula in LaTeX — is essentially impossible. At 3 minutes per formula and 50 formulas per notebook, one year's worth of notes would take over 400 hours to transcribe manually.
With TexPixel, each formula takes under 5 seconds from photo to LaTeX. The same year's worth of notes: under 7 hours.
## A Practical Digitization Workflow
### Step 1: Photograph the Notebooks
Use a phone with a good camera and a document scanner app (Adobe Scan, Microsoft Lens, or Apple's built-in document scanner). These apps:
- Automatically detect page edges
- Correct perspective distortion
- Apply contrast enhancement for faded ink or pencil
- Export to PDF
Scan a full notebook in 1520 minutes.
### Step 2: Identify Formula-Dense Pages
Not every page needs digitizing. Quickly flip through and flag pages with equations you'll actually need. A single key derivation or set of equations is often worth digitizing even if the surrounding text isn't.
### Step 3: Batch Process with TexPixel
For each flagged page:
1. Export the page or crop area as a PNG
2. Upload to TexPixel
3. Copy the LaTeX output into your notes
For formula-dense pages, consider cropping individual formulas rather than uploading the full page — this gives more accurate results and cleaner output.
### Step 4: Organize into a Reference Document
Create a `.tex` document (or Overleaf project) structured by topic. Paste each extracted formula with a brief comment about its context:
```latex
% Variational lower bound — from 2022 NeurIPS derivation
\mathcal{L}(\theta, \phi) = \mathbb{E}_{q_\phi(z|x)}\left[\log p_\theta(x|z)\right] - D_{KL}(q_\phi(z|x) \| p(z))
```
After a few sessions, you'll have a searchable, compilable reference document that took a fraction of the time of manual transcription.
## Working with Whiteboards
Conference room whiteboards are particularly valuable targets. A single group meeting might produce 2030 key equations that would otherwise be lost when someone erases the board.
**Best practice:** Photograph the whiteboard before it's erased (obvious) but also photograph intermediate steps — derivations that get overwritten as the discussion progresses. The intermediate steps are often where the insight lives.
For whiteboards:
- Photograph straight-on, not at an angle
- Use even lighting — a photo taken with the lights on and no flash usually works better than using flash, which creates glare on glossy boards
- Crop each distinct equation before uploading
## Working with Printed Papers
For annotated printed papers, TexPixel can extract both the printed formulas and (with somewhat lower accuracy) handwritten margin notes. Crop tightly to the region you need, and upload each formula separately from its annotations.
## Building a Long-Term Knowledge Base
The real value of digitization compounds over time. A well-organized LaTeX reference document from 5 years of notes is something you can:
- Search with `grep` or your editor's search
- Cross-reference with a citation manager
- Share with collaborators
- Build on directly when writing new papers
Start with the past year's notebooks. The 7-hour investment pays dividends for years.
[Start digitizing your notes →](/app)