TexTeller v2

This commit is contained in:
三洋三洋
2024-03-25 06:54:22 +00:00
parent 74341c7e8a
commit ef218d67f6
7 changed files with 28 additions and 22 deletions

View File

@@ -7,9 +7,12 @@
<p align="center">
English | <a href="./assets/README_zh.md">中文</a>
</p>
<p align="center">
<!-- <p align="center">
<img src="./assets/web_demo.gif" alt="TexTeller_demo" width=800>
</p>
</p> -->
<video width="800" controls>
<source src="./assets/test.mp4" type="video/mp4">
</video>
</div>
TexTeller is an end-to-end formula recognition model based on ViT, capable of converting images into corresponding LaTeX formulas.
@@ -21,6 +24,8 @@ TexTeller was trained with ~~550K~~7.5M image-formula pairs (dataset available [
## 🔄 Change Log
* 📮[2024-03-24] TexTeller 2.0 released! The training data for TexTeller 2.0 has been increased to 7.5M (about **15 times more** than TexTeller 1.0 and also improved in data quality). The trained TexTeller 2.0 demonstrated **superior performance** in the test set, especially in recognizing rare symbols, complex multi-line formulas, and matrices.
> [!INFO]
> [There](./assets/test.pdf) are more test images here and a horizontal comparison of recognition models from different companies.
## 🔑 Prerequisites
@@ -138,8 +143,6 @@ In `TexTeller/src/globals.py` and `TexTeller/src/models/ocr_model/train/train_ar
## 🚧 Limitations
* Some complex multi-line scenarios are not well handled (e.g., long formulas mixed with matrices)
* Does not support scanned images and PDF document recognition
* Does not support handwritten formulas