[chore] Update

This commit is contained in:
OleehyO
2025-04-25 11:59:03 +00:00
parent cab9d664f2
commit 90e16fd868
2 changed files with 30 additions and 18 deletions

View File

@@ -56,37 +56,43 @@ TexTeller was trained with **80M image-formula pairs** (previous dataset can be
</tr>
</table>
## 🔄 Change Log
## 📮 Change Log
- 📮[2024-06-06] **TexTeller3.0 released!** The training data has been increased to **80M** (**10x more than** TexTeller2.0 and also improved in data diversity). TexTeller3.0's new features:
- [2024-06-06] **TexTeller3.0 released!** The training data has been increased to **80M** (**10x more than** TexTeller2.0 and also improved in data diversity). TexTeller3.0's new features:
- Support scanned image, handwritten formulas, English(Chinese) mixed formulas.
- OCR abilities in both Chinese and English for printed images.
- 📮[2024-05-02] Support **paragraph recognition**.
- [2024-05-02] Support **paragraph recognition**.
- 📮[2024-04-12] **Formula detection model** released!
- [2024-04-12] **Formula detection model** released!
- 📮[2024-03-25] TexTeller2.0 released! The training data for TexTeller2.0 has been increased to 7.5M (15x more than TexTeller1.0 and also improved in data quality). The trained TexTeller2.0 demonstrated **superior performance** in the test set, especially in recognizing rare symbols, complex multi-line formulas, and matrices.
- [2024-03-25] TexTeller2.0 released! The training data for TexTeller2.0 has been increased to 7.5M (15x more than TexTeller1.0 and also improved in data quality). The trained TexTeller2.0 demonstrated **superior performance** in the test set, especially in recognizing rare symbols, complex multi-line formulas, and matrices.
> [Here](./assets/test.pdf) are more test images and a horizontal comparison of various recognition models.
## 🚀 Getting Started
1. Install the project's dependencies:
1. Install uv:
```bash
pip install texteller
pip install uv
```
2. If your are using CUDA backend, you may need to install `onnxruntime-gpu`:
2. Install the project's dependencies:
```bash
pip install texteller[onnxruntime-gpu]
uv pip install texteller
```
3. Run the following command to start inference:
3. If your are using CUDA backend, you may need to install `onnxruntime-gpu`:
```bash
uv pip install texteller[onnxruntime-gpu]
```
4. Run the following command to start inference:
```bash
texteller inference "/path/to/image.{jpg,png}"
@@ -164,7 +170,7 @@ Please setup your environment before training:
1. Install the dependencies for training:
```bash
pip install texteller[train]
uv pip install texteller[train]
```
2. Clone the repository:

View File

@@ -74,19 +74,25 @@ TexTeller 使用 **8千万图像-公式对** 进行训练(前代数据集可
## 🚀 快速开始
1. 安装项目依赖
1. 安装uv
```bash
pip install texteller
pip install uv
```
2. 若使用 CUDA 后端,可能需要安装 `onnxruntime-gpu`
2. 安装项目依赖
```bash
pip install texteller[onnxruntime-gpu]
uv pip install texteller
```
3. 运行以下命令开始推理
3. 若使用 CUDA 后端,可能需要安装 `onnxruntime-gpu`
```bash
uv pip install texteller[onnxruntime-gpu]
```
4. 运行以下命令开始推理:
```bash
texteller inference "/path/to/image.{jpg,png}"
@@ -96,7 +102,7 @@ TexTeller 使用 **8千万图像-公式对** 进行训练(前代数据集可
## 🌐 网页演示
运行命令:
命令行运行
```bash
texteller web
@@ -164,7 +170,7 @@ TexTeller的公式检测模型在3415张中文资料图像和8272张[IBEM数据
1. 安装训练依赖:
```bash
pip install texteller[train]
uv pip install texteller[train]
```
2. 克隆仓库: