From 90e16fd868eacde454a24804cc5c3803a9f1dddd Mon Sep 17 00:00:00 2001 From: OleehyO Date: Fri, 25 Apr 2025 11:59:03 +0000 Subject: [PATCH] [chore] Update --- README.md | 28 +++++++++++++++++----------- assets/README_zh.md | 20 +++++++++++++------- 2 files changed, 30 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index 0cf95f7..3de8a8d 100644 --- a/README.md +++ b/README.md @@ -56,37 +56,43 @@ TexTeller was trained with **80M image-formula pairs** (previous dataset can be -## 🔄 Change Log +## 📮 Change Log -- 📮[2024-06-06] **TexTeller3.0 released!** The training data has been increased to **80M** (**10x more than** TexTeller2.0 and also improved in data diversity). TexTeller3.0's new features: +- [2024-06-06] **TexTeller3.0 released!** The training data has been increased to **80M** (**10x more than** TexTeller2.0 and also improved in data diversity). TexTeller3.0's new features: - Support scanned image, handwritten formulas, English(Chinese) mixed formulas. - OCR abilities in both Chinese and English for printed images. -- 📮[2024-05-02] Support **paragraph recognition**. +- [2024-05-02] Support **paragraph recognition**. -- 📮[2024-04-12] **Formula detection model** released! +- [2024-04-12] **Formula detection model** released! -- 📮[2024-03-25] TexTeller2.0 released! The training data for TexTeller2.0 has been increased to 7.5M (15x more than TexTeller1.0 and also improved in data quality). The trained TexTeller2.0 demonstrated **superior performance** in the test set, especially in recognizing rare symbols, complex multi-line formulas, and matrices. +- [2024-03-25] TexTeller2.0 released! The training data for TexTeller2.0 has been increased to 7.5M (15x more than TexTeller1.0 and also improved in data quality). The trained TexTeller2.0 demonstrated **superior performance** in the test set, especially in recognizing rare symbols, complex multi-line formulas, and matrices. > [Here](./assets/test.pdf) are more test images and a horizontal comparison of various recognition models. ## 🚀 Getting Started -1. Install the project's dependencies: +1. Install uv: ```bash - pip install texteller + pip install uv ``` -2. If your are using CUDA backend, you may need to install `onnxruntime-gpu`: +2. Install the project's dependencies: ```bash - pip install texteller[onnxruntime-gpu] + uv pip install texteller ``` -3. Run the following command to start inference: +3. If your are using CUDA backend, you may need to install `onnxruntime-gpu`: + + ```bash + uv pip install texteller[onnxruntime-gpu] + ``` + +4. Run the following command to start inference: ```bash texteller inference "/path/to/image.{jpg,png}" @@ -164,7 +170,7 @@ Please setup your environment before training: 1. Install the dependencies for training: ```bash - pip install texteller[train] + uv pip install texteller[train] ``` 2. Clone the repository: diff --git a/assets/README_zh.md b/assets/README_zh.md index 0a545ba..69eb662 100644 --- a/assets/README_zh.md +++ b/assets/README_zh.md @@ -74,19 +74,25 @@ TexTeller 使用 **8千万图像-公式对** 进行训练(前代数据集可 ## 🚀 快速开始 -1. 安装项目依赖: +1. 安装uv: ```bash - pip install texteller + pip install uv ``` -2. 若使用 CUDA 后端,可能需要安装 `onnxruntime-gpu`: +2. 安装项目依赖: ```bash - pip install texteller[onnxruntime-gpu] + uv pip install texteller ``` -3. 运行以下命令开始推理: +3. 若使用 CUDA 后端,可能需要安装 `onnxruntime-gpu`: + + ```bash + uv pip install texteller[onnxruntime-gpu] + ``` + +4. 运行以下命令开始推理: ```bash texteller inference "/path/to/image.{jpg,png}" @@ -96,7 +102,7 @@ TexTeller 使用 **8千万图像-公式对** 进行训练(前代数据集可 ## 🌐 网页演示 -运行命令: +命令行运行: ```bash texteller web @@ -164,7 +170,7 @@ TexTeller的公式检测模型在3415张中文资料图像和8272张[IBEM数据 1. 安装训练依赖: ```bash - pip install texteller[train] + uv pip install texteller[train] ``` 2. 克隆仓库: