This commit is contained in:
OleehyO
2025-08-22 21:43:30 +08:00
parent 154c8fcab5
commit 9b88cec77b
2 changed files with 2 additions and 4 deletions

View File

@@ -8,7 +8,6 @@
</h1> </h1>
[![](https://img.shields.io/badge/API-Docs-orange.svg?logo=read-the-docs)](https://oleehyo.github.io/TexTeller/) [![](https://img.shields.io/badge/API-Docs-orange.svg?logo=read-the-docs)](https://oleehyo.github.io/TexTeller/)
[![arXiv](https://img.shields.io/badge/arXiv-2508.09200-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2508.09220)
[![](https://img.shields.io/badge/Data-Texteller3.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas-80M) [![](https://img.shields.io/badge/Data-Texteller3.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas-80M)
[![](https://img.shields.io/badge/Weights-Texteller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller) [![](https://img.shields.io/badge/Weights-Texteller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller)
[![](https://img.shields.io/badge/docker-pull-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller) [![](https://img.shields.io/badge/docker-pull-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller)
@@ -61,7 +60,7 @@ TexTeller was trained with **80M image-formula pairs** (previous dataset can be
## 📮 Change Log ## 📮 Change Log
- [2025-08-15] We have published the [technical report](https://arxiv.org/abs/2508.09220) of TexTeller. The model evaluated on the Benchmark (which was trained from scratch and had its handwritten subset filtered based on the test set) is available at https://huggingface.co/OleehyO/TexTeller_en. **Please do not directly use the open-source version of TexTeller3.0 to reproduce the experimental results of handwritten formulas**, as this model includes the test sets of these benchmarks. <!-- - [2025-08-15] We have published the [technical report](https://arxiv.org/abs/2508.09220) of TexTeller. The model evaluated on the Benchmark (which was trained from scratch and had its handwritten subset filtered based on the test set) is available at https://huggingface.co/OleehyO/TexTeller_en. **Please do not directly use the open-source version of TexTeller3.0 to reproduce the experimental results of handwritten formulas**, as this model includes the test sets of these benchmarks. -->
- [2025-08-15] We have open-sourced the [training dataset](https://huggingface.co/datasets/OleehyO/latex-formulas-80M) of TexTeller 3.0. Please note that the handwritten* subset of this dataset is collected from existing open-source handwritten datasets (including both training and test sets). If you need to use the handwritten* subset for your experimental ablation, please filter the test labels first. - [2025-08-15] We have open-sourced the [training dataset](https://huggingface.co/datasets/OleehyO/latex-formulas-80M) of TexTeller 3.0. Please note that the handwritten* subset of this dataset is collected from existing open-source handwritten datasets (including both training and test sets). If you need to use the handwritten* subset for your experimental ablation, please filter the test labels first.

View File

@@ -8,7 +8,6 @@
</h1> </h1>
[![](https://img.shields.io/badge/API-文档-orange.svg?logo=read-the-docs)](https://oleehyo.github.io/TexTeller/) [![](https://img.shields.io/badge/API-文档-orange.svg?logo=read-the-docs)](https://oleehyo.github.io/TexTeller/)
[![arXiv](https://img.shields.io/badge/arXiv-2508.09200-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2508.09220)
[![](https://img.shields.io/badge/数据-TexTeller3.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas-80M) [![](https://img.shields.io/badge/数据-TexTeller3.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas-80M)
[![](https://img.shields.io/badge/权重-TexTeller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller) [![](https://img.shields.io/badge/权重-TexTeller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller)
[![](https://img.shields.io/badge/docker-镜像-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller) [![](https://img.shields.io/badge/docker-镜像-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller)
@@ -59,7 +58,7 @@ TexTeller 使用 **8千万图像-公式对** 进行训练(前代数据集可
## 📮 更新日志 ## 📮 更新日志
- [2025-08-15] 我们发布了 TexTeller 的[技术报告](https://arxiv.org/abs/2508.09220)。在基准集上评测的模型(从零训练,且对手写子集按测试集进行了过滤)可在 https://huggingface.co/OleehyO/TexTeller_en 获取。**请不要直接使用开源的 TexTeller3.0 版本来复现实验中的手写公式结果**,因为该模型的训练包含了这些基准的测试集。 <!-- - [2025-08-15] 我们发布了 TexTeller 的[技术报告](https://arxiv.org/abs/2508.09220)。在基准集上评测的模型(从零训练,且对手写子集按测试集进行了过滤)可在 https://huggingface.co/OleehyO/TexTeller_en 获取。**请不要直接使用开源的 TexTeller3.0 版本来复现实验中的手写公式结果**,因为该模型的训练包含了这些基准的测试集。 -->
- [2025-08-15] 我们开源了 TexTeller 3.0 的[训练数据集](https://huggingface.co/datasets/OleehyO/latex-formulas-80M)。其中handwritten* 子集来自现有的开源手写数据集(**包含训练集和测试集**),请不要将该子集用于实验消融。 - [2025-08-15] 我们开源了 TexTeller 3.0 的[训练数据集](https://huggingface.co/datasets/OleehyO/latex-formulas-80M)。其中handwritten* 子集来自现有的开源手写数据集(**包含训练集和测试集**),请不要将该子集用于实验消融。