147 Commits
v1.0.2 ... dev

Author SHA1 Message Date
OleehyO
30f7e93c49 📝 [docs] Update README badges and branding consistency
Add arXiv paper badge, fix TexTeller3.0 capitalization, and update documentation links for improved consistency.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-14 22:41:35 +08:00
OleehyO
4f88499de5 🔧 [chore] Replace pre-commit with ruff for linting workflow
- Update CI workflow to use ruff instead of pre-commit
- Remove E999 from ruff ignore rules in pyproject.toml

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-14 22:34:42 +08:00
OleehyO
bfe070f976 📦️ [chore] Update project for TexTeller 3.0 release
- Update dataset references from TexTeller 1.0 to 3.0 in README files
- Add paper.pdf to assets directory
- Configure pre-commit to exclude assets/ from large file checks

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-13 22:01:17 +08:00
OleehyO
af56271e1c 🧑 [chore] Add Claude Code configuration for Git workflow automation
Add Claude agents and commands to enhance developer experience:
- commit-crafter agent for standardized conventional commits
- staged-code-reviewer agent for automated code review
- Commands for code review, GitHub issue fixing, and commit creation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-13 21:59:12 +08:00
三洋三洋
30f88d55ac Upload compare 2025-04-23 22:21:40 +08:00
OleehyO
3d430735a4 [docs] Using uv to install deps 2025-04-23 10:40:12 +00:00
OleehyO
184c890437 [chore] Correct file url 2025-04-23 10:39:24 +00:00
OleehyO
c758dc277b [deps] Pin transformers to 4.47 2025-04-21 12:24:03 +00:00
OleehyO
0ab938aad4 [chore] Setup deps for doc build 2025-04-21 12:24:00 +00:00
OleehyO
90e16fd868 [chore] Update 2025-04-21 12:24:00 +00:00
OleehyO
cab9d664f2 [CD] Add documentation auto-deployment 2025-04-21 12:23:56 +00:00
OleehyO
3f930fdaaf [deps] Add sphnix extension deps 2025-04-21 08:38:06 +00:00
OleehyO
324ab8a03f [docs] Fix typo 2025-04-21 08:21:16 +00:00
OleehyO
a62600f384 [chore] Change logo font 2025-04-21 08:20:16 +00:00
OleehyO
511f69555c 🔧 Fix all ruff typo errors & test CI/CD workflow (#109)
* [chore] Fix ruff typo

* [robot] Fix welcome robot
2025-04-21 13:52:16 +08:00
OleehyO
ae776aa9c7 [CI] Fix deps installation 2025-04-21 05:17:12 +00:00
OleehyO
d46be980ee [CD] Change trigger condition 2025-04-21 05:12:38 +00:00
OleehyO
1201c67237 [chore] Update README_zh.md 2025-04-21 05:11:47 +00:00
OleehyO
d5938c6a2a [deps] Grouped deps & setup vcs 2025-04-21 04:48:29 +00:00
OleehyO
9a388cdfc5 [chore] Update README.md 2025-04-21 04:47:53 +00:00
OleehyO
59bc9bdd41 [CI/CD] Setup complete workflow 2025-04-21 03:00:06 +00:00
OleehyO
7490fa9c5a [chore] Setup vcs and deps 2025-04-21 02:41:46 +00:00
OleehyO
5cf9960a7c [chore] Ignore images 2025-04-21 02:41:06 +00:00
OleehyO
9006edb949 [docs] Set up documentation structure with API reference 2025-04-21 02:38:36 +00:00
OleehyO
d6c659d576 Upload logo.svg 2025-04-21 02:37:28 +00:00
OleehyO
ff02336007 [feat] Support dynamic package vcs 2025-04-21 02:36:13 +00:00
OleehyO
789006894c [docs] Add comprehensive function documentation 2025-04-21 02:34:56 +00:00
OleehyO
2c9ce6b6c1 Add globals test 2025-04-21 02:32:05 +00:00
OleehyO
57b757c0f0 [test] Init 2025-04-19 16:36:48 +00:00
OleehyO
a7a296025a [feat] Add texteller training script 2025-04-19 16:36:43 +00:00
OleehyO
991d6bc00d [CI] Update ruff hook 2025-04-19 14:32:32 +00:00
OleehyO
06edd104e2 [refactor] Init 2025-04-19 14:32:28 +00:00
OleehyO
0e32f3f3bf [chore] Cleanup 2025-04-17 07:08:47 +00:00
OleehyO
6bd68ad3b7 [feat] Support n-gram stop criteria 2025-04-02 03:23:27 +00:00
OleehyO
aae7af445f [deps] Change onnx-gpu to manually install 2025-04-02 02:48:23 +00:00
三洋三洋
38e7c6293f [feat][formatter] Integrate LaTeX formatter for improved formula readability
- Add latex_formatter.py based on tex-fmt (https://github.com/WGUNDERWOOD/tex-fmt)
- Update to_katex.py to use the new formatter
- Enhance LaTeX formula output with better formatting and readability

This integration helps make generated LaTeX formulas more readable and
maintainable by applying consistent formatting rules.
2025-03-01 00:55:41 +08:00
三洋三洋
192e8d6352 [chore] Ignore ruff lint E741 2025-03-01 00:54:57 +08:00
三洋三洋
110cb29d6c [fix] Add project prefix 2025-02-28 23:38:12 +08:00
三洋三洋
abd6057378 [feat] Remove bold style 2025-02-28 23:38:12 +08:00
三洋三洋
e214b508d2 [deps] Add ray serve & python-multipart 2025-02-28 23:37:53 +08:00
三洋三洋
de9deacaf2 [chore] Add build system and pakage location 2025-02-28 23:18:06 +08:00
三洋三洋
cd0f397f20 [chore] Add python related rules 2025-02-28 23:18:03 +08:00
三洋三洋
5668a2e26c [chore] Remove unsed files 2025-02-28 20:54:51 +08:00
三洋三洋
3d546f9993 [chore] exclude paddleocr directory from pre-commit hooks 2025-02-28 20:01:54 +08:00
三洋三洋
a8a005ae10 [chore] Setup project infrastructure 2025-02-28 20:01:52 +08:00
三洋三洋
52fce4d39d [deps] pin transformers to 4.45.2 and sentence-transformers to 3.1.1 2025-02-01 13:00:44 +08:00
OleehyO
b8100517c6 Merge pull request #78 from OleehyO/pre_release
Change to better import dependency
2024-08-07 12:43:15 +08:00
三洋三洋
06701415cc Change to better import dependency 2024-08-07 01:19:26 +08:00
OleehyO
c6eb1b6ea2 Merge pull request #67 from OleehyO/pre_release
Change setting name
2024-07-11 20:34:50 +08:00
三洋三洋
1b685054c9 Change setting name 2024-07-11 20:33:51 +08:00
OleehyO
c835cedcf5 Merge pull request #60 from OleehyO/pre_release
Pre release
2024-06-23 22:16:09 +08:00
三洋三洋
9f3a46e8a9 Update README 2024-06-23 22:14:05 +08:00
三洋三洋
569c72ffe3 Remove onnxruntime-gpu 2024-06-23 22:13:51 +08:00
OleehyO
b4f70a09e0 Merge pull request #59 from OleehyO/pre_release
Pre release
2024-06-22 23:56:45 +08:00
三洋三洋
36a2680d28 Update model config 2024-06-22 22:08:08 +08:00
三洋三洋
c5e859517a Update README 2024-06-22 22:00:14 +08:00
三洋三洋
9638c0030d Support onnx runtime 2024-06-22 22:00:05 +08:00
三洋三洋
8da3fd7418 Add optimum 2024-06-22 21:49:47 +08:00
OleehyO
fb6784b535 Merge pull request #58 from OleehyO/pre_release
Add formula detection service
2024-06-17 21:26:35 +08:00
三洋三洋
76eeb18b83 Add formula detection service 2024-06-17 21:23:55 +08:00
OleehyO
e2d0e91a77 Merge pull request #56 from OleehyO/pre_release
Add docker link
2024-06-11 13:22:17 +08:00
三洋三洋
0d5cd9a75d Add docker link 2024-06-11 13:20:32 +08:00
三洋三洋
624f9531b4 Update server.py
1. Change the default host address to 0.0.0.0.
2. Convert the output to KaTeX.
2024-06-07 12:26:24 +00:00
三洋三洋
aa14674097 Update README 2024-06-07 06:54:23 +00:00
三洋三洋
a7044e0369 Add Apache2.0 license 2024-06-06 13:06:16 +00:00
三洋三洋
837cb6021f Add cover.png 2024-06-06 13:06:16 +00:00
三洋三洋
354833aac8 Modify the names of options in the web.py
Formula only       -> Formula recognition
Text formula mixed -> Paragraph recognition

Improved display during mixed inference
2024-06-06 13:06:16 +00:00
三洋三洋
760bd78c10 Refine mix_inference
1. Add the formula number back to the isolated formula and merge multiple tag.
2. remove bold effect from inline formuals
3. change split environment into aligned
2024-06-06 13:06:11 +00:00
三洋三洋
c0e730f697 Bugfix: to_katex.py
1. Added `change_all` function to fix a bug where some LaTeX formulas with the same wrapper were causing issues.
2. Removed some unnecessary formatting commands.

Bugfix: to_katex.py
2024-06-06 08:25:50 +00:00
三洋三洋
7aad0839c4 Update 2024-05-28 09:51:53 +00:00
三洋三洋
5420e92cc4 Added releasing file 2024-05-28 07:50:09 +00:00
三洋三洋
89aa396cbb Change the model configuration to trocr 2024-05-28 07:50:09 +00:00
三洋三洋
9b11689f22 Using paddleocr with onnxruntime
Deleted the code for test time.
2024-05-28 07:50:09 +00:00
三洋三洋
85d558f772 Added mixed recognition
change suryaocr to paddleocr
2024-05-28 07:50:08 +00:00
三洋三洋
2af1e067c1 Added ONNX file for PaddleOCR model 2024-05-28 07:50:08 +00:00
三洋三洋
6b852d561d Update .gitignore 2024-05-28 07:50:08 +00:00
三洋三洋
e193fe3798 Added code for PaddleOCR inference 2024-05-28 07:50:08 +00:00
三洋三洋
714fef4def Eliminated dependency on paddleocr
Change to trocr
2024-05-28 07:50:08 +00:00
三洋三洋
edef073812 update 2024-05-28 07:50:08 +00:00
OleehyO
1b8f6ba0b6 bugfix: ocr_aug.py
Change "lhy_custom" in ink_swap_color to "random"
2024-05-28 07:49:55 +00:00
三洋三洋
a27cf716ee bugfix: missing filter_fn and inference/train transform 2024-05-12 07:49:04 +00:00
三洋三洋
8557e81374 update 2024-05-12 07:47:35 +00:00
三洋三洋
10e22259a2 update 2024-05-10 03:48:31 +00:00
TonyLee1256
9875fedb1b Update requirements.txt 2024-05-09 00:23:32 +08:00
TonyLee1256
83da4262fd Update mix_inference.py
替换文本OCR模型为paddleocr
2024-05-09 00:23:02 +08:00
TonyLee1256
bd2aaa3e00 Update inference.py
替换文本OCR模型为paddleocr
2024-05-09 00:22:01 +08:00
TonyLee1256
fe7e4a7af0 Update inference.py
增加了计时功能
2024-05-09 00:20:32 +08:00
TonyLee1256
48043d11e3 Update infer_det.py
增加使用gpu进行onnx模型推理的功能
2024-05-09 00:19:39 +08:00
三洋三洋
e495640690 bugfix 2024-05-08 14:34:01 +00:00
三洋三洋
84fa43321f Added Language option in mixed mode 2024-05-07 07:44:24 +00:00
三洋三洋
b116dfae55 Update README 2024-05-07 07:30:29 +00:00
三洋三洋
85b22ff9c7 bugfix 2024-05-07 07:11:34 +00:00
三洋三洋
42959cd6a5 Add train_config.yaml 2024-05-07 07:11:05 +00:00
三洋三洋
4c182aecda update .gitignore 2024-05-07 06:54:53 +00:00
TonyLee1256
d2c1e5e10f bugfix inference.py 2024-05-07 13:28:07 +08:00
TonyLee1256
c5dd0dacd8 Update README_zh.md 2024-05-07 13:27:23 +08:00
TonyLee1256
8981df6bc9 Update README.md 2024-05-07 13:26:50 +08:00
TonyLee1256
bb0594815a Update README.md 2024-05-07 13:25:28 +08:00
TonyLee1256
8c85575260 bugfix inference.py 2024-05-07 13:19:43 +08:00
三洋三洋
7c5a547b1f update 2024-05-02 09:10:21 +00:00
三洋三洋
c6e6622aaf Merge remote-tracking branch 'origin/pre_release' into pre_release 2024-04-21 16:13:49 +00:00
三洋三洋
8fa462b434 update README.md 2024-04-21 16:13:45 +00:00
TonyLee1256
1a7939190f Update rec_infer_from_crop_imgs.py 2024-04-22 00:08:36 +08:00
TonyLee1256
0bb11bebfc Update infer_det.py 2024-04-22 00:07:41 +08:00
TonyLee1256
be19ed8d63 Update README.md 2024-04-21 22:14:23 +08:00
TonyLee1256
0079c07be2 Update README.md 2024-04-21 22:12:22 +08:00
TonyLee1256
b3dd73c716 Update README_zh.md 2024-04-21 22:09:58 +08:00
三洋三洋
188ab88e07 Merge branch 'dev' into pre_release 2024-04-21 13:14:49 +00:00
三洋三洋
9018c62f66 Update README.md 2024-04-21 13:06:01 +00:00
三洋三洋
5cbbfb38d6 1) 修复了to_katex.py的bug; 2)把Box.py中的转化结果写在logs 2024-04-21 12:09:26 +00:00
三洋三洋
11df230200 merge dev后调整了项目结构 2024-04-21 00:48:24 +08:00
三洋三洋
e6dca76123 merge dev后删除了resizer 2024-04-21 00:13:21 +08:00
三洋三洋
185b2e3db6 1) 实现了文本-公式混排识别; 2) 重构了项目结构 2024-04-21 00:05:14 +08:00
三洋三洋
eab6e4c85d update infer_det.py 2024-04-18 00:06:05 +08:00
三洋三洋
48f778eeda 为了支持mixed inference, 重构了目录 2024-04-17 15:24:06 +00:00
三洋三洋
7883d3c07f 修复了merge pre_release分支后导致参数名不一致的bug 2024-04-17 14:47:58 +00:00
三洋三洋
a064b7dbb0 Merge branch 'pre_release' into dev 2024-04-17 10:32:22 +00:00
三洋三洋
f81a31a8c9 checkpoint 2024-04-17 10:20:15 +00:00
三洋三洋
ec3e744376 update README.md 2024-04-17 10:08:46 +00:00
三洋三洋
3cebc2eb2a 前端更新, inference.py更新
1) 前端支持剪贴板粘贴图片.
2) 前端支持模型配置.
3) 修改了inference.py的接口.
4) 删除了不必要的文件
2024-04-17 09:36:40 +00:00
三洋三洋
66d4902871 add contributor 2024-04-12 07:29:36 +00:00
三洋三洋
78d29d49ef update README 2024-04-12 06:16:37 +00:00
三洋三洋
7d1d8ddd77 work in progress 2024-04-12 03:20:04 +00:00
OleehyO
9e8b15ef3a Merge pull request #14 from TonyLee1256/pre_release
新增公式检测模块
2024-04-12 00:46:45 +08:00
TonyLee1256
9e8ac666b0 新增公式检测模块 2024-04-11 16:44:19 +00:00
三洋三洋
1538cb73f8 修改了transforms.py中inference_transform的bug: 在训练的eval阶段没有把png图片转化为np.ndarray 2024-04-11 07:04:58 +00:00
三洋三洋
762012be1f 优化了transform.py中的trim_white_border 2024-04-10 16:09:13 +00:00
三洋三洋
1589fb3217 增加了数据增强的概率 2024-04-09 13:50:35 +00:00
三洋三洋
1db514bdbf inference.py支持katex语法 2024-04-06 12:06:08 +00:00
三洋三洋
840be6b843 update README.md 2024-04-06 11:57:50 +00:00
三洋三洋
93fc22adf5 inference.py支持katex 2024-04-06 11:38:59 +00:00
三洋三洋
8d6d889efa update README.md 2024-04-06 07:43:03 +00:00
三洋三洋
ecd5481bea web demo支持katex, 不再需要本地安装xelatex渲染器 2024-04-06 07:28:46 +00:00
三洋三洋
b5f7166e58 web demo加入了katex支持, 不再需要本地安装xelatex渲染器 2024-04-06 07:18:40 +00:00
三洋三洋
c9c15d27bd inference_transform bugfix 2024-04-06 05:09:50 +00:00
三洋三洋
87ddb86e5e 完成了v3版本:加入自然场景的数据增强 2024-04-05 08:11:06 +00:00
三洋三洋
a4e878da96 Merge remote-tracking branch 'origin/dev' into dev 2024-04-05 08:00:11 +00:00
三洋三洋
70dce92e19 Merge remote-tracking branch 'origin/dev' into dev 2024-04-05 07:52:40 +00:00
三洋三洋
e16f46e856 修改了v3(支持自然场景、混合文字场景识别)版本的inference.py模版 2024-04-05 07:27:07 +00:00
三洋三洋
67426c439f update README.md 2024-04-05 05:19:27 +00:00
三洋三洋
d2090c0d61 Merge remote-tracking branch 'origin/dev' into dev 2024-03-28 14:33:46 +00:00
三洋三洋
5a259065a4 merge v3_nature_scence 2024-03-28 14:33:25 +00:00
三洋三洋
8d94611aba merge v3_nature_scence 2024-03-28 14:22:23 +00:00
三洋三洋
a6a5d07430 Merge remote-tracking branch 'origin/dev' into dev 2024-03-28 13:28:47 +00:00
三洋三洋
63b8e04dab TexTellerv2 release 2024-03-25 13:22:11 +00:00
OleehyO
86443d0cf7 Update README_zh.md 2024-03-25 16:35:34 +08:00
OleehyO
88d2730752 Update README.md 2024-03-25 16:34:46 +08:00
13 changed files with 284 additions and 16 deletions

View File

@@ -0,0 +1,164 @@
---
name: commit-crafter
description: Expertly creates clean, conventional, and atomic Git commits with pre-commit checks.
---
You are an expert Git assistant. Your purpose is to help create perfectly formatted, atomic commits that follow conventional commit standards. You enforce code quality by running pre-commit checks (if exists) and help maintain a clean project history by splitting large changes into logical units.
## Using Hints for Commit Customization
When a user provides a hint, use it to guide the commit message generation while still maintaining conventional commit standards:
- **Analyze the hint**: Extract the key intent, context, or focus area from the user's hint
- **Combine with code analysis**: Use both the hint and the actual code changes to determine the most appropriate commit type and description
- **Prioritize hint context**: When the hint provides specific context (e.g., "fix login bug"), use it to craft a more targeted and meaningful commit message
- **Maintain standards**: The hint should guide the message content, but the format must still follow conventional commit standards
- **Resolve conflicts**: If the hint conflicts with what the code changes suggest, prioritize the code changes but incorporate the hint's context where applicable
## Best Practices for Commits
- **Verify before committing**: Ensure code is linted, builds correctly, and documentation is updated
- **Use hints effectively**: When a hint is provided, incorporate its context into the commit message while ensuring the message accurately reflects the actual code changes
- **Atomic commits**: Each commit should contain related changes that serve a single purpose
- **Split large changes**: If changes touch multiple concerns, split them into separate commits
- **Conventional commit format**: Use the format `[<type>] <description>`, some of <type> are:
- feat: A new feature
- fix: A bug fix
- docs: Documentation changes
- style: Code style changes (formatting, etc)
- refactor: Code changes that neither fix bugs nor add features
- perf: Performance improvements
- test: Adding or fixing tests
- chore: Changes to the build process, tools, etc.
- **Present tense, imperative mood**: Write commit messages as commands (e.g., "add feature" not "added feature")
- **Concise first line**: Keep the first line under 72 characters
- **Emoji**: Each commit type is paired with an appropriate emoji:
- ✨ [feat] New feature
- 🐛 [fix] Bug fix
- 📝 [docs] Documentation
- 💄 [style] Formatting/style
- ♻️ [refactor] Code refactoring
- ⚡️ [perf] Performance improvements
- ✅ [test] Tests
- 🔧 [chore] Tooling, configuration
- 🚀 [ci] CI/CD improvements
- 🗑️ [revert] Reverting changes
- 🧪 [test] Add a failing test
- 🚨 [fix] Fix compiler/linter warnings
- 🔒️ [fix] Fix security issues
- 👥 [chore] Add or update contributors
- 🚚 [refactor] Move or rename resources
- 🏗️ [refactor] Make architectural changes
- 🔀 [chore] Merge branches
- 📦️ [chore] Add or update compiled files or packages
- [chore] Add a dependency
- [chore] Remove a dependency
- 🌱 [chore] Add or update seed files
- 🧑 [chore] Improve developer experience
- 🧵 [feat] Add or update code related to multithreading or concurrency
- 🔍️ [feat] Improve SEO
- 🏷️ [feat] Add or update types
- 💬 [feat] Add or update text and literals
- 🌐 [feat] Internationalization and localization
- 👔 [feat] Add or update business logic
- 📱 [feat] Work on responsive design
- 🚸 [feat] Improve user experience / usability
- 🩹 [fix] Simple fix for a non-critical issue
- 🥅 [fix] Catch errors
- 👽️ [fix] Update code due to external API changes
- 🔥 [fix] Remove code or files
- 🎨 [style] Improve structure/format of the code
- 🚑️ [fix] Critical hotfix
- 🎉 [chore] Begin a project
- 🔖 [chore] Release/Version tags
- 🚧 [wip] Work in progress
- 💚 [fix] Fix CI build
- 📌 [chore] Pin dependencies to specific versions
- 👷 [ci] Add or update CI build system
- 📈 [feat] Add or update analytics or tracking code
- ✏️ [fix] Fix typos
- ⏪️ [revert] Revert changes
- 📄 [chore] Add or update license
- 💥 [feat] Introduce breaking changes
- 🍱 [assets] Add or update assets
- ♿️ [feat] Improve accessibility
- 💡 [docs] Add or update comments in source code
- 🗃 [db] Perform database related changes
- 🔊 [feat] Add or update logs
- 🔇 [fix] Remove logs
- 🤡 [test] Mock things
- 🥚 [feat] Add or update an easter egg
- 🙈 [chore] Add or update .gitignore file
- 📸 [test] Add or update snapshots
- ⚗️ [experiment] Perform experiments
- 🚩 [feat] Add, update, or remove feature flags
- 💫 [ui] Add or update animations and transitions
- ⚰️ [refactor] Remove dead code
- 🦺 [feat] Add or update code related to validation
- ✈️ [feat] Improve offline support
## Guidelines for Splitting Commits
When analyzing the diff, consider splitting commits based on these criteria:
1. **Different concerns**: Changes to unrelated parts of the codebase
2. **Different types of changes**: Mixing features, fixes, refactoring, etc.
3. **File patterns**: Changes to different types of files (e.g., source code vs documentation)
4. **Logical grouping**: Changes that would be easier to understand or review separately
5. **Size**: Very large changes that would be clearer if broken down
## Examples
Good commit messages:
- ✨ [feat] Add user authentication system
- 🐛 [fix] Resolve memory leak in rendering process
- 📝 [docs] Update API documentation with new endpoints
- ♻️ [refactor] Simplify error handling logic in parser
- 🚨 [fix] Resolve linter warnings in component files
- 🧑 [chore] Improve developer tooling setup process
- 👔 [feat] Implement business logic for transaction validation
- 🩹 [fix] Address minor styling inconsistency in header
- 🚑 [fix] Patch critical security vulnerability in auth flow
- 🎨 [style] Reorganize component structure for better readability
- 🔥 [fix] Remove deprecated legacy code
- 🦺 [feat] Add input validation for user registration form
- 💚 [fix] Resolve failing CI pipeline tests
- 📈 [feat] Implement analytics tracking for user engagement
- 🔒️ [fix] Strengthen authentication password requirements
- ♿️ [feat] Improve form accessibility for screen readers
Examples with hints:
**Hint: "fix user login bug"**
- Code changes: Fix null pointer exception in auth service
- Generated: 🐛 [fix] Resolve null pointer exception in user login flow
**Hint: "API refactoring"**
- Code changes: Extract common validation logic into separate service
- Generated: ♻️ [refactor] Extract API validation logic into shared service
**Hint: "add dark mode support"**
- Code changes: Add CSS variables and theme toggle component
- Generated: ✨ [feat] Implement dark mode support with theme toggle
**Hint: "performance optimization"**
- Code changes: Implement memoization for expensive calculations
- Generated: ⚡️ [perf] Add memoization to optimize calculation performance
Example of splitting commits:
- First commit: ✨ [feat] Add new solc version type definitions
- Second commit: 📝 [docs] Update documentation for new solc versions
- Third commit: 🔧 [chore] Update package.json dependencies
- Fourth commit: 🏷 [feat] Add type definitions for new API endpoints
- Fifth commit: 🧵 [feat] Improve concurrency handling in worker threads
- Sixth commit: 🚨 [fix] Resolve linting issues in new code
- Seventh commit: ✅ [test] Add unit tests for new solc version features
- Eighth commit: 🔒️ [fix] Update dependencies with security vulnerabilities
## Important Notes
- **If no files are staged, abort the process immediately**.
- **Commit staged files only**: Unstaged files are assumed to be intentionally excluded from the current commit.
- **Do not make any pre-commit checks**. If a pre-commit hook is triggered and fails during the commit process, abort the process immediately.
- **Process hints carefully**: When a hint is provided, analyze it to understand the user's intent, but always verify it aligns with the actual code changes.
- **Hint priority**: Use hints to provide context and focus, but the actual code changes should determine the commit type and scope.
- Before committing, review the diff to **identify if multiple commits would be more appropriate**.

View File

@@ -0,0 +1,71 @@
---
name: staged-code-reviewer
description: Reviews staged git changes for quality, security, and performance. Analyzes files in the git index (git diff --cached) and provides actionable, line-by-line feedback.
---
You are a specialized code review agent. Your sole function is to analyze git changes that have been staged for commit. You must ignore unstaged changes, untracked files, and non-code files (e.g., binaries, data). Your review should be direct, objective, and focused on providing actionable improvements.
## Core Directives
1. Analyze Staged Code: Use the output of `git diff --cached` as the exclusive source for your review.
2. Prioritize by Impact: Focus first on security vulnerabilities and critical bugs, then on performance, and finally on code quality and style.
3. Provide Actionable Feedback: Every identified issue must be accompanied by a concrete suggestion for improvement.
## Review Criteria
For each change, evaluate the following:
* Security: Check for hardcoded secrets, injection vulnerabilities (SQL, XSS), insecure direct object references, and missing authentication/authorization.
* Correctness & Reliability: Verify the logic works as intended, includes proper error handling, and considers edge cases.
* Performance: Identify inefficient algorithms, potential bottlenecks, and expensive operations (e.g., N+1 database queries).
* Code Quality: Assess readability, simplicity, naming conventions, and code duplication (DRY principle).
* Test Coverage: Ensure that new logic is accompanied by meaningful tests.
## Critical Issues to Flag Immediately
* Hardcoded credentials, API keys, or tokens.
* SQL or command injection vulnerabilities.
* Cross-Site Scripting (XSS) vulnerabilities.
* Missing or incorrect authentication/authorization checks.
* Use of unsafe functions like eval() without proper sanitization.
## Output Format
Your entire response must follow this structure. Do not deviate.
Start with a summary header:
Staged Code Review
---
Files Reviewed: [List of staged files]
Total Changes: [Number of lines added/removed]
---
Then, for each file with issues, create a section:
### filename.ext
(One-line summary of the changes in this file.)
**CRITICAL ISSUES**
* (Line X): [Concise Issue Title]
Problem: [Clear description of the issue.]
Suggestion: [Specific, actionable improvement.]
Reasoning: [Why the change is necessary (e.g., security, performance).]
**MAJOR ISSUES**
* (Line Y): [Concise Issue Title]
Problem: [Clear description of the issue.]
Suggestion: [Specific, actionable improvement, including code examples if helpful.]
Reasoning: [Why the change is necessary.]
**MINOR ISSUES**
* (Line Z): [Concise Issue Title]
Problem: [Clear description of the issue.]
Suggestion: [Specific, actionable improvement.]
Reasoning: [Why the change is necessary.]
If a file has no issues, state: "No issues found."
If you see well-implemented code, you may optionally add a "Positive Feedback" section to acknowledge it.

View File

@@ -0,0 +1 @@
Use staged-code-reviewer sub agent to perform code review

View File

@@ -0,0 +1,13 @@
Please analyze and fix the GitHub issue: $ARGUMENTS.
Follow these steps:
1. Use `gh issue view` to get the issue details
2. Understand the problem described in the issue
3. Search the codebase for relevant files
4. Implement the necessary changes to fix the issue
5. Write and run tests to verify the fix
6. Ensure code passes linting and type checking
7. Create a descriptive commit message
Remember to use the GitHub CLI (`gh`) for all GitHub-related tasks.

View File

@@ -0,0 +1,16 @@
Use commit-crafter sub agent to make a standardized commit
## Usage
```
/make-commit [hint]
```
**Parameters:**
- `hint` (optional): A brief description or context to help customize the commit message. The hint will be used to guide the commit message generation while maintaining conventional commit standards.
**Examples:**
- `/make-commit` - Generate commit message based purely on code changes
- `/make-commit "API refactoring"` - Guide the commit to focus on API-related changes
- `/make-commit "fix user login bug"` - Provide context about the specific issue being fixed
- `/make-commit "add dark mode support"` - Indicate the feature being added

View File

@@ -21,7 +21,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pre-commit
pip install ruff
- name: Run pre-commit
run: pre-commit run --all-files
- name: Run ruff
run: ruff check .

View File

@@ -17,6 +17,7 @@ repos:
- id: check-yaml
- id: check-toml
- id: check-added-large-files
exclude: assets/
- id: check-case-conflict
- id: check-merge-conflict
- id: debug-statements

View File

@@ -2,15 +2,16 @@
<div align="center">
<h1>
<img src="./assets/fire.svg" width=30, height=30>
<img src="./assets/fire.svg" width=60, height=60>
𝚃𝚎𝚡𝚃𝚎𝚕𝚕𝚎𝚛
<img src="./assets/fire.svg" width=30, height=30>
<img src="./assets/fire.svg" width=60, height=60>
</h1>
[![](https://img.shields.io/badge/API-Docs-orange.svg?logo=read-the-docs)](https://oleehyo.github.io/TexTeller/)
[![](https://img.shields.io/badge/docker-pull-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller)
[![](https://img.shields.io/badge/Data-Texteller1.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas)
[![arXiv](https://img.shields.io/badge/arXiv-2508.09200-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2508.09220)
[![](https://img.shields.io/badge/Data-Texteller3.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas-80M)
[![](https://img.shields.io/badge/Weights-Texteller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller)
[![](https://img.shields.io/badge/docker-pull-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller)
[![](https://img.shields.io/badge/License-Apache_2.0-blue.svg?logo=github)](https://opensource.org/licenses/Apache-2.0)
</div>

View File

@@ -1,16 +1,17 @@
📄 中文 | [English](./README.md)
📄 中文 | [English](../README.md)
<div align="center">
<h1>
<img src="./fire.svg" width=30, height=30>
<img src="./fire.svg" width=60, height=60>
𝚃𝚎𝚡𝚃𝚎𝚕𝚕𝚎𝚛
<img src="./fire.svg" width=30, height=30>
<img src="./fire.svg" width=60, height=60>
</h1>
[![](https://img.shields.io/badge/API-文档-orange.svg?logo=read-the-docs)](https://oleehyo.github.io/TexTeller/)
[![arXiv](https://img.shields.io/badge/arXiv-2508.09200-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2508.09220)
[![](https://img.shields.io/badge/docker-镜像-green.svg?logo=docker)](https://hub.docker.com/r/oleehyo/texteller)
[![](https://img.shields.io/badge/数据-Texteller1.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas)
[![](https://img.shields.io/badge/权重-Texteller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller)
[![](https://img.shields.io/badge/数据-TexTeller3.0-brightgreen.svg?logo=huggingface)](https://huggingface.co/datasets/OleehyO/latex-formulas-80M)
[![](https://img.shields.io/badge/权重-TexTeller3.0-yellow.svg?logo=huggingface)](https://huggingface.co/OleehyO/TexTeller)
[![](https://img.shields.io/badge/协议-Apache_2.0-blue.svg?logo=github)](https://opensource.org/licenses/Apache-2.0)
</div>
@@ -70,7 +71,7 @@ TexTeller 使用 **8千万图像-公式对** 进行训练(前代数据集可
- [2024-03-25] TexTeller2.0 发布TexTeller2.0 的训练数据增至750万是前代的15倍并提升了数据质量。训练后的 TexTeller2.0 在测试集中展现了**更优性能**,特别是在识别罕见符号、复杂多行公式和矩阵方面表现突出。
> [此处](./assets/test.pdf) 展示了更多测试图像及各类识别模型的横向对比。
> [此处](./test.pdf) 展示了更多测试图像及各类识别模型的横向对比。
## 🚀 快速开始
@@ -191,7 +192,7 @@ TexTeller的公式检测模型在3415张中文资料图像和8272张[IBEM数据
accelerate launch train.py
```
训练参数可通过[`train_config.yaml`](./examples/train_texteller/train_config.yaml)调整。
训练参数可通过[`train_config.yaml`](../examples/train_texteller/train_config.yaml)调整。
## 📅 计划列表

Binary file not shown.

Binary file not shown.

View File

@@ -20,7 +20,8 @@ You can install TexTeller using pip:
.. code-block:: bash
pip install texteller
pip install uv
uv pip install texteller
Quick Start
----------

View File

@@ -44,7 +44,6 @@ quote-style = "double"
[tool.ruff.lint]
select = ["E", "W"]
ignore = [
"E999",
"EXE001",
"UP009",
"F401",