From 72a60f861114cf90b489804042185367bc122572 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=89=E6=B4=8B=E4=B8=89=E6=B4=8B?= <1258009915@qq.com>
Date: Mon, 12 Feb 2024 16:27:58 +0000
Subject: [PATCH] Update README

---
 README.md           | 4 ++--
 assets/README_zh.md | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 8e81cfc..b1bfc27 100644
--- a/README.md
+++ b/README.md
@@ -96,13 +96,13 @@ After the dataset is ready, you should **change the `DIR_URL` variable** in `...
 
 If you are using a different dataset, you may need to retrain the tokenizer to match your specific vocabulary. After setting up the dataset, you can do this by:
 
-1. Change the line `new_tokenizer.save_pretrained('./your_dir_name')` in `TexTeller/src/models/ocr_model/tokenizer/train.py` to your desired output directory name.
+1. Change the line `new_tokenizer.save_pretrained('./your_dir_name')` in `TexTeller/src/models/tokenizer/train.py` to your desired output directory name.
     > To use a different vocabulary size, you should modify the `VOCAB_SIZE` parameter in the `TexTeller/src/models/globals.py`.
 
 2. Running the following command **under `TexTeller/src` directory**:
 
     ```bash
-    python -m models.ocr_model.tokenizer.train
+    python -m models.tokenizer.train
     ```
 
 ### Train the model
diff --git a/assets/README_zh.md b/assets/README_zh.md
index 1a68b64..7768d9d 100644
--- a/assets/README_zh.md
+++ b/assets/README_zh.md
@@ -126,13 +126,13 @@ python serve.py  # default settings
 
 如果你使用了不一样的数据集，你可能需要重新训练tokenizer来得到一个不一样的字典。配置好数据集后，可以通过以下命令来训练自己的tokenizer：
 
-1. 在`TexTeller/src/models/ocr_model/tokenizer/train.py`中，修改`new_tokenizer.save_pretrained('./your_dir_name')`为你自定义的输出目录
+1. 在`TexTeller/src/models/tokenizer/train.py`中，修改`new_tokenizer.save_pretrained('./your_dir_name')`为你自定义的输出目录
     > 如果要用一个不一样大小的字典(默认1W个token)，你需要在 `TexTeller/src/models/globals.py`中修改`VOCAB_SIZE`变量
 
 2. **在 `TexTeller/src` 目录下**运行以下命令:
 
     ```bash
-    python -m models.ocr_model.tokenizer.train
+    python -m models.tokenizer.train
     ```
 
 ### Train the model