Files
TexTeller/texteller/models/ocr_model/train/__pycache__/train.cpython-310.pyc

16 lines
2.6 KiB
Plaintext
Raw Normal View History

o
U<><55>g<EFBFBD><00>@s<>ddlZddlmZddlmZddlmZddlmZm Z m
Z
m Z m Z ddl mZdd lmZdd
lmZmZmZmZmZdd lmZd d lmZmZmZdd<0F>Zdd<11>Zedkr<>ee <20><01>!<21>j"Z#e<00>$e#<23>ede%e#d<00>d<15>dZ&e&<26>'dd<17><00>Z&e&j(dd<19>Z&e&<26>)<29>Z&e<10>*<2A>Z+eee+d<1A>Z,e&j'e,dd<1C>Z&eee+d<1A>Z-e&j.e-de&j/dd<1E>Z0e0j1ddd <20>Z2e2de2d!Z3Z4e3<65>5e<14>Z3e4<65>5e<15>Z4eee+d<1A>Z6e<10>Z7dZ8d"Z9e8r<38>ee7e+e3e4e6<65>e9r<39>e:e4<65>dkr<>ee7e+e4e6<65>dSdSdSdS)#<23>N)<01>partial)<01>Path)<01> load_dataset)<05>Trainer<65>TrainingArguments<74>Seq2SeqTrainer<65>Seq2SeqTrainingArguments<74>GenerationConfig<69>)<01>CONFIG<49>)<01> TexTeller)<05> tokenize_fn<66>
collate_fn<EFBFBD>img_train_transform<72>img_inf_transform<72> filter_fn)<01> bleu_metric<69>)<03>MAX_TOKEN_SIZE<5A> MIN_WIDTH<54>
MIN_HEIGHTcCs2tdit<01><01>}t||||||d<01>}|jdd<02>dS)N)<04> train_dataset<65> eval_dataset<65> tokenizer<65> data_collator)Zresume_from_checkpoint<6E>)rr r<00>train)<07>modelrrr<00>collate_fn_with_tokenizer<65> training_args<67>trainerrr<00>E/Users/Leehy/Code/TexTeller/texteller/models/ocr_model/train/train.pyrs<06> rc
Cspt<00><01>}d|d<ttdd|j|j|jd<05>}||d<td i|<04><01>}t|||||t t
|d<07>d<08>}|<07> <0B>}t |<08>dS)
NTZpredict_with_generater
F)Zmax_new_tokensZ num_beamsZ do_sample<6C> pad_token_id<69> eos_token_id<69> bos_token_idZgeneration_config<69>r)rrrZcompute_metricsr) r <00>copyr rr#r$r%rrrr<00>evaluate<74>print) rrrrZ eval_configZgenerate_configZseq2seq_configr!Zeval_resrrr"r(&s,<06>
<06>
r(<00>__main__Z imagefolder<65>dataset)Zdata_dircCs|djtko |djtkS)NZimage)Zheightr<00>widthr)<01>xrrr"<00><lambda>Hsr.<00>*)<01>seedr&<00>)<01>num_procT)ZbatchedZremove_columnsr2g<><67><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>?)Z test_sizer0<00>testF);<3B>os<6F> functoolsrZpathlibrZdatasetsrZ transformersrrrrr r r Zmodel.TexTellerr Zutils.functionalrrrrrZ utils.metricsr<00>globalsrrrrr(<00>__name__<5F>__file__Zresolve<76>parentZscript_dirpath<74>chdir<69>strr+<00>filterZshuffleZflatten_indicesZ get_tokenizerrZfilter_fn_with_tokenizerZmap_fn<66>mapZ column_namesZtokenized_datasetZtrain_test_splitZ split_datasetrrZwith_transformrrZ enable_trainZenable_evaluate<74>lenrrrr"<00><module>sR      
  <06> 

  <04>+