Files
TexTeller/texteller/models/ocr_model/utils/__pycache__/functional.cpython-310.pyc

14 lines
2.3 KiB
Plaintext
Raw Normal View History

o
{<7B>g<EFBFBD><00>@sddlZddlmZddlmZmZmZddlmZm Z ddl
m Z m Z m Z dejfd d
<EFBFBD>Zdd eeeefd eeeeffd d<0E>Zdd eeeefd eeeeffdd<10>Zd eeeefd eeeeffdd<12>Zd eeeefd eeeeffdd<14>Zdd efdd<16>ZdS)<18>N)<01>DataCollatorForLanguageModeling)<03>List<73>Dict<63>Any<6E>)<02>train_transform<72>inference_transform<72>)<03>
MIN_HEIGHT<EFBFBD> MIN_WIDTH<54>MAX_TOKEN_SIZE<5A>xcCsXt|j<01>dks Jd<02><01>t<02>|<00>}|dd<00>dd<00>f|dd<00>dd<04>f<||dd<00>df<|S)N<>zx should be 2-dimensionalr<00><><EFBFBD><EFBFBD><EFBFBD>)<04>len<65>shape<70>torch<63> ones_like)r Zpad_valZlefted_x<5F>r<00>J/Users/Leehy/Code/TexTeller/texteller/models/ocr_model/utils/functional.py<70> left_move s

$r<00>samples<65>returncCs0|dusJd<01><01>||ddd<04>}|d|d<|S)N<>tokenizer should not be None<6E> latex_formulaT)<01>return_special_tokens_mask<73>image<67> pixel_valuesr)r<00> tokenizerZtokenized_formularrr<00> tokenize_fns rcCs<>|dusJd<01><01>dd<03>|D<00>}t|dd<05>}||<00>}||d<|<04>d<07>|d<|<04>d <09>|d
<t|d d <0C>|d <tj|dd d<0E>|d<|S)NrcSsg|]}|<01>d<00><01>qS)r)<01>pop)<02>.0<EFBFBD>dicrrr<00>
<listcomp>szcollate_fn.<locals>.<listcomp>F)r<00>mlmr<00> input_ids<64>decoder_input_ids<64>attention_mask<73>decoder_attention_mask<73>labelsi<73><69><EFBFBD><EFBFBD>r)<01>dim)rr rr<00>stack)rrrZ clm_collator<6F>batchrrr<00>
collate_fns r-cC<00>t|d<00>}||d<|S<00>Nr)r<00>rZ processed_imgrrr<00>img_train_transform+<00> r1cCr.r/)rr0rrr<00>img_inf_transform1r2r3cCs8|djtko|djtkot||d<00>d<00>tdkS)Nrrr%<00>
)<06>heightr
<00>widthr rr )<02>samplerrrr<00> filter_fn7s<02>r8)N)r<00> transformersr<00>typingrrr<00>
transformsrr<00>globalsr
r r <00>Tensorr<00>strrr-r1r3<00>boolr8rrrr<00><module>s ,,**