i6_core.bpe.apply
¶
This is an old location of bpe jobs kept for backwards compatibility, for new setups using the subword-nmt based BPE, please use i6_core.label.bpe, for other setups please switch to the sentencepiece implementation
- class i6_core.bpe.apply.ApplyBPEModelToLexiconJob(*args, **kwargs)¶
Apply BPE codes to a Bliss lexicon file
- Parameters:
bliss_lexicon (Path) –
bpe_codes (Path) –
bpe_vocab (Path|None) –
subword_nmt_repo (Optional[Path]) –
- class i6_core.bpe.apply.ApplyBPEToTextJob(*args, **kwargs)¶
Apply BPE codes on a text file
- Parameters:
text_file – words text file to convert to bpe
bpe_codes – bpe codes file, e.g. ReturnnTrainBpeJob.out_bpe_codes
bpe_vocab – if provided, then merge operations that produce OOV are reverted, use e.g. ReturnnTrainBpeJob.out_bpe_dummy_count_vocab
subword_nmt_repo – subword nmt repository path. see also CloneGitRepositoryJob
gzip_output – use gzip on the output text
mini_task – if the Job should run locally, e.g. only a small (<1M lines) text should be processed