i6_core.text.label.subword_nmt.apply
¶
- class i6_core.text.label.subword_nmt.apply.ApplyBPEModelToLexiconJob(*args, **kwargs)¶
Apply BPE codes to a Bliss lexicon file
- Parameters:
bliss_lexicon (Path) –
bpe_codes (Path) –
bpe_vocab (Path|None) –
subword_nmt_repo (Optional[Path]) –
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.label.subword_nmt.apply.ApplyBPEToTextJob(*args, **kwargs)¶
Apply BPE codes on a text file
- Parameters:
text_file – words text file to convert to bpe
bpe_codes – bpe codes file, e.g. ReturnnTrainBpeJob.out_bpe_codes
bpe_vocab – if provided, then merge operations that produce OOV are reverted, use e.g. ReturnnTrainBpeJob.out_bpe_dummy_count_vocab
subword_nmt_repo – subword nmt repository path. see also CloneGitRepositoryJob
gzip_output – use gzip on the output text
mini_task – if the Job should run locally, e.g. only a small (<1M lines) text should be processed
- classmethod hash(parsed_args)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]