Welcome to the documentation of the i6-core recipes!¶
This is the documentation of the public recipe collection of the RWTH i6 lab for the Sisyphus workflow manager.
The repository is still under construction, so please wait for any updates to the source code and the documentation.
API¶
i6_core.adaptation.ivector
¶
i6_core.adaptation.linear_adaptation_layer
¶
i6_core.adaptation.ubm
¶
i6_core.am.config
¶
i6_core.am.score_features
¶
i6_core.audio.encoding
¶
- class i6_core.audio.encoding.BlissChangeEncodingJob(*args, **kwargs)¶
Uses ffmpeg to convert all audio files of a bliss corpus (file format, encoding, channel layout)
For all parameter holds that “None” means to use the ffmpeg defaults, which depend on the input file and the output format specified.
- Parameters:
corpus_file – bliss corpus
output_format – output file ending to determine container format (without dot)
sample_rate – target sample rate of the audio
codec – specify the codec, codecs are listed with ffmpeg -codecs
codec_options – specify additional codec specific options (be aware of potential conflicts with “fixed bitrate” and “sample_rate”)
fixed_bitrate – a target bitrate (be aware that not all codecs support all bitrates)
force_num_channels – specify the channel number, exceeding channels will be merged
select_channels – tuple of (channel_layout, channel_name), see ffmpeg -layouts this is useful if the new encoding might have an effect on the duration, or if no duration was specified in the source corpus
ffmpeg_binary – path to a ffmpeg binary, uses system “ffmpeg” if None
hash_binary – In some cases it might be required to work with a specific ffmpeg version, in which case the binary needs to be hashed
recover_duration – This will open all files with “soundfile” and extract the length information. There might be minimal differences when converting the encoding, so only set this to False if you’re willing to accept this risk. None (default) means that the duration is recovered if either output_format or codec is specified because this might possibly lead to duration mismatches.
in_codec – specify the codec of the input file
in_codec_options – specify additional codec specific options for the in_codec
i6_core.audio.ffmpeg
¶
- class i6_core.audio.ffmpeg.BlissFfmpegJob(*args, **kwargs)¶
Applies an FFMPEG audio filter to all recordings of a bliss corpus. This Job is extremely generic, as any valid audio option/filter string will work. Please consider using more specific jobs that use this Job as super class, see e.g. BlissChangeEncodingJob
- WARNING:
This job assumes that file names of individual recordings are unique across the whole corpus.
Do not change the duration of the audio files when you have multiple segments per audio, as the segment information will be incorrect afterwards.
Typical applications:
Changing Audio Format/Encoding
specify in output_format what container you want to use. If the filter string is empty (“”), ffmepg will automatically use a default encoding option
specify specific encoding with -c:a <codec>. For a list of available codecs and their options see https://ffmpeg.org/ffmpeg-codecs.html#Audio-Encoders
specify a fixed bitrate with -b:a <bit_rate>, e.g. 64k. Variable bitrate options depend on the used encoder, refer to the online documentation in this case
specify a sample rate with -ar <sample_rate>. FFMPEG will do proper resampling, so the speed of the audio is NOT changed.
Changing Channel Layout
for detailed informations see https://trac.ffmpeg.org/wiki/AudioChannelManipulation
convert to mono -ac 1
selecting a specific audio channel: -filter_complex [0:a]channelsplit=channel_layout=stereo:channels=FR[right] -map [right] For a list of channels/layouts use ffmpeg -layouts
Simple Filter Syntax
For a list of available filters see: https://ffmpeg.org/ffmpeg-filters.html
-af <filter_name>=<first_param>=<first_param_value>:<second_param>=<second_param_value>
Complex Filter Syntax
-filter_complex [<input>]<simple_syntax>[<output>];[<input>]<simple_syntax>[<output>];…
Inputs and outputs can be namend arbitrarily, but the default stream 0 audio can be accessed with [0:a]
The output stream that should be written into the audio is defined with -map [<output_stream>]
IMPORTANT! Do not forget to add and escape additional quotation marks correctly for parameters to`-af` or -filter_complex
- Parameters:
corpus_file – bliss corpus
ffmpeg_options – list of additional ffmpeg parameters
recover_duration – if the filter changes the duration of the audio, set to True
output_format – output file ending to determine container format (without dot)
ffmpeg_binary – path to a ffmpeg binary, uses system “ffmpeg” if None
hash_binary – In some cases it might be required to work with a specific ffmpeg version, in which case the binary needs to be hashed
ffmpeg_input_options – list of ffmpeg parameters thare are applied for reading the input files
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- run_recover_duration()¶
Open all files with “soundfile” and extract the length information
- Returns:
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.bpe.apply
¶
This is an old location of bpe jobs kept for backwards compatibility, for new setups using the subword-nmt based BPE, please use i6_core.label.bpe, for other setups please switch to the sentencepiece implementation
- class i6_core.bpe.apply.ApplyBPEModelToLexiconJob(*args, **kwargs)¶
Apply BPE codes to a Bliss lexicon file
- Parameters:
bliss_lexicon (Path) –
bpe_codes (Path) –
bpe_vocab (Path|None) –
subword_nmt_repo (Optional[Path]) –
- class i6_core.bpe.apply.ApplyBPEToTextJob(*args, **kwargs)¶
Apply BPE codes on a text file
- Parameters:
text_file – words text file to convert to bpe
bpe_codes – bpe codes file, e.g. ReturnnTrainBpeJob.out_bpe_codes
bpe_vocab – if provided, then merge operations that produce OOV are reverted, use e.g. ReturnnTrainBpeJob.out_bpe_dummy_count_vocab
subword_nmt_repo – subword nmt repository path. see also CloneGitRepositoryJob
gzip_output – use gzip on the output text
mini_task – if the Job should run locally, e.g. only a small (<1M lines) text should be processed
i6_core.bpe.train
¶
This is an old location of bpe jobs kept for backwards compatibility, for new setups using the subword-nmt based BPE, please use i6_core.label.bpe, for other setups please switch to the sentencepiece implementation
- class i6_core.bpe.train.ReturnnTrainBpeJob(*args, **kwargs)¶
Create Bpe codes and vocab files compatible with RETURNN BytePairEncoding Repository:
- Parameters:
text_file – corpus text file, .gz compressed or uncompressed
bpe_size (int) – number of BPE merge operations
unk_label (str) – unknown label
subword_nmt_repo (Path|None) – subword nmt repository path. see also CloneGitRepositoryJob
- class i6_core.bpe.train.TrainBPEModelJob(*args, **kwargs)¶
Create a bpe codes file using the official subword-nmt repo, either installed from pip or https://github.com/rsennrich/subword-nmt
- Parameters:
text_corpus (Path) –
symbols (int) –
min_frequency (int) –
dict_input (bool) –
total_symbols (bool) –
subword_nmt_repo (Optional[Path]) –
i6_core.cart.estimate
¶
- class i6_core.cart.estimate.AccumulateCartStatisticsJob(*args, **kwargs)¶
Goes over all training data and for each triphone state accumulates the values and squared values of the given feature flow
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
alignment_flow (rasr.flow.FlowNetwork) –
keep_accumulators (bool) –
extra_config_accumulate (rasr.config.RasrConfig) –
extra_post_config_accumulate (rasr.config.RasrConfig) –
extra_config_merge (rasr.config.RasrConfig) –
extra_post_config_merge (rasr.config.RasrConfig) –
- accumulate(task_id)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_accumulate_config(crp, alignment_flow, extra_config_accumulate, extra_post_config_accumulate, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
alignment_flow (rasr.flow.FlowNetwork) –
extra_config_accumulate (rasr.config.RasrConfig) –
extra_post_config_accumulate (rasr.config.RasrConfig) –
kwargs –
- Returns:
- Return type:
- create_files()¶
- classmethod create_merge_config(crp, extra_config_merge, extra_post_config_merge, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
extra_config_merge (rasr.config.RasrConfig) –
extra_post_config_merge (rasr.config.RasrConfig) –
kwargs –
- Returns:
- Return type:
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- merge()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.cart.estimate.EstimateCartJob(*args, **kwargs)¶
This job estimates a phonetic decision tree. Given a set of accumulated (squared) feature values a single gaussian model is estimated per triphone state. Then iteratively states are merged according to the provided questions such that the log-likelihood of the resulting models is minimized. Finally states which have a low number of occurrences are merged into the closest cluster.
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
questions (Path|BasicCartQuestions|str) – Either a Path to a questions.xml file, a question object or simply a str
cart_examples (Path) –
variance_clipping (float) –
generate_cluster_file (bool) –
extra_config (rasr.config.RasrConfig) –
extra_post_config (rasr.config.RasrConfig) –
- cleanup_before_run(*args)¶
- classmethod create_config(crp, questions, cart_examples, variance_clipping, generate_cluster_file, extra_config, extra_post_config, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
questions (Path|BasicCartQuestions|str) –
cart_examples (Path) –
variance_clipping (float) –
generate_cluster_file (bool) –
extra_config (rasr.config.RasrConfig) –
extra_post_config (rasr.config.RasrConfig) –
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.cart.questions
¶
- class i6_core.cart.questions.BasicCartQuestions(phoneme_path, max_leaves, min_obs)¶
- get_questions()¶
- load_phonemes_from_file()¶
- write_to_file(file)¶
- class i6_core.cart.questions.BeepCartQuestions(include_central_phoneme=True, *args, **kwargs)¶
- get_questions()¶
i6_core.corpus.convert
¶
- class i6_core.corpus.convert.CorpusReplaceOrthFromReferenceCorpus(*args, **kwargs)¶
Copies the orth tag from one corpus to another through matching segment names.
- Parameters:
bliss_corpus – Corpus in which the orth tag is to be replaced
reference_bliss_corpus – Corpus from which the orth tag replacement is taken
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.convert.CorpusReplaceOrthFromTxtJob(*args, **kwargs)¶
Merge raw text back into a bliss corpus
- Parameters:
bliss_corpus (Path) – Bliss corpus
text_file (Path) – a raw or gzipped text file
segment_file (Path|None) – only replace the segments as specified in the segment file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.convert.CorpusToStmJob(*args, **kwargs)¶
Convert a Bliss corpus into a .stm file
- Parameters:
bliss_corpus – Path to Bliss corpus
exclude_non_speech – non speech tokens should be removed
non_speech_tokens – defines the list of non speech tokens
remove_punctuation – should punctuation be removed
punctuation_tokens – defines list/string of punctuation tokens
fix_whitespace – should white space be fixed. !!!be aware that the corpus loading already fixes white space!!!
name – new corpus name
tag_mapping – 3-string tuple contains (“short name”, “long name”, “description”) of each tag. and the Dict[int, tk.Path] is e.g. the out_single_segment_files of a FilterSegments*Jobs
- classmethod replace_recursive(orthography, token)¶
recursion is required to find repeated tokens string.replace is not sufficient some other solution might also work
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.convert.CorpusToTextDictJob(*args, **kwargs)¶
Extract the Text from a Bliss corpus to fit a “{key: text}” structure (e.g. for RETURNN)
- Parameters:
bliss_corpus (Path) – bliss corpus file
segment_file (Path|None) – a segment file as optional whitelist
invert_match (bool) – use segment file as blacklist (needs to contain full segment names then)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.convert.CorpusToTxtJob(*args, **kwargs)¶
Extract orth from a Bliss corpus and store as raw txt or gzipped txt
- Parameters:
bliss_corpus (Path) – Bliss corpus
segment_file (Path) – segment file
gzip (bool) – gzip the output text file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.costa
¶
- class i6_core.corpus.costa.CostaJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, eval_recordings, eval_lm, extra_config, extra_post_config)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.data_augmentation
¶
- class i6_core.corpus.data_augmentation.ChangeCorpusSpeedJob(*args, **kwargs)¶
Changes the speed of all audio files in the corpus (shifting time AND frequency)
- Parameters:
bliss_corpus (Path) – Bliss corpus
corpus_name (str) – name of the new corpus
speed_factor (float) – relative speed factor
base_frequency (int) – sampling rate of the audio files
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.data_augmentation.SelfNoiseCorpusJob(*args, **kwargs)¶
Add noise to each recording in the corpus. The noise consists of audio data from other recordings in the corpus and is reduced by the given SNR. Only supports .wav files
WARNING: This Job uses /dev/shm for performance reasons, please be cautious
- Parameters:
bliss_corpus (Path) – Bliss corpus with wav files
snr (float) – signal to noise ratio in db, positive values only
corpus_name (str) – name of the new corpus
n_noise_tracks (int) – number of random (parallel) utterances to add
seed (int) – seed for random utterance selection
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.filter
¶
- class i6_core.corpus.filter.FilterCorpusBySegmentDurationJob(*args, **kwargs)¶
- Parameters:
bliss_corpus – path of the corpus file
min_duration – minimum duration for a segment to keep (in seconds)
max_duration – maximum duration for a segment to keep (in seconds)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.filter.FilterCorpusBySegmentsJob(*args, **kwargs)¶
- Parameters:
bliss_corpus –
segment_file – a single segment file or a list of segment files
compressed –
invert_match –
delete_empty_recordings – if true, empty recordings will be removed
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.filter.FilterCorpusRemoveUnknownWordSegmentsJob(*args, **kwargs)¶
Filter segments of a bliss corpus if there are unknowns with respect to a given lexicon
- Parameters:
bliss_corpus –
bliss_lexicon –
case_sensitive – consider casing for check against lexicon
all_unknown – all words have to be unknown in order for the segment to be discarded
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.filter.FilterSegmentsByAlignmentConfidenceJob(*args, **kwargs)¶
- Parameters:
alignment_logs – alignment_job.out_log_file; task_id -> log_file
percentile – percent of alignment segments to keep. should be in (0,100]. for
np.percentile()
crp – used to set the number of output segments. if none, number of alignment log files is used instead.
plot – plot the distribution of alignment scores
absolute_threshold – alignments with score above this number are discarded
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.filter.FilterSegmentsByListJob(*args, **kwargs)¶
Filters segment list file using a given list of segments, which is either used as black or as white list :param segment_files: original segment list files to be filtered :param filter_list: list used for filtering or a path to a text file with the entries of that list one per line :param invert_match: black list (if False) or white list (if True) usage
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.filter.FilterSegmentsByRegexJob(*args, **kwargs)¶
Filters segment list file using a given regular expression :param segment_files: original segment list files to be filtered :param filter_regex: regex used for filtering :param invert_match: keep segment if regex does not match (if False) or does match (if True)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.segments
¶
- class i6_core.corpus.segments.DynamicSplitSegmentFileJob(*args, **kwargs)¶
Split the segments to concurrent many shares. It is a variant to the existing SplitSegmentFileJob. This requires a tk.Delayed variable (instead of int) for the argument concurrent.
- Parameters:
segment_file (tk.Path|str) – segment file
concurrent (tk.Delayed) – number of splits
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SegmentCorpusByRegexJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SegmentCorpusBySpeakerJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SegmentCorpusJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.ShuffleAndSplitSegmentsJob(*args, **kwargs)¶
- default_split = {'dev': 0.1, 'train': 0.9}¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SortSegmentsByLengthAndShuffleJob(*args, **kwargs)¶
- Parameters:
crp – rasr.crp.CommonRasrParameters
shuffle_strength – float in [0,inf) determines how much the length should affect sorting 0 -> completely random; inf -> strictly sorted
shuffle_seed – random number seed
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SplitSegmentFileJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.UpdateSegmentsWithSegmentMapJob(*args, **kwargs)¶
Update a segment file with a segment mapping file (e.g. from corpus compression)
- Parameters:
segment_file (Path) – path to the segment text file (uncompressed)
segment_map (Path) – path to the segment map (gz or uncompressed)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.speaker
¶
- class i6_core.corpus.speaker.CorpusAddSpeakerTagsFromMappingJob(*args, **kwargs)¶
Adds speaker tags from given mapping defined by dictonary to corpus
- Parameters:
corpus – Corpus to add tags to
mapping – pickled dictionary that defines a mapping corpus fullname -> speaker id
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.speaker.CorpusRemoveSpeakerTagsJob(*args, **kwargs)¶
Remove speaker tags from given corpus
- Parameters:
corpus – Corpus to remove the tags from
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.stats
¶
- class i6_core.corpus.stats.CountCorpusWordFrequenciesJob(*args, **kwargs)¶
Extracts a list of words and their counts in the provided bliss corpus
- Parameters:
bliss_corpus (Path) – path to corpus file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.stats.ExtractOovWordsFromCorpusJob(*args, **kwargs)¶
Extracts the out of vocabulary words based on a given corpus and lexicon
- Parameters:
bliss_corpus (Union[Path, str]) – path to corpus file
bliss_lexicon (Union[Path, str]) – path to lexicon
casing (str) – changes the casing of the orthography (options: upper, lower, none) str.upper() is problematic for german since ß -> SS https://bugs.python.org/issue34928
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.corpus.transform
¶
- class i6_core.corpus.transform.AddCacheToCorpusJob(*args, **kwargs)¶
Adds cache manager call to all audio paths in a corpus file :param Path bliss_corpus: bliss corpora file path
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.transform.ApplyLexiconToCorpusJob(*args, **kwargs)¶
Use a bliss lexicon to convert all words in a bliss corpus into their phoneme representation.
Currently only supports picking the first phoneme.
- Parameters:
bliss_corpus (Path) – path to a bliss corpus xml
bliss_lexicon (Path) – path to a bliss lexicon file
word_separation_orth (str|None) – a default word separation lemma orth. The corresponding phoneme (or phonemes in some special cases) are inserted between each word. Usually it makes sense to use something like “[SILENCE]” or “[space]” or so).
strategy (LexiconStrategy) – strategy to determine which representation is selected
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.transform.CompressCorpusJob(*args, **kwargs)¶
Compresses a corpus by concatenating audio files and using a compression codec. Does currently not support corpora with subcorpora, files need to be .wav :param Path bliss_corpus: path to an xml corpus file with wave recordings :param str format: supported file formats, currently limited to mp3 :param str bitrate: bitrate as string, e.g. ‘32k’ or ‘192k’, can also be an integer e.g. 192000 :param int max_num_splits: maximum number of resulting audio files.
- add_duration_to_recordings(c)¶
open each recording, extract the duration and add the duration to the recording object # TODO: this is a lengthy operation, but so far there was no alternative… :param corpus.Corpus c: :return:
- info()¶
read the log.run file to extract the current status of the compression job :return:
- run()¶
- run_ffmpeg(ffmpeg_inputs, output_path)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.transform.MergeCorporaJob(*args, **kwargs)¶
Merges Bliss Corpora files into a single file as subcorpora or flat
- Parameters:
bliss_corpora (Iterable[Path]) – any iterable of bliss corpora file paths to merge
name (str) – name of the new corpus (subcorpora will keep the original names)
merge_strategy (MergeStrategy) – how the corpora should be merged, e.g. as subcorpora or flat
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.transform.MergeCorpusSegmentsAndAudioJob(*args, **kwargs)¶
This job merges segments and audio files based on a rasr cluster map and a list of cluster_names. The cluster map should map segments to something like cluster.XXX where XXX is a natural number (starting with 1). The lines in the cluster_names file will be used as names for the recordings in the new corpus.
The job outputs a new corpus file + the corresponding audio files.
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.transform.MergeStrategy(value)¶
An enumeration.
- CONCATENATE = 2¶
- FLAT = 1¶
- SUBCORPORA = 0¶
- class i6_core.corpus.transform.ReplaceTranscriptionFromCtmJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.transform.ShiftCorpusSegmentStartJob(*args, **kwargs)¶
Shifts the start time of a corpus to change the fft window offset
- Parameters:
bliss_corpus (Path) – path to a bliss corpus file
corpus_name (str) – name of the new corpus
shift (int) – shift in seconds
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.datasets.huggingface
¶
https://huggingface.co/docs/datasets/
- class i6_core.datasets.huggingface.DownloadAndPrepareHuggingFaceDatasetJob(*args, **kwargs)¶
https://huggingface.co/docs/datasets/ https://huggingface.co/datasets
pip install datasets
Basically wraps
datasets.load_dataset(...).save_to_disk(out_dir)
.Example for Librispeech:
DownloadAndPrepareHuggingFaceDatasetJob(“librispeech_asr”, “clean”) https://github.com/huggingface/datasets/issues/4179
- Parameters:
path – Path or name of the dataset, parameter passed to Dataset.load_dataset
name – Name of the dataset configuration, parameter passed to Dataset.load_dataset
data_files – Path(s) to the source data file(s), parameter passed to Dataset.load_dataset
revision – Version of the dataset script, parameter passed to Dataset.load_dataset
time_rqmt (float) –
mem_rqmt (float) –
cpu_rqmt (int) –
mini_task (bool) – the job should be run as mini_task
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.datasets.librispeech
¶
- class i6_core.datasets.librispeech.DownloadLibriSpeechCorpusJob(*args, **kwargs)¶
Download a part of the LibriSpeech corpus from https://www.openslr.org/resources/12 and checks for file integrity via md5sum
(see also: https://www.openslr.org/12/)
To get the corpus metadata, use DownloadLibriSpeechMetadataJob
self.out_corpus_folder links to the root of the speaker_id/chapter/* folder structure
- Parameters:
corpus_key (str) – corpus identifier, e.g. “train-clean-100”
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.librispeech.DownloadLibriSpeechMetadataJob(*args, **kwargs)¶
Downloads the metadata file and checks for md5sum integrity
Defines outputs for “SPEAKERS.TXT, CHAPTERS.TXT and BOOKS.TXT”
- Parameters:
corpus_key (str) – corpus identifier, e.g. “train-clean-100”
- class i6_core.datasets.librispeech.LibriSpeechCreateBlissCorpusJob(*args, **kwargs)¶
Creates a Bliss corpus from a LibriSpeech corpus folder using the speaker information in addition
Outputs a single bliss .xml.gz file
- Parameters:
corpus_folder (Path) – Path to a LibriSpeech corpus folder
speaker_metadata (Path) – Path to SPEAKER.TXT file from the MetdataJob (out_speakers)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.datasets.ljspeech
¶
- class i6_core.datasets.ljspeech.DownloadLJSpeechCorpusJob(*args, **kwargs)¶
Downloads, checks and extracts the LJSpeech corpus.
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.ljspeech.LJSpeechCreateBlissCorpusJob(*args, **kwargs)¶
Generate a Bliss xml from the downloaded LJspeech dataset
- Parameters:
metadata (Path) – path to metadata.csv
audio_folder (Path) – path to the wavs folder
name – overwrite default corpus name
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.datasets.switchboard
¶
Switchboard is conversational telephony speech with 8 Khz audio files. The training data consists of 300h hours. Reference: https://catalog.ldc.upenn.edu/LDC97S62
number of recordings: 4876 number of segments: 249624 number of speakers: 2260
- class i6_core.datasets.switchboard.CreateFisherTranscriptionsJob(*args, **kwargs)¶
Create the compressed text data based on the fisher transcriptions which can be used for LM training
Part 1: https://catalog.ldc.upenn.edu/LDC2004T19 Part 2: https://catalog.ldc.upenn.edu/LDC2005T19
- Parameters:
fisher_transcriptions1_folder – path to unpacked LDC2004T19.tgz, usually named fe_03_p1_tran
fisher_transcriptions2_folder – path to unpacked LDC2005T19.tgz, usually named fe_03_p2_tran
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateHub5e00CorpusJob(*args, **kwargs)¶
Creates the switchboard hub5e_00 corpus based on LDC2002S09 No speaker information attached
- Parameters:
wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S09.tgz
hub5_transcriptions – extracted LDC2002T43.tgz named “2000_hub5_eng_eval_tr”
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateHub5e01CorpusJob(*args, **kwargs)¶
Creates the switchboard hub5e_01 corpus based on LDC2002S13
This corpus provides no glm, as the same as for Hub5e00 should be used
No speaker information attached
- Parameters:
wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S13.tgz
hub5e01_folder – extracted LDC2002S13 named “hub5e_01”
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateLDCSwitchboardSpeakerListJob(*args, **kwargs)¶
This creates the speaker list according to the conversation and speaker table from the LDC documentation: https://catalog.ldc.upenn.edu/docs/LDC97S62
- The resulting file contains 520 speakers in the format of:
<speaker_id> <gender> <recording>
- Parameters:
caller_tab_file – caller_tab.csv from the Switchboard LDC documentation
conv_tab_file – conv_tab.csv from the Switchboard LDC documentation
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateRT03sCTSCorpusJob(*args, **kwargs)¶
Create the RT03 test set corpus, specifically the “CTS” subset of LDC2007S10
No speaker information attached
- Parameters:
wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2007S10.tgz
rt03_folder – extracted LDC2007S10.tgz
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateSwitchboardBlissCorpusJob(*args, **kwargs)¶
Creates Switchboard bliss corpus xml
segment name format: sw2001B-ms98-a-<folder-name>
- Parameters:
audio_dir (tk.Path) – path for audio data
trans_dir (tk.Path) – path for transcription data. see DownloadSwitchboardTranscriptionAndDictJob
speakers_list_file (tk.Path) –
- path to a speakers list text file with format:
speaker_id gender recording<channel>, e.g. 1005 F 2452A
on each line. see CreateSwitchboardSpeakersListJob job
skip_empty_ldc_file (bool) – In the original corpus the sequence 2167B is mostly empty, thus exclude it from training (recommended, GMM will fail otherwise)
lowercase (bool) – lowercase the transcriptions of the corpus (recommended)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateSwitchboardLexiconTextFileJob(*args, **kwargs)¶
This job creates SWB preprocessed dictionary text file consistent with the training corpus given a raw dictionary text file downloaded within the transcription directory using DownloadSwitchboardTranscriptionAndDictJob Job. The resulted dictionary text file will be passed as argument to LexiconFromTextFileJob job in order to create bliss xml lexicon.
- Parameters:
raw_dict_file (tk.Path) – path containing the raw dictionary text file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateSwitchboardSpeakersListJob(*args, **kwargs)¶
- Given some speakers statistics info, this job creates a text file having on each line:
speaker_id gender recording
- Parameters:
speakers_stats_file (tk.Path) – speakers stats text file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.CreateSwitchboardSpokenFormBlissCorpusJob(*args, **kwargs)¶
Creates a special spoken form version of switchboard-1 used for e.g. BPE or Sentencepiece based models. It includes:
make sure everything is lowercased
conversion of numbers to written form (using a given conversion table)
conversion of some short forms into spoken forms (also using the table)
making special tokens uppercase again
- Parameters:
switchboard_bliss_corpus – out_corpus of CreateSwitchboardBlissCorpusJob
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.DownloadSwitchboardSpeakersStatsJob(*args, **kwargs)¶
Note that this does not contain the speaker info for all recordings. We assume later that each recording has a unique speaker and a unique id is used for those recordings with unknown speakers info
- Parameters:
url (str) –
target_filename (str|None) – explicit output filename, if None tries to detect the filename from the url
checksum (str|None) – A sha256 checksum to verify the file
- classmethod hash(parsed_args)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- class i6_core.datasets.switchboard.DownloadSwitchboardTranscriptionAndDictJob(*args, **kwargs)¶
Downloads switchboard training transcriptions and dictionary (or lexicon)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.switchboard.SwitchboardSphereToWaveJob(*args, **kwargs)¶
Takes an audio folder from one of the switchboard LDC folders and converts dual channel .sph files with mulaw encoding to single channel .wav files with s16le encoding
- Parameters:
sph_audio_folder –
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.datasets.tedlium2
¶
- class i6_core.datasets.tedlium2.CreateTEDLIUM2BlissCorpusJob(*args, **kwargs)¶
Processes stm files from TEDLIUM2 corpus folders and creates Bliss corpus files Outputs a stm file and a bliss .xml.gz file for each train/dev/test set
- Parameters:
{corpus_key (Dict) – Path} corpus_folders:
- load_stm_data(stm_file)¶
:param str stm_file
- make_corpus()¶
create bliss corpus from stm file (always include speakers)
- make_stm()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.datasets.tedlium2.DownloadTEDLIUM2CorpusJob(*args, **kwargs)¶
Download full TED-LIUM Release 2 corpus from https://projets-lium.univ-lemans.fr/wp-content/uploads/corpus/TED-LIUM/ (all train/dev/test/LM/dictionary data included)
- process_dict()¶
minor modification on the dictionary (see comments)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.datasets.tf_datasets
¶
This module adds jobs for TF datasets, as documented here: https://www.tensorflow.org/datasets
- class i6_core.datasets.tf_datasets.DownloadAndPrepareTfDatasetJob(*args, **kwargs)¶
This job downloads and prepares a TF dataset. The processed files are stored in a data_dir folder, from where it can be loaded again (see https://www.tensorflow.org/datasets/overview#load_a_dataset)
Install the dependencies:
pip install tensorflow-datasets
It further needs some extra dependencies, for example for ‘librispeech’:
pip install apache_beam pip install pydub # ffmpeg installed
See here for some more: https://github.com/tensorflow/datasets/blob/master/setup.py
Also maybe:
pip install datasets # for Huggingface community datasets
- Parameters:
dataset_name – Name of the dataset in the official TF catalog or community catalog. Available datasets can be found here: https://www.tensorflow.org/datasets/overview https://www.tensorflow.org/datasets/catalog/overview https://www.tensorflow.org/datasets/community_catalog/huggingface
max_simultaneous_downloads – simultaneous downloads for tfds.load, some datasets might not work with the internal defaults, so use e.g. 1 in the case of librispeech. (https://github.com/tensorflow/datasets/issues/3885)
max_workers – max workers for download extractor and Apache Beam, the default (cpu core count) might cause high memory load, so reduce this to a number smaller than the number of cores. (https://github.com/tensorflow/datasets/issues/3887)
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.deprecated.returnn_extract_prior
¶
i6_core.deprecated.returnn_search
¶
- class i6_core.deprecated.returnn_search.ReturnnSearchJob(*args, **kwargs)¶
Given a model checkpoint, run search task with RETURNN
- Parameters:
search_data (dict[str]) – dataset used for search
model_checkpoint (Checkpoint) – TF model checkpoint. see ReturnnTrainingJob.
returnn_config (ReturnnConfig) – object representing RETURNN config
output_mode (str) – “txt” or “py”
log_verbosity (int) – RETURNN log verbosity
device (str) – RETURNN device, cpu or gpu
time_rqmt (float|int) – job time requirement in hours
mem_rqmt (float|int) – job memory requirement in GB
cpu_rqmt (float|int) – job cpu requirement in GB
returnn_python_exe (tk.Path|str|None) – path to the RETURNN executable (python binary or launch script)
returnn_root (tk.Path|str|None) – path to the RETURNN src folder
- create_files()¶
- classmethod create_returnn_config(search_data, model_checkpoint, returnn_config, output_mode, log_verbosity, device, **kwargs)¶
Creates search RETURNN config :param dict[str] search_data: dataset used for search :param Checkpoint model_checkpoint: TF model checkpoint. see ReturnnTrainingJob. :param ReturnnConfig returnn_config: object representing RETURNN config :param str output_mode: “txt” or “py” :param int log_verbosity: RETURNN log verbosity :param str device: RETURNN device, cpu or gpu :rtype: ReturnnConfig
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.discriminative_training.lattice_generation
¶
i6_core.features.common
¶
- i6_core.features.common.add_derivatives(feature_net, derivatives=1)¶
- i6_core.features.common.add_linear_transform(feature_net, matrix_path)¶
- i6_core.features.common.basic_cache_flow(cache_files)¶
- i6_core.features.common.cepstrum_flow(normalize=True, outputs=16, add_epsilon=False, epsilon=1.175494e-38)¶
- i6_core.features.common.external_file_feature_flow(flow_file)¶
- i6_core.features.common.feature_extraction_cache_flow(feature_net, port_name_mapping, one_dimensional_outputs=None)¶
- Parameters:
feature_net (rasr.FlowNetwork) – feature flow to extract features from
port_name_mapping (dict[str,str]) – maps output ports to names of the cache files
one_dimensional_outputs (set[str]|None) – output ports that return one-dimensional features (e.g. energy)
- Return type:
rasr.FlowNetwork
- i6_core.features.common.fft_flow(preemphasis=1.0, window_type='hamming', window_shift=0.01, window_length=0.025)¶
- i6_core.features.common.make_first_feature_energy(feature_net)¶
- i6_core.features.common.normalize_features(feature_net, length='infinite', right='infinite', norm_type='mean-and-variance')¶
Add normalization of the specfified type to the feature flow :param feature_net rasr.FlowNetwork: the unnormalized flow network, must have an output named ‘features’ :param length int|str: length of the normalization window in frames (or ‘infinite’) :param right int|str: number of frames right of the current position in the normalization window (can also be ‘infinite’) :param norm_type str: type of normalization, possible values are ‘level’, ‘mean’, ‘mean-and-variance’, ‘mean-and-variance-1D’, ‘divide-by-mean’, ‘mean-norm’ :returns rasr.FlowNetwork: input FlowNetwork with a signal-normalization node before the output
- i6_core.features.common.raw_audio_flow(audio_format='wav')¶
- i6_core.features.common.samples_flow(audio_format='wav', dc_detection=True, dc_params={'max-dc-increment': 0.9, 'min-dc-length': 0.01, 'min-non-dc-segment-length': 0.021}, input_options=None, scale_input=None)¶
Create a flow to read samples from audio files, convert it to f32 and apply optional dc-detection.
Files that do not have a native input node will be opened with the ffmpeg flow node. Please check if scaling is needed.
- Native input formats are:
wav
nist
flac
mpeg (mp3)
gsm
htk
phondat
oss
For more information see: https://www-i6.informatik.rwth-aachen.de/rwth-asr/manual/index.php/Audio_Nodes
- Parameters:
audio_format (str) – the input audio format
dc_detection (bool) – enable dc-detection node
dc_params (dict) – optional dc-detection node parameters
input_options (dict) – additional options for the input node
scale_input (int|float|None) – scale the waveform samples, this might be needed to scale ogg inputs by 2**15 to support feature flows designed for 16-bit wav inputs
- Returns:
- i6_core.features.common.select_features(feature_net, select_range)¶
- i6_core.features.common.sync_energy_features(feature_net, energy_net)¶
- i6_core.features.common.sync_features(feature_net, target_net, feature_output='features', target_output='features')¶
i6_core.features.energy
¶
- i6_core.features.energy.EnergyJob(crp: CommonRasrParameters, energy_options: Optional[Dict[str, Any]] = None, **kwargs) FeatureExtractionJob ¶
- i6_core.features.energy.energy_flow(without_samples: bool = False, samples_options: Optional[Dict[str, Any]] = None, fft_options: Optional[Dict[str, Any]] = None, normalization_type: str = 'divide-by-mean') FlowNetwork ¶
- Parameters:
without_samples –
samples_options – arguments to
sample_flow()
fft_options – arguments to
fft_flow()
normalization_type (str) –
i6_core.features.extraction
¶
- class i6_core.features.extraction.FeatureExtractionJob(*args, **kwargs)¶
Runs feature extraction of a given corpus into cache files
The cache files can be accessed as bundle Path (out_feature_bundle) or as MultiOutputPath (out_feature_path)
- Parameters:
crp (rasr.crp.CommonRasrParameters) – common RASR parameters
feature_flow (rasr.flow.FlowNetwork) – feature flow for feature foraging
port_name_mapping (dict[str,str]) – mapping between output ports (key) and name of the features (value)
one_dimensional_outputs (set[str]|None) – set of output ports with one dimensional features
job_name (str) – name used in sisyphus visualization and job folder name
rtf (float) – real-time-factor of the feature-extraction
mem (int) – memory required for the job
parallel (int) – maximum number of parallely running tasks
indirect_write (bool) – if true will write to temporary directory first before copying to output folder
extra_config (rasr.config.RasrConfig|None) – additional RASR config merged into the final config
extra_post_config (rasr.config.RasrConfig|None) – additional RASR config that will not be part of the hash
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, feature_flow, extra_config, extra_post_config, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
feature_flow –
extra_config (rasr.config.RasrConfig|None) –
extra_post_config (rasr.config.RasrConfig|None) –
- Returns:
config, post_config
- Return type:
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.features.filterbank
¶
- i6_core.features.filterbank.FilterbankJob(crp, filterbank_options=None, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
filterbank_options (dict[str, Any]|None) –
- Returns:
Feature extraction job with filterbank flow
- Return type:
- i6_core.features.filterbank.filter_width_from_channels(channels, warping_function='mel', f_max=8000, f_min=0)¶
- Per default we use FilterBank::stretchToCover, it computes it number of filters:
number_of_filters = (maximumFrequency_ - minimumFrequency_ - filterWidth_) / spacing_ + 1));
- Parameters:
channels (int) – Number of channels of the filterbank
warping_function (str) – Warping function used by the filterbank. [‘mel’, ‘bark’]
f_max (float) – Filters are placed only below this frequency in Hz. The physical maximum is half of the audio sample rate, but lower values make possibly more sense.
f_min (float) – Filters are placed only over this frequency in Hz
- Returns:
filter-width
:rtype float
- i6_core.features.filterbank.filterbank_flow(warping_function='mel', filter_width=70, normalize=True, normalization_options=None, without_samples=False, samples_options=None, fft_options=None, apply_log=False, add_epsilon=False, add_features_output=False)¶
- Parameters:
warping_function (str) – “mel” or “bark”
filter_width (int) – filter width in Hz. Please use
filter_width_from_channels()
to get N filters.normalize (bool) – add a final signal-normalization node
normalization_options (dict[str, Any]|None) – option dict for signal-normalization flow node
without_samples (bool) – creates the flow network without a sample flow, but expects “samples” as input
samples_options (dict[str, Any]|None) – parameter dict for
samples_flow()
fft_options (dict[str, Any]|None) – parameter dict for
fft_flow()
apply_log (bool) – adds a logarithm before normalization
add_epsilon (bool) – if a logarithm should be applied, add a small epsilon to prohibit zeros
add_features_output (bool) – Add the output port “features”. This should be set to True, default is False to not break existing hash.
- Returns:
filterbank flow network
- Return type:
rasr.FlowNetwork
i6_core.features.gammatone
¶
- i6_core.features.gammatone.GammatoneJob(crp: CommonRasrParameters, gt_options: Optional[Dict[str, Any]] = None, **kwargs) FeatureExtractionJob ¶
- i6_core.features.gammatone.gammatone_flow(minfreq: int = 100, maxfreq: int = 7500, channels: int = 68, warp_freqbreak: Optional[int] = None, tempint_type: str = 'hanning', tempint_shift: float = 0.01, tempint_length: float = 0.025, flush_before_gap: bool = True, do_specint: bool = True, specint_type: str = 'hanning', specint_shift: int = 4, specint_length: int = 9, normalize: bool = True, preemphasis: bool = True, legacy_scaling: bool = False, without_samples: bool = False, samples_options: Optional[Dict[str, Any]] = None, normalization_options: Optional[Dict[str, Any]] = None, add_features_output: bool = False) FlowNetwork ¶
- Parameters:
minfreq –
maxfreq –
channels –
warp_freqbreak –
tempint_type –
tempint_shift –
tempint_length –
flush_before_gap –
do_specint –
specint_type –
specint_shift –
specint_length –
normalize –
preemphasis –
legacy_scaling –
without_samples –
samples_options – arguments to
sample_flow()
normalization_options –
add_features_output –
i6_core.features.mfcc
¶
- i6_core.features.mfcc.MfccJob(crp: CommonRasrParameters, mfcc_options: Optional[Dict[str, Any]] = None, **kwargs) FeatureExtractionJob ¶
- Parameters:
crp –
mfcc_options – Nested parameters for
mfcc_flow()
- i6_core.features.mfcc.mfcc_flow(warping_function: str = 'mel', filter_width: float = 268.258, normalize: bool = True, normalization_options: Optional[Dict[str, Any]] = None, without_samples: bool = False, samples_options: Optional[Dict[str, Any]] = None, fft_options: Optional[Dict[str, Any]] = None, cepstrum_options: Optional[Dict[str, Any]] = None, add_features_output: bool = False) FlowNetwork ¶
- Parameters:
warping_function –
filter_width –
normalize – whether to add or not a normalization layer
normalization_options –
without_samples –
samples_options – arguments to
sample_flow()
fft_options – arguments to
fft_flow()
cepstrum_options – arguments to
cepstrum_flow()
add_features_output – Add the output port “features” when normalize is True. This should be set to True, default is False to not break existing hash.
i6_core.features.mrasta
¶
- i6_core.features.mrasta.MrastaJob(crp, mrasta_options=None, **kwargs)¶
- i6_core.features.mrasta.mrasta_flow(temporal_size=101, temporal_right=50, derivatives=1, gauss_filters=6, warping_function='mel', filter_width=268.258, filterbank_outputs=20, samples_options={}, fft_options={})¶
i6_core.features.normalization
¶
- class i6_core.features.normalization.CovarianceNormalizationJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, feature_flow, extra_config_estimate=None, extra_post_config_estimate=None, extra_config_normalization=None, extra_post_config_normalization=None)¶
- create_files()¶
- estimate()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- normalization()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.features.plp
¶
- i6_core.features.plp.PlpJob(crp, sampling_rate, plp_options=None, **kwargs)¶
- i6_core.features.plp.plp_flow(warping_function='bark', num_features=20, sampling_rate=8000, filter_width=3.8, normalize=True, normalization_options=None, without_samples=False, samples_options=None, fft_options=None, add_features_output=False)¶
i6_core.features.sil_norm
¶
- class i6_core.features.sil_norm.ExtractSegmentSilenceNormalizationMapJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.features.sil_norm.ExtractSilenceNormalizationMapJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.features.sil_norm.UnwarpTimesInCTMJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- i6_core.features.sil_norm.samples_with_silence_normalization_flow(audio_format='wav', dc_detection=True, dc_params=None, silence_params=None)¶
i6_core.features.tone
¶
- class i6_core.features.tone.ToneJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- convert(task_id)¶
- classmethod create_convert_config(crp, timestamp_flow, timestamp_port, extra_convert_config, extra_convert_post_config, **kwargs)¶
- classmethod create_convert_flow(crp, timestamp_flow, timestamp_port, **kwargs)¶
- classmethod create_dump_config(crp, samples_flow, extra_dump_config, extra_dump_post_config, **kwargs)¶
- classmethod create_dump_flow(crp, samples_flow, **kwargs)¶
- create_files()¶
- dump(task_id)¶
- extract_pitch()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.features.voiced
¶
- i6_core.features.voiced.VoicedJob(crp, voiced_options=None, **kwargs)¶
- i6_core.features.voiced.voiced_flow(window_shift=0.01, window_duration=0.04, min_pos=0.0025, max_pos=0.0167, without_samples=False, samples_options={}, add_voiced_output=False)¶
i6_core.g2p.apply
¶
- class i6_core.g2p.apply.ApplyG2PModelJob(*args, **kwargs)¶
Apply a trained G2P on a word list file
- Parameters:
g2p_model (Path) –
word_list_file (Path) – text file with a word each line
variants_mass (float) –
variants_number (int) –
g2p_path (Optional[Path]) –
g2p_python (Optional[Path]) –
filter_empty_words (bool) – if True, creates a new lexicon file with no empty translated words
concurrent (int) – split up word list file to parallelize job into this many instances
- filter()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- merge()¶
- run(task_id)¶
- split_word_list()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.g2p.convert
¶
- class i6_core.g2p.convert.BlissLexiconToG2PLexiconJob(*args, **kwargs)¶
Convert a bliss lexicon into a g2p compatible lexicon for training
- Parameters:
bliss_lexicon (Path) –
include_pronunciation_variants (bool) – In case of multiple phoneme representations for one lemma, when this is false it outputs only the first phoneme
include_orthography_variants (bool) – In case of multiple orthographic representations for one lemma, when this is false it outputs only the first orth
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.g2p.convert.G2POutputToBlissLexiconJob(*args, **kwargs)¶
Convert a g2p applied word list file (g2p lexicon) into a bliss lexicon
- Parameters:
iv_bliss_lexicon (Path) – bliss lexicon as reference for the phoneme inventory
g2p_lexicon (Path) – from ApplyG2PModelJob.out_g2p_lexicon
merge (bool) – merge the g2p lexicon into the iv_bliss_lexicon instead of only taking the phoneme inventory
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.g2p.train
¶
- class i6_core.g2p.train.TrainG2PModelJob(*args, **kwargs)¶
Train a G2P model using Sequitur
see https://github.com/sequitur-g2p/sequitur-g2p
- Parameters:
g2p_lexicon (Path) – g2p_lexicon for training, use BlissLexiconToG2PLexiconJob to generate a g2p_lexicon from a bliss lexicon
num_ramp_ups (int) – number of global ramp-ups (n-gram-iness)
min_iter (int) – minimum iterations per ramp-up
max_iter (int) – maximum iteration sper ramp-up
devel (str) – passed as -d argument, percent of train lexicon held out as validation set
size_constrains (str) – passed as -s argument, multigrams must have l1 … l2 left-symbols and r1 … r2 right-symbols
extra_args (list[str]) – extra cmd arguments that are passed to the g2p process
g2p_path (Optional[Path]) – path to the g2p installation. If None, searches for a global G2P_PATH, and uses the default binary path if not existing.
g2p_python (Optional[Path]) – path to the g2p python binary. If None, searches for a global G2P_PYTHON, and uses the default python binary if not existing.
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lda.config
¶
i6_core.lda.estimate
¶
- class i6_core.lda.estimate.EstimateLDAMatrixJob(*args, **kwargs)¶
- cleanup_before_run(*args)¶
- classmethod create_config(crp, between_class_scatter_matrix, within_class_scatter_matrix, reduced_dimension, eigenvalue_problem_config, generalized_eigenvalue_problem_config, extra_config, extra_post_config)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lda.estimate.EstimateScatterMatricesJob(*args, **kwargs)¶
- accumulate(task_id)¶
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_accumulate_config(crp, alignment_flow, extra_config_accumulate, extra_post_config_accumulate, **kwargs)¶
- classmethod create_estimate_config(crp, extra_config_estimate, extra_post_config_estimate, **kwargs)¶
- create_files()¶
- classmethod create_merge_config(crp, extra_config_merge, extra_post_config_merge, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- merge()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lda.flow
¶
- i6_core.lda.flow.add_context_flow(feature_net, max_size=9, right=4, margin_condition='present-not-empty', expand_timestamp=False)¶
i6_core.lexicon.allophones
¶
- class i6_core.lexicon.allophones.DumpStateTyingJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.allophones.StoreAllophonesJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lexicon.beep
¶
- class i6_core.lexicon.beep.BeepToBlissLexiconJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.beep.DownloadBeepJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lexicon.cmu
¶
- class i6_core.lexicon.cmu.CMUDictToBlissJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.cmu.DownloadCMUDictJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lexicon.conversion
¶
- class i6_core.lexicon.conversion.FilterLexiconByWordListJob(*args, **kwargs)¶
Filter lemmata to given word list. Warning: case_sensitive parameter does the opposite. Kept for backwards-compatibility.
- Parameters:
bliss_lexicon (tk.Path) – lexicon file to be handeled
word_list (tk.Path) – filter lexicon by this word list
case_sensitive (bool) – filter lemmata case-sensitive. Warning: parameter does the opposite.
check_synt_tok (bool) – keep also lemmata where the syntactic token matches word_list
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.conversion.GraphemicLexiconFromWordListJob(*args, **kwargs)¶
- default_transforms = {'+': 'PLUS', '.': 'DOT', '{': 'LBR', '}': 'RBR'}¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.conversion.LexiconFromTextFileJob(*args, **kwargs)¶
Create a bliss lexicon from a regular text file, where each line contains: <WORD> <PHONEME1> <PHONEME2> … separated by tabs or spaces. The lemmata will be added in the order they appear in the text file, the phonemes will be sorted alphabetically. Phoneme variants of the same word need to appear next to each other.
WARNING: No special lemmas or phonemes are added, so do not use this lexicon with RASR directly!
As the splitting is taken from RASR and not fully tested, it might not work in all cases so do not use this job without checking the output manually on new lexica.
- Parameters:
text_file (Path) –
compressed – save as .xml.gz
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.conversion.LexiconToWordListJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.conversion.LexiconUniqueOrthJob(*args, **kwargs)¶
Merge lemmata with the same orthography.
- Parameters:
bliss_lexicon (tk.Path) – lexicon file to be handeled
merge_multi_orths_lemmata (bool) –
if True, also lemmata containing multiple orths are merged based on their primary orth. Otherwise they are ignored.
Merging strategy - orth/phon/eval
all orth/phon/eval elements are merged together
- synt
- synt element is only copied to target lemma when
the target lemma does not already have one
and the rest to-be-merged-lemmata have any synt element.
** having a synt <=> synt is not None
this could lead to INFORMATION LOSS if there are several different synt token sequences in the to-be-merged lemmata
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.conversion.SpellingConversionJob(*args, **kwargs)¶
Spelling conversion for lexicon.
- Parameters:
bliss_lexicon (Path) – input lexicon, whose lemmata all have unique PRIMARY orth to reach the above requirements apply LexiconUniqueOrthJob
orth_mapping_file (str) –
orthography mapping file: *.json *.json.gz *.txt *.gz in case of plain text file
one can adjust mapping_delimiter a line starting with “#” is a comment line
mapping_file_delimiter (str) – delimiter of source and target orths in the mapping file relevant only if mapping is provided with a plain text file
- :param Optional[List[Tuple[str, str, str]]] mapping_rules
- a list of mapping rules, each rule is represented by 3 strings
(source orth-substring, target orth-substring, pos) where pos should be one of [“leading”, “trailing”, “any”]
e.g. the rule (“zation”, “sation”, “trailing”) will convert orth ending with -zation to orth ending with -sation set this ONLY when it’s clearly defined rules which can not generate any kind of ambiguities
- Parameters:
invert_mapping (bool) – invert the input orth mapping NOTE: this also affects the pairs which are inferred from mapping_rules
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lexicon.modification
¶
- class i6_core.lexicon.modification.AddEowPhonemesToLexiconJob(*args, **kwargs)¶
Extends phoneme set of a lexicon by additional end-of-word (eow) versions of all regular phonemes. Modifies lemmata to use the new eow-version of the final phoneme in each pronunciation.
- Parameters:
bliss_lexicon – Base lexicon to be modified.
nonword_phones – List of nonword-phones for which no eow-versions will be added, e.g. [noise]. Phonemes that occur in special lemmata are found automatically and do not need to be specified here.
boundary_marker – String that is appended to phoneme symbols to mark eow-version.
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.modification.MergeLexiconJob(*args, **kwargs)¶
Merge multiple bliss lexica into a single bliss lexicon.
Phonemes and lemmata can be individually sorted alphabetically or kept as is.
When merging a lexicon with a static lexicon, putting the static lexicon first and only sorting the phonemes will result in the “typical” lexicon structure.
Please be aware that the sorting or merging of lexica that were already used will create a new lexicon that might be incompatible to previously generated alignments.
- Parameters:
bliss_lexica (list[Path]) – list of bliss lexicon files (plain or gz)
sort_phonemes (bool) – sort phoneme inventory alphabetically
sort_lemmata (bool) – sort lemmata alphabetically based on first orth entry
compressed (bool) – compress final lexicon
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lexicon.modification.WriteLexiconJob(*args, **kwargs)¶
Create a bliss lexicon file from a static Lexicon.
Supports optional sorting of phonemes and lemmata.
Example for a static lexicon:
- Parameters:
static_lexicon (lexicon.Lexicon) – A Lexicon object
sort_phonemes (bool) – sort phoneme inventory alphabetically
sort_lemmata (bool) – sort lemmata alphabetically based on first orth entry
compressed (bool) – compress final lexicon
- classmethod hash(parsed_args)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lib.corpus
¶
Helper functions and classes for Bliss xml corpus loading and writing
- class i6_core.lib.corpus.Corpus¶
This class represents a corpus in the Bliss format. It is also used to represent subcorpora when the parent_corpus attribute is set. Corpora with include statements can be read but are written back as a single file.
- dump(path: str)¶
- Parameters:
path – target .xml or .xml.gz path
- filter_segments(filter_function: Callable[[Corpus, Recording, Segment], bool])¶
filter all segments (including in subcorpora) using filter_function :param filter_function: takes arguments corpus, recording and segment, returns True if segment should be kept
- fullname() str ¶
- load(path: str)¶
- Parameters:
path – corpus .xml or .xml.gz
- class i6_core.lib.corpus.CorpusSection¶
- class i6_core.lib.corpus.NamedEntity¶
i6_core.lib.hdf
¶
- i6_core.lib.hdf.get_input_dict_from_returnn_hdf(hdf_file: File) Dict[str, ndarray] ¶
Generate dictionary containing the “data” value as ndarray indexed by the sequence tag
- Parameters:
hdf_file – HDF file to extract data from
- Returns:
- i6_core.lib.hdf.get_returnn_simple_hdf_writer(returnn_root: Optional[str])¶
Get the RETURNN SimpleHDFWriter, will add return to the path, so only use in Job runtime :param returnn_root:
i6_core.lib.lexicon
¶
Library for the RASR Lexicon files
For format details visit: `https://www-i6.informatik.rwth-aachen.de/rwth-asr/manual/index.php/Lexicon`_
- class i6_core.lib.lexicon.Lemma(orth: Optional[List[str]] = None, phon: Optional[List[str]] = None, synt: Optional[List[str]] = None, eval: Optional[List[List[str]]] = None, special: Optional[str] = None)¶
Represents a lemma of a lexicon
- Parameters:
orth – list of spellings used in the training data
phon – list of pronunciation variants. Each str should contain a space separated string of phonemes from the phoneme-inventory.
synt – list of LM tokens that form a single token sequence. This sequence is used as the language model representation.
eval – list of output representations. Each sublist should contain one possible transcription (token sequence) of this lemma that is scored against the reference transcription.
special – assigns special property to a lemma. Supported values: “silence”, “unknown”, “sentence-boundary”, or “sentence-begin” / “sentence-end”
- to_xml()¶
- Returns:
xml representation
- Return type:
ET.Element
- class i6_core.lib.lexicon.Lexicon¶
Represents a bliss lexicon, can be read from and written to .xml files
- add_phoneme(symbol, variation='context')¶
- Parameters:
symbol (str) – representation of one phoneme
variation (str) – possible values: “context” or “none”. Use none for context independent phonemes like silence and noise.
- load(path)¶
- Parameters:
path (str) – bliss lexicon .xml or .xml.gz file
- remove_phoneme(symbol)¶
- Parameters:
symbol (str) –
- to_xml()¶
- Returns:
xml representation, can be used with util.write_xml
- Return type:
ET.Element
i6_core.lib.lm
¶
- class i6_core.lib.lm.Lm(lm_path)¶
Interface to access the ngrams of an LM. Currently supports only LMs in arpa format.
- Parameters:
lm_path (str) – Path to the LM file, currently supports only arpa files
- get_ngrams(n)¶
returns all the ngrams of order n
- load_arpa()¶
- i6_core.lib.lm.not_ngrams(text: str)¶
i6_core.lib.rasr_cache
¶
This module is about reading (maybe later also writing) the Rasr archive format.
- class i6_core.lib.rasr_cache.AllophoneLabeling(silence_phone, allophone_file, phoneme_file=None, state_tying_file=None, verbose_out=None)¶
Allophone labeling.
- Parameters:
silence_phone (str) – e.g. “si”
allophone_file (str) – list of allophones
phoneme_file (str|None) – list of phonemes
state_tying_file (str|None) – allophone state tying (e.g. via CART). maps each allophone state to a class label
verbose_out (file) – stream to dump log messages
- get_label_idx(allo_idx, state_idx)¶
- Parameters:
allo_idx (int) –
state_idx (int) –
- Return type:
int
- get_label_idx_by_allo_state_idx(allo_state_idx)¶
- Parameters:
allo_state_idx (int) –
- Return type:
int
- class i6_core.lib.rasr_cache.FileArchive(filename, must_exists=False, encoding='ascii')¶
File archive.
- RasrCacheHeader = 'SP_ARC1\x00'¶
- addAttributes(filename, dim, duration)¶
- Parameters:
filename (str) –
dim (int) –
duration (float) –
- addFeatureCache(filename, features, times)¶
- Parameters:
filename (str) –
features –
times –
- end_recovery_tag = 1437226410¶
- file_list()¶
- Return type:
list[str]
- finalize()¶
Finalize.
- getState(mix)¶
- Parameters:
mix (int) –
- Returns:
(mix, state)
- Return type:
(int,int)
- has_entry(filename)¶
- Parameters:
filename (str) – argument for self.read()
- Returns:
True if we have this entry
- read(filename, typ)¶
- Parameters:
filename (str) – the entry-name in the archive
typ (str) – “str”, “feat” or “align”
- Returns:
depending on typ, “str” -> string, “feat” -> (time, data), “align” -> align, where string is a str, time is list of time-stamp tuples (start-time,end-time) in millisecs,
data is a list of features, each a numpy vector,
- align is a list of (time, allophone, state), time is an int from 0 to len of align,
allophone is some int, state is e.g. in [0,1,2].
- Return type:
str|(list[numpy.ndarray],list[numpy.ndarray])|list[(int,int,int)]
- readFileInfoTable()¶
Read file info table.
- read_S16()¶
- Return type:
float
- read_U32()¶
- Return type:
int
- read_U8()¶
- Return type:
int
- read_bytes(l)¶
- Return type:
bytes
- read_char()¶
- Return type:
int
- read_f32()¶
- Return type:
float
- read_f64()¶
- Return type:
float
- read_packed_U32()¶
- Return type:
int
- read_str(l, enc='ascii')¶
- Return type:
str
- read_u32()¶
- Return type:
int
- read_u64()¶
- Return type:
int
- read_v(typ, size)¶
- Parameters:
typ (str) – “f” for float (float32) or “d” for double (float64)
size (int) – number of elements to return
- Returns:
numpy array of shape (size,) of dtype depending on typ
- Return type:
numpy.ndarray
- scanArchive()¶
Scan archive.
- setAllophones(f)¶
- Parameters:
f (str) – allophone filename. line-separated. will ignore lines starting with “#”
- start_recovery_tag = 2857740885¶
- writeFileInfoTable()¶
Write file info table.
- write_U32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_char(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_f32(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_f64(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_str(s, enc='ascii')¶
- Parameters:
s (str) –
- Return type:
int
- write_u32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_u64(i)¶
- Parameters:
i (int) –
- Return type:
int
- class i6_core.lib.rasr_cache.FileArchiveBundle(filename, encoding='ascii')¶
File archive bundle.
- Parameters:
filename (str) – .bundle file
encoding (str) – encoding used in the files
- file_list()¶
- Return type:
list[str]
- Returns:
list of content-filenames (which can be used for self.read())
- has_entry(filename)¶
- Parameters:
filename (str) – argument for self.read()
- Returns:
True if we have this entry
- read(filename, typ)¶
- Parameters:
filename (str) – the entry-name in the archive
typ (str) – “str”, “feat” or “align”
- Returns:
depending on typ, “str” -> string, “feat” -> (time, data), “align” -> align, where string is a str, time is list of time-stamp tuples (start-time,end-time) in millisecs,
data is a list of features, each a numpy vector,
- align is a list of (time, allophone, state), time is an int from 0 to len of align,
allophone is some int, state is e.g. in [0,1,2].
- Return type:
str|(list[numpy.ndarray],list[numpy.ndarray])|list[(int,int,int)]
Uses FileArchive.read().
- setAllophones(filename)¶
- Parameters:
filename (str) – allophone filename
- class i6_core.lib.rasr_cache.FileInfo(name, pos, size, compressed, index)¶
File info.
- Parameters:
name (str) –
pos (int) –
size (int) –
compressed (bool|int) –
index (int) –
- class i6_core.lib.rasr_cache.MixtureSet(filename)¶
Mixture set.
- Parameters:
filename (str) –
- getCovByIdx(idx)¶
- Parameters:
idx (int) –
- Return type:
numpy.ndarray
- getMeanByIdx(idx)¶
- Parameters:
idx (int) –
- Return type:
numpy.ndarray
- getNumberMixtures()¶
- Return type:
int
- read_U32()¶
- Return type:
int
- read_char()¶
- Return type:
int
- read_f32()¶
- Return type:
float
- read_f64()¶
- Return type:
float
- read_str(l, enc='ascii')¶
- Parameters:
l (int) –
enc (str) –
- Return type:
str
- read_u32()¶
- Return type:
int
- read_u64()¶
- Return type:
int
- read_v(size, a)¶
- Parameters:
size (int) –
a (array.array) –
- Return type:
array.array
- write(filename)¶
- Parameters:
filename (str) –
- write_U32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_char(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_f32(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_f64(i)¶
- Parameters:
i (float) –
- Return type:
int
- write_str(s, enc='ascii')¶
- Parameters:
s (str) –
enc (str) –
- Return type:
int
- write_u32(i)¶
- Parameters:
i (int) –
- Return type:
int
- write_u64(i)¶
- Parameters:
i (int) –
- Return type:
int
- class i6_core.lib.rasr_cache.WordBoundaries(filename)¶
Word boundaries.
- Parameters:
filename (str) –
- read_str(l, enc='ascii')¶
- Return type:
str
- read_u16()¶
- Return type:
int
- read_u32()¶
- Return type:
int
- i6_core.lib.rasr_cache.is_rasr_cache_file(filename)¶
- Parameters:
filename (str) – file to check. must exist
- Returns:
True iff this is a rasr cache (which can be loaded with open_file_archive())
- Return type:
bool
- i6_core.lib.rasr_cache.open_file_archive(archive_filename, must_exists=True, encoding='ascii')¶
- Parameters:
archive_filename (str) –
must_exists (bool) –
encoding (str) –
- Return type:
i6_core.lm.lm_image
¶
- class i6_core.lm.lm_image.CreateLmImageJob(*args, **kwargs)¶
pre-compute LM image without generating global cache
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, extra_config, extra_post_config, encoding, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lm.perplexity
¶
- class i6_core.lm.perplexity.ComputePerplexityJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, text_file, encoding, renormalize, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lm.reverse_arpa
¶
- class i6_core.lm.reverse_arpa.ReverseARPALmJob(*args, **kwargs)¶
Create a new LM in arpa format by reverting the n-grams of an existing Arpa LM.
- Parameters:
lm_path (Path) – Path to the existing arpa file
- static add_missing_backoffs(words, ngrams: List[Dict[str, Tuple]])¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lm.srilm
¶
- class i6_core.lm.srilm.ComputeBestMixJob(*args, **kwargs)¶
Compute the best mixture weights for a combination of count LMs based on the given PPL logs
- Parameters:
ppl_logs – List of PPL Logs to compute the weights from
compute_best_mix_exe – Path to srilm compute_best_mix executable
- run()¶
Call the srilm script and extracts the different weights from the log, then relinks log to output folder
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lm.srilm.ComputeNgramLmJob(*args, **kwargs)¶
Generate count based LM with SRILM
- Parameters:
ngram_order – Maximum n gram order
data – Either text file or counts file to read from, set data mode accordingly the counts file can come from the CountNgramsJob.out_counts
data_mode – Defines whether input format is text based or count based
vocab – Vocabulary file, one word per line
extra_ngram_args – Extra arguments for the execution call e.g. [‘-kndiscount’]
count_exe – Path to srilm ngram-count exe
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)
Example options for ngram_args: -kndiscount -interpolate -debug <int> -addsmooth <int>
- compress()¶
executes the previously created compression script and relinks the lm from work folder to output folder
- create_files()¶
creates bash script for lm creation and compression that will be executed in the run Task
- classmethod hash(kwargs)¶
delete the queue requirements from the hashing
- run()¶
executes the previously created lm script and relinks the vocabulary from work folder to output folder
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lm.srilm.ComputeNgramLmPerplexityJob(*args, **kwargs)¶
Calculate the Perplexity of a Ngram LM via SRILM
- Parameters:
ngram_order – Maximum n gram order
lm – LM to evaluate
eval_data – Data to calculate PPL on
vocab – Vocabulary file
set_unknown_flag – sets unknown lemma
extra_ppl_args – Extra arguments for the execution call e.g. ‘-debug 2’
ngram_exe – Path to srilm ngram exe
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)
- create_files()¶
creates bash script that will be executed in the run Task
- get_ppl()¶
extracts various outputs from the ppl.log file
- classmethod hash(kwargs)¶
delete the queue requirements from the hashing
- run()¶
executes the previously created script and relinks the log file from work folder to output folder
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lm.srilm.CountNgramsJob(*args, **kwargs)¶
Count ngrams with SRILM
- Parameters:
ngram_order – Maximum n gram order
data – Input data to be read as textfile
extra_count_args – Extra arguments for the execution call e.g. [‘-unk’]
count_exe – Path to srilm ngram-count executable
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)
Example options/parameters for count_args: -unk
- create_files()¶
creates bash script that will be executed in the run Task
- classmethod hash(kwargs)¶
delete the queue requirements from the hashing
- run()¶
executes the previously created bash script and relinks outputs from work folder to output folder
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lm.srilm.InterpolateNgramLmJob(*args, **kwargs)¶
Uses SRILM to interpolate different LMs with previously calculated weights
- Parameters:
ngram_lms – List of language models to interpolate, format: ARPA, compressed ARPA
weights – Weights of different language models, has to be same order as ngram_lms
ngram_order – Maximum n gram order
extra_interpolation_args – Additional arguments for interpolation
ngram_exe – Path to srilm ngram executable
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)
- classmethod hash(parsed_args)¶
delete the queue requirements from the hashing
- run()¶
delete the executable from the hashing
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lm.srilm.PruneLMWithHelperLMJob(*args, **kwargs)¶
Job that prunes the given LM with the help of a helper LM
- Parameters:
ngram_order – Maximum n gram order
lm – LM to be pruned
prune_thresh – Pruning threshold
helper_lm – helper/’Katz’ LM to prune the other LM with
ngram_exe – Path to srilm ngram-count executable
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)
- create_files()¶
creates bash script that will be executed in the run Task
- classmethod hash(kwargs)¶
delete the queue requirements from the hashing
- run()¶
executes the previously created script and relinks the lm from work folder to output folder
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.lm.vocabulary
¶
- class i6_core.lm.vocabulary.LmIndexVocabulary(vocab: sisyphus.job_path.Path, vocab_size: sisyphus.job_path.Variable, unknown_token: Union[sisyphus.job_path.Variable, str])¶
- unknown_token: Union[Variable, str]¶
- vocab: Path¶
- vocab_size: Variable¶
- class i6_core.lm.vocabulary.LmIndexVocabularyFromLexiconJob(*args, **kwargs)¶
Computes a <word>: <index> vocabulary file from a bliss lexicon for Word-Level LM training
Sentence begin/end will point to index 0, unknown to index 1. Both are taking directly from the lexicon via the “special” marking:
<lemma special=”sentence-begin”> -> index 0
<lemma special=”sentence-end”> -> index 0
<lemma special=”unknown”> -> index 1
If <synt> tokens are provided in a lemma, they will be used instead of <orth>
CAUTION: Be aware of: https://github.com/rwth-i6/returnn/issues/1245 when using Returnn’s LmDataset
- Parameters:
bliss_lexicon – us the lemmas from the lexicon to define the indices
count_ordering_text – optional text that can be used to define the index order based on the lemma count
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.lm.vocabulary.VocabularyFromLmJob(*args, **kwargs)¶
Extract the vocabulary from an existing LM. Currently supports only arpa files for input.
- Parameters:
lm_file (Path) – path to the lm arpa file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.meta.cart_lda
¶
i6_core.meta.mm_sequence
¶
i6_core.meta.system
¶
i6_core.meta.warping_sequence
¶
i6_core.mm.alignment
¶
- class i6_core.mm.alignment.AMScoresFromAlignmentLogJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.mm.alignment.AlignmentJob(*args, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
feature_flow –
feature_scorer (rasr.FeatureScorer) –
alignment_options (dict[str]) –
word_boundaries (bool) –
use_gpu (bool) –
rtf (float) –
extra_config –
extra_post_config –
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, feature_flow, feature_scorer, alignment_options, word_boundaries, extra_config, extra_post_config, **kwargs)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
feature_flow –
feature_scorer (rasr.FeatureScorer) –
alignment_options (dict[str]) –
word_boundaries (bool) –
extra_config –
extra_post_config –
- Returns:
config, post_config
- Return type:
(rasr.RasrConfig, rasr.RasrConfig)
- create_files()¶
- classmethod create_flow(feature_flow, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.mm.alignment.DumpAlignmentJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod create_flow(feature_flow, original_alignment, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.mm.confidence_based_alignment
¶
- class i6_core.mm.confidence_based_alignment.ConfidenceBasedAlignmentJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, feature_flow, feature_scorer, lattice_cache, global_scale, confidence_threshold, weight_scale, ref_alignment_path, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod create_flow(feature_flow, lattice_cache, global_scale, confidence_threshold, weight_scale, ref_alignment_path, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.mm.flow
¶
- i6_core.mm.flow.alignment_flow(feature_net, alignment_cache_path=None)¶
- i6_core.mm.flow.cached_alignment_flow(feature_net, alignment_cache_path)¶
- i6_core.mm.flow.confidence_based_alignment_flow(feature_net, lattice_cache_path, alignment_cache_path=None, global_scale=1.0, confidence_threshold=0.75, weight_scale=1.0, ref_alignment_path=None)¶
- i6_core.mm.flow.dump_alignment_flow(feature_net, original_alignment, new_alignment)¶
- i6_core.mm.flow.linear_segmentation_flow(feature_energy_net, alignment_cache=None)¶
i6_core.mm.mixtures
¶
i6_core.mm.tdp
¶
i6_core.rasr.command
¶
- class i6_core.rasr.command.RasrCommand¶
Mixin for
Job
.- NO_RETRY_AFTER_TIME = 600.0¶
- RETRY_WAIT_TIME = 5.0¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod default_exe(exe_name)¶
Extract executable path from the global sisyphus settings
- Parameters:
exe_name (str) –
- Return type:
str
- classmethod get_rasr_exe(exe_name, rasr_root, rasr_arch)¶
- Parameters:
exe_name (str) –
rasr_root (str) –
rasr_arch (str) –
- Returns:
path to a rasr binary with the default path pattern inside the repsoitory
- Return type:
str
- log_file_output_path(name, crp, parallel)¶
- Parameters:
name (str) –
crp (rasr.crp.CommonRasrParameters) –
parallel (int|bool) –
- Return type:
Path
- run_cmd(cmd: str, args: Optional[List[str]] = None, retries: int = 2, cwd: Optional[str] = None)¶
- Parameters:
cmd –
args –
retries –
cwd – execute cmd in this dir
- run_script(task_id: int, log_file: ~typing.Union[str, <sisyphus.toolkit.RelPath object at 0x7f1e203dd110>], cmd: str = './run.sh', args: ~typing.Optional[~typing.List] = None, retries: int = 2, use_tmp_dir: bool = False, copy_tmp_ls: ~typing.Optional[~typing.List] = None)¶
- classmethod select_exe(specific_exe, default_exe_name)¶
- Parameters:
specific_exe (str|None) –
default_exe_name (str) –
- Returns:
path to exe
- Return type:
str
- static write_config(config, post_config, filename)¶
- Parameters:
config (rasr.RasrConfig) –
post_config (rasr.RasrConfig) –
filename (str) –
- static write_run_script(exe, config, filename='run.sh', extra_code='', extra_args='')¶
- Parameters:
exe (str) –
config (str) –
filename (str) –
extra_code (str) –
extra_args (str) –
i6_core.rasr.config
¶
- class i6_core.rasr.config.ConfigBuilder(defaults)¶
- class i6_core.rasr.config.RasrConfig(prolog='', prolog_hash='', epilog='', epilog_hash='')¶
Used to store a Rasr configuration
- Parameters:
prolog (string) – A string that should be pasted as code at the beginning of the config file
epilog (string) – A string that should be pasted as code at the end of the config file
prolog_hash (string) – sets a specific hash for the prolog
epilog_hash (string) – sets a specific hash for the epilog
- html()¶
- class i6_core.rasr.config.StringWrapper(string, hidden=None)¶
Deprecated, please use e.g. DelayedFormat directly from Sisyphus Example for wrapping commands:
command = DelayedFormat(“{} -a -l en -no-escape”, tokenizer_binary)
Example for wrapping/combining paths:
pymod_config = DelayedFormat(“epoch:{},action:forward,configfile:{}”, model.epoch, model.returnn_config_file)
Example for wrapping even function calls:
- def cut_ending(path):
return path[: -len(“.meta”)]
- def foo():
[…] config.loader.saved_model_file = DelayedFunction(returnn_model.model, cut_ending)
- Parameters:
string (str) – some string based on the hashing object
hidden (Any) – hashing object
- class i6_core.rasr.config.WriteRasrConfigJob(*args, **kwargs)¶
Write a RasrConfig object into a .config file
- Parameters:
config (RasrConfig) – RASR config part that is hashed
post_config (RasrConfig) – RASR config part that is not hashed
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- i6_core.rasr.config.build_config_from_mapping(crp, mapping, include_log_config=True, parallelize=False)¶
- Parameters:
crp (rasr.crp.CommonRasrParameters) –
mapping (dict[str,str|list[str]]) –
include_log_config (bool) –
parallelize (bool) –
- Returns:
config, post_config
- Return type:
i6_core.rasr.crp
¶
- class i6_core.rasr.crp.CommonRasrParameters(base=None)¶
This class holds often used parameters for Rasr.
- Parameters:
base (CommonRasrParameters|None) –
- html()¶
- set_executables(rasr_binary_path, rasr_arch='linux-x86_64-standard')¶
Set all executables to a specific binary folder path
- Parameters:
rasr_binary_path (tk.Path) – path to the rasr binary folder
rasr_arch (str) – RASR compile architecture suffix
- Returns:
- i6_core.rasr.crp.crp_add_default_output(crp, compress=False, append=False, unbuffered=False, compress_after_run=True)¶
- Parameters:
crp (CommonRasrParameters) –
compress (bool) –
append (bool) –
unbuffered (bool) –
compress_after_run (bool) –
- i6_core.rasr.crp.crp_set_corpus(crp, corpus)¶
- Parameters:
crp (CommonRasrParameters) –
corpus (meta.CorpusObject) – object with corpus_file, audio_dir, audio_format, duration
i6_core.rasr.feature_scorer
¶
- class i6_core.rasr.feature_scorer.DiagonalMaximumScorer(*args, **kwargs)¶
- class i6_core.rasr.feature_scorer.GMMFeatureScorer(mixtures, scale=1.0)¶
- class i6_core.rasr.feature_scorer.InvAlignmentPassThroughFeatureScorer(prior_mixtures, max_segment_length, mapping, priori_scale=0.0)¶
- class i6_core.rasr.feature_scorer.OnnxFeatureScorer(*, mixtures: Path, model: Path, io_map: Dict[str, str], label_log_posterior_scale: float = 1.0, label_prior_scale: float, label_log_prior_file: Optional[Path] = None, apply_log_on_output: bool = False, negate_output: bool = True, intra_op_threads: int = 1, inter_op_threads: int = 1, **kwargs)¶
- Parameters:
mixtures – path to a *.mix file e.g. output of either EstimateMixturesJob or CreateDummyMixturesJob
model – path of a model e.g. output of ExportPyTorchModelToOnnxJob
io_map – mapping between internal rasr identifiers and the model related input/output. Default key values are “features” and “output”, and optionally “features-size”, e.g. io_map = {“features”: “data”, “output”: “classes”}
label_log_posterior_scale – scales for the log probability of a label e.g. 1.0 is recommended
label_prior_scale – scale for the prior log probability of a label reasonable e.g. values in [0.1, 0.7] interval
label_log_prior_file – xml file containing log prior probabilities e.g. estimated from the model via povey method
apply_log_on_output – whether to apply the log-function on the output, usefull if the model outputs softmax instead of log-softmax
negate_output – whether negate output (because the model outputs log softmax and not negative log softmax
intra_op_threads – Onnxruntime session’s number of parallel threads within each operator
inter_op_threads – Onnxruntime session’s number of parallel threads between operators used only for parallel execution mode
- class i6_core.rasr.feature_scorer.PrecomputedHybridFeatureScorer(prior_mixtures, scale=1.0, priori_scale=0.0, prior_file=None)¶
- class i6_core.rasr.feature_scorer.PreselectionBatchIntScorer(*args, **kwargs)¶
- class i6_core.rasr.feature_scorer.ReturnnScorer(feature_dimension, output_dimension, prior_mixtures, model, mixture_scale=1.0, prior_scale=1.0, prior_file=None, returnn_root=None)¶
- class i6_core.rasr.feature_scorer.SimdDiagonalMaximumScorer(*args, **kwargs)¶
i6_core.rasr.flow
¶
- class i6_core.rasr.flow.FlowNetwork(name='network')¶
- Parameters:
name (str) –
- add_flags(flags)¶
in case a Path has to be converted to a string that is then added to the network
- add_input(name)¶
- add_net(net)¶
- add_node(filter, name, attr=None, **kwargs)¶
- add_output(name)¶
- add_param(name)¶
- apply_config(path, config, post_config)¶
- contains_filter(filter_name)¶
- default_flags = {}¶
- get_input_links(input_port)¶
- get_input_ports()¶
- get_internal_links()¶
- get_node_names_by_filter(filter_name)¶
- get_output_links(output_port)¶
- Parameters:
output_port (str) –
- Returns:
list of from_name
- Return type:
list[str]
- get_output_ports()¶
- interconnect(a, node_mapping_a, b, node_mapping_b, mapping=None)¶
assuming a and b are FlowNetworks that have already been added to this net, the outputs of a are linked to the inputs of b, optionally a mapping between the ports can be specified
- interconnect_inputs(net, node_mapping, mapping=None)¶
assuming net has been added to self, link all of self’s inputs to net’s inputs, optionally a mapping between the ports can be specified
- interconnect_outputs(net, node_mapping, mapping=None)¶
assuming net has been added to self, link all of net’s outputs to self’s outputs, optionally a mapping between the ports can be specified
- link(from_name, to_name)¶
- Parameters:
from_name (str) –
to_name (str) –
- remove_node(name)¶
- subnet_from_node(node_name)¶
creates a new net where only nodes that follow the given node are retained. nodes before the specified node are not included. links between one retained node and one not retained one are returned aswell. this function is usefull if a part of a net should be duplicated without copying the other part
- unique_name(name)¶
- Parameters:
name (str) –
- Return type:
str
- unlink(from_name=None, to_name=None)¶
- write_to_file(file)¶
- class i6_core.rasr.flow.NodeMapping(mapping)¶
- Parameters:
mapping (dict) –
- class i6_core.rasr.flow.WriteFlowNetworkJob(*args, **kwargs)¶
Writes a flow network to a file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.rasr.util
¶
- class i6_core.rasr.util.ClusterMapToSegmentListJob(*args, **kwargs)¶
Creates segment files in relation to a speaker cluster map
- WARNING: This job has broken (non-portable) hashes and is not really useful anyway,
please use this only for existing pipelines
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.rasr.util.MapSegmentsWithBundlesJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.rasr.util.RemapSegmentsJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.rasr.util.RemapSegmentsWithBundlesJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.recognition.advanced_tree_search
¶
- class i6_core.recognition.advanced_tree_search.AdvancedTreeSearchJob(*args, **kwargs)¶
- Parameters:
crp – Common Rasr parameters for recognition
feature_flow – Flow network for recognition
feature_scorer – Feature scorer used in recognition. For AdvancedTreeSearchLmImageAndGlobalCacheJob RASR requires setting the feature scorer while actually not using it.
search_parameters – Parameters for search e.g. beam-pruning, uses defaults defined below if not set
lm_lookahead – Whether to perform language model lookahead or not
lookahead_options – Options for the lm lookahead
create_lattice – Recognizer option to produde lattices. Stored as FST
eval_single_best – Extract the best path from the lattice
eval_best_in_lattice –
use_gpu – Flag to enable GPU decoding
rtf – Expected rtf value to predict time requirement for the job
mem – Memory requirement for the job
cpu – CPU requirement for the job
lmgc_mem – Memory requirement for the AdvancedTreeSearchLmImageAndGlobalCacheJob
lmgc_alias – Alias for the AdvancedTreeSearchLmImageAndGlobalCacheJob
lmgc_scorer – Dummy scorer for the AdvancedTreeSearchLmImageAndGlobalCacheJob which is required but unused
model_combination_config – Configuration for model combination
model_combination_post_config – Post config for model combination
extra_config – Additional Config for recognition
extra_post_config – Post config of additional config for recognition
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp: CommonRasrParameters, feature_flow: FlowNetwork, feature_scorer: FeatureScorer, search_parameters: Optional[Dict[str, Any]], lm_lookahead: bool, lookahead_options: Optional[Dict[str, Any]], create_lattice: bool, eval_single_best: bool, eval_best_in_lattice: bool, mem: float, cpu: int, lmgc_mem: float, lmgc_alias: Optional[str], lmgc_scorer: Optional[FeatureScorer], model_combination_config: Optional[RasrConfig], model_combination_post_config: Optional[RasrConfig], extra_config: Optional[RasrConfig], extra_post_config: Optional[RasrConfig], **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- classmethod update_search_parameters(search_parameters)¶
- class i6_core.recognition.advanced_tree_search.AdvancedTreeSearchLmImageAndGlobalCacheJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, feature_scorer, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod find_arpa_lms(lm_config, lm_post_config=None)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.recognition.advanced_tree_search.AdvancedTreeSearchWithRescoringJob(*args, **kwargs)¶
- Parameters:
crp – Common Rasr parameters for recognition
feature_flow – Flow network for recognition
feature_scorer – Feature scorer used in recognition. For AdvancedTreeSearchLmImageAndGlobalCacheJob RASR requires setting the feature scorer while actually not using it.
search_parameters – Parameters for search e.g. beam-pruning, uses defaults defined below if not set
lm_lookahead – Whether to perform language model lookahead or not
lookahead_options – Options for the lm lookahead
create_lattice – Recognizer option to produde lattices. Stored as FST
eval_single_best – Extract the best path from the lattice
eval_best_in_lattice –
use_gpu – Flag to enable GPU decoding
rtf – Expected rtf value to predict time requirement for the job
mem – Memory requirement for the job
cpu – CPU requirement for the job
lmgc_mem – Memory requirement for the AdvancedTreeSearchLmImageAndGlobalCacheJob
lmgc_alias – Alias for the AdvancedTreeSearchLmImageAndGlobalCacheJob
lmgc_scorer – Dummy scorer for the AdvancedTreeSearchLmImageAndGlobalCacheJob which is required but unused
model_combination_config – Configuration for model combination
model_combination_post_config – Post config for model combination
extra_config – Additional Config for recognition
extra_post_config – Post config of additional config for recognition
- classmethod create_config(rescorer_type, rescoring_lm_config, max_hypotheses, pruning_threshold, history_limit, rescoring_lookahead_scale, **kwargs)¶
- class i6_core.recognition.advanced_tree_search.BidirectionalAdvancedTreeSearchJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, feature_flow, feature_scorer, recognizer_parameters, search_parameters, lm_lookahead, lookahead_options, create_lattice, lattice_filter_type, eval_single_best, eval_best_in_lattice, mem, model_combination_config, model_combination_post_config, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- classmethod update_search_parameters(search_parameters)¶
- class i6_core.recognition.advanced_tree_search.BuildGlobalCacheJob(*args, **kwargs)¶
Standalone job to create the global-cache for advanced-tree-search
- Parameters:
crp (rasr.CommonRasrParameters) – common RASR params (required: lexicon, acoustic_model, language_model, recognizer)
extra_config (rasr.Configuration) – overlay config that influences the Job’s hash
extra_post_config (rasr.Configuration) – overlay config that does not influences the Job’s hash
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.recognition.cn_decoding
¶
- class i6_core.recognition.cn_decoding.CNDecodingJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, lattice_path, lm_scale, pron_scale, write_cn, extra_config, extra_post_config)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- merge()¶
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.recognition.conversion
¶
- class i6_core.recognition.conversion.LatticeToCtmJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, lattice_cache, parallelize, encoding, fill_empty_segments, best_path_algo, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- merge()¶
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.recognition.optimize_parameters
¶
- class i6_core.recognition.optimize_parameters.OptimizeAMandLMScaleJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, *args)¶
- classmethod create_config(crp, lattice_cache, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.recognition.prune
¶
- class i6_core.recognition.prune.LatticePruningJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, lattice_path, pruning_threshold, phone_coverage, nonword_phones, max_arcs_per_second, max_arcs_per_segment, output_format, pronunciation_scale, extra_config, extra_post_config)¶
- create_files()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.recognition.scoring
¶
- class i6_core.recognition.scoring.AnalogJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.recognition.scoring.Hub5ScoreJob(*args, **kwargs)¶
- Parameters:
ref – reference stm text file
glm – text file containing mapping rules for scoring
hyp – hypothesis ctm text file
sctk_binary_path – set an explicit binary path.
- calc_wer()¶
- run(move_files=True)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.recognition.scoring.KaldiScorerJob(*args, **kwargs)¶
Applies the Kaldi compute-wer binary. Required gs.KALDI_PATH to be the path to the Kaldi bin folder.
- Parameters:
ref – Path to corpus file. This job will generate reference from it.
hyp – Path to CTM file. It will be converted to Kaldi format in this Job.
map – Dictionary with words to be replaced in hyp. Example: {‘[NOISE]’ : ‘’}
regex – String with groups used for regex the segment names. WER will be calculated for each group individually. Example: ‘.*(S..)(P..).*’
- calc_wer()¶
- run(report_path=None, ref_path=None, hyp_path=None)¶
- run_regex()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.recognition.scoring.QuaeroScorerJob(*args, **kwargs)¶
- calc_wer()¶
- run(move_files=True)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.recognition.scoring.ScliteJob(*args, **kwargs)¶
Run the Sclite scorer from the SCTK toolkit
- Outputs:
out_report_dir: contains the report files with detailed scoring information
out_*: the job also outputs many variables, please look in the init code for a list
- Parameters:
ref – reference stm text file
hyp – hypothesis ctm text file
cer – compute character error rate
sort_files – sort ctm and stm before scoring
additional_args – additional command line arguments passed to the Sclite binary call
sctk_binary_path – set an explicit binary path.
precision_ndigit – number of digits after decimal point for the precision of the percentages in the output variables. If None, no rounding is done. In sclite, the precision was always one digit after the decimal point (https://github.com/usnistgov/SCTK/blob/f48376a203ab17f/src/sclite/sc_dtl.c#L343), thus we recalculate the percentages here.
- calc_wer()¶
- run(output_to_report_dir=True)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.report.report
¶
- class i6_core.report.report.GenerateReportStringJob(*args, **kwargs)¶
Job to generate and output a report string
- Parameters:
report_values – Can be either directly callable or a dict which then is handled by report_template
report_template – Function to handle report_values of type _Report_Type
compress – Whether to zip the report
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.report.report.MailJob(*args, **kwargs)¶
Job that sends a mail upon completion of an output
- Parameters:
result – graph output that triggers sending the mail
subject – Subject of the mail
mail_address – Mail address of recipient (default: user)
send_contents – send the contents of result in body of the mail
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.returnn.compile
¶
- class i6_core.returnn.compile.CompileNativeOpJob(*args, **kwargs)¶
Compile a RETURNN native op into a shared object file.
- Parameters:
native_op (str) – Name of the native op to compile (e.g. NativeLstm2)
returnn_python_exe (Optional[Path]) – file path to the executable for running returnn (python binary or .sh)
returnn_root (Optional[Path]) – file path to the RETURNN repository root folder
search_numpy_blas (bool) – search for blas lib in numpy’s .libs folder
blas_lib (Path|str) – explicit path to the blas library to use
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.returnn.compile.CompileTFGraphJob(*args, **kwargs)¶
This Job is a wrapper around the RETURNN tool compile_tf_graph.py
- Parameters:
returnn_config (ReturnnConfig|Path|str) – Path to a RETURNN config file
train (int) –
eval (int) –
search (int) –
epoch (int|tk.Variable|None) – compile a specific epoch for networks that might change with every epoch
log_verbosity (int) – RETURNN log verbosity from 1 (least verbose) to 5 (most verbose)
device (str|None) – optimize graph for cpu or gpu. If None, defaults to cpu for current RETURNN. For any RETURNN version before cd4bc382, the behavior will depend on the device entry in the returnn_conig, or on the availability of a GPU on the execution host if not defined at all.
summaries_tensor_name –
output_format (str) – graph output format, one of [“pb”, “pbtxt”, “meta”, “metatxt”]
returnn_python_exe (Optional[Path]) – file path to the executable for running returnn (python binary or .sh)
returnn_root (Optional[Path]) – file path to the RETURNN repository root folder
rec_step_by_step (Optional[str]) – name of rec layer for step-by-step graph
rec_json_info (bool) – whether to enable rec json info for step-by-step graph compilation
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.returnn.compile.TorchOnnxExportJob(*args, **kwargs)¶
Export an ONNX model using the appropriate RETURNN tool script.
Currently only supports PyTorch via tools/torch_export_to_onnx.py
- Parameters:
returnn_config – RETURNN config object
checkpoint – Path to the checkpoint for export
device – target device for graph creation
returnn_python_exe – file path to the executable for running returnn (python binary or .sh)
returnn_root – file path to the RETURNN repository root folder
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.returnn.config
¶
- class i6_core.returnn.config.CodeWrapper(code)¶
Can be used to insert direct “code” (not as quoted string) into the config dict
- class i6_core.returnn.config.ReturnnConfig(config, post_config=None, staged_network_dict=None, *, python_prolog=None, python_prolog_hash=None, python_epilog='', python_epilog_hash=None, hash_full_python_code=False, sort_config=True, pprint_kwargs=None, black_formatting=True)¶
An object that manages a RETURNN config.
It can be used to serialize python functions and class definitions directly from Sisyphus code and paste them into the RETURNN config file.
- Parameters:
config (dict) – dictionary of the RETURNN config variables that are hashed
post_config (dict) – dictionary of the RETURNN config variables that are not hashed
staged_network_dict (None|dict[int, str|dict[str, Any]]) – dictionary of network dictionaries or any string that defines a network with network = … (e.g. the return variable of get_ext_net_dict_py_code_str() from returnn_common), indexed by the desired starting epoch of the network stage. By enabling this, an additional “networks” folder will be created next to the config location.
python_prolog (None|str|Callable|Class|tuple|list|dict) – str or structure containing str/callables/classes that should be pasted as code at the beginning of the config file
python_prolog_hash (None|Any) – set a specific hash (str) or any type of hashable objects to overwrite the default hashing for python_prolog
python_epilog (None|str|Callable|Class|tuple|list|dict) – str or structure containing str/callables/classes that should be pasted as code at the end of the config file
python_epilog_hash (None|Any) – set a specific hash (str) or any type of hashable objects to overwrite the default hashing for python_epilog
hash_full_python_code (bool) – By default, function bodies are not hashed. If set to True, the full content of python pro-/epilog is parsed and hashed.
sort_config (bool) – If set to True, the dictionary part of the config is sorted by key
pprint_kwargs (dict|None) – kwargs for pprint, e.g. {“sort_dicts”: False} to print dicts in given order for python >= 3.8
black_formatting (bool) – if true, the written config will be formatted with black
- GET_NETWORK_CODE = 'import os\nimport sys\nsys.path.insert(0, os.path.dirname(__file__))\n\ndef get_network(epoch, **kwargs):\n from networks import networks_dict\n for epoch_ in sorted(networks_dict.keys(), reverse=True):\n if epoch_ <= epoch:\n return networks_dict[epoch_]\n assert False, "Error, no networks found"\n\n'¶
- PYTHON_CODE = '#!rnn.py\n\n${SUPPORT_CODE}\n\n${PROLOG}\n\n${REGULAR_CONFIG}\n\nlocals().update(**config)\n\n${EPILOG}\n'¶
- check_consistency()¶
Check that there is no config key overwritten by post_config. Also check for parameters that should never be hashed.
- get(key, default=None)¶
- update(other)¶
- updates a ReturnnConfig with an other ReturnnConfig:
config, post_config, and pprint_kwargs use dict.update
prolog, epilog, and hashes are concatenated
staged_network_dict, sort_config, and black_formatting are overwritten
- Parameters:
other (ReturnnConfig) –
- write(path)¶
- Parameters:
path (str) –
- class i6_core.returnn.config.WriteReturnnConfigJob(*args, **kwargs)¶
Writes a ReturnnConfig into a .config file
- Parameters:
returnn_config (ReturnnConfig) –
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.returnn.dataset
¶
i6_core.returnn.extract_prior
¶
i6_core.returnn.flow
¶
i6_core.returnn.forward
¶
i6_core.returnn.hdf
¶
i6_core.returnn.oggzip
¶
i6_core.returnn.rasr_training
¶
i6_core.returnn.search
¶
i6_core.returnn.training
¶
- class i6_core.returnn.training.AverageTFCheckpointsJob(*args, **kwargs)¶
Compute the average of multiple specified Tensorflow checkpoints using the tf_avg_checkpoints script from Returnn
- Parameters:
model_dir – model dir from ReturnnTrainingJob
epochs – manually specified epochs or out_epoch from GetBestEpochJob
returnn_python_exe – file path to the executable for running returnn (python binary or .sh)
returnn_root – file path to the RETURNN repository root folder
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.returnn.training.Checkpoint(index_path)¶
Checkpoint object which holds the (Tensorflow) index file path as tk.Path, and will return the checkpoint path as common prefix of the .index/.meta/.data[…]
A checkpoint object should directly assigned to a RasrConfig entry (do not call .ckpt_path) so that the hash will resolve correctly
- Parameters:
index_path (Path) –
- property ckpt_path¶
- exists()¶
- class i6_core.returnn.training.GetBestEpochJob(*args, **kwargs)¶
Provided a RETURNN model directory and a score key, finds the best epoch. The sorting is lower=better, so to access the model with the highest values use negative index values (e.g. -1 for the model with the highest score, error or “loss”)
- Parameters:
model_dir – model_dir output from a RETURNNTrainingJob
learning_rates – learning_rates output from a RETURNNTrainingJob
key – a key from the learning rate file that is used to sort the models, e.g. “dev_score_output/output_prob”
index – index of the sorted list to access, 0 for the lowest, -1 for the highest score/error/loss
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.returnn.training.GetBestPtCheckpointJob(*args, **kwargs)¶
Analog to GetBestTFCheckpointJob, just for torch checkpoints.
- Parameters:
model_dir (Path) – model_dir output from a ReturnnTrainingJob
learning_rates (Path) – learning_rates output from a ReturnnTrainingJob
key (str) – a key from the learning rate file that is used to sort the models e.g. “dev_score_output/output_prob”
index (int) – index of the sorted list to access, 0 for the lowest, -1 for the highest score
- run()¶
- class i6_core.returnn.training.GetBestTFCheckpointJob(*args, **kwargs)¶
Returns the best checkpoint given a training model dir and a learning-rates file The best checkpoint will be HARD-linked if possible, so that no space is wasted but also the model not deleted in case that the training folder is removed.
- Parameters:
model_dir (Path) – model_dir output from a RETURNNTrainingJob
learning_rates (Path) – learning_rates output from a RETURNNTrainingJob
key (str) – a key from the learning rate file that is used to sort the models e.g. “dev_score_output/output_prob”
index (int) – index of the sorted list to access, 0 for the lowest, -1 for the highest score
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.returnn.training.PtCheckpoint(path: Path)¶
Checkpoint object pointing to a PyTorch checkpoint .pt file
- Parameters:
path – .pt file
- exists()¶
- class i6_core.returnn.training.ReturnnModel(returnn_config_file, model, epoch)¶
Defines a RETURNN model as config, checkpoint meta file and epoch
This is deprecated, use
Checkpoint
instead.- Parameters:
returnn_config_file (Path) – Path to a returnn config file
model (Path) – Path to a RETURNN checkpoint (only the .meta for Tensorflow)
epoch (int) –
- class i6_core.returnn.training.ReturnnTrainingFromFileJob(*args, **kwargs)¶
The Job allows to directly execute returnn config files. The config files have to have the line ext_model = config.value(“ext_model”, None) and model = ext_model to correctly set the model path
If the learning rate file should be available, add ext_learning_rate_file = config.value(“ext_learning_rate_file”, None) and learning_rate_file = ext_learning_rate_file
Other externally controllable parameters may also defined in the same way, and can be set by providing the parameter value in the parameter_dict. The “ext_” prefix is used for naming convention only, but should be used for all external parameters to clearly mark them instead of simply overwriting any normal parameter.
Also make sure that task=”train” is set.
- Parameters:
returnn_config_file (tk.Path|str) – a returnn training config file
parameter_dict (dict) – provide external parameters to the rnn.py call
time_rqmt (int|str) –
mem_rqmt (int|str) –
returnn_python_exe (Optional[Path]) – file path to the executable for running returnn (python binary or .sh)
returnn_root (Optional[Path]) – file path to the RETURNN repository root folder
- create_files()¶
- get_parameter_list()¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- path_available(path)¶
Returns True if given path is available yet
- Parameters:
path – path to check
- Returns:
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.returnn.training.ReturnnTrainingJob(*args, **kwargs)¶
Train a RETURNN model using the rnn.py entry point.
Only returnn_config, returnn_python_exe and returnn_root influence the hash.
The outputs provided are:
out_returnn_config_file: the finalized Returnn config which is used for the rnn.py call
out_learning_rates: the file containing the learning rates and training scores (e.g. use to select the best checkpoint or generate plots)
- out_model_dir: the model directory, which can be used in succeeding jobs to select certain models or do combinations
note that the model dir is DIRECTLY AVAILABLE when the job starts running, so jobs that do not have other conditions need to implement an “update” method to check if the required checkpoints are already existing
- out_checkpoints: a dictionary containing all created checkpoints. Note that when using the automatic checkpoint cleaning
function of Returnn not all checkpoints are actually available.
- Parameters:
returnn_config –
log_verbosity – RETURNN log verbosity from 1 (least verbose) to 5 (most verbose)
device – “cpu” or “gpu”
num_epochs – number of epochs to run, will also set num_epochs in the config file. Note that this value is NOT HASHED, so that this number can be increased to continue the training.
save_interval – save a checkpoint each n-th epoch
keep_epochs – specify which checkpoints are kept, use None for the RETURNN default This will also limit the available output checkpoints to those defined. If you want to specify the keep behavior without this limitation, provide cleanup_old_models/keep in the post-config and use None here.
time_rqmt –
mem_rqmt –
cpu_rqmt –
horovod_num_processes – If used without multi_node_slots, then single node, otherwise multi node.
multi_node_slots – multi-node multi-GPU training. See Sisyphus rqmt documentation. Currently only with Horovod, and horovod_num_processes should be set as well, usually to the same value. See https://returnn.readthedocs.io/en/latest/advanced/multi_gpu.html.
returnn_python_exe – file path to the executable for running returnn (python binary or .sh)
returnn_root – file path to the RETURNN repository root folder
- check_blacklisted_parameters(returnn_config)¶
Check for parameters that should not be set in the config directly
- Parameters:
returnn_config (ReturnnConfig) –
- Returns:
- create_files()¶
- classmethod create_returnn_config(returnn_config, log_verbosity, device, num_epochs, save_interval, keep_epochs, horovod_num_processes, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- info()¶
Returns information about the currently running job to be displayed on the web interface and the manager view :return: string to be displayed or None if not available :rtype: str
- path_available(path)¶
Returns True if given path is available yet
- Parameters:
path – path to check
- Returns:
- plot()¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.returnn.vocabulary
¶
i6_core.sat.clustering
¶
- class i6_core.sat.clustering.BayesianInformationClusteringJob(*args, **kwargs)¶
Generate a coprus-key-map based on the Bayesian information criterion. Each concurrent is clustered independently.
- classmethod create_config(crp, feature_flow, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod create_flow(feature_flow, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- merge()¶
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.sat.flow
¶
- i6_core.sat.flow.add_cmllr_transform(feature_net: FlowNetwork, map_file: Path, transform_dir: Path, matrix_name: str = '$input(corpus-key).matrix') FlowNetwork ¶
- Parameters:
feature_net – flow network for feature extraction, e.g. one from i6_core.features
map_file – RASR corpus-key-map file, e.g. out_cluster_map_file from SegmentCorpusBySpeakerJob
transform_dir – Directory containing the transformation matrix files, e.g. EstimateCMLLRJob.out_transforms
matrix_name – Name pattern for the matrix files in the transform_dir
- Returns:
A new flow network with the CMLLR transformation added
- i6_core.sat.flow.segment_clustering_flow(feature_flow=None, file='cluster.map.$(TASK)', minframes=0, mincluster=2, maxcluster=100000, threshold=0, _lambda=1, minalpha=0.4, maxalpha=0.6, alpha=-1, amalgamation=0, infile=None, **kwargs)¶
- Parameters:
feature_flow – Flownetwork of features used for clustering
file – Name of the cluster outputfile
minframes – minimum number of frames in a segment to consider the segment for clustering
mincluster – minimum number of clusters
maxcluster – maximum number of clusters
threshold – Threshold for BIC which is added to the model-complexity based penalty
_lambda – Weight for the model-complexity-based penalty (only lambda=1 corresponds to the definition of BIC; decreasing lambda increases the number of segment clusters.
minalpha – Minimum Alpha scaling value used within distance scaling optimization
maxalpha – Maximum Alpha scaling value used within distance scaling optimization
alpha – Weighting Factor for correlation-based distance (default is automatic alpha estimation using minalpha and maxalpha values)
amalgamation – Amalgamation Rule 1=Max Linkage, 0=Concatenation
infile – Name of inputfile of clusters
- Returns:
(FlowNetwork)
i6_core.sat.training
¶
- class i6_core.sat.training.EstimateCMLLRJob(*args, **kwargs)¶
- cleanup_before_run(cmd, retry, task_id, *args)¶
- classmethod create_config(crp, feature_flow, mixtures, alignment, cluster_map, estimation_iter, min_observation_weight, optimization_criterion, extra_config, extra_post_config, **kwargs)¶
- create_files()¶
- classmethod create_flow(feature_flow, alignment, **kwargs)¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- move_transforms()¶
- run(task_id)¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.summary.wer
¶
- class i6_core.summary.wer.KaldiSummaryJob(*args, **kwargs)¶
- Parameters:
(dict) (data) – contains strings at keys data[(col, row)]
(str) (header) – header of the first column
([str]) (col_names) – list of columns in order of appearance
row_names([str]) – list of rows in order of appearance
sort_rows(bool) – if true, the rows will be sorted alphanumerically
sort_cols(bool) – if true, the columns will be sorted alphanumerically
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- static wer(path)¶
- class i6_core.summary.wer.PrintTableJob(*args, **kwargs)¶
- Parameters:
(dict) (data) – contains strings at keys data[(col, row)]
(str) (header) – header of the first column
([str]) (col_names) – list of columns in order of appearance
row_names([str]) – list of rows in order of appearance
sort_rows(bool) – if true, the rows will be sorted alphanumerically
sort_cols(bool) – if true, the columns will be sorted alphanumerically
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.summary.wer.ScliteLurSummaryJob(*args, **kwargs)¶
Prints a table containing all sclite lur results :param data: {name:str , report_dir:str}
- dict2table(dicts)¶
Gets a list of dictionarys and creates a table :param dicts: [{name : str , data : {col:float, col:float….} }, … ] :return:
- parse_lur(file_path)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.summary.wer.ScliteSummaryJob(*args, **kwargs)¶
- Parameters:
(dict) (data) – contains strings at keys data[(col, row)]
(str) (header) – header of the first column
([str]) (col_names) – list of columns in order of appearance
row_names([str]) – list of rows in order of appearance
sort_rows(bool) – if true, the rows will be sorted alphanumerically
sort_cols(bool) – if true, the columns will be sorted alphanumerically
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- static wer(path)¶
i6_core.tests.job_tests.corpus.test_convert
¶
- i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm()¶
- i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_non_speech()¶
- i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_none()¶
- i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_punctuation()¶
- i6_core.tests.job_tests.corpus.test_convert.test_corpus_to_stm_whitespace()¶
i6_core.tests.job_tests.rasr.test_config
¶
- i6_core.tests.job_tests.rasr.test_config.test_write_rasr_config()¶
This test can be used to test the writing of different variable types into a rasr config and check for correct serialization. Only dummy example for now.
i6_core.tests.job_tests.rasr.test_flow
¶
- i6_core.tests.job_tests.rasr.test_flow.test_deterministic_flow_serialization()¶
Check if the RASR flow network is serialized in a deterministic way by running multiple serializations of slightly different flow networks. Serialization used to be non-deterministic over different python interpreter runs.
i6_core.tests.job_tests.recognition.test_scoring
¶
- i6_core.tests.job_tests.recognition.test_scoring.compile_sctk(branch: Optional[str] = None, commit: Optional[str] = None, sctk_git_repository: str = 'https://github.com/usnistgov/SCTK.git') Path ¶
- Parameters:
branch – specify a specific branch
commit – specify a specific commit
sctk_git_repository – where to clone SCTK from, usually does not need to be altered
- Returns:
SCTK binary folder
- i6_core.tests.job_tests.recognition.test_scoring.test_sclite_job()¶
i6_core.tests.job_tests.returnn.test_convert
¶
- i6_core.tests.job_tests.returnn.test_convert.test_corpus_replace_orth_from_reference_corpus()¶
i6_core.tests.job_tests.returnn.test_search
¶
i6_core.tests.job_tests.returnn.test_vocabulary
¶
i6_core.text.label.sentencepiece.train
¶
- class i6_core.text.label.sentencepiece.train.SentencePieceType(value)¶
An enumeration.
- BPE = 'bpe'¶
- CHAR = 'char'¶
- UNIGRAM = 'unigram'¶
- WORD = 'word'¶
- class i6_core.text.label.sentencepiece.train.TrainSentencePieceJob(*args, **kwargs)¶
Train a sentence-piece model to be used with RETURNN
- Parameters:
training_text (tk.Path) – raw text or gzipped text
vocab_size (int) – target vocabulary size for the created model
model_type (SentencePieceType) – which sentence model to use, use “UNIGRAM” for “typical” SPM
character_coverage (float) – official default is 0.9995, but this caused the least used character to be dropped entirely
additional_options (dict|None) – additional trainer options, see `https://github.com/google/sentencepiece/blob/master/doc/options.md`_
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.text.label.subword_nmt.apply
¶
- class i6_core.text.label.subword_nmt.apply.ApplyBPEModelToLexiconJob(*args, **kwargs)¶
Apply BPE codes to a Bliss lexicon file
- Parameters:
bliss_lexicon (Path) –
bpe_codes (Path) –
bpe_vocab (Path|None) –
subword_nmt_repo (Optional[Path]) –
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.label.subword_nmt.apply.ApplyBPEToTextJob(*args, **kwargs)¶
Apply BPE codes on a text file
- Parameters:
text_file – words text file to convert to bpe
bpe_codes – bpe codes file, e.g. ReturnnTrainBpeJob.out_bpe_codes
bpe_vocab – if provided, then merge operations that produce OOV are reverted, use e.g. ReturnnTrainBpeJob.out_bpe_dummy_count_vocab
subword_nmt_repo – subword nmt repository path. see also CloneGitRepositoryJob
gzip_output – use gzip on the output text
mini_task – if the Job should run locally, e.g. only a small (<1M lines) text should be processed
- classmethod hash(parsed_args)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.text.label.subword_nmt.train
¶
- class i6_core.text.label.subword_nmt.train.ReturnnTrainBpeJob(*args, **kwargs)¶
Create Bpe codes and vocab files compatible with RETURNN BytePairEncoding Repository:
This job can be used to produce BPE codes compatible to legacy (non-sisyphus) RETURNN setups.
- Outputs:
bpe_codes: the codes file to apply BPE to any text
- bpe_vocab: the index vocab in the form of {“<token>”: <index>, …} that can be used e.g. for RETURNN
Will contain <s> and </s> pointing to index 0 and the unk_label pointing to index 1
- bpe_dummy_count_vocab: a text file containing all words, to be used with the ApplyBPEToTextJob
DOES NOT INCLUDE COUNTS, but just set each count to -1. Is used to not cause invalid merges when converting text to the BPE form.
vocab_size: variable containing the number of indices
- Parameters:
text_file – corpus text file, .gz compressed or uncompressed
bpe_size (int) – number of BPE merge operations
unk_label (str) – unknown label
subword_nmt_repo (Path|None) – subword nmt repository path. see also CloneGitRepositoryJob
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.label.subword_nmt.train.TrainBPEModelJob(*args, **kwargs)¶
Create a bpe codes file using the official subword-nmt repo, either installed from pip or https://github.com/rsennrich/subword-nmt
This job is deprecated, to create BPE codes that are compatible with legacy (non-sisyphus) RETURNN setups using e.g. language models from Kazuki, please use the ReturnnTrainBpeJob.
Otherwise, please consider using the sentencepiece implementation.
- Parameters:
text_corpus (Path) –
symbols (int) –
min_frequency (int) –
dict_input (bool) –
total_symbols (bool) –
subword_nmt_repo (Optional[Path]) –
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.text.processing
¶
- class i6_core.text.processing.ConcatenateJob(*args, **kwargs)¶
Concatenate all given input files (gz or raw)
- Parameters:
text_files (list[Path]) – input text files
zip_out (bool) – apply gzip to the output
out_name (str) – user specific name
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.processing.HeadJob(*args, **kwargs)¶
Return the head of a text file, either absolute or as ratio (provide one)
- Parameters:
text_file (Path) – text file (gz or raw)
num_lines (int) – number of lines to extract
ratio (float) – ratio of lines to extract
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.processing.PipelineJob(*args, **kwargs)¶
Reads a text file and applies a list of piped shell commands
- Parameters:
text_files (iterable[Path]|Path) – text file (raw or gz) or list of files to be processed
pipeline (list[str|DelayedBase]) – list of shell commands to form the pipeline, can be empty to use the job for concatenation or gzip compression only.
zip_output (bool) – apply gzip to the output
check_equal_length (bool) – the line count of the input and output should match
mini_task (bool) – the pipeline should be run as mini_task
- classmethod hash(parsed_args)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.processing.SetDifferenceJob(*args, **kwargs)¶
Return the set difference of two text files, where one line is one element.
This job performs the set difference minuend - subtrahend. Unlike the bash utility comm, the two files do not need to be sorted. :param Path minuend: left-hand side of the set subtraction :param Path subtrahend: right-hand side of the set subtraction :param bool gzipped: whether the output should be compressed in gzip format
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.text.processing.TailJob(*args, **kwargs)¶
Return the tail of a text file, either absolute or as ratio (provide one)
- Parameters:
text_file (Path) – text file (gz or raw)
num_lines (int) – number of lines to extract
ratio (float) – ratio of lines to extract
- run()¶
- class i6_core.text.processing.WriteToTextFileJob(*args, **kwargs)¶
Write a given content into a text file, one entry per line
- Parameters:
content (list|dict|str) – input which will be written into a text file
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.tools.compile
¶
- class i6_core.tools.compile.MakeJob(*args, **kwargs)¶
Executes a sequence of make commands in a given folder
- Parameters:
folder – folder in which the make commands are executed, e.g. a GitCloneRepositoryJob output
make_sequence – list of options that are given to the make calls. defaults to [“all”] i.e. “make all” is executed
configure_opts – if given, runs ./configure with these options before make
num_processes – number of parallel running make processes
output_folder_name – name of the output path folder, if None, the repo is not copied as output
link_outputs – provide “output_name”: “local/repo/file_folder” pairs to link (or copy if output_folder_name=None) files or directories as output. This can be used to access single binaries or a binary folder instead of the whole repository.
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks() Iterator[Task] ¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.tools.download
¶
- class i6_core.tools.download.DownloadJob(*args, **kwargs)¶
Download an arbitrary file with optional checksum verification
If a checksum is provided the url will not be hashed
- Parameters:
url (str) –
target_filename (str|None) – explicit output filename, if None tries to detect the filename from the url
checksum (str|None) – A sha256 checksum to verify the file
- classmethod hash(parsed_args)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.tools.git
¶
- class i6_core.tools.git.CloneGitRepositoryJob(*args, **kwargs)¶
Clone a git repository given optional branch name and commit hash
- Parameters:
url (str) – git repository url
branch (str) – git branch name
commit (str) – git commit hash
checkout_folder_name (str) – name of the output path repository folder
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
i6_core.util
¶
- class i6_core.util.MultiOutputPath(creator, path_template, hidden_paths, cached=False)¶
- class i6_core.util.MultiPath(path_template, hidden_paths, cached=False, path_root=None, hash_overwrite=None)¶
- i6_core.util.add_suffix(string: str, suffix: str) str ¶
- i6_core.util.backup_if_exists(file: str)¶
- i6_core.util.cached_path(path: Union[str, Path]) Union[str, bytes] ¶
- i6_core.util.check_file_sha256_checksum(filename: str, reference_checksum: str)¶
Validates the sha256sum for a file against the target checksum
- Parameters:
filename – a single file to be checked
reference_checksum – checksum to verify against
- i6_core.util.chunks(l: List, n: int) List[List] ¶
- Parameters:
l – list which should be split into chunks
n – number of chunks
- Returns:
yields n chunks
- i6_core.util.compute_file_sha256_checksum(filename: str) str ¶
Computes the sha256sum for a file
- Parameters:
filename – a single file to be checked
- Returns:
checksum
:rtype:str
- i6_core.util.create_executable(filename: str, command: List[str])¶
create an executable .sh file calling a single command :param filename: executable name ending with .sh :param command: list representing the command and parameters :return:
- i6_core.util.delete_if_exists(file: str)¶
- i6_core.util.delete_if_zero(file: str)¶
- i6_core.util.get_executable_path(path: Optional[Path], gs_member_name: Optional[str], default_exec_path: Optional[Path] = None) Path ¶
Helper function that allows to select a specific version of software while maintaining compatibility to different methods that were used in the past to select software versions. It will return a Path object for the first path found in
- Parameters:
path – Directly specify the path to be used
gs_member_name – get path from sisyphus.global_settings.<gs_member_name>
default_exec_path – general fallback if no specific version is given
- i6_core.util.get_g2p_path(g2p_path: Path) Path ¶
gets the path to the sequitur g2p script
- i6_core.util.get_g2p_python(g2p_python: Path) Path ¶
gets the path to a python binary or script that is used to run g2p
- i6_core.util.get_returnn_python_exe(returnn_python_exe: Path) Path ¶
gets the path to a python binary or script that is used to run RETURNN
- i6_core.util.get_returnn_root(returnn_root: Path) Path ¶
gets the path to the root folder of RETURNN
- i6_core.util.get_subword_nmt_repo(subword_nmt_repo: Path) Path ¶
gets the path to the root folder of subword-nmt repo
- i6_core.util.get_val(var: Any) Any ¶
- i6_core.util.instanciate_delayed(o: Any) Any ¶
Recursively traverses a structure and calls .get() on all existing Delayed Operations, especially Variables in the structure
- Parameters:
o – nested structure that may contain DelayedBase objects
- Returns:
- i6_core.util.num_cart_labels(path: Union[str, Path]) int ¶
- i6_core.util.partition_into_tree(l: List, m: int) List[List] ¶
Transforms the list l into a nested list where each sub-list has at most length m + 1
- i6_core.util.reduce_tree(func, tree)¶
- i6_core.util.relink(src: str, dst: str)¶
- i6_core.util.remove_suffix(string: str, suffix: str) str ¶
- i6_core.util.uopen(path: Union[str, Path], *args, **kwargs) Union[open, open] ¶
- i6_core.util.update_nested_dict(dict1: Dict[str, Any], dict2: Dict[str, Any])¶
updates dict 1 with all the items from dict2, both dict1 and dict2 can be nested dict
- i6_core.util.write_paths_to_file(file: Union[str, Path], paths: List[Union[str, Path]])¶
- i6_core.util.write_xml(filename: ~typing.Union[<sisyphus.toolkit.RelPath object at 0x7f1e203dd110>, str], element_tree: ~typing.Union[~xml.etree.ElementTree.ElementTree, ~xml.etree.ElementTree.Element], prettify: bool = True)¶
writes element tree to xml file :param filename: name of desired output file :param element_tree: element tree which should be written to file :param prettify: prettify the xml. Warning: be careful with this option if you care about whitespace in the xml.
- i6_core.util.zmove(src: Union[str, Path], target: Union[str, Path])¶
i6_core.vtln.features
¶
- i6_core.vtln.features.VTLNFeaturesJob(crp, feature_flow, map_file, extra_warp_args=None, extra_config=None, extra_post_config=None)¶
i6_core.vtln.flow
¶
- i6_core.vtln.flow.add_static_warping_to_filterbank_flow(feature_net, alpha_name='warping-alpha', omega_name='warping-omega', node_name='filterbank')¶
- i6_core.vtln.flow.label_features_with_map_flow(feature_net, map_file, map_key='$(id)', default_output=1.0)¶
- i6_core.vtln.flow.recognized_warping_factor_flow(feature_net, alphas_file, mixtures, filterbank_node='filterbank', amplitude_spectrum_node='amplitude-spectrum', omega=0.875)¶
- i6_core.vtln.flow.warp_filterbank_with_map_flow(feature_net, map_file, map_key='$(id)', default_output=1.0, omega=0.875, node_name='filterbank')¶