i6_core.corpus.segments
¶
- class i6_core.corpus.segments.DynamicSplitSegmentFileJob(*args, **kwargs)¶
Split the segments to concurrent many shares. It is a variant to the existing SplitSegmentFileJob. This requires a tk.Delayed variable (instead of int) for the argument concurrent.
- Parameters:
segment_file (tk.Path|str) – segment file
concurrent (tk.Delayed) – number of splits
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SegmentCorpusByRegexJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SegmentCorpusBySpeakerJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SegmentCorpusJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.ShuffleAndSplitSegmentsJob(*args, **kwargs)¶
- default_split = {'dev': 0.1, 'train': 0.9}¶
- classmethod hash(kwargs)¶
- Parameters:
parsed_args (dict[str]) –
- Returns:
hash for job given the arguments
- Return type:
str
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SortSegmentsByLengthAndShuffleJob(*args, **kwargs)¶
- Parameters:
crp – rasr.crp.CommonRasrParameters
shuffle_strength – float in [0,inf) determines how much the length should affect sorting 0 -> completely random; inf -> strictly sorted
shuffle_seed – random number seed
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.SplitSegmentFileJob(*args, **kwargs)¶
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]
- class i6_core.corpus.segments.UpdateSegmentsWithSegmentMapJob(*args, **kwargs)¶
Update a segment file with a segment mapping file (e.g. from corpus compression)
- Parameters:
segment_file (Path) – path to the segment text file (uncompressed)
segment_map (Path) – path to the segment map (gz or uncompressed)
- run()¶
- tasks()¶
- Returns:
yields Task’s
- Return type:
list[sisyphus.task.Task]