`i6_core.datasets.switchboard`¶

Switchboard is conversational telephony speech with 8 Khz audio files. The training data consists of 300h hours. Reference: https://catalog.ldc.upenn.edu/LDC97S62

number of recordings: 4876 number of segments: 249624 number of speakers: 2260

class i6_core.datasets.switchboard.CreateFisherTranscriptionsJob(*args, **kwargs)¶

Create the compressed text data based on the fisher transcriptions which can be used for LM training

Part 1: https://catalog.ldc.upenn.edu/LDC2004T19 Part 2: https://catalog.ldc.upenn.edu/LDC2005T19

Parameters:

fisher_transcriptions1_folder – path to unpacked LDC2004T19.tgz, usually named fe_03_p1_tran
fisher_transcriptions2_folder – path to unpacked LDC2005T19.tgz, usually named fe_03_p2_tran

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateHub5e00CorpusJob(*args, **kwargs)¶

Creates the switchboard hub5e_00 corpus based on LDC2002S09 No speaker information attached

Parameters:

wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S09.tgz
hub5_transcriptions – extracted LDC2002T43.tgz named “2000_hub5_eng_eval_tr”

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateHub5e01CorpusJob(*args, **kwargs)¶

Creates the switchboard hub5e_01 corpus based on LDC2002S13

This corpus provides no glm, as the same as for Hub5e00 should be used

No speaker information attached

Parameters:

wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S13.tgz
hub5e01_folder – extracted LDC2002S13 named “hub5e_01”

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateLDCSwitchboardSpeakerListJob(*args, **kwargs)¶

This creates the speaker list according to the conversation and speaker table from the LDC documentation: https://catalog.ldc.upenn.edu/docs/LDC97S62

The resulting file contains 520 speakers in the format of:: <speaker_id> <gender> <recording>

Parameters:

caller_tab_file – caller_tab.csv from the Switchboard LDC documentation
conv_tab_file – conv_tab.csv from the Switchboard LDC documentation

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateRT03sCTSCorpusJob(*args, **kwargs)¶

Create the RT03 test set corpus, specifically the “CTS” subset of LDC2007S10

No speaker information attached

Parameters:

wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2007S10.tgz
rt03_folder – extracted LDC2007S10.tgz

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardBlissCorpusJob(*args, **kwargs)¶

Creates Switchboard bliss corpus xml

segment name format: sw2001B-ms98-a-<folder-name>

Parameters:

audio_dir (tk.Path) – path for audio data
trans_dir (tk.Path) – path for transcription data. see DownloadSwitchboardTranscriptionAndDictJob
speakers_list_file (tk.Path) –

path to a speakers list text file with format:
speaker_id gender recording<channel>, e.g. 1005 F 2452A

on each line. see CreateSwitchboardSpeakersListJob job
skip_empty_ldc_file (bool) – In the original corpus the sequence 2167B is mostly empty, thus exclude it from training (recommended, GMM will fail otherwise)
lowercase (bool) – lowercase the transcriptions of the corpus (recommended)

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardLexiconTextFileJob(*args, **kwargs)¶

This job creates SWB preprocessed dictionary text file consistent with the training corpus given a raw dictionary text file downloaded within the transcription directory using DownloadSwitchboardTranscriptionAndDictJob Job. The resulted dictionary text file will be passed as argument to LexiconFromTextFileJob job in order to create bliss xml lexicon.

Parameters:: raw_dict_file (tk.Path) – path containing the raw dictionary text file

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardSpeakersListJob(*args, **kwargs)¶

Given some speakers statistics info, this job creates a text file having on each line:: speaker_id gender recording

Parameters:: speakers_stats_file (tk.Path) – speakers stats text file

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardSpokenFormBlissCorpusJob(*args, **kwargs)¶

Creates a special spoken form version of switchboard-1 used for e.g. BPE or Sentencepiece based models. It includes:

make sure everything is lowercased

conversion of numbers to written form (using a given conversion table)

conversion of some short forms into spoken forms (also using the table)

making special tokens uppercase again

Parameters:: switchboard_bliss_corpus – out_corpus of CreateSwitchboardBlissCorpusJob

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.DownloadSwitchboardSpeakersStatsJob(*args, **kwargs)¶

Note that this does not contain the speaker info for all recordings. We assume later that each recording has a unique speaker and a unique id is used for those recordings with unknown speakers info

Parameters:

url (str) –
target_filename (str|None) – explicit output filename, if None tries to detect the filename from the url
checksum (str|None) – A sha256 checksum to verify the file

classmethod hash(parsed_args)¶

Parameters:: parsed_args (dict[str]) –
Returns:: hash for job given the arguments
Return type:: str

class i6_core.datasets.switchboard.DownloadSwitchboardTranscriptionAndDictJob(*args, **kwargs)¶

Downloads switchboard training transcriptions and dictionary (or lexicon)

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.datasets.switchboard.SwitchboardSphereToWaveJob(*args, **kwargs)¶

Takes an audio folder from one of the switchboard LDC folders and converts dual channel .sph files with mulaw encoding to single channel .wav files with s16le encoding

Parameters:: sph_audio_folder –

run()¶

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

`i6_core.datasets.switchboard`¶

i6_core

Navigation

Related Topics

i6_core.datasets.switchboard¶

`i6_core.datasets.switchboard`¶