i6_core.datasets.switchboard

Switchboard is conversational telephony speech with 8 Khz audio files. The training data consists of 300h hours. Reference: https://catalog.ldc.upenn.edu/LDC97S62

number of recordings: 4876 number of segments: 249624 number of speakers: 2260

class i6_core.datasets.switchboard.CreateFisherTranscriptionsJob(*args, **kwargs)

Create the compressed text data based on the fisher transcriptions which can be used for LM training

Part 1: https://catalog.ldc.upenn.edu/LDC2004T19 Part 2: https://catalog.ldc.upenn.edu/LDC2005T19

Parameters:
  • fisher_transcriptions1_folder – path to unpacked LDC2004T19.tgz, usually named fe_03_p1_tran

  • fisher_transcriptions2_folder – path to unpacked LDC2005T19.tgz, usually named fe_03_p2_tran

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateHub5e00CorpusJob(*args, **kwargs)

Creates the switchboard hub5e_00 corpus based on LDC2002S09 No speaker information attached

Parameters:
  • wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S09.tgz

  • hub5_transcriptions – extracted LDC2002T43.tgz named “2000_hub5_eng_eval_tr”

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateHub5e01CorpusJob(*args, **kwargs)

Creates the switchboard hub5e_01 corpus based on LDC2002S13

This corpus provides no glm, as the same as for Hub5e00 should be used

No speaker information attached

Parameters:
  • wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2002S13.tgz

  • hub5e01_folder – extracted LDC2002S13 named “hub5e_01”

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateLDCSwitchboardSpeakerListJob(*args, **kwargs)

This creates the speaker list according to the conversation and speaker table from the LDC documentation: https://catalog.ldc.upenn.edu/docs/LDC97S62

The resulting file contains 520 speakers in the format of:

<speaker_id> <gender> <recording>

Parameters:
  • caller_tab_file – caller_tab.csv from the Switchboard LDC documentation

  • conv_tab_file – conv_tab.csv from the Switchboard LDC documentation

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateRT03sCTSCorpusJob(*args, **kwargs)

Create the RT03 test set corpus, specifically the “CTS” subset of LDC2007S10

No speaker information attached

Parameters:
  • wav_audio_folder – output of SwitchboardSphereToWave called on extracted LDC2007S10.tgz

  • rt03_folder – extracted LDC2007S10.tgz

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardBlissCorpusJob(*args, **kwargs)

Creates Switchboard bliss corpus xml

segment name format: sw2001B-ms98-a-<folder-name>

Parameters:
  • audio_dir (tk.Path) – path for audio data

  • trans_dir (tk.Path) – path for transcription data. see DownloadSwitchboardTranscriptionAndDictJob

  • speakers_list_file (tk.Path) –

    path to a speakers list text file with format:

    speaker_id gender recording<channel>, e.g. 1005 F 2452A

    on each line. see CreateSwitchboardSpeakersListJob job

  • skip_empty_ldc_file (bool) – In the original corpus the sequence 2167B is mostly empty, thus exclude it from training (recommended, GMM will fail otherwise)

  • lowercase (bool) – lowercase the transcriptions of the corpus (recommended)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardLexiconTextFileJob(*args, **kwargs)

This job creates SWB preprocessed dictionary text file consistent with the training corpus given a raw dictionary text file downloaded within the transcription directory using DownloadSwitchboardTranscriptionAndDictJob Job. The resulted dictionary text file will be passed as argument to LexiconFromTextFileJob job in order to create bliss xml lexicon.

Parameters:

raw_dict_file (tk.Path) – path containing the raw dictionary text file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardSpeakersListJob(*args, **kwargs)
Given some speakers statistics info, this job creates a text file having on each line:

speaker_id gender recording

Parameters:

speakers_stats_file (tk.Path) – speakers stats text file

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.CreateSwitchboardSpokenFormBlissCorpusJob(*args, **kwargs)

Creates a special spoken form version of switchboard-1 used for e.g. BPE or Sentencepiece based models. It includes:

  • make sure everything is lowercased

  • conversion of numbers to written form (using a given conversion table)

  • conversion of some short forms into spoken forms (also using the table)

  • making special tokens uppercase again

Parameters:

switchboard_bliss_corpus – out_corpus of CreateSwitchboardBlissCorpusJob

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.DownloadSwitchboardSpeakersStatsJob(*args, **kwargs)

Note that this does not contain the speaker info for all recordings. We assume later that each recording has a unique speaker and a unique id is used for those recordings with unknown speakers info

Parameters:
  • url (str) –

  • target_filename (str|None) – explicit output filename, if None tries to detect the filename from the url

  • checksum (str|None) – A sha256 checksum to verify the file

classmethod hash(parsed_args)
Parameters:

parsed_args (dict[str]) –

Returns:

hash for job given the arguments

Return type:

str

class i6_core.datasets.switchboard.DownloadSwitchboardTranscriptionAndDictJob(*args, **kwargs)

Downloads switchboard training transcriptions and dictionary (or lexicon)

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]

class i6_core.datasets.switchboard.SwitchboardSphereToWaveJob(*args, **kwargs)

Takes an audio folder from one of the switchboard LDC folders and converts dual channel .sph files with mulaw encoding to single channel .wav files with s16le encoding

Parameters:

sph_audio_folder

run()
tasks()
Returns:

yields Task’s

Return type:

list[sisyphus.task.Task]