`i6_core.lm.srilm`¶

class i6_core.lm.srilm.ComputeBestMixJob(*args, **kwargs)¶

Compute the best mixture weights for a combination of count LMs based on the given PPL logs

Parameters:

ppl_logs – List of PPL Logs to compute the weights from
compute_best_mix_exe – Path to srilm compute_best_mix executable

run()¶: Call the srilm script and extracts the different weights from the log, then relinks log to output folder

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.lm.srilm.ComputeNgramLmJob(*args, **kwargs)¶

Generate count based LM with SRILM

Parameters:

ngram_order – Maximum n gram order
data – Either text file or counts file to read from, set data mode accordingly the counts file can come from the CountNgramsJob.out_counts
data_mode – Defines whether input format is text based or count based
vocab – Vocabulary file, one word per line
extra_ngram_args – Extra arguments for the execution call e.g. [‘-kndiscount’]
count_exe – Path to srilm ngram-count exe
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

Example options for ngram_args: -kndiscount -interpolate -debug <int> -addsmooth <int>

class DataMode(value)¶

An enumeration.

COUNT = 2¶

TEXT = 1¶

compress()¶: executes the previously created compression script and relinks the lm from work folder to output folder

create_files()¶: creates bash script for lm creation and compression that will be executed in the run Task

classmethod hash(kwargs)¶: delete the queue requirements from the hashing

run()¶: executes the previously created lm script and relinks the vocabulary from work folder to output folder

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.lm.srilm.ComputeNgramLmPerplexityJob(*args, **kwargs)¶

Calculate the Perplexity of a Ngram LM via SRILM

Parameters:

ngram_order – Maximum n gram order
lm – LM to evaluate
eval_data – Data to calculate PPL on
vocab – Vocabulary file
set_unknown_flag – sets unknown lemma
extra_ppl_args – Extra arguments for the execution call e.g. ‘-debug 2’
ngram_exe – Path to srilm ngram exe
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

create_files()¶: creates bash script that will be executed in the run Task

get_ppl()¶: extracts various outputs from the ppl.log file

classmethod hash(kwargs)¶: delete the queue requirements from the hashing

run()¶: executes the previously created script and relinks the log file from work folder to output folder

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.lm.srilm.CountNgramsJob(*args, **kwargs)¶

Count ngrams with SRILM

Parameters:

ngram_order – Maximum n gram order
data – Input data to be read as textfile
extra_count_args – Extra arguments for the execution call e.g. [‘-unk’]
count_exe – Path to srilm ngram-count executable
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

Example options/parameters for count_args: -unk

create_files()¶: creates bash script that will be executed in the run Task

classmethod hash(kwargs)¶: delete the queue requirements from the hashing

run()¶: executes the previously created bash script and relinks outputs from work folder to output folder

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.lm.srilm.InterpolateNgramLmJob(*args, **kwargs)¶

Uses SRILM to interpolate different LMs with previously calculated weights

Parameters:

ngram_lms – List of language models to interpolate, format: ARPA, compressed ARPA
weights – Weights of different language models, has to be same order as ngram_lms
ngram_order – Maximum n gram order
extra_interpolation_args – Additional arguments for interpolation
ngram_exe – Path to srilm ngram executable
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

classmethod hash(parsed_args)¶: delete the queue requirements from the hashing

run()¶: delete the executable from the hashing

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

class i6_core.lm.srilm.PruneLMWithHelperLMJob(*args, **kwargs)¶

Job that prunes the given LM with the help of a helper LM

Parameters:

ngram_order – Maximum n gram order
lm – LM to be pruned
prune_thresh – Pruning threshold
helper_lm – helper/’Katz’ LM to prune the other LM with
ngram_exe – Path to srilm ngram-count executable
mem_rqmt – Memory requirements of Job (not hashed)
time_rqmt – Time requirements of Job (not hashed)
cpu_rqmt – Amount of Cpus required for Job (not hashed)
fs_rqmt – Space on fileserver required for Job, example: “200G” (not hashed)

create_files()¶: creates bash script that will be executed in the run Task

classmethod hash(kwargs)¶: delete the queue requirements from the hashing

run()¶: executes the previously created script and relinks the lm from work folder to output folder

tasks()¶

Returns:: yields Task’s
Return type:: list[sisyphus.task.Task]

`i6_core.lm.srilm`¶

i6_core

Navigation

Related Topics

i6_core.lm.srilm¶

`i6_core.lm.srilm`¶