




已阅读5页,还剩44页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
脚本名称涉及的kaldi函数 cmd.sh 无 run.sh 无 path.sh 无 说明:本文档基于kaldi中timit中的s5分析 用 途 设置环境变量(相当于全局),分了a、b、c三类。 用于运行的主脚本,会调用很多其他的shell文件 对程序中需要用到的文件路径进行设置 脚本名称涉及的kaldi函数 local/timit_data_prep.sh无 local/timit_prepare_dict.sh无 utils/prepare_lang.sh无 local/timit_format_data.sh无 用 途 数据准备 数据准备 数据准备 数据准备 scriptcorrelation function steps/make_mfcc.sh extract-segments compute-mfcc-feats copy-feats steps/compute_cmvn_stats.sh copy-matrix compute-cmvn-stats-two-channel compute-cmvn-stats modify-cmvn-stats function descriptionscript description extract segments from a large audio file in wav format. get mfcc featurecreate mfcc feature files copy features copy matrices, or archives of matrices compute cepstral mean and variance statistics per speaker. compute cepstral mean and variance normalization statistics compute cepstral mean and variance normalization statistics copy cepstral mean/variance stats. scriptcorrelation function steps/train_mono.sh gmm-init-mono compile-train-graphs align-equal-compiled gmm-acc-stats-ali gmm-est gmm-align-compiled utils/mkgraph.sh无 红色的循环四十次 function description initialize monophone gmm creates training graphs write an equally spaced alignment accumulate stats for gmm training. do maximum likelihood re-estimation of gmm-based acoustic model align features given gmm-based models. script description flat start and monophone training, with delta-delta features. (训练单音素的基础隐马尔可夫模型,迭代40次, 并且每隔两次迭代对语音数据进行一次对齐。) creates a fully expanded decoding graph (hclg) (建立完全的识别网络,输出是一个有限状态转换器。 每次训练出一种新的种类模型后都需要执行该脚本生成识别网络。) scriptcorrelation function steps/align_si.sh gmm-align-compiled compile-train-graphs steps/train_deltas.sh acc-tree-stats sum-tree-stats cluster-phones compile-questions build-tree gmm-init-model gmm-mixup convert-ali compile-train-graphs gmm-align-compiled gmm-acc-stats-ali gmm-est utils/mkgraph.sh无 function description align features given gmm-based models. creates training graphs accumulate statistics for phonetic-context tree building. sum statistics for phonetic-context tree building. cluster phones (or sets of phones) into sets for various purposes compile questions train decision tree initialize gmm from decision tree and tree stats does gmm mixing up (and gaussian merging) convert alignments from one decision-tree/model to another creates training graphs (without transition-probabilities, by default) align features given gmm-based models. accumulate stats for gmm training. do maximum likelihood re-estimation of gmm-based acoustic model script description computes training alignments using a model with delta or lda+mllt features. 运行指定模型对指定数据进行对齐,一般在新的模型开始训练前调用,上一个版本训练的模型作为输入。 train tri1, which is deltas + delta-deltas, on train data。 训练与上下文相关的三因素声学模型, 该脚本需要单因素的声学模型作为输入。 creates a fully expanded decoding graph (hclg) (建立完全的识别网络,输出是一个有限状态转换器。 每次训练出一种新的种类模型后都需要执行该脚本生成识别网络。) scriptcorrelation function steps/align_si.sh gmm-align-compiled compile-train-graphs steps/train_lda_mllt.sh ali-to-post weight-silence-post acc-lda est-lda acc-tree-stats sum-tree-stats cluster-phones compile-questions build-tree gmm-init-model gmm-init-model-flat convert-ali compile-train-graphs gmm-align-compiled gmm-acc-mllt est-mllt gmm-transform-means compose-transforms gmm-acc-stats-ali gmm-est utils/mkgraph.sh无 function description align features given gmm-based models. creates training graphs convert alignments to posteriors apply weight to silences in posts accumulate lda statistics based on pdf-ids. estimate lda transform using stats obtained with acc-lda. accumulate statistics for phonetic-context tree building. sum statistics for phonetic-context tree building. cluster phones (or sets of phones) into sets for various purposes compile questions train decision tree initialize gmm from decision tree and tree stats initialize gmm, with gaussians initialized to mean and variance of some provided example data convert alignments from one decision-tree/model to another creates training graphs (without transition-probabilities, by default) align features given gmm-based models. accumulate mllt (global stc) statistics do mllt update transform gmm means with linear or affine transform compose (affine or linear) feature transforms accumulate stats for gmm training. do maximum likelihood re-estimation of gmm-based acoustic model script description computes training alignments using a model with delta or lda+mllt features. 运行指定模型对指定数据进行对齐,一般在新的模型开始训练前调用,上一个版本训练的模型作为输入。 训练加入lda和mllt以后的三音素声学模型, 该脚本需要基础的三音素声学模型作为输入。 creates a fully expanded decoding graph (hclg) (建立完全的识别网络,输出是一个有限状态转换器。 每次训练出一种新的种类模型后都需要执行该脚本生成识别网络。) initialize gmm, with gaussians initialized to mean and variance of some provided example data scriptcorrelation function steps/decode.sh gmm-latgen-faster function description generate lattices using gmm-based model. script description 解码脚本,并且在解码完成后能够输出wer, 该脚本以声学模型和测试数据作为输出。 scriptcorrelation function steps/align_si.sh gmm-align-compiled compile-train-graphs steps/train_sat.sh ali-to-post weight-silence-post gmm-est-fmllr acc-tree-stats sum-tree-stats cluster-phones compile-questions build-tree gmm-init-model convert-ali compile-train-graphs gmm-align-compiled gmm-acc-stats-ali gmm-est gmm-sum-accs gmm-acc-stats-twofeats utils/mkgraph.sh无 function description align features given gmm-based models. creates training graphs convert alignments to posteriors apply weight to silences in posts estimate global fmllr transforms, either per utterance or for the supplied set of speakers accumulate statistics for phonetic-context tree building. sum statistics for phonetic-context tree building. cluster phones (or sets of phones) into sets for various purposes compile questions train decision tree initialize gmm from decision tree and tree stats convert alignments from one decision-tree/model to another creates training graphs (without transition-probabilities, by default) align features given gmm-based models. accumulate stats for gmm training. do maximum likelihood re-estimation of gmm-based acoustic model sum multiple accumulated stats files for gmm training. accumulate stats for gmm training, computing posteriors with one set of features but accumulating statistics with another. script description computes training alignments using a model with delta or lda+mllt features. 运行指定模型对指定数据进行对齐,一般在新的模型开始训练前调用,上一个版本训练的模型作为输入。 his does speaker adapted training (sat), i.e. train on fmllr-adapted features (运用基于特征空间的最大似然线性回归(fmllr) 进行发音人自适应训练,该脚本以三因素作为输入。) creates a fully expanded decoding graph (hclg) (建立完全的识别网络,输出是一个有限状态转换器。 每次训练出一种新的种类模型后都需要执行该脚本生成识别网络。) estimate global fmllr transforms, either per utterance or for the supplied set of speakers scriptcorrelation function steps/decode_fmllr.sh gmm-latgen-faster compute-wer lattice-to-post weight-silence-post gmm-post-to-gpost gmm-est-fmllr-gpost gmm-est-fmllr compose-transforms gmm-rescore-lattice lattice-determinize-pruned function description generate lattices using gmm-based model. compute wer by comparing different transcriptions do forward-backward and collect posteriors over lattices. apply weight to silences in posts convert state-level posteriors to gaussian-level posteriors estimate global fmllr transforms estimate global fmllr transforms, either per utterance or for the supplied compose (affine or linear) feature transforms replace the acoustic scores on a lattice using a new model. determinize lattices, keeping only the best path script description 对于进行了发音人自适应的模型进行解码 estimate global fmllr transforms, either per utterance or for the supplied scriptcorrelation function steps/align_fmllr.sh gmm-align-compiled ali-to-post weight-silence-post gmm-post-to-gpost gmm-est-fmllr-gpost gmm-est-fmllr steps/train_ubm.shinit-ubm gmm-gselect fgmm-global-est steps/train_sgmm2.sh acc-tree-stats sum-tree-stats cluster-phones compile-questions build-tree-two-level sgmm2-init sgmm2-gselect compile-train-graphs convert-ali sgmm2-align-compiled ali-to-post weight-silence-post sgmm2-est-spkvecs sgmm2-acc-stats copy-vector sgmm2-est sgmm2-post-to-gpost utils/mkgraph.sh无 function description align features given gmm-based models. convert alignments to posteriors apply weight to silences in posts convert state-level posteriors to gaussian-level posteriors estimate global fmllr transforms estimate global fmllr transforms, cluster the gaussians in a diagonal-gmm acoustic model precompute gaussian indices for pruning estimate a full-covariance gmm from the accumulated stats. accumulate statistics for phonetic-context tree building sum statistics for phonetic-context tree building cluster phones (or sets of phones) into sets for various purposes compile questions trains two-level decision tree. initialize an sgmm from a trained full-covariance ubm and a specified model topology. precompute gaussian indices for sgmm training creates training graphs convert alignments from one decision-tree/model to another align features given sgmm-based models. convert alignments to posteriors apply weight to silences in posts estimate sgmm speaker vectors, either per utterance or for the supplied set of speakers (with spk2utt option). accumulate stats for sgmm training. copy vectors, or archives of vectors estimate sgmm model parameters from accumulated stats. convert posteriors to gaussian-level posteriors for sgmm training. script description computes training alignments; assumes features are (lda+mllt or delta+delta- delta)+ fmllr 训练通用背景模型 sgmm training, with speaker vectors. (训练子空间高斯混合模型,输入为三音素声学模型) creates a fully expanded decoding graph (hclg) (建立完全的识别网络,输出是一个有限状态转换器。 每次训练出一种新的种类模型后都需要执行该脚本生成识别网络。) estimate a full-covariance gmm from the accumulated stats. cluster phones (or sets of phones) into sets for various purposes initialize an sgmm from a trained full-covariance ubm and a specified model topology. convert alignments from one decision-tree/model to another estimate sgmm speaker vectors, either per utterance or for the supplied set of speakers (with spk2utt option). estimate sgmm model parameters from accumulated stats. convert posteriors to gaussian-level posteriors for sgmm training. scriptcorrelation function steps/decode_sgmm2.sh sgmm2-gselect sgmm2-latgen-faster lattice-prune lattice-determinize-pruned lattice-to-post weight-silence-post sgmm2-post-to-gpost sgmm2-rescore-lattice weight-silence-post sgmm2-est-spkvecs sgmm2-comp-prexform sgmm2-est-fmllr compute-wer function description precompute gaussian indices for sgmm training decode features using sgmm-based model. apply beam pruning to lattices determinize lattices do forward-backward and collect posteriors over lattices. apply weight to silences in posts convert posteriors to gaussian-level posteriors for sgmm training. replace the acoustic scores on a lattice using a new model. apply weight to silences in posts estimate sgmm speaker vectors compute pre-transform parameters “estimate fmllr transform for sgmms compute wer script description decoding with an sgmm system, with speaker vectors. (空间高斯混合函数的解码脚本) convert posteriors to gaussian-level posteriors for sgmm training. scriptcorrelation function steps/align_sgmm2.sh compile-train-graphs sgmm2-gselect sgmm2-align-compiled ali-to-post weight-silence-post sgmm2-post-to-gpost sgmm2-est-spkvecs-gpost steps/make_denlats_sgmm2.sh sgmm2-latgen-faster steps/train_mmi_sgmm2.sh sgmm2-rescore-lattice lattice-to-post sum-post sgmm2-acc-stats2 sgmm2-est-ebw function description creates training graphs precompute gaussian indices for sgmm training align features given sgmm-based models. convert alignments to posteriors apply weight to silences in posts convert posteriors to gaussian-level posteriors for sgmm training. estimate sgmm speaker vectors decode features using sgmm-based model. replace the acoustic scores on a lattice using a new model. do forward-backward and collect posteriors over lattices. sum two sets of posteriors for each utterance, e.g. useful in fmmi. accumulate numerator and denominator stats for discriminative training of sgmms estimate sgmm model parameters discriminatively using extended baum-welch style of update script description computes training alignments and (if needed) speaker-vectors, given an sgmm system. create denominator lattices for mmi/mpe training, with sgmm models. mmi training (or optionally boosted mmi, if you give the -boost option), for sgmms. convert posteriors to gaussian-level posteriors for sgmm training. do forward-backward and collect posteriors over lattices. sum two sets of posteriors for each utterance, e.g. useful in fmmi. accumulate numerator and denominator stats for discriminative training of sgmms estimate sgmm model parameters discriminatively using extended baum-welch style of update scriptcorrelation function steps/decode_sgmm2_rescore.shsgmm2-rescore-lattice function descriptionscript description replace the acoustic scores on a lattice using a new model. decoding with an sgmm system, by rescoring lattices generated from a previous sgmm system. scriptcorrelation function steps/nnet2/train_tanh.sh extend-transform-dim nnet-train-transitions nnet-compute-prob nnet-show-progress nnet-shuffle-egs nnet-am-info nnet-am-average nnet-am-copy nnet-modify-learning-rates nnet-subset-egs nnet-combine-fast nnet-am-fix nnet-am-mixup nnet-compute-from-egs matrix-sum-rows vector-sum nnet-adjust-priors function description read in transform from dimension train the transition probabilities of a neural network acoustic model computes and prints the average log-prob per frame of the given data with a neural net. given an old and a new model and some training examples copy examples (typically single frames) for neural network training print human-readable information this program average (or sums) the parameters over a number of neural nets. copy a (nnet2) neural net and its associated transition model this program modifies the learning rates creates a random subset of the input examples compute an optimal combination of a number of neural nets copy a (cpu-based) neural net and its associated transition model add mixture-components to a neural net comparable to mixtures in a gaussian mixture model does the neural net computation, taking as input the nnet-training examples sum the rows of an input table of matrices and output the corresponding table of vectors add vectors (e.g. weights, transition-accs; speaker vectors) set the priors of the neural net to the computed posterios from the net,on typical data (e.g. training data). script description trains a fairly vanilla network with tanh nonlinearities does the neural net computation, taking as input the nnet-training examples sum the rows of an input table of matrices and output the corresponding table of vectors set the priors of the neural net to the computed posterios from the net,on typical data (e.g. training data). scriptcorrelation function steps/nnet2/decode.sh copy-feats nnet-latgen-faster function descriptionscript description copy features decoding with a neural-net generate lattices using neural net model scriptcorrelation function local/score_combine.shlattice-combine lattice-to-ctm-conf function description combine lattices generated by different systems by removing the total cost of all paths generate 1-best from lattices and convert into ctm with confidences. script description script for system combination using minimum bayes risk decoding generate 1-best from lattices and convert into ctm with confidences. scriptcorrelation function utils/parse_options.sh 无 steps/nnet/make_fmllr_feats.shcopy-feats utils/subset_data_dir_tr_cv.sh 无 steps/nnet/pretrain_dbn.sh nnet-initialize nnet-forward compute-cmvn-stats nnet-concat rbm-train-cd1-frmshuff cmvn-to-nnet rbm-convert-to-nnet steps/nnet/train.sh analyze-counts copy-transition-model compose-transforms nnet-concat ali-to-post weight-silence-post est-lda transf-to-nnet nnet-forward compute-cmvn-stats nnet-initialize steps/nnet/align.sh compile-train-graphs align-compiled-mapped latgen-faster-mapped steps/nnet/make_denlats.sh latgen-faster-mapped steps/nnet/train_mpe.shnnet-train-mpe-sequential function description parse command-line options copy features this script splits dataset to two parts copy neural network model perform forward pass through neural network. compute cepstral mean and variance normalization statistics concatenate neural networks train rbm by contrastive divergence alg. with 1 step of markov chain mont
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 生物医学实验室操作规范知识问答
- 初中诗词欣赏:深入理解意象和寓意教学教案
- 企业外聘人员合同书
- 设备采购安装维护合同协议书规范要求
- 高中物理电磁学考点解析
- 财务管理流程表格化展示
- 养殖基地合作经营及疫病防控责任书
- 突发环境污染事件应急响应与预防机制
- 文化产业发展对历史文化保护的促进作用
- 公司产品合格率与不合格处理情况统计表
- 第5章 自动驾驶仪系统《民航飞机自动飞行控制系统》
- DB4401-T 19-2019涉河建设项目河道管理技术规范-(高清现行)
- T∕CSTM 00839-2022 材料基因工程术语
- 通用桥式起重机施工过程记录表
- 电梯安装施工进度及保证措施
- NLP神经语言学培训课件(PPT 164页)
- 脑卒中康复PPT医学课件
- 高等数学(下册)资料期末复习试题与答案
- 四冲程内燃机 机械原理课程设计说明书
- PCB 企业生产工艺及风险点
- Grace评分表、TIMI评分、CRUSAD评分、wells评分等
评论
0/150
提交评论