版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Next-Generation DNA Sequencing MethodsChapter 1Q1给你一段DNA序列,你想要什么样的DNA序列?如果你有了一段DNA序列,你会干什么?不同的专业如果瞬间给你百万数量级的DNA序列,你又会如何?请你考虑如果我们能够检测出某一细胞里所有的DNA序列,是不是意味着我们已经能够破解该细胞的绝大多数分子机制?如果不是,我们还需要什么才有能。Sanger SequencingRoche 454Sample Input & FragmentationLibrary PreparationOne Fragment = One BeademPCR: Emulsio
2、n PCR AmplificationSequencing: One Bead = One ReadPyrosequencing chemistrySingle-nucleotide addition: pyrosequencingFeaturesHowever, the calibrated base calling cannot properly interpret long stretches (6) of the same nucleotide (homopolymer run), so these areas are prone to base insertion and delet
3、ion errors during base calling. By contrast, because each incorporation step is nucleotide specific, substitution errors are rarely encountered in Roche/454 sequence reads.The FLX instrument currently provides 100 flows of each nucleotide during an 8-h run, which produces an average read length of 2
4、50 nucleotides (an average of 2.5 bases per flow are incorporated).mixed sequences (more than one initial DNA fragment per bead)FeaturesAlthough shorter than reads derived from capillary sequencers, FLXreads are of sufficient length to assemble small genomes such as bacterial and viral genomes to hi
5、gh quality and contiguity.Preprocessvarious quality filters to remove poor-quality sequencesmixed sequences (more than one initial DNA fragment per bead)sequences without the initiating TCGA sequence.P11)设计一个程序 随机的产生1000bp的DNA序列2)并随机的从中取出1000个25bp的序列,记住位置3)开发一个函数用于比对这1000个25bp到原来的1000bp的DNA序列,计算出相对位
6、置,并与原来的产生位置相比较,看是否一致4)给出程序流程图,及其计算时间以及相应的计算机配置Illumina/Solexa Genome AnalyzerCyclic reversible termination.FeaturesSubstitutions are the most common error type, with a higher portion of errors occurring when the previous incorporated nucleotide is a G baseGenome analysis of Illumina/Solexa data has
7、revealed an underrepresentation of AT-rich and GC-rich regions, which is probably due to amplification bias during template preparation.Sequence variants are called by aligning reads to a reference genome using bioinformatics tools such as MAQ or ELAND. Bentley and colleagues reported high concordan
8、ce (99.5%) of single-nucleotide variant (SNV) calls with standard genotyping arrays using both alignment tools, and a false-positive rate of 2.5% with novel SNVs. Other reports have described a higher falsepositive rate associated with novel SNV detection using these alignment toolsApplied Biosystem
9、s SOLiDSequencerSequencing by ligationFeaturesThe method uses two-base-encoded probes, which has the primary advantage of improved accuracy in colour calling and SNV calling, the latter of which requires an adjacent valid colour change.Substitutions are the most common error type. Similar to the gen
10、ome analysis of Illumina/Solexa reads, SOLiD data have also revealed an underrepresentation of AT-rich and GC-rich regions.P2在先前的例子中,随机的在25bp的序列上进行1/2/3次碱基的改变检查是否先前的比对代码可以用于寻找出定位,如不行,请设计有效的解决方案。(如果随机的在25bp的序列上进行1/2/3次碱基的删除或者添加,检查是否先前的比对代码可以用于寻找出定位,如不行,请设计有效的解决方案。)In summary-KeysThe length of a seque
11、nce read from all current next generation platforms is much shorter than that from a capillary sequencereach next generation read type has a unique error model different from that already established for capillary sequence reads.in strain-to-reference comparisons (resequencing), the typical definiti
12、on of repeat content must be revised in the context of the shorter read length.a much higher read coverage or sampling depth is required for comprehensive resequencing with short reads to adequately cover the reference sequence at the depth and low gap size needed.In summaryimage analysis, signal pr
13、ocessing, background subtraction, base calling, and quality assessment to produce the final sequence reads for each rmation technology (IT), computational, data storage, and laboratory information management system (LIMS) infrastructuresIn summaryAlthough quality scores and accuracy estimate
14、s are provided by each manufacturer, there is no consensus that a quality base from one platform is equivalent to that from another platform.more importantly, these methods do not require PCR, which creates mutations in clonally amplified templates that masquerade as sequence variants. AT-rich and G
15、C-rich target sequences may also show amplification bias in product yield. In summarydemand on the efficiency of the addition process, and incomplete extension of the template ensemble results in lagging-strand dephasing. Signal dephasing increases fluorescence noise, causing base-calling errors and
16、 shorter readsIn summarySingle molecules, however, are susceptible to multiple nucleotide or probe additions in any given cycle. Here, deletion errors will occur owing to quenching effects between adjacent dye molecules or no signal will be detected because of the incorporation of dark nucleotides o
17、r probes.Dark nucleotides or probes A nucleotide or probe that does not contain a fluorescent label. It can be generated from its cleavage and carry-over from the previous cycle or be hydrolysed in situ from its dye-labelled counterpart in the current cycle.In summaryreported that deletion errors in
18、 homopolymeric repeat regions were the most common error type (5% frequency) when using the primer-immobilized strategy. This is likely to be related to the incorporation of two or more Cy5-12ss-dNTPs in a given cycle. These errors can be greatly reduced with two-pass sequencing, which provides 25-b
19、ase consensus reads using the template-immobilized strategy.direct RNA sequencing, as it sequences RNA templates directly without the need to convert them into cDNAsReal-time sequencingFeaturesTo assess the accuracy of this method, a four-colour sequencing experiment was conducted using a known 150
20、bp linear template. Base calls from the real-time reads were determined from their corresponding fluorescence pulses. when the reads were compared to a known sequence, 27 errors consisting of deletions, insertions and mismatches were identified, corresponding to a read accuracy of approximately 83%
21、(131/158).FeaturesGiven that most errors appear as stochastic events, the authors showed that repeated sequencing of the same template molecule 15 times or more could improve the consensus read accuracy to 99%.At the 2009 AGBT meeting, Pacific Biosciences reported improvements to their platform; whe
22、n it was used to sequence the E. coli genome at 38-fold base coverage, 99.3% genome coverage was obtained. The onsensus accuracy reached was 99.999% for the entire genome, with read lengths averaging 964 basesGenome enrichmentuse NGS platforms to target specific regions of interestThis strategy can be used to examine all of the exons in the genome, specific gene families that constitute known drug targets or megabasesize regions that are implicated in disease or pharmacogenetics effects through genome-wide association studiesSumma
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 验收小组奖惩制度范本
- 警务团队奖惩制度汇编
- 初中实习教师奖惩制度
- 车险销售团队奖惩制度
- 原辅材料出库奖惩制度
- 补习班大拇指奖惩制度
- 护理实习生教学奖惩制度
- 一年级家长奖惩制度细则
- 电力施工员工奖惩制度
- 特教教师考核奖惩制度
- 中等职业学校公共基础课程 数学《平面与平面平行》教学课件 第1课时
- 工程咨询公司三级复核制度
- 长沙市肇事肇祸精神障碍患者管理办法(全文)
- 建设许可法规(建设法规课件)
- 企业保卫治安培训课件
- 社会稳定风险评估 投标方案(技术标)
- 少先队德育知识讲座
- 米粉切割机-毕业设计
- (完整)CRH380A动车组轮对检修流程及改进方案
- 人音版小学六年级音乐下册全册教案【完整版】
- 四川省省属卫生事业单位公开招聘卫生专业技术岗位人员公共科目笔试大纲
评论
0/150
提交评论