《淀粉样变性病》PPT课件.ppt_第1页
《淀粉样变性病》PPT课件.ppt_第2页
《淀粉样变性病》PPT课件.ppt_第3页
《淀粉样变性病》PPT课件.ppt_第4页
《淀粉样变性病》PPT课件.ppt_第5页
已阅读5页,还剩129页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

A Field Guide part 2,Genome resources Sequence similarity,Apr, 2007,Shandong University,Genome Resources,LocusLink,Gene database,UniGene,Trace Archive,Map Viewer,Homologene,Genomic Biology,Genome Projects: microb,Genome Resources,LocusLink,Gene database,UniGene,Trace Archive,Map Viewer,Homologene,A single query interface to Sequences - RefSeqs - GenBank - Homologene Maps MapViewer Entrez links,LocusLink,LocusLink will be replaced by Entrez Gene on MARCH 1, 2005. Check Gene FAQ for current information.,Entrez Gene,LocusLink,A single query interface to Sequences - RefSeqs - GenBank - Homologene Maps MapViewer Entrez links,Entrez Gene,More organisms - all RefSeq genomes Entrez integration,Gsnsym,淀粉样变性病,Global Entrez: NADH2,nadh2,47,Entrez Gene: NADH2,26 records,Gene Record for Pongo NADH2,Display Exons/Introns: Gene Table,Gene Table,A Record With More Data: Human HFE,血色沉着病,Gene Graphic Links,Introns/Exons: Gene Table,links to sequence,A Record With More Data: Human HFE,Entrez SNP,hfegene name AND humanorgn,52,血色沉着病,Linking to SNP,染色体定位,基因定位,序列定位,SNP in Structure,Link to OMIM,Variants in OMIM,Genome Resources,LocusLink,Gene database,Trace Archive,Map Viewer,Homologene,UniGene,Gene-oriented clusters of expressed sequences,Automatic clustering using MegaBlast Each cluster represents a unique gene Informed by genome hits Information on tissue types and map locations Useful for gene discovery and selection of mapping reagents,UniGene,A Cluster of ESTs,query,5 EST hits,3 EST hits,Unigene,UniGene Collections,Example UniGene Cluster,Histogram of cluster sizes for UniGene Hs build 177,UniGene Cluster Hs.95351,UniGene Cluster Hs.95351,UniGene Cluster Hs.95351: expression,UniGene Cluster Hs.95351: seqs,Download sequences,web page,ftp site,Genome Resources,LocusLink,Gene database,UniGene,Trace Archive,Map Viewer,Homologene,The New Homologene,Automated detection of homologs among the annotated genes of completely sequenced eukaryotic genomes.,No longer UniGene based Protein similarities first Guided by taxonomic tree Includes orthologs and paralogs,Orthologs 和 Paralogs 是同源序列的两种类型。 Orthologs(垂直同源基因)是指来自于不同物种的由垂直家系(物种形成)进化而来的蛋白,并且典型的保留与原始蛋白有相同的功能。 Paralogs(平行同源基因)是那些在一定物种中的来源于基因复制的蛋白,可能会进化出新的与原来有关的功能。请参考文献获得更多的信息。,gene duplication,Paralogs vs Orthologs,early globin gene,A-chain gene B-chain gene,paralogs,orthologs orthologs,The New Homologene,RAG1 Homologene,rag1,12,recombination activating gene,RAG1 Homolgene,RAG1,Amniota,Homolgene: RAG1,Homolgene: RAG1,Genome Resources,LocusLink,Gene database,UniGene,Trace Archive,Map Viewer,Homologene,MapViewer,List View,Human MapViewer,腺甙脱氨酶,MapViewer: Human ADAR,4,MV Hs ADAR,Maps & Options,-Sequence maps- Ab initio Assembly Repeats BES_Clone Clone NCI_Clone Contig Component CpG island dbSNP haplotype Fosmid GenBank_DNA Gene Phenotype SAGE_Tag STS TCAG_RNA Transcript (RNA) Hs_UniGene Hs_EST,-Cytogenetic maps- Ideogram FISH Clone Gene_Cytogenetic Mitelman Breakpoint Morbid/Disease -Genetic Maps- deCODE Genethon Marshfield -RH maps- GeneMap99-G3 GeneMap99-GB4 NCBI RH Standford-G3 TNG Whitehead-RH Whitehead-YAC,Mm_UniGene Mm_EST Rn_UniGene Rn_EST Ssc_UniGene Ssc_EST Bt_UniGene Bt_EST Gga_UniGene Gga_EST Variation,Maps & Options,MapViewer,UniGene,Component,Repeats,Gene,Master map: repeats,Gene,Phenotype,Variation,Maps & Options,Maps & Options,Genome Resources,LocusLink,Gene database,UniGene,Trace Archive,Map Viewer,Homologene,Strongylocentrotus purpuratus Traces,BLAST,Basic Local Alignment Search Tool,Web Access,BLAST,VAST,Entrez,Text,Sequence,Structure,Basic Local Alignment Search Tool,Why use sequence similarity? BLAST algorithm BLAST statistics BLAST output Examples,Why Do We Need Sequence Similarity Searching?,To identify and annotate sequences To evaluate evolutionary relationships Other: model genomic structure (e.g., Spidey) check primer specificity in silico,: NCBIs tool,BLAST Website Stats,Global vs Local Alignment,Global vs Local Alignment,Seq1: WHEREISWALTERNOW (16aa) Seq2: HEWASHEREBUTNOWISHERE (21aa),The Flavors of BLAST,Standard BLAST traditional “contiguous” word hit position independent scoring nucleotide, protein and translations (blastn, blastp, blastx, tblastn, tblastx) Megablast optimized for large batch searches can use discontiguous words PSI-BLAST constructs PSSMs automatically; uses as query very sensitive protein search RPS BLAST searches a database of PSSMs tool for conserved domain searches,Widely used similarity search tool Heuristic approach based on Smith Waterman algorithm Finds best local alignments Provides statistical significance All combinations (DNA/Protein) query and database. DNA vs DNA blastn DNA translation vs Protein blastx Protein vs Protein blastp Protein vs DNA translation tblastn DNA translation vs DNA translation tblastx www, standalone, and network clients,Basic Local Alignment Search Tool,Translated BLAST,Query,Database,Program,ucleotide,rotein,N,N,N,N,P,P,blastx,tblastn,tblastx,Particularly useful for nucleotide sequences without protein annotations, such as ESTs or genomic DNA,How BLAST Works,Make lookup table of “words” for query Scan database for hits Ungapped extensions of hits (initial HSPs) Gapped extensions (no traceback) Gapped extensions (traceback; alignment details),Nucleotide Words,GTACTGGACAT TACTGGACATG ACTGGACATGG CTGGACATGGA TGGACATGGAC GGACATGGACC GACATGGACCC ACATGGACCCT,Make a lookup table of words,. . .,Protein Words,GTQ TQI QIT ITV TVE VED EDL DLF .,Make a lookup table of words, -f 11 = blastp default ,Minimum Requirements for a Hit,Nucleotide BLAST requires one exact match Protein BLAST requires two neighboring matches within 40 aa,GTQITVEDLFYNI SEI YYN,ATCGCCATGCTTAATTGGGCTT CATGCTTAATT,neighborhood words,one exact match,two matches, -A 40 = blastp default ,BLASTP Summary,High-scoring pair (HSP),Scoring Systems - Nucleotides,A G C T A +1 3 3 -3 G 3 +1 3 -3 C 3 3 +1 -3 T 3 3 3 +1,Identity matrix,CAGGTAGCAAGCTTGCATGTCA | | | raw score = 19-9 = 10 CACGTAGCAAGCTTG-GTGTCA, -r 1 -q -3 ,Scoring Systems - Proteins,Position Independent Matrices PAM Matrices (Percent Accepted Mutation) Derived from observation; small dataset of alignments Implicit model of evolution All calculated from PAM1 PAM250 widely used BLOSUM Matrices (BLOck SUbstitution Matrices) Derived from observation; large dataset of highly conserved blocks Each matrix derived separately from blocks with a defined percent identity cutoff BLOSUM62 - default matrix for BLAST Position Specific Score Matrices (PSSMs) PSI- and RPS-BLAST,A 4 R -1 5 N -2 0 6 D -2 -2 1 6 C 0 -3 -3 -3 9 Q -1 1 0 0 -3 5 E -1 0 0 2 -4 2 5 G 0 -2 0 -1 -3 -2 -2 6 H -2 0 1 -1 -3 0 0 -2 8 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 X 0 -1 -1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 A R N D C Q E G H I L K M F P S T W Y V X,BLOSUM62,Position-Specific Score Matrix,DAF-1,Serine/Threonine protein kinases catalytic loop,A R N D C Q E G H I L K M F P S T W Y V 435 K -1 0 0 -1 -2 3 0 3 0 -2 -2 1 -1 -1 -1 -1 -1 -1 -1 -2 436 E 0 1 0 2 -1 0 2 -1 0 -1 -1 0 0 0 -1 0 0 -1 -1 -1 437 S 0 0 -1 0 1 1 0 1 1 0 -1 0 0 0 2 0 -1 -1 0 -1 438 N -1 0 -1 -1 1 0 -1 3 3 -1 -1 1 -1 0 0 -1 -1 1 1 -1 439 K -2 1 1 -1 -2 0 -1 -2 -2 -1 -2 5 1 -2 -2 -1 -1 -2 -2 -1 440 P -2 -2 -2 -2 -3 -2 -2 -2 -2 -1 -2 -1 0 -3 7 -1 -2 -3 -1 -1 441 A 3 -2 1 -2 0 -1 0 1 -2 -2 -2 0 -1 -2 3 1 0 -3 -3 0 442 M -3 -4 -4 -4 -3 -4 -4 -5 -4 7 0 -4 1 0 -4 -4 -2 -4 -1 2 443 A 4 -4 -4 -4 0 -4 -4 -3 -4 4 -1 -4 -2 -3 -4 -1 -2 -4 -3 4 444 H -4 -2 -1 -3 -5 -2 -2 -4 10 -6 -5 -3 -4 -3 -2 -3 -4 -5 0 -5 445 R -4 8 -3 -4 0 -1 -2 -3 -2 -5 -4 0 -3 -2 -4 -3 -3 0 -4 -5 446 D -4 -4 -1 8 -6 -2 0 -3 -3 -5 -6 -3 -5 -6 -4 -2 -3 -7 -5 -5 447 I -4 -5 -6 -6 -3 -4 -5 -6 -5 3 5 -5 1 1 -5 -5 -3 -4 -3 1 448 K 0 0 1 -3 -5 -1 -1 -3 -3 -5 -5 7 -4 -5 -3 -1 -2 -5 -4 -4 449 S 0 -3 -2 -3 0 -2 -2 -3 -3 -4 -4 -2 -4 -5 2 6 2 -5 -4 -4 450 K 0 3 0 1 -5 0 0 -4 -1 -4 -3 4 -3 -2 2 1 -1 -5 -4 -4 451 N -4 -3 8 -1 -5 -2 -2 -3 -1 -6 -6 -2 -4 -5 -4 -1 -2 -6 -4 -5 452 I -3 -5 -5 -6 0 -5 -5 -6 -5 6 2 -5 2 -2 -5 -4 -3 -5 -3 3 453 M -4 -4 -6 -6 -3 -4 -5 -6 -5 0 6 -5 1 0 -5 -4 -3 -4 -3 0 454 V -3 -3 -5 -6 -3 -4 -5 -6 -5 3 3 -4 2 -2 -5 -4 -3 -5 -3 5 455 K -2 1 1 4 -5 0 -1 -2 1 -4 -2 4 -3 -2 -3 0 -1 -5 -2 -3 456 N 1 1 3 0 -4 -1 1 0 -3 -4 -4 3 -2 -5 -2 2 -2 -5 -4 -4 457 D -3 -2 5 5 -1 -1 1 -1 0 -5 -4 0 -2 -5 -1 0 -2 -6 -4 -5 458 L -3 -1 0 -3 0 -3 -2 3 -4 -2 3 0 1 1 -2 -2 -3 5 -1 -3,Position-Specific Score Matrix,catalytic loop, ./blastpgp -i NP_499868.2 -d nr -j 3 -Q NP_499868.pssm ,Local Alignment Statistics,High scores of local alignments between two random sequences follow the Extreme Value Distribution,Score (S),Alignments,Expect Value E = number of database hits you expect to find by chance, S,your score,expected number of random hits,More info: /BLAST/tutorial/Altschul-1.html,Advanced BLAST Options: Nucleotide,Example Entrez Queries nucleotide allFilter NOT mammaliaOrganism green plantsOrganism biomol mrnaProperties gbdiv estProperties AND ratorganism Other Advanced e 10000 expect value -v 2000 descriptions -b 2000 alignments,Advanced BLAST Options: Protein,Matrix Selection PAM30 - most stringent BLOSUM45 - least stringent,Example Entrez Queries proteins allFilter NOT mammaliaOrganism green plantsOrganism srcdb refseqProperties Other Advanced e 10000 expect value -v 2000 descriptions -b 2000 alignments,Limit by taxon Mus musculusOrganism MammaliaOrganism ViridiplantaeOrganism,sp|P27476|NSR1_YEAST NUCLEAR LOCALIZATION SEQUENCE BINDING PROTEIN (P67) Length = 414 Score = 40.2 bits (92), Expect = 0.013 Identities = 35/131 (26%), Positives = 56/131 (42%), Gaps = 4/131 (3%) Query: 362 STTSLTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQPLSKPLS-SQPQAIVTEDKTD 418 S+S SSS+S SS + + +S + + S S S+ + E K Sbjct: 29 SSSSSESSSSSSSSSESESESESESESSSSSSSSDSESSSSSSSDSESEAETKKEESKDS 88,Filtered,Unfiltered,Low Complexity Filtering,Other BLAST Algorithms,Megablast Discontiguous Megablast PSI-BLAST PHI-BLAST,Megablast: NCBIs Genome Annotator,Long alignments of similar DNA sequences Greedy algorithm Concatenation of query sequences Faster than blastn; less sensitive,MegaBLAST & Word Size,Trade-off: sensitivity vs speed,Discontiguous Megablast,Uses discontiguous word matches Better for cross-species comparisons,Templates for Discontiguous Words,W = 11, t = 16, coding: 1101101101101101 W = 11, t = 16, non-coding: 1110010110110111 W = 12, t = 16, coding: 1111101101101101 W = 12, t = 16, non-coding: 1110110110110111 W = 11, t = 18, coding: 101101100101101101 W = 11, t = 18, non-coding: 111010010110010111 W = 12, t = 18, coding: 101101101101101101 W = 12, t = 18, non-coding: 111010110010110111 W = 11, t = 21, coding: 100101100101100101101 W = 11, t = 21, non-coding: 111010010100010010111 W = 12, t = 21, coding: 100101101101100101101 W = 12, t = 21, non-coding: 111010010110010010111,Reference: Ma, B, Tromp, J, Li, M. PatternHunter: faster and more sensitive homology search. Bioinformatics March, 2002; 18(3):440-5,W = word size; # matches in template t = template length,Discontiguous (Cross-species) MegaBLAST,Discontiguous Word Options,MegaBLAST vs Discontiguous MegaBLAST,NM_017460,Homo sapiens cytochrome P450, family 3, subfamily A, polypeptide 4 (CYP3A4), transcript variant 1, mRNA (2768 letters),vs Drosophila,MegaBLAST vs Discontiguous MegaBLAST,MegaBLAST = “No significant similarity found.”,Discontiguous megaBLAST =,Another Example . . .,Discontiguous megaBLAST = numerous hits . . .,Query: NM_078651 Drosophila melanogaster CG18582-PA (mbt) mRNA, (3244 bp) /note= mushroom bodies tiny; synonyms: Pak2, STE20, dPAK2,MegaBLAST = “No significant similarity found.”,Database: nr (nt), Mammaliaorgn,Ex: Discontiguous MegaBLAST,Ex: BLASTN,PSI-BLAST,Example: Confirming relationships of purine nucleotide metabolism proteins,Position-specific Iterated BLAST,gi|113340|sp|P03958|ADA_MOUSE ADENOSINE DEAMINASE (ADENOSINE MAQTPAFNKPKVELHVHLDGAIKPETILYFGKKRGIALPADTVEELRNIIGMDKPLSLPGF VIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVDPMPWNQTEGDVTPDDVVD EQAFGIKVRSILCCMRHQPSWSLEVLELCKKYNQKTVVAMDLAGDETIEGSSLFPGHVEAY RTVHAGEVGSPEVVREAVDILKTERVGHGYHTIEDEALYNRLLKENMHFEVCPWSSYLTGA VRFKNDKANYSLNTDDPLIFKSTLDTDYQMTKKDMGFTEEEFKRLNINAAKSSFLPEEEKK,PSI-BLAST,0.005,E value cutoff for PSSM,RESULTS: Initial BLASTP,Same results as protein-protein BLAST; different format,Results of First PSSM Search,Other purine nucleotide metabolizing enzymes not found by ordinary BLAST,Tenth PSSM Search: Convergence,Just below threshold, another nucleotide metabolism enzyme,Reverse PSI-BLAST (RPS)-BLAST,Adenosine/AMP Deaminase Domain,. . .,PHI-BLAST,gi|231729|sp|P30429|CED4_CAEEL CELL DEATH PROTEIN 4 MLCEIECRALSTAHTRLIHDFEPRDALTYLEGKNIFTEDHSELISKMSTRLERIANFLRIYRRQASE LIDFFNYNNQSHLADFLEDYIDFAINEPDLLRPVVIAPQFSRQMLDRKLLLGNVPKQMTCYIREYHV IKKLDEMCDLDSFFLFLHGRAGSGKSVIASQALSKSDQLIGINYDSIVWLKDSGTAPKSTFDLFTDI LKSEDDLLNFPSVEHVTSVVLKRMICNALIDRPNTLFVFDDVVQEETIRWAQELRLRCLVTTRDVEI ASQTCEFIEVTSLEIDECYDFLEAYGMPMPVGEKEEDVLNKTIELSSGNPATLMMFFKSCEPKTFEK,GAxxxxGKST,Whats New?,BLAST Databases,Nucleotide ref

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论