淀粉样变性病讲解_第1页
淀粉样变性病讲解_第2页
淀粉样变性病讲解_第3页
淀粉样变性病讲解_第4页
淀粉样变性病讲解_第5页
已阅读5页,还剩135页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、会计学1淀粉样变性病讲解淀粉样变性病讲解Genome ResourcesGenome ResourcesA single query interface to SequencesSequences- RefSeqs- RefSeqs- GenBank- GenBank- Homologene- HomologeneMaps MapViewerMaps MapViewerEntrez linksEntrez linksLocusLink will be replaced by Entrez Gene on MARCH 1, 2005. Check Gene FAQ for current inf

2、ormation.LocusLinkA single query interface to SequencesSequences - RefSeqs - RefSeqs - GenBank - GenBank - Homologene - HomologeneMaps MapViewerMaps MapViewerEntrez linksEntrez linksEntrez Gene More organisms - all RefSeq genomes Entrez integrationGsnsym淀粉样变性病nadh24726 recordsHomo sapiens血色沉着病NM_NM_

3、NP_NP_links to sequencehfegene name AND humanorgn 52血色沉着病染色体定位基因定位序列定位Genome ResourcesGene-oriented clusters of expressed sequences Automatic clustering using MegaBlast Each cluster represents a unique gene Informed by genome hits Information on tissue types and map locations Useful for gene discove

4、ry and selection of mapping reagentsquery5 EST hits3 EST hitsUnigeneweb pageftp siteGenome ResourcesAutomated detection of homologs among the annotated genes of completely sequenced eukaryotic genomes. Orthologs 和 Paralogs 是同源序列的两种类型。 Orthologs(垂直同源基因)是指来自于不同物种的由垂直家系(物种形成)进化而来的蛋白,并且典型的保留与原始蛋白有相同的功能。

5、 Paralogs(平行同源基因)是那些在一定物种中的来源于基因复制的蛋白,可能会进化出新的与原来有关的功能。请参考文献获得更多的信息。gene duplicationearly globin geneA-chain gene B-chain genefrog A chick A mouse Amouse B chick B frog Bparalogsorthologs orthologs Homologene Build 37.2Species Number of genes input grouped groupsrag112recombination activating gene R

6、AG1Amniota Genome Resourcesadar腺甙脱氨酶43 UTR5 UTR-Sequence mapsSequence maps-Ab initioAssemblyRepeatsBES_CloneCloneNCI_CloneContigComponentCpG islanddbSNP haplotypeFosmidGenBank_DNAGenePhenotypeSAGE_TagSTSTCAG_RNATranscript (RNA)Hs_UniGeneHs_EST-Cytogenetic mapsCytogenetic maps-IdeogramFISH CloneGene_

7、CytogeneticMitelman BreakpointMorbid/Disease-Genetic Maps-deCODEGenethonMarshfield-RH maps-GeneMap99-G3GeneMap99-GB4NCBI RHStandford-G3TNGWhitehead-RHWhitehead-YACMm_UniGeneMm_ESTRn_UniGeneRn_ESTSsc_UniGeneSsc_ESTBt_UniGeneBt_ESTGga_UniGeneGga_ESTVariationMaps & Options= SNPUniGeneComponentRepea

8、tsGeneGenePhenotypeVariationMaps & OptionsGenome ResourcesWeb AccessBLASTVASTEntrezTextSequenceStructure Why use sequence similarity? BLAST algorithm BLAST statistics BLAST output Examples: NCBIs toolSeq 1Seq 2Seq 1Seq 2Global alignmentLocal alignmentSeq1: WHEREISWALTERNOW (16aa)Seq2: HEWASHEREB

9、UTNOWISHERE (21aa)GlobalSeq1:1 W-HEREISWALTERNOW 16 W HERE Seq2:1 HEWASHEREBUTNOWISHERE 21LocalSeq1: 1 W-HERE 5 Seq1: 1 W-HERE 5 W HERE W HERESeq2: 3 WASHERE 9 Seq2: 15 WISHERE 21QueryQueryDatabaseDatabaseProgramProgramNPucleotideroteinNNNNPPblastxtblastntblastxPPPPPPPPPPPPPPPPPPPPPPPPParticularly u

10、seful for nucleotide sequences withoutprotein annotations, such as ESTs or genomic DNAGTACTGGACATGGACCCTACAGGAAQuery:GTACTGGACAT TACTGGACATG ACTGGACATGG CTGGACATGGA TGGACATGGAC GGACATGGACC GACATGGACCC ACATGGACCCTMake a lookuptable of words11-mer. . .828megablast711blastnminimumdefaultWORD SIZEGTQITV

11、EDLFYNIATRRKALKNQuery: Neighborhood WordsLTV, MTV, ISV, LSV, etc.GTQ TQI QIT ITV TVE VED EDL DLF .Make a lookuptable of wordsWord size = 3 (default)Word size can only be 2 or 3 -f 11 = blastp default Nucleotide BLAST requires one exact match Protein BLAST requires two neighboring matches within 40 a

12、aGTQITVEDLFYNI SEI YYNATCGCCATGCTTAATTGGGCTT CATGCTTAATT neighborhood wordsone exact matchtwo matches -A 40 = blastp default YLS HFLSbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEI 333 Query 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI 47Gapped extension with trace backGapped extension

13、 with trace backQuery 1 IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESI-LEV 50 +E YA YL K F+YLSL +SP+ +DVNVHP+K VHFL+ I + +Sbjct 287 LEETYAKYLHKGASYFVYLSLNMSPEQLDVNVHPSKRIVHFLYDQEIATSI 337 Final HSPFinal HSP +E YA YL K F+ L +SP+ +DVNVHP+K V + I High-scoring pair (HSP)High-scoring pair (HSP)HFL 18HFV

14、 15 HFS 14HWL 13NFL 13DFL 12HWV 10etc YLS 15YLT 12 YVS 12YIT 10etc Neighborhood Neighborhood wordswordsNeighborhood Neighborhood score thresholdscore thresholdT (-f) =11T (-f) =11Query: IETVYAAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILEVexample query wordsexample query words A G C TA +1 3 3 -3G 3 +1

15、3 -3C 3 3 +1 -3T 3 3 3 +1Identity matrixCAGGTAGCAAGCTTGCATGTCA| | | raw score = 19-9 = 10CACGTAGCAAGCTTG-GTGTCA -r 1 -q -3 Position Independent MatricesPAM Matrices (Percent Accepted Mutation) Derived from observation; small dataset of alignments Implicit model of evolution All calculated from PAM1

16、PAM250 widely usedBLOSUM Matrices (BLOck SUbstitution Matrices) Derived from observation; large dataset of highly conserved blocks Each matrix derived separately from blocks with a defined percent identity cutoff BLOSUM62 - default matrix for BLASTPosition Specific Score Matrices (PSSMs)PSI- and RPS

17、-BLASTA 4R -1 5 N -2 0 6D -2 -2 1 6C 0 -3 -3 -3 9Q -1 1 0 0 -3 5E -1 0 0 2 -4 2 5G 0 -2 0 -1 -3 -2 -2 6H -2 0 1 -1 -3 0 0 -2 8I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6P -1

18、 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4X 0 -1

19、-1 -1 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -2 0 0 -2 -1 -1 -1 A R N D C Q E G H I L K M F P S T W Y V XDFNegative for less likely substitutionsDYFPositive for more likely substitutionsDAF-1Serine/Threonine protein kinases catalytic loop174PSSM scores54 A R N D C Q E G H I L K M F P S T W Y V 435 K -1 0 0 -

20、1 -2 3 0 3 0 -2 -2 1 -1 -1 -1 -1 -1 -1 -1 -2 436 E 0 1 0 2 -1 0 2 -1 0 -1 -1 0 0 0 -1 0 0 -1 -1 -1 437 S 0 0 -1 0 1 1 0 1 1 0 -1 0 0 0 2 0 -1 -1 0 -1 438 N -1 0 -1 -1 1 0 -1 3 3 -1 -1 1 -1 0 0 -1 -1 1 1 -1 439 K -2 1 1 -1 -2 0 -1 -2 -2 -1 -2 5 1 -2 -2 -1 -1 -2 -2 -1 440 P -2 -2 -2 -2 -3 -2 -2 -2 -2

21、-1 -2 -1 0 -3 7 -1 -2 -3 -1 -1 441 A 3 -2 1 -2 0 -1 0 1 -2 -2 -2 0 -1 -2 3 1 0 -3 -3 0 442 M -3 -4 -4 -4 -3 -4 -4 -5 -4 7 0 -4 1 0 -4 -4 -2 -4 -1 2 443 A 4 -4 -4 -4 0 -4 -4 -3 -4 4 -1 -4 -2 -3 -4 -1 -2 -4 -3 4 444 H -4 -2 -1 -3 -5 -2 -2 -4 10 -6 -5 -3 -4 -3 -2 -3 -4 -5 0 -5 445 R -4 8 -3 -4 0 -1 -2

22、-3 -2 -5 -4 0 -3 -2 -4 -3 -3 0 -4 -5 446 D -4 -4 -1 8 -6 -2 0 -3 -3 -5 -6 -3 -5 -6 -4 -2 -3 -7 -5 -5 447 I -4 -5 -6 -6 -3 -4 -5 -6 -5 3 5 -5 1 1 -5 -5 -3 -4 -3 1 448 K 0 0 1 -3 -5 -1 -1 -3 -3 -5 -5 7 -4 -5 -3 -1 -2 -5 -4 -4 449 S 0 -3 -2 -3 0 -2 -2 -3 -3 -4 -4 -2 -4 -5 2 6 2 -5 -4 -4 450 K 0 3 0 1 -

23、5 0 0 -4 -1 -4 -3 4 -3 -2 2 1 -1 -5 -4 -4 451 N -4 -3 8 -1 -5 -2 -2 -3 -1 -6 -6 -2 -4 -5 -4 -1 -2 -6 -4 -5 452 I -3 -5 -5 -6 0 -5 -5 -6 -5 6 2 -5 2 -2 -5 -4 -3 -5 -3 3 453 M -4 -4 -6 -6 -3 -4 -5 -6 -5 0 6 -5 1 0 -5 -4 -3 -4 -3 0 454 V -3 -3 -5 -6 -3 -4 -5 -6 -5 3 3 -4 2 -2 -5 -4 -3 -5 -3 5 455 K -2

24、1 1 4 -5 0 -1 -2 1 -4 -2 4 -3 -2 -3 0 -1 -5 -2 -3 456 N 1 1 3 0 -4 -1 1 0 -3 -4 -4 3 -2 -5 -2 2 -2 -5 -4 -4 457 D -3 -2 5 5 -1 -1 1 -1 0 -5 -4 0 -2 -5 -1 0 -2 -6 -4 -5 458 L -3 -1 0 -3 0 -3 -2 3 -4 -2 3 0 1 1 -2 -2 -3 5 -1 -3catalytic loop ./blastpgp -i NP_499868.2 -d nr -j 3 -Q NP_499868.pssm High

25、scores of local alignments between two random sequencesfollow the Extreme Value DistributionScore (S)Alignments(applies to ungapped alignments)E = Kmne-S or E = mn2-SK = scale for search space = scale for scoring system S = bitscore = (S - lnK)/ln2Expect ValueExpect ValueE = number of database hits

26、you expect to find by chance, Syour scoreexpected number of random hitsMore info: /BLAST/tutorial/Altschul-1.html Example Entrez Queriesnucleotide allFilter NOT mammaliaOrganismgreen plantsOrganismbiomol mrnaPropertiesgbdiv estProperties AND ratorganismOther Advancede 10000 expec

27、t value-v 2000 descriptions-b 2000 alignmentsMatrix SelectionPAM30 - most stringentBLOSUM45 - least stringentExample Entrez Queriesproteins allFilter NOT mammaliaOrganismgreen plantsOrganismsrcdb refseqPropertiesOther Advancede 10000 expect value-v 2000 descriptions-b 2000 alignmentsLimit by taxonMu

28、s musculusOrganismMammaliaOrganismViridiplantaeOrganism sp|P27476|NSR1_YEAST NUCLEAR LOCALIZATION SEQUENCE BINDING PROTEIN (P67) Length = 414 Score = 40.2 bits (92), Expect = 0.013 Identities = 35/131 (26%), Positives = 56/131 (42%), Gaps = 4/131 (3%)Query: 362 STTSLTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQP

29、LSKPLS-SQPQAIVTEDKTD 418 S+S SSS+S SS + + +S + + S S S+ + E K Sbjct: 29 SSSSSESSSSSSSSSESESESESESESSSSSSSSDSESSSSSSSDSESEAETKKEESKDS 88FilteredUnfiltered Megablast Discontiguous Megablast PSI-BLAST PHI-BLASTTrade-off: sensitivity vs speed23blastp828megablast711blastnminimumdefaultWORD SIZEW = 11, t

30、= 16, coding: 1101101101101101W = 11, t = 16, non-coding: 1110010110110111W = 12, t = 16, coding: 1111101101101101W = 12, t = 16, non-coding: 1110110110110111W = 11, t = 18, coding: 101101100101101101W = 11, t = 18, non-coding: 111010010110010111W = 12, t = 18, coding: 101101101101101101W = 12, t =

31、18, non-coding: 111010110010110111W = 11, t = 21, coding: 100101100101100101101W = 11, t = 21, non-coding: 111010010100010010111W = 12, t = 21, coding: 100101101101100101101W = 12, t = 21, non-coding: 111010010110010010111 Reference: Ma, B, Tromp, J, Li, M. PatternHunter: faster and more sensitive h

32、omology search. Bioinformatics March, 2002; 18(3):440-5 W = word size; # matches in templatet = template lengthNM_017460Homo sapiens cytochrome P450, family 3, subfamily A, polypeptide 4 (CYP3A4), transcript variant 1, mRNA (2768 letters) vs Drosophila MegaBLAST = “No significant similarity found.”

33、Discontiguous megaBLAST = Discontiguous megaBLAST = numerous hits . . .Query: NM_078651 Drosophila melanogaster CG18582-PA (mbt) mRNA, (3244 bp)/note= mushroom bodies tiny; synonyms: Pak2, STE20, dPAK2 MegaBLAST = “No significant similarity found.”Database: nr (nt), MammaliaorgnPosition-specific Ite

34、rated BLASTgi|113340|sp|P03958|ADA_MOUSE ADENOSINE DEAMINASE (ADENOSINEMAQTPAFNKPKVELHVHLDGAIKPETILYFGKKRGIALPADTVEELRNIIGMDKPLSLPGFVIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVDPMPWNQTEGDVTPDDVVDEQAFGIKVRSILCCMRHQPSWSLEVLELCKKYNQKTVVAMDLAGDETIEGSSLFPGHVEAYRTVHAGEVGSPEVVREAVDILKTERVGHGYHTIEDEALYNRLLKEN

35、MHFEVCPWSSYLTGAVRFKNDKANYSLNTDDPLIFKSTLDTDYQMTKKDMGFTEEEFKRLNINAAKSSFLPEEEKK0.005E value cutoff for PSSMSame results as protein-protein BLAST; different formatOther purine nucleotide metabolizing enzymes not found by ordinary BLASTJust below threshold, another nucleotide metabolism enzymeCheck to ad

36、d to PSSMAMP Deaminases. . . .gi|231729|sp|P30429|CED4_CAEEL CELL DEATH PROTEIN 4MLCEIECRALSTAHTRLIHDFEPRDALTYLEGKNIFTEDHSELISKMSTRLERIANFLRIYRRQASELIDFFNYNNQSHLADFLEDYIDFAINEPDLLRPVVIAPQFSRQMLDRKLLLGNVPKQMTCYIREYHVIKKLDEMCDLDSFFLFLHGRAGSGKSVIASQALSKSDQLIGINYDSIVWLKDSGTAPKSTFDLFTDILKSEDDLLNFPSVEHVTS

37、VVLKRMICNALIDRPNTLFVFDDVVQEETIRWAQELRLRCLVTTRDVEIASQTCEFIEVTSLEIDECYDFLEAYGMPMPVGEKEEDVLNKTIELSSGNPATLMMFFKSCEPKTFEKGAxxxxGKSTWhats Whats New?New?Select lower caseSelect red gray line = same database hit hsps color-coded independentlylow complexity sequence filteredlow complexity sequence filteredLi

38、mit to Organismprotein allfilter Nprotein allfilter NExample Entrez Queriesproteins allFilter NOT mammaliaOrganismray finned fishesOrganismsrcdb refseqProperties Nucleotide only:biomol mrnaPropertiesbiomol genomicPropertiesOtherAdvancede 10000expect value-v 2000descriptions-b 2000alignments-e 10000

39、-v 2000-e 10000 -v 2000Gene“hemochromatosis”nucleotide sequenceGenomeBLASTMap ViewerSNPProteinDomainstext searchsequence searchTGCCTCCTTTGGTGAAGGTGACACATCATGTGACCTCTTCAGTGACCACTCTACGGTGTCGGGCCTTGAACTACTACCCCCAGAACATCACCATGAAGTGGCTGAAGGATAAGCAGCCAATGGATGCCAAGGAGTTCGAACCTAAAGACGTATTGCCCAATGGGGATGGGACCTACCAGGGCTGGATAACCTTGGCTGTACCCCCTGGGGAAGAGCHuman ESTCCATGGCGACCCTGGAAAAGCNNNNNNNNNNCAGCAGCGGCTGTGCCTGCGG-W 7 e 1000forward primerreverse primerforward primerreverse primerforwardreverse/refse

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论