会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

外文翻译-- An online system for functional relationship.PDF外文翻译-- An online system for functional relationship.PDF -- 1 元

宽屏显示 收藏 分享

资源预览需要最新版本的Flash Player支持。
您尚未安装或版本过低,建议您

AnonlinesystemforfunctionalrelationshipanalysisofgenomewidegeneproductsQiangHu,ZhengGuoZhangDepartmentofBiomedicalEngineeringInstituteofBasicMedicalSciences,ChineseAcademyofMedicalSciencesSchoolofBasicMedicine,PekingUnionMedicalCollegeBeijing,ChinaEmailzhangzg126126.comAbstractThoughthefunctionalrelationshipanalysisforgeneproductsisuseful,aconvenientanduserfriendlytooltomeasurethefunctionalsimilarityforgenomewidegeneproductsinmultiplespeciesisstillnotavailable.Wecomputedthefunctionalsimilarityofgeneproductsingenomewideinhuman,mouseandratbasedonouralgorithm.Databaseandwebserviceswerebuiltbasedontheprecomputedsimilarityscores.Oursystemprovidedagroupoftoolstoretrievethefunctionalsimilarityandanalysisthefunctionalrelationshipforgeneproducts.Thewebserviceisfreelyavailableathttp//bme.pumc.edu.cn/fsim/index.html.I.INTRODUCTIONThefunctionalsimilaritymeasurementforgeneproductsisausefulmethodtoinvestigatetheirrelationship.Oneimportantapplicationoffunctionalsimilarityanalysisistopredictandassesstheproteinproteininteractions1,2,3.Anotherapplicationistodiscoverthepositionalcandidategenesofdiseases4.Functionalsimilarityalsocanbeusedtoclustergeneexpressiondataforfunctionalrelatedgeneshavesimilarexpressionprofiles5.Mostofmethodstomeasurefunctionalsimilarityarebasedontheannotationinformationofgeneproducts.TheGeneOntologyGOdatabase6providesacontrolledvocabularyoftermstoannotatethefunctionsofgeneproducts.Itiswidelyadoptedbymostofalgorithmsandtoolstomeasurethefunctionalsimilarity.Thoughmanytoolshavebeendevelopedtomeasurethefunctionalsimilarity,aconvenientanduserfriendlytooltoanalysistherelationshipofgenomewidegeneproductsisstillnotavailable.TheGOtoolswebpagecollectedalotofsoftwarebasedonthedatabase.Forexample,AmiGO7andQuickGO8provideaninterfacetosearchandbrowsetheontologyandannotationdata.Therelationshipofgeneproductscanbecomparedbyusersbutnotautomatically.GOTax9thatintegratedtheannotationdataofproteinandproteinfamiliesprovidedafunctionalsimilaritysearchtoolFSSTbasedonthealgorithmofInformationContentICofGOterms.Thetoolcanbeusedtomeasurethefunctionalsimilarityofproteinsandproteinfamilies.GSESAME10developedanewalgorithmtomeasurethefunctionalsimilarity.Thewebtoolitofferedonlycanbeusedtomeasurethefunctionalsimilarityoftwogeneproducts.FunSimMat11calculatedthesimilarityofproteinsinUniProtKB12.Awebsearchenginewasdevelopedtoretrievethefunctionalsimilarityofproteins.Itwouldbehelpfulifatoolcouldassistbiologiststocomparethefunctionalrelationshipofinterestedgeneswithwholegenomegeneproducts.However,genomewiderelationshipanalysiscouldnotbecarriedoutinordinarycomputingservers.Itwouldcostdozensofhourseveninhighperformancecluster.Wedevelopedanonlinesystemforfunctionalrelationshipanalysisofgenomewidegeneproducts.Anallagainstallfunctionalsimilaritycomparisonforgenomewidegeneproductsinhuman,mouseandratwerecomputedpreliminarilybasedonouralgorithms.Threedatabaseswerebuilttointegratethesimilarityscoresrespectively.Basedontheprecomputedsimilarityscores,awebsearchenginewasdevelopedtoretrievethesimilarityscoresdireclty.Someotherrelatedtoolsweredevelopedtoextendtheonlinewebservices.Biologistscanusethesystemeasilytoanalyzethefunctionalrelationshipofgenomewidegeneproducts.II.CONSTRUCTIONANDCONTENTA.DataSetsTherawdataadoptedtocalculatethesimilarityweredirectlyfromtheannotationpackagesofR/Bioconductorproject13,14.Forexample,thepackagesorg.Hs.eg.db,org.Mm.eg.dbandorg.Rn.eg.dbcontainedtheGOannotationdataofgeneproductsinhuman,mouseandratrespectively.ThepackagesweredescribedinthetableI.AlltheseGOrelatedpackageswerebuiltbyBioconductorprojectaccordingtothelatestversionofGOdatabasein2009March.TheannotationdataofprobeIDsofdifferentmicroarrayplatformswerealsofromtheannotationpackagesinBioconductor.B.Implement1AlgorithmThreedatabasesintegratedallsimilarityscoresofgenomewidegeneproductsinhuman,mouseandratrespectively.Weproposedanovelalgorithmtomeasuretherelationship.Statisticalmodelwasbuiltaccordingtothecommoninformationoftheannotationtermsbetweentwogeneproducts.TheGOprovidedthreestructuredvocabulariesontologiestodescribegeneproductsintermsoftheirassociatedbiologicalprocessesBP,cellularcomponents9781424447138/10/25.00©2010IEEEFig.1.Functionalsimilaritysearchforgeneproducts.TABLEIDATASETSADOPTEDINTHEDATABASESAnnotationpackagesSpiecesRawdataorg.Hs.eg.dbHumanGOannotationMappinginformationbetweendistinctidentifications.org.Mm.eg.dbMousedittoorg.Rn.eg.dbRatdittoorg.Hs.sp.dbHumanProteinidentifierstoEntrezIDsorg.Mm.sp.dbMousedittoorg.Rn.sp.dbRatdittoGO.dbGOtermsrelationshipandannotationKEGG.dbAnnotationmapsforKEGGdatabaseCCandmolecularfunctionsMF.TheGOtermscouldbeconnectedwithchildparentrelationshipbetweeneachother.ThethreeontologieswerestructuredasDirectedAcyclicGraphDAG.GOtermswereindifferentlevelsoftheDAG.ThetermslocatedclosetotheleavesofDAGdescribedmorespecificmeanings.Thesetermscontainedmoreinformationthanthetermslocatedclosetotheroot.Wedefinedaparameter,LevelCoefficientLC,todenotetheweightoftheinformationofaGOterm.TheLCvaluesofleavesweredefinedas1.Fromchildrentoparents,theLCvaluesgraduallydecreasedastheratiooftheirlevelsintheDAG.Ageneusuallywasannotatedbymorethanoneterminthreeontologies.Theinformationofatermshouldalsocontaintheinformationofitsancestorterms.Thus,thecommontermsbetweentwogeneproductscouldbesummarizedtoacontingencytable.TheLCvaluesasinformationweightsoftermscouldbecountedtothecontingencytable.Therefore,therelationshipoftwogeneproductscouldbemeasuredbystatisticallytestingtheagreementofthecontingencytable.WeadoptedKappavaluetotesttheagreement.Furthermore,theZtestwasusedtotestthesignificantofKappavalue.Whentwogeneproductswerefunctionallyrelated,theKappavaluewouldbecloseto1.2SimilarityScoresComputationTherearemorethantenthousandsgeneproductsindifferentspecies.Allagainstallcomparisonofallgeneproductsrequiredsolargeamountofcomputingpowerthatordinarycomputerscouldnotfinishthecalculation.Thecomputationaltaskwasseparatedintosmalltasksbydividingtheinputdata.Iftheamountofgenomewidegeneproductsisn,theithcalculationtaskwastocalculatethesimilarityscoresbetweentheithgeneproductandtheonesfromthefirsttotheithgeneproducts.DifferentcalculationtaskswereassignedtodifferentCPUsinahighperformancecluster.Thenthecomputationalresultsweresummarizedtoamatrixofsimilarityscores.ParallelprogramsbasedonRlanguageweredevelopedtorealizethecomputation.RpackagesRmpi15andsnow16providedparallelinterfacestoMPIlibraryoftheclusterenvironment.C.DatabasesThreedatabaseswerecreatedtointegratetheprecomputedsimilarityscoresmatricesofallgeneproductsinhuman,mouseandrat.ThescoresincludedKappavaluesandZscoresbetweeneverytwogeneproducts.Forexample,therewere17482humangeneproducts,thenthescorematrixwiththedimensionof1748217482wouldbestoredinthedatabases.Rlanguage13wereusedtodevelopprogramstoperformthecomputation.Theresultsmatricesweresohugethatitwasdifficulttobestoredinregularrelationaldatabase.Fig.2.Onlinetoolsforfunctionalrelationshipanalysis.Weformattedthelargescorematricesintohundredsofmatriceswithsmallerdimensions.ThenoursystemstoredthematricesdatadirectlyinRbinaryfilesRdata.Thevolumeofdatabasefileswasapproximate4gigabytesinsize.ThefiledatabasecouldbeimportedbyRscripts.D.WebsystemThesystemcouldbevisitedthoughinternettoretrieveandanalyzethefunctionalrelationshipofgeneproducts.TheApachehttpserverwasusedtoparsetheHTMLwebpages.Throughthewebserver,theuserscouldsubmittheirdatatothesystemandtheresultswouldbereturnedonthewebpages.Renvironmentwasthebaseofthesystem,whichwasinchargeofdataanalysisandinteractingwiththedatabases.Rapache17asafunctionalmoduleofApache,connectedthewebserverandRenvironment.ThedataandvariablessubmittedbytheuserscouldbetransferredtoRenvironmentviaApache.TheresultsfromRprogramsalsocouldbereturnedtotheusersthroughthewebserver.III.UTILITYANDDISCUSSIONA.WebInterfacesWebinterfacestothedatabaseandanalysistoolsweredeveloped.Asshowninfigure1,ourwebtoolsweredesignedintheconciseanduserfriendlyway.Thesystemprovidedthetoolsoffunctionalsimilaritysearchandclassificationforgeneproducts.Someothertools,suchasgeneenrichmentanalysis,identifierconversionandGOannotation,wereextendedtothesystemtoassistthedataanalysis.DocumentswerealsowrittenintheFAQpagetodescribethetoolsandgiveexamples.B.FunctionalsimilaritysearchforasinglegeneproductThegFSimtoolprovidesafunctiontosearchthemostrelatedgeneproductsforasinglegeneproductinthegenomeFigure1A.SeveralidentifiersofgeneproductsincludingEntrezID,Symbol,UnigeneandSwissProtIDweresupported.Geneproductsinthreespeciesincludinghuman,mouseandratcouldbesearchedinthetool.Thenumberofgeneproductsintheresultscouldbespecified.Thetop100functionallysimilargeneproductswouldbereturnedintheresultsbydefaults.EntrezID,annotatedGOtermsandZscoreswouldbeshowninthesearchresultsFigure1B.GeneproductsannotatedwiththesameGOtermswouldbeputinthesamerow.ThesearchresultscouldalsobedownloadedintheCSVcommaseparatedvaluesformatfile.C.FunctionalsimilarityanalysisforagroupofgeneproductsThegsFSimtoolcouldbeusedtoretrieveandanalyzethefunctionalrelationshipofagroupofgeneproductsFigure1C.MultipleidentifiersandspeciesofgeneproductsweresupportedinthetoolassameasgFSim.Agroupofformattedgeneproductscouldbesubmittedwiththeseparatorssuchascommas,semicolons,spacesandlinebreaks.AsimilarityscorematrixoftheinputgeneproductswithKappavalueswasshownintheresults.Thesimilarityscorematrixwasalsographicallyvisualized.AheatmapFigure1DdemonstratedtheannotatedGOtermsofgeneproducts.ThebluecolorinthegraphdenotedthetheGOtermswereusedtoannotatethecorrespondinggeneproducts.Blackmeantthesetermsdidnotannotatethegeneproducts.AdendrogramFigure1Eintheresultsshowedthehierarchicalclusteringresultsaccordingtothesimilarityscorematrix.Geneproductswereclassifiedintodifferentgroupsbasedontheirfunctionalrelationship.D.EnrichmentAnalysisGeneenrichmentanalysis18isausefulmethodtodiscoverthespecificfunctionalannotationintheselectedgenesfromthetotaluniversegenes.Asshowninfigure2A,theannotationdatabaseshouldbeselectedfirstly.BP,MFandCContologyofGOdatabaseandKEGGpathwaydatabase19weresupportedinthetool.Thenthepvalueofsignificanttestintheenrichmentanalysisalgorithmcouldbespecified.Thepvaluewas0.05bydefault.Iftheannotationtermwasmorespecificandimportantintheselectedgeneproducts,thetermwouldgetasmallerpvalue.Thisvaluecouldbeusedtorestrictthenumberofresults.Iftherewasnoresultintheenrichmentanalysis,abiggerpvaluecouldbeassigned.Agroupofinterestedgeneproductscouldbesubmittedtotheselectedgenes.Theoverallgeneproductsshouldbesubmittedastheuniversegenes.Theanalysisresultsincludethesignificantlyenrichedfunctions,Pvalues,oddsratio,andannotatedcountsFigure2B.TheresultscouldalsobedownloadedintheCSVformatfile.TheenrichmentanalysistoolcouldbeusedtoanalysistheresultsoffunctionalsearchforagroupofgeneproductsgsFSim.E.MicroarrayProbeIDConversionThemicroarrayprobeIDconversiontoolcouldtransfertheprobeIDsfromdifferentmicroaryplatformstoEntrezIDsFigure2C.Mostofcommercialgenechips,suchasAffymetrix,Agilent,GEGeneralElectricandIlluminaweresupported.MicroarrayprobeIDscouldbeconvertedtoEntrezID,thentheIDscouldbesubmittedtotheothertoolstoanalyzethefunctionalrelationship.Therefore,thetoolextendsthesupportedidentifierstypesofgeneproductsinthesystem.F.GOAnnotationAsetofGOtermscouldbesubmittedtotheannotationtooltosearchthedetailedinformationinbatch.AfteragroupofGOtermsweresubmitted,theresultswouldbereturnedincludingthetermnames,definitions,synonymsandLCvaluesindescendingorderofLCvalues.LCdenotedtheweightedinformationofaGOterm.Thusthetermswithmorespecificbiologicalmeaningswouldbeshowninthefrontoftheresults.IV.CONCLUSIONForthepurposeofdevelopingapowerfulanduserfriendlytooltoanalyzethefunctionalrelationshipofgenomewidegeneproducts,wecomputedthefunctionalsimilarityscoresofallgeneproductsinhuman,mouseandratbasedonouralgorithminadvance.Anonlinesystemwasdevelopedonthebaseoftheprecomputedsimilarityscores.Thesystemprovidedagroupoftoolstoretrievethefunctionalsimilarityandanalyzetherelationshipforgenomewidegeneproducts.Ourwebservicesarefreelyavailableathttp//bme.pumc.edu.cn/fsim/index.html.ACKNOWLEDGMENTThisworkwaspartiallysupportedbyChinaMedicalBoardofNewYork,Inc.03787.ThecomputingtasksofsimilarityscorematriceswereperformedintheHighPerformanceComputingCenter,PekingUnionMedicalCollege.REFERENCES1L.J.Lu,Y.Xia,A.Paccanaro,H.Yu,andM.Gerstein,Assessingthelimitsofgenomicdataintegrationforpredictingproteinnetworks.GenomeRes,vol.15,no.7,pp.945–953,Jul2005.2A.Schlicker,C.Huthmacher,F.Ramrez,T.Lengauer,andM.Albrecht,Functionalevaluationofdomaindomaininteractionsandhumanproteininteractionnetworks.Bioinformatics,vol.23,no.7,pp.859–865,Apr2007.3M.E.Futschik,G.Chaurasia,andH.Herzel,Comparisonofhumanproteinproteininteractionmaps.Bioinformatics,vol.23,no.5,pp.605–611,Mar2007.4E.A.Adie,R.R.Adams,K.L.Evans,D.J.Porteous,andB.S.Pickard,Suspectsenablingfastandeffectiveprioritizationofpositionalcandidates.Bioinformatics,vol.22,no.6,pp.773–774,Mar2006.5Y.QuandS.Xu,Supervisedclusteranalysisformicroarraydatabasedonmultivariategaussianmixture.Bioinformatics,vol.20,no.12,pp.1905–1913,Aug2004.6M.Ashburner,C.A.Ball,J.A.Blake,D.Botstein,H.Butler,J.M.Cherry,A.P.Davis,K.Dolinski,S.S.Dwight,J.T.Eppig,M.A.Harris,D.P.Hill,L.IsselTarver,A.Kasarskis,S.Lewis,J.C.Matese,J.E.Richardson,M.Ringwald,G.M.Rubin,andG.Sherlock,Geneontologytoolfortheunificationofbiology.thegeneontologyconsortium.NatGenet,vol.25,no.1,pp.25–29,May2000.7S.Carbon,A.Ireland,C.J.Mungall,S.Shu,B.Marshall,S.Lewis,A.O.Hub,andW.P.W.Group,Amigoonlineaccesstoontologyandannotationdata.Bioinformatics,vol.25,no.2,pp.288–289,Jan2009.8D.Binns,E.Dimmer,R.Huntley,D.Barrell,C.ODonovan,andR.Apweiler,Quickgoawebbasedtoolforgeneontologysearching.Bioinformatics,vol.25,no.22,pp.3045–3046,Nov2009.9A.Schlicker,J.Rahnenfhrer,M.Albrecht,T.Lengauer,andF.S.Domingues,Gotaxinvestigatingbiologicalprocessesandbiochemicalactivitiesalongthetaxonomictree.GenomeBiol,vol.8,no.3,p.R33,2007.10Z.Du,L.Li,C.F.Chen,P.S.Yu,andJ.Z.Wang,Gsesamewebtoolsforgotermbasedgenesimilarityanalysisandknowledgediscovery.NucleicAcidsRes,vol.37,no.WebServerissue,pp.W345–W349,Jul2009.11A.SchlickerandM.Albrecht,Funsimmatacomprehensivefunctionalsimilaritydatabase.NucleicAcidsRes,vol.36,no.Databaseissue,pp.D434–D439,Jan2008.12U.Consortium,Theuniversalproteinresourceuniprot2009.NucleicAcidsRes,vol.37,no.Databaseissue,pp.D169–D174,Jan2009.13RDevelopmentCoreTeam,RAlanguageandenvironmentforstatisticalcomputing,2009,ISBN3900051070.Online.Availablehttp//www.Rproject.org14R.C.Gentleman,V.J.Carey,D.M.Bates,B.Bolstad,M.Dettling,S.Dudoit,B.Ellis,L.Gautier,Y.Ge,J.Gentry,K.Hornik,T.Hothorn,W.Huber,S.Iacus,R.Irizarry,F.Leisch,C.Li,M.Maechler,A.J.Rossini,G.Sawitzki,C.Smith,G.Smyth,L.Tierney,J.Y.H.Yang,andJ.Zhang,BioconductorOpensoftwaredevelopmentforcomputationalbiologyandbioinformatics,GenomeBiology,vol.5,p.R80,2004.15H.Yu,RmpiInterfaceWrappertoMPIMessagePassingInterface,2007,rpackageversion0.55.Online.Availablehttp//www.stats.uwo.ca/faculty/yu/Rmpi16L.Tierney,A.J.Rossini,N.Li,andH.Sevcikova,snowSimpleNetworkofWorkstations,2004,rpackageversion0.30.Online.Availablehttp//www.sfu.ca/∼sblay/R/snow.html17J.Horner,rapacheWebapplicationdevelopmentwithRandApache.,2009.Online.Availablehttp//biostat.mc.vanderbilt.edu/rapache/18A.Alexa,J.Rahnenfhrer,andT.Lengauer,Improvedscoringoffunctionalgroupsfromgeneexpressiondatabydecorrelatinggographstructure.Bioinformatics,vol.22,no.13,pp.1600–1607,Jul2006.19M.Kanehisa,Thekeggdatabase.NovartisFoundSymp,vol.247,pp.91–101discussion101–3,119–28,244–52,2002.
编号:201311191259373759    大小:269.92KB    格式:PDF    上传时间:2013-11-19
  【编辑】
1
关 键 词:
外文翻译 外文资料
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
  人人文库网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
0条评论

还可以输入200字符

暂无评论,赶快抢占沙发吧。

当前资源信息

4.0
 
(2人评价)
浏览:33次
tuzhidiguo上传于2013-11-19

官方联系方式

客服手机:13961746681   
2:不支持迅雷下载,请使用浏览器下载   
3:不支持QQ浏览器下载,请用其他浏览器   
4:下载后的文档和图纸-无水印   
5:文档经过压缩,下载后原文更清晰   

相关资源

相关资源

相关搜索

外文翻译   外文资料  
关于我们 - 网站声明 - 网站地图 - 友情链接 - 网站客服客服 - 联系我们
copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5