版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Gene Ontology (GO) 简介,目的: 为了查找某个研究领域的相关信息,生物学家往往要花费大量的时间,更糟糕的是,不同的生物学数据库可能会使用不同的术语,好比是一些方言一样,这让信息查找更加麻烦,尤其是使得机器查找无章可循。Gene Ontology就是为了解决这种问题而发起的一个项目。,Gene Ontology中最基本的概念是term。 GO里面的每一个entry都有一个唯一的数字标记,形如GO:nnnnnnn,还有一个term名,比如“cell”, “fibroblast(纤维组织母细胞 ) growth factor receptor(受体) binding”,或者“sig
2、nal transduction”。 每个term都属于一个ontology,总共有三个ontology,它们分别是molecular function, cellular component和biological process。,一个基因product可能会出现在不止一个cellular component里面,也可能会在很多biological process里面起作用,并且在其中发挥不同的molecular function。 比如,基因product “cytochrome(细胞色素) c” 用molecular function term描述是“oxidoreductase act
3、ivity(氧化还原酶活性)”,而用biological process term描述就是“oxidative phosphorylation(氧化磷酸化)”和“induction of cell death”, 最后,它的celluar component term是“mitochondrial matrix(线粒体)”和“mitochondrial inner membrane”(线粒体内部膜)。,Ontology中的term有两种相互关系,它们分别是is_a关系和part_of关系。 is_a关系是一种简单的包含关系,比如A is_a B表示A是B的一个子集。比如nuclear chro
4、mosome is_a chromosome。 part_of关系要稍微复杂一点,C part_of_D意味着如果C出现,那么它就肯定是D的一部分,但C不一定总会出现。比如nucleus part_of cell,核肯定是细胞的一部分,但有的细胞没有核。,Ontology的结构是一个有向无环图,有点类似于分类树,不同点在于Ontology的结构中一个term可以有不止一个parent。 比如 biological process term hexose biosynthesis 有两个parents,它们分别是hexose metabolism和monosaccharide biosynthe
5、sis,这是因为生物合成是代谢的一种,而己糖又是单糖的一种。,毒理基因组学,Not found,Not found,Not found,MAPPFinder and GenMAPP,一、问题的提出,解决方法,One way of accelerating the pace of data analysis is to approach the data from a higher level of organization. This can be done using data-driven methods, such as hierarchical clustering and selfo
6、rganizingmaps A complementary approach is to view the data at the level of known biological processes or pathways.,We have developed a tool called MAPPFinder that dynamically links gene-expression data to the GO hierarchy. One tool that assists in the identification of important biological processes
7、 is GenMAPP (Gene MicroArray Pathway Profiler) , a program for viewing and analyzing microarray data on microarray pathway profiles (MAPPs) representing biological pathways or any other functional grouping of genes.,输入数据,库文件,结果输出,Genemerge,问题提出: While freely available through public databases, diffe
8、rent sets of genomic data are often difficult to integrate into a given study because no common platform exists for such analysis.,For example,A great deal of genomic and proteomic information is available. For many genes, something is known about their molecular and biological function, pathway mem
9、bership, physical chromosomal location, level of polymorphism, RNAi phenotypes, disease phenotypes, and rate of molecular evolution.,提出需求: Simple and flexible software that can take advantage of diverse genomic and proteomic data for both data mining and hypothesis testing is required.,软件功能: GeneMer
10、ge can perform analyses on a wide variety of data quickly and easily and facilitates both data mining and hypothesis testing.,Given a set of study genes,GeneMerge retrieves functional genomic data for each gene and provides statistical rank scores for over-representation of functions or categories w
11、ithin the set of study genes.,GeneMerge uses 4 input files:,1. Study set gene file 2. Population set gene file 3. Gene-association file 4. Description file,Study set gene file,It is comprised of genes that are currently under investigation Study set gene file format: genename; genename; ,Population
12、set gene file,It is comprised of those genes from which the study set was drawn, often all genes on a given microarray. Population set gene file format: genename; genename; ,Gene association file,Association file Gene-association file format genename tab functionID; genename tab functionID; genename
13、 tab functionID;functionID; .,Gene-association Data,KEGG metabolic pathway KEGG developmental pathway GO molecular function GO biological process GO cellular component RNAi phenotype chromosomal location knock-out phenotype disease phenotypes, local recombination rate transcription binding site tran
14、scription binding site family DNA methylation acetylation GC content male specific female specific ortholog in clade X rate of molecular evolution tissue-specific expression over/under-expression in experiment X,Description file,description file containshuman-readable descriptions of gene-associatio
15、n IDs. Description file format functionID tab description_of_function functionID tab description_of_function functionID tab description_of_function,Output text file,GMRG_Term : GeneMerge term, for example a GO identifier GO:0001234“ Pop_freq: fraction of genes in the population with this term Pop_fr
16、ac: fraction of genes in the population with this term (whole numbers),Study_frac: fraction of genes in the study set with this term (whole numbers) Raw:: es P-value e-score: Bonferroni corrected P-value,Description : GeneMerge terms English description Contributing_genes : All the genes that are as
17、sociated with this term in the study set,统计方法:超几何分布,n n is the population set, the set from which k is drawn k: k is always the study set of genes,p: p isthe fraction of genes in the population n associated with the particular identifier under investigation. r: The number of genes with a particular
18、identifier is r,Example Data,Population Gene Set Yeast Genome,Study Gene Sets,Up-regulated in Snf/Swi Mutants (Minimal Media),Gene-Association Files,Yeast GO Biological Process,Description Files,GO Biological Process Descriptions,Results,Online gengmerge,GoMiner: (Zeeberg et al., Genome Biology, Mar
19、ch 2003),For Tour of GoMiner: Advance using forward arrow,GoMiner: (Zeeberg et al., Genome Biology, March 2003),GoMiner is a tool for biological interpretation of omic data including data from gene expression microarrays. Omic experiments often generate lists of dozens or hundreds of genes that diff
20、er in expression between samples, raising the question, “What the h-ll does it all mean biologically?”,To answer that question, GoMiner leverages the Gene Ontology (GO) to classify the genes into biologically coherent categories and assess those categories statistically.,For biological interpretatio
21、n of a microarray experiment, the user enters the list of genes on the array and has the option of flagging them as up-regulated, down-regulated, or neither with respect to some index standard.,This animation presents a tour of the highlights of GoMiners functionality and user interface.,GoMiner: (Z
22、eeberg et al., Genome Biology, March 2003),An Unchanged Gene,An Overexpressed Gene,An Underexpressed Gene,Total Genes in Category,Fishers Exact p-Value for All Changed Genes in Category,Relative Enrichment of Underexpressed Genes In Category,Relative Enrichment of All Changed Genes in Category,Relative Enrichment of Overexpressed G
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 场车修理单位质量安全总监职责培训课件
- 汽车新媒体营销策划与运营 项目二任务四 思考与练习
- 通信电源操作施工安全规范培训
- 2026安徽小语面试题库及答案
- 2026安方公司面试题目及答案
- 通信电源施工安全规范培训
- 统编版语文四年级下册第六单元达标测试卷
- 2025年区块链溯源在邮政物流的应用
- 2025年区块链技术驱动供应链溯源的数字化孪生
- 石家庄企业保洁外包合同
- 2026石河子泽众水务有限公司部分岗位社会招聘37人笔试备考题库及答案解析
- 2026国盛证券股份有限公司选聘广西分公司负责人1人备考题库附答案详解(能力提升)
- 2026湖北供销集团有限公司招聘66人考试备考题库及答案解析
- 2026年上海军转干部安置考试行政管理知识点归纳
- 【 道法 】国家监察机关的职责课件-2025-2026学年统编版道德与法治八年级下册
- 江苏省兴化市2026届中考数学模拟预测题含解析
- TSG08-2026《特种设备使用管理规则》全面解读课件
- 降低心脏植入型电子器械(CIED)植入术住院死亡率策略探讨
- 老龄政策课件
- 2025年全国供销社笔试及答案
- 水利工程监理实施细则范本(2025版水利部)
评论
0/150
提交评论