分子文献作业RNA聚合酶II核心启动子转录的途径_第1页
分子文献作业RNA聚合酶II核心启动子转录的途径_第2页
分子文献作业RNA聚合酶II核心启动子转录的途径_第3页
分子文献作业RNA聚合酶II核心启动子转录的途径_第4页
分子文献作业RNA聚合酶II核心启动子转录的途径_第5页
已阅读5页,还剩12页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、 the rna polymerase ii core promoter the gateway totranscription rna聚合酶ii核心启动子转录的途径概要rna聚合酶ii(负责真核生物蛋白编码基因的转录(产物mrna),有7-10个亚基,最大亚基的羧基末端结构域(ctd)具有7个氨基酸(tyr-ser-pro-thr-ser -pro-set)的重复序列,其中有多个磷酸化位点ctd磷酸化对调控基因转录有重要作用)核心启动子(真核生物启动子中转录起始复合物组装的位置,是决定转录起始位置的关键序列,也是普通转录因子tfd的结合位点.)通常被定义为指导转录起始的一段序列。这个简单的定

2、义掩饰了转录模型的多样性和复杂性。核心启动子有两个主要的类型:集中启动子和分散启动子。集中启动子包括一个单一的转录起始位点或几个核苷酸上一群很明显的起始位点,然而分散的启动子包括50100个核苷酸上几个起始位点,最典型的在脊椎动物的cpg岛(人类基因组中位于大多数的基因上游一段富含gc的序列。)上被发现。集中启动子比分散启动子更古老,在自然界中分布更广泛。然而,在脊椎动物中分散启动子比集中启动子更常见。此外,核心启动子还包括许多不同的序列基序,如:tata 框、bre、 inr、 mte、 dpe、 dce 、和xcpe1,这些规定了转录的不同机制和对增强子的反应。因此,核心启动子是一个决定了

3、哪些信号可以起始转录的复杂的转录途径。rna聚合酶ii包括指导转录的起始序列。因此,大体上,核心启动子可以像一个作为通用转录起始位点的信号基序那样简单,也可以像每个启动子的唯一的指令序列那样复杂。从历史观点上来看:以前的模型被认为是正确的,但新出现的数据表明,核心启动子的结构和功能上有相当多的多样性。这些评论的客观性是为了提供一个与当前带有特别强调的序列的核心启动子相关主题的综述,此外,我们在两年后将出版的论文上注释了与核心启动子相关的数据。核心启动子的特性和它的同源因子不太可能严格绝对,应该被更深一层关注。因此,这篇随笔中描述的原则和思想只适用于当前的工作模式。集中启动子vs分散启动子对核心

4、启动子的大量研究致力于集中启动子的学习。集中启动子(也被叫做单峰胰岛素)有单一的起始位点或几个核苷酸上一群明显的起始位点。大多数的真核核心启动子是集中启动子。在脊椎动物中,然而,只有三分之一或更少的核心启动子是集中启动子。相反,大多数的基因(如众所周知的br 、mv、 pb)含有分散启动子,它们有许多起始位点分布在一个从50100个核苷酸的广阔区域的典型的变化范围内。(注释:分散启动子不能和交替启动子混淆,交替启动子是有区别的,它是典型的位于几百到几千个核苷酸旁有时起特异性调控作用的的启动子。)核心启动子原件,如:tata 框、bre、 inr、 mte、 dpe、 和dce,代表性的发现于集

5、中启动子上。这些核心启动子元件不是通用的而是每一个出现在核心启动子的一个小子集上,而且,有些核心启动子好像缺乏所有已知的核心启动子元件。很有趣来注意到,tata 框、bre是最古老的核心启动子。tata 框、bre,伴随着它们的同源蛋白因子,tbp, tf ii b,从古生菌到人类都是保守的。tata框也出现在植物的启动子中,mte 和dpe好像是在多细胞动物中保存着。相反,分散启动子有代表性的发现于脊椎动物的cpg岛上,普遍的缺乏tata 、dpe、 mte基序。因此,集中启动子比分散启动子更古老,在有机体中使用更广泛。此外,几个有助于集中启动子活性的关键序列基序已经被识别。另一方面,在脊椎

6、动物中,分散启动子比集中启动子更常见,而且对分散核心启动子中负责转录的序列和因子知之甚少。然而,很有趣注意到,分散启动子区tata缺乏的启动子通常是atp三联体缺陷性的。集中启动子和分散启动子的不同就在于不同的基础转录机制。启动子(promoter) 与基因转录启动有关的一组dna序列,一般位于rna poii转录起始点上游100-200bp以内,其功能是决定转录的起始点和调控转录频率启动子区域包括核心启动子和启动子上游近侧序列.启动子基序包括转录起始位点。基于功能分析,人类启动子的共有序列被认为是yyanwyy,在果蝇中是tcakty(退化核苷酸是预示着根据iupac核苷酸编码)在集中启动子

7、中,启动子中共有序l列“a”通常是“+1”起点。类似于启动子的序列在酿酒酵母中已被描述。起始子是集中启动子上最常见的序列,启动子是一个tf ii d结合的识别位点。虽然一些蛋白质被发现结合在启动子上,但tf ii d结合到启动子上特别重要,因为tf ii d结合到启动子的启动子区的专一性序列和启动子的保守序列完全相同。对成千上万的哺乳动物的转录起始位点的计算机注释表明,哺乳类启动子的共有序列是ir,r是相应的起点+1.相反,对成千上万的果蝇的核心启动子的计算机注释揭示了一个更为严格的共有序列“tcagty”。果蝇和哺乳类启动子共有序列专一性的显著不同表明哺乳类转录因子进化成为比果蝇转录因子在功

8、能上有一个更广的启动子序列范围。这个性质可能与分散启动子在哺乳类而不是果蝇中广泛分布有关。tata 框和bretata 框是是自然界中国最古老最广泛使用的核心启动子基序,是适当地可以识别的首个真核核心启动子。tata框有一个共有序列“tatawaar”,在启动子上游-31或-30到起点a+1,t核苷酸是最常见的。tata 框被tbp识别并结合,tbp在真核生物中是tf ii d的一个亚基,像上面提到的,tata 框出现在集中启动子的一个子集上。结果是,tata 框在脊椎动物中不常见,因为在脊椎动物中只有三分之一或更少的核心启动子是集中启动子并且只有一小部分集中启动子包含tata 框。联系上下文

9、,附带说明很重要的是:这是一次有点谨慎在解释核心启动子基序期望的频率的好的实践。tata 框出现的频率是根据参数估计的,这些参数被用来定义tata 框同时启动子所覆盖的那些用来分析的特殊的数据库的精确性。因此,虽然显而易见的是有许多tata 框缺失的启动子,但在脊椎动物转录中tata 框或它的同功序列的贡献是明确地起决定性作用的,甚至比我们现在任务为的更大。最后,有必要去断定每个核心启动子潜在基序实际的功能。tbe最初被作为直接的位于tata 框上游的子集 tf ii b结合序列来鉴别。后来发现tf ii b可以在breu或 bred序列结合到tata 框的上游和下游。breu的共有序列是ss

10、rcgcc,bred 位于tata 框的下游,共有序列rtdkkkk。依据启动子的内容breu和bred起正调控作用也可以起负调控作用。dpe和mtedpe(下游核心启动子元件)最初被作为下游tf ii b识别序列来鉴别,对于基础的转录活性很重要。从果蝇到人类都有dpe ,它位于启动子+28+33到a+1。果蝇中dpe的共有序列是“rgwyvt”.人类的dpe共有序列还未被断定,然而,哺乳类核心启动子包含的序列和果蝇中被发现的控制dpe活性的序列一致。dpe与启动子在功能上是协同的,启动子和dpe之间的距离对最佳转录是很关键的。光交联研究揭示dpe非常接近tf iid亚基taf6和taf9,包

11、括组蛋白折叠基序,与组蛋白h3,h4相关,很有可能tf iid的亚基 taf6-taf9 与dpe的相互作用方式与组蛋白 h3h4结合到dna核小体中类似。mte(10个基序的元件)通过计算机和生化研究的结合发现。在果蝇的启动子中mte中有一个从+18+27到+1的共有序列“csarcssaac”. +18+22核苷酸的突变可以使mte在试管和人工培养中失去活性。像dpe一样,mte与启动子共同作用有严格的inrmte间隔要求。此外,mte可以补偿由tata框和dpe发生突变造成基础转录活性的表达,而且,mte表现出与tata框和dpe基序的协同效应。mte和其他核心启动子基序的协同激发了包含

12、最佳版本的tata框,inr、mte、dpe的超级核心启动子scp的指导。scp是在试管中和人工培养的已知最强大的核心启动子。此外,scp展示了对tf iid结合的强大吸引力。可得到的证据表明mte出现在人类。首先果蝇的核心启动子中mte的突变通过试管或人工培养的细胞的人为因素上导致转录量的减少。因此,人类转录机制识别mte,第二,一个带有mte功能的核心启动子被识别,然而,mte通常不会出现在计算机注释的哺乳类启动子的数据库中超出比例的序列上。就像以上所描述的启动子,很有可能哺乳类gcg中 mte比果蝇有更宽广和更少限制的保守序列。在这里,很有趣注意到,对哺乳类核心启动子的分析揭示了一个位于

13、+20的“gcg基序”和位于+30的“ gcg重复基序”。这个 gcg基序和mte突变敏感的+18+22区域重叠, gcg基序也和dpe的位置重叠。因此,gcg基序可能是哺乳动物版的mte,而gcg重复基序可能与dpe相对应。dce和xcpe1基序dce(下游核心元件)最初在人类贝塔珠蛋白启动子上被发现,已被认为具有腺病毒主要后启动子的特征。dce和tata框一起出现的很频繁,和dpe相差很大。dce包括三条序列:s1,从+6到+1,cttc,s2,从+16到+21,ctgt,s3,从+30到+34,agc。光交联研究揭露了dce非常靠近taf1。xcpe1(x核心启动子元件1)基序位于8-+

14、2到+1为起点,有“dsgyggrasm”的保守序列。在人类的核心启动子上出现的概率是1%,大多数是tata框缺失的,xcpe1本身有很小的活性。相反,它与专一序列的激活蛋白如:nrf1,nf-1和sp1结合起作用。因此xcpe1可能是指导翻译起始的cpg岛中的序列专一性激活蛋白共同工作的基序大家庭的成员之一。在将来,调查不同核心启动子基序间功能相互作用也很重要。计算机注释揭露了各种各样的核心启动子基序结合同时出现。这个研究不仅确认了现有的已知核心启动子基序间的相互作用还鉴别了潜在的相互作用。核心启动子功能的多样性现存的不同的核心启动子元件导致了核心启动子功能的多样性。例如:增强子功能性地链接

15、到核心启动子上。一些转录增强子被发现对tata对抗dpe核心启动子基序展示出专一性。此外,不同因子调解不同类型核心启动子转录的基本进程。例如:一系列净化的转录因子(tfiid,tfb,tfiif,tfiie,tfiih,tfiia rna聚合酶 ii的普通转录因子(tf ii)包括tfiid,tfb,tfiif,tfiie,tfiih,tfiia等tfiid是由tata盒结合蛋白(tata binding protein,tbp)和8种tbp协同因子(tbp associated factor,taf)组成的复合物,tfiid可识别和结合核心启动子(tata盒和inr);tfb的c端与tfii

16、d和dna的复合物结合,n-端与tff协同作用募集rna聚合酶ii,再加上tfiie,tfiih形成完整的转录复合物tfiih有蛋白激酶活性,可使rna pol最大亚基ctd磷酸化,使转录起始过渡到转录延伸 tfiia有助于tfiid与tata盒核心启动子结合)对转录一个tata依赖性的核心启动子是充分的,但却被发现不能转录dpe 依赖性的核心启动子,而且,在转录反应执行粗核提取被发现nc2积累 dpe依赖性的转录和起始tata依赖性的转录,然而,nc2 的dpe依赖性的转录并没有在重建的净化系统中被看到,也许是由于缺失了在粗提中出现而在净化系统中未出现的额外的辅助因子。在分开的工作中也发现了

17、启动子元件导致了nc2tata依赖性的转录因子的抑制性反抗。因此,在 rna干扰损耗的研究中;taf1和taf4被观察到对于一个dpe 依赖的通告基因而不是一个tata 依赖的通告基因的转录很重要,现在,我们有一个不同类型核心启动子参与转录的因子的有趣的不完全图谱。这个项目是未来调查的一个重要领域。tbp相关因子 在tbp和tbp相关因子上可以看到转录机制功能的多样性。有三种trfs、trf2和trf3。trf1是在脊椎动物中缺失但在果蝇中出现,它结合到tc富集区,通过rna聚合酶ii,iii来调解转录。trf2不结合到tata框,涉及果蝇和脊椎动物中rna聚合酶ii的转录。果蝇的 trf2是

18、在一个多亚基复合物中包含iswi和dref蛋白。果蝇中trf2被发现有短的和长的形式存在,两种形式都与iswi相关。trf3在脊椎动物而不是果蝇中出现,正是 trf最接近tbp。trf3结合tata框序列上,通过rna聚合酶ii参与转录。更清楚的是,我们以前认为tbp是通用转录因子,对许多基因的转录是不需要的,相反,可得到的数据显示,tbp和trfs不同的功能涉及到许多不同的网络调控。例如:trf2对于tata缺失的组蛋白h1基因的转录来说是必须的但对于含有tata的s平面来说是不必要的。tbp在调控核心组蛋白的基因上展示了相反的作用。此外, trf2 和tbp的负面影响在tata缺失的1型多

19、发性神经纤维瘤启动子和含有tata的cfos启动子中被看见。惊人的,在成肌细胞分化为肌管时,含有trf3和trf3的异常复合物代替了tfiid复合物tbp。这个发现提供了一个tbp和trf3在细胞分化过程中交换的例子。在此,很有趣来注意到tbp和trf3在老鼠卵子发生时表达的不同。trf3也被发现对于斑马鱼的造血作用很重要。总结和展望核心启动子是多样性的,复杂的。我们仍旧需要获得对dna序列的更好的解释。揭示核心启动子的功能和对不同类型核心启动子有功能的蛋白质因子。特别重要是,我们需要投入更多的精力去研究分散核心启动子的转录机制,因为目前的证据表明分散核心启动子与集中核心启动子在转录机制和策略

20、上有根本性的不同。那将比较有趣来调查rna聚合酶ii结合到介导因子上是否影响本底性转录。例如:介导因子已被发现有助于含有dpe的启动子的转录。而且,一种新形式的叫做聚合酶(g)的rna聚合酶ii含有一个叫做g下调蛋白的传统蛋白质能够使rna聚合酶ii对介导因子有响应。此外,核染色质结构可能是核心启动子功能的组分,核染色质经常在h2a.z组蛋白的侧面,转录起点上被发现,它经常位于三甲基赖氨酸抗体的上游。如果有可能,例如,转录起始复合物的形成需要tfiid的亚基taf3结合到含有核小体的h3k4me上。从更广阔的远景上看,弄明白核心启动子在生物进化中如何行使功能也非常重要。例如:基因网络和发展。获

21、得和整合这个信息给我们提供这个事件的认识,反复考虑核心启动子转录的途径。 原文the rna polymerase ii core promoter the gateway to transcriptiontamar juven-gershon, jer-yuan hsu, joshua w. m. theisen, and james t. kadonagasection of molecular biology, university of california, san diego, 9500 gilman drive, la jolla, ca920930347, usasummary

22、 the rna polymerase ii core promoter is generally defined to be the sequence that directs theinitiation of transcription. this simple definition belies a diverse and complex transcriptional module.there are two major types of core promoters focused and dispersed. focused promoters containeither a si

23、ngle transcription start site or a distinct cluster of start sites over several nucleotides, whereasdispersed promoters contain several start sites over 50 to 100 nucleotides and are typically found in cpg islands in vertebrates. focused promoters are more ancient and widespread throughout nature th

24、an dispersed promoters; however, in vertebrates, dispersed promoters are more common thanfocused promoters. in addition, core promoters may contain many different sequence motifs, suchas the tata box, bre, inr, mte, dpe, dce, and xcpe1, that specify different mechanisms oftranscription and responses

25、 to enhancers. thus, the core promoter is a sophisticated gateway totranscription that determines which signals will lead to transcription initiation.introductionthe rna polymerase ii core promoter comprises the sequences that direct the initiation oftranscription (for reviews, see 1,2*,3*,4*,5*). t

26、hus, in principle, the core promoter could beas simple as a single motif that serves as a universal transcription start site, or as complex asa unique set of sequence instructions for each promoter. historically, the former model hasoften been presumed to be true, but emerging data indicate that the

27、re is considerable diversityin core promoter structure and function.the objective of this review is to provide an overview of current topics that relate to the corepromoter, with a particular emphasis on sequence motifs in core promoters. in addition, wehave annotated core promoter-related data in p

28、apers that were published in the past two years.it should further be noted that the properties of core promoters and their cognate factors arenot likely to be strictly absolute; hence, the principles and ideas described in this essay shouldbe taken only as current working models.focused versus dispe

29、rsed core promoter.page 2 short region of several nucleotides. most eukaryotic core promoters appear to be focused core promoters. in vertebrates, however, only about one-third or less of core promoters are focused core promoters; instead, the vast majority of genes appear to contain dispersed core

30、promoters (also known as br broad distribution, mu multimodal, or pb broad with dominant peak promoters), in which there are a number of transcription start sites distributed over a broad region that might typically range from 50 to 100 nucleotides (fig. 1). note that dispersed core promoters should

31、 not be confused with alternate promoters, which are distinct and sometimes differentially-regulated promoters that are typically located hundreds or thousands of nucleotides apart. core promoter elements such as the tata box, bre, inr, mte, dpe, and dce (fig. 2; discussed in greater detail below) a

32、re typically found in focused core promoters. these core promoter elements are not universal; rather, each is present in only a subset of core promoters. moreover, some core promoters appear to lack all of the known core promoter elements. it is interesting to note that the tata box and bre are the

33、most ancient of the core promoter motifs. the tata box and bre along with their cognate protein factors, tbp (tata box- binding protein) and tfiib (transcription factor iib), are conserved from archaea to humans (for review, see 6). the tata box is also present in plant promoters 7,8*. the mte and d

34、pe appear to be conserved among metazoans. in contrast, dispersed core promoters are typically found in cpg islands in vertebrates and generally lack tata, dpe, and mte motifs (see, for example 1,3*,5*,9,10*). thus, focused core promoters are more ancient and used in a much broader range of organism

35、s than dispersed promoters. in addition, several of the key sequence motifs that contribute to the activity of focused core promoters have been identified. on the other hand, in vertebrates, dispersed core promoters are more common than focused promoters. moreover, little is known about the sequence

36、s and factors that are responsible for transcription from dispersed core promoters. it is interesting to note, however, that the promoter region of dispersed tata-less promoters are generally deficient in atg triplets 11. there may be fundamental differences in the basic mechanisms of transcription

37、from dispersed versus focused core promoters.the initiator (inr) the initiator (inr) motif encompasses the transcription start site 1,12. based on functional assays, the inr consensus was determined to be yyanwyy in humans and tcakty in drosophila (degenerate nucleotides are indicated according to t

38、he iupac nucleotide code). the a nucleotide in the middle of the inr consensus is often the +1 start site in focused core promoters. inr-like sequences have also been described in saccharomyces cerevisiae (for example, see 13* and references therein). the inr is probably the most commonly occurring

39、sequence motif in focused core promoters (see, for example 14,15*,16*). the inr is a recognition site for the binding of tfiid. although a number of proteins have been found to bind to inr sequences, the binding of tfiid to the inr appears to be particularly important because the sequence specificit

40、y of tfiid binding to the inr region of the core promoter is identical to the inr consensus sequence 17. the computational analysis of thousands of mammalian transcription start sites suggests that the mammalian inr consensus is yr, where r corresponds to the +1 start site 10*,18*. in contrast, the

41、computational analysis of thousands of drosophila core promoters reveals a much more strict consensus sequence of tcagty 14,15*. this sharp difference in the specificity of the inr consensus between drosophila and mammals suggests that mammalian transcription factors have evolved to function with a

42、broader range of inr sequences than drosophila transcription factors. this property may be related to the prevalence of dispersed core promoters in mammals but not in drosophila.the tata box and bre the tata box, which is the most ancient and most widely-used core promoter motif throughout nature, w

43、as aptly the first eukaryotic core promoter element to be identified 19. the tata box has a consensus of tatawaar, where the upstream t nucleotide is most commonly at 31 or 30 relative to the a+1 (or g+1) in the inr (see, for instance 10*,20*). the tata box is recognized and bound by tbp, which is a

44、 subunit of the tfiid complex in eukaryotes. as noted above, the tata box is present in a subset of focused core promoters. consequently, the tata box appears to be uncommon in vertebrates (see, for example 10*, 21,22*), because only about one-third or less of vertebrate core promoters is focused an

45、d only a fraction of the focused core promoters contains a tata box. yet, in this context, it is useful to note parenthetically that it is probably a good practice to be somewhat cautious in the interpretation of expected frequencies of core promoter motifs. for instance, the frequency of occurrence

46、 of the tata box is an estimate that depends on the parameters (i.e., sequences and positions) that are used to define the tata box as well as the accuracy and promoter coverage of the particular database that is used in the analysis. thus, although it is clearly apparent that there are many tata-le

47、ss promoters, the contribution of the tata box or functionally equivalent sequences to vertebrate transcription remains to be determined unambiguously, and may be greater than is currently believed. ultimately, it will be necessary to determine the actual function of each potential motif in each cor

48、e promoter. the bre (tfiib recognition element) was originally identified as a tfiib-binding sequence that is immediately upstream of a subset of tata boxes 23. it was then found that tfiib can bind upstream or downstream of the tata box at the breu (upstream bre, which is the same as the original b

49、re) or bred (downstream bre) sequences 24,25*. the breu consensus is ssrcgcc 23. bred is located immediately downstream of the tata box and has a consensus of rtdkkkk 24. depending on the promoter context, the breu and bred can act in either a positive or negative manner 23,24,25*.the dpe and mte th

50、e dpe (downstream core promoter element) was identified as a downstream tfiid recognition sequence that is important for basal transcription activity 26. the dpe is conserved from drosophila to humans, and is located from +28 to +33 relative to the a+1 in the inr. the dpe consensus is rgwyvt in dros

51、ophila 27. the dpe consensus in humans has yet to be determined; however, mammalian core promoters containing sequences that conform to the drosophila consensus have been found to possess dpe activity. the dpe functions cooperatively with the inr, and the spacing between the inr and dpe is critical

52、for optimal transcription. photocrosslinking studies revealed that the dpe is in close proximity to the tfiid subunits taf6 (tafii60) and taf9 (tafii40), which contain histone fold motifs and are related to histones h4 and h3 28. it is thus possible that the taf6-taf9 subunits of tfiid interact with

53、 the dpe in a manner that is similar to binding of histones h3-h4 to dna in nucleosomes 29. the mte (motif ten element) was found by a combination of computational and biochemical studies 14,30. the mte has a consensus of csarcssaac from +18 to +27 relative to a +1 in the inr in drosophila. mutation of the nucleotides from +18 to +22 can

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论