




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Organization of The Human GenomeDepartment of Medical GeneticsYaping Liu (刘雅萍)Human Molecular Genetics (4th Edition) by Tom Strachan and Andrew Read The human genome is subdivided into a large nuclear genome and a very small circular mitochondrial genome Human genes are usually not discrete entities
2、 Many RNA genes make functional noncoding RNAs that can be involved in diverse regulatory functions Some copies of a functional gene come to acquire mutations that prevent their expressionKey ConceptsOutline General organization of the human genome The mitochondrial genome The nuclear genome Protein
3、-coding genes RNA genes Highly repetitive DNA: heterochromatin and transposon repeatsGeneral Organization The DNA sequences of the human mtDNA was published in 1981 Consists of a single type of circular double stranded DNA , 16.6 kb in length Two mtDNA strands Heavy (H) strand: rich in G Light (L) s
4、trand: rich in C Maternally inherited Mitochondrial GenomeMitochondrial Genome Contains 37 genes (28 by H strand, 9 by L strand), tightly packed, no introns Transcription starts from common promoters 24 out of 37 non-coding RNA 13 out of 37 protein-coding Genetic code Slightly different from the uni
5、versal genetic code 60 codons Mitochondrial GenomeMtDNA Mutation & Human diseasesThe Human Nuclear and Mito. GenomesNuclearGenomeMitochondrialGenomeSize3.1Gb16.6KbNumberofdifferentDNAmolecules23(inXXcells)or24(inXYcells);alllinearonecircularDNAmoleculeTotalnumberofDNAmoleculespercellvariesaccord
6、ingtoploidy;46indiploidoftenseveralthousandcopies(butcopynumbervariesindifferentcelltype)AssociatedproteinseveralclassesofhistoneandnonhistoneproteinlargelyfreeofproteinNumberofprotein-codinggenes21,00013numberofRNAgenesuncertain,but600024Genedensity1/120Kb,butgreatuncertainty1/0.45KbRepetitiveDNAmo
7、rethan50%ofgenomeverylittleTranscriptiongenesareoftenindependentlytranscribedmultigenictranscriptsareproducedfromboththeheavyanlightstrandsIntronsfoundinmostgenesabsentPercentageofprotein-codingDNA1.1%32%Codonusage61aminoacidcodonsplusthreestopcodons60aminoacidcodonsplusfourstopcodonsRecombinationat
8、leastonceforeachpairofhomologsatmeiosisnotevidentInheritanceMendelianforXchromosomeandautosomes;paternalforYchromosomeexclusivelymaternal 3.1 Gb in size 2.9 Gb euchromatin 200 Mb heterochromatin DNA 24 different types of linear chromosomal DNA molecules Nuclear GenomeChromosomeTotalDNA(Mb)Euchromati
9、n(Mb)Heterochromatin(Mb)ChromosomeTotalDNA(Mb)Euchromatin(Mb)Heterochromatin(Mb)124922419.51311596.317.222432402.91410788.317.231981971.51510382.118.3419118831690791051811783178178.77.561711682.3187874.61.471591564.6195960.80.381461432.2206360.61.8914112018214834.211.6101361332.5225135.114.311135131
10、4.8X1551513121341314.3Y5926.431.6DNA Content of Human Chromosomes Contains at least 26,000 genes, but the exact gene number is difficult to determine 21,000 protein-coding genes At least 6,000 RNA genes Huge uncertainty about the number of RNA genesNuclear Genome RNA gene prediction Absence of sizea
11、ble ORF Small Comparatively poorly conserved Low expressed genes and/or genes expressed at unusual cellular locations and stages of development Very large genes with widely dispersed exons PseudogenesDifficulties in Estimating Gene Number in Complex Genomes Human genes are unevenly distributed betwe
12、en and within chromosomesNuclear GenomeConstitutive heterochromatin region: gene-poor (green)Euchromatin region: gene-rich (red) Duplication of DNA segments has resulted in copy-number variation and gene familiesNuclear GenomeAAAA+Outline General organization of the human genome The mitochondrial ge
13、nome The nuclear genome Protein-coding genes RNA genes Highly repetitive DNA: heterochromatin and transposon repeats Show enormous variation in size and internal organizationProtein-Coding GenesHumanproteinSizeofprotein(no.ofaminoacid)Sizeofgene(Kb)No.ofexonsCodingDNA(%)Averagesizeofexon(bp)Averages
14、izeofintron(bp)SRY2040.9194850-Globin1461.6338150490p161567.43174063064Serum albumin6091814121371100Type VII collagen2928311182977190p53393391062363076Complement C3164141298.6122900Apolipoprotein B45634529314871103Phenylalanine Hydroxylase452902633757100Factor VIII2351186263963500Huntingtin314418967
15、82012361RB1 retinoblastoma protein928198272.41796668CFTR1480250272.42279100Tltin3435028336340315466Utrophin3433567742.21687464Dystrophin36852400790.618030770Independent Of gene sizedependent Of gene size Different proteins can be specified by overlapping transcription unitsProtein-Coding GenesHLA co
16、mplex tightly packed and overlappingIntron 27b of NF1 genes contains three small internal genesOverlapping genesGenes within genes Often belong to families of genes that may be clustered or dispersed on multiple chromosomesProtein-Coding GenesGenes in a cluster are often closely related in sequence
17、and are typically transcribed from the same strand Gene families can be recognized according to the extent of sequence and structural similarity of the protein productsProtein-Coding Genes Gene duplication events that give rise to multigene families also create pseudogenes and gene fragmentsProtein-
18、Coding Genes Usually thought of as defective copies of a functional gene to which they show significant sequence homologyPseudogenesgeneduplicationMutationalconstraintMutationalflexibilityAAATranscriptionandprocessingReversetranscriptasemRNAcDNAChromosomeintegrationmutationAANonprocessed Pseudogenes
19、Processed PseudogenesPseudogenesv 未加工的假基因(未加工的假基因(nonprocessed pseudogenes)的特点)的特点v 有内含子有内含子v 可以转录(如果启动子等完整)可以转录(如果启动子等完整)v 与功能基因一般在同一染色体与功能基因一般在同一染色体v 产生于基因重复产生于基因重复v 经过加工的假基因(经过加工的假基因(processed pseudogenes)的特点)的特点v 无内含子无内含子v 两侧有同向重复序列两侧有同向重复序列v 与功能基因的调控序列没有同源关系与功能基因的调控序列没有同源关系v 3端有端有poly(A)尾尾Outli
20、ne General organization of the human genome The mitochondrial genome The nuclear genome Protein-coding genes RNA genes Highly repetitive DNA: heterochromatin and transposon repeats In the past decade, there has been a revolution in how we view RNA Whole genome analysis using microarrays and high-thr
21、oughput transcript sequencing have showed that at least 85% of the human genome is transcribed Genes whose final products are functional noncoding RNA (ncRNA) molecules have been realized to function in great diversityRNA GeneExtensive Transcriptional Complexity of Human GeneFunctional diversity of
22、RNAOutline General organization of the human genome The mitochondrial genome The nuclear genome Protein-coding genes RNA genes Highly repetitive DNA: heterochromatin and transposon repeats Heterochromatin: DNA sequences present at certain subchromosomal regions as large arrays of tandem repeats Rema
23、ins highly condensed throughout the cell cycle Does not generally contain genes Transposon repeats: DNA sequences interspersed throughout the human genome Derived by duplicative transposition Account for more than 40% of the total DNAs Reside in extragenic regions, introns and UTRs, even coding regi
24、onsHighly Repetitive DNA Mostly consists of long arrays of high-copy-number tandemly repeated DNA sequences, known as satellite DNAHeterochromatinClassTotal array size unitSize or sequence of repeat unitMajor chromosomal location(s)Satellite DNAoften hundreds of kilobasesassociated with heterochroma
25、tin (alphoid DNA) 171 bpcentromeric heterochromatin of all chromosome (Sau3A fimily)68 bpnotably the centromeric heteorochromain of 1,9,13,14,15,21,22, and Y)Satellite 125-48 bp(AT-rich)centromeric heterochromatin of most chromosomes and other heterochromatic regions)Satellite 2diverged forms of ATT
26、CC/GGAATmost, possibly all, chromosomeSatellite 3ATTCC?GGAAT13p,14p,15p,21p,22p, and heterochromatin on 1q,9q,and Yq12DYZ19125 bp400 kb at Yq11DYZ2AT-richYq12; higher periodicity of 2470 bpMinisatellite DNA0.1-20 kbat or close to telomeres of all chromosomesTelemeric minisatelliteTTAGGGall telomeres
27、Hypervariable minisatellite9-64 bpall chromosomes, associated with euchromatin, notably in sub-telemeric regionsMicrosatellite DNA100 bpoften 1-4 bpwidely dispersed throughout all chromosomesMajor classes of high-copy-number tandemly repeated human DNATransposons: mobile DNA sequences that can migra
28、te to different regions of the genome. Transposon-derived interspersed repeats make up more than 40% of the human genome and arose mostly through RNA intermediatesTwo groups of repeats:RetrotransposonsCopy-and-paste mechanismLINE, SINE, and retrovirus-like (LTR transposons)DNA transposonsCut-and-pas
29、te mechanismTransposon RepeatsTranspose independentlyCan not transpose independentlyMammalian Transposon FamiliesLINE-1 repeat element (6.1 Kb, 17% of the human genome, the only family that continues to have actively transposing members)Endonuclease cuts one strand of a DNA duplex, preferably within
30、 the sequence TTTT A, and the reverse transcriptase uses the released 3-OH end to prime cDNA synthesis. New insert sites are flanked by a small target site duplication of 2-20 bpAlu repeat dimer ( 280 bp, the most abundant sequence in the human genome, consists of two tandem repeats The Human LINE-1
31、 and Alu repeatsGeneticVariationsDepartment of Medical GeneticsYaping Liu (刘雅萍)Human Molecular Genetics (4th Edition) by Tom Strachan and Andrew Read Outline Terminology of genetic variation Types of variation single base to multiple base changes Methods of detecting (genotyping) genetic variationsT
32、erminology of Genetic VariationTermDefinitionVariation Any nucleotide change in the genome Rare Polymorphism Variation found in 0.01 (arbitrary value) Population geneticists The rarer genotype could not be maintained simply by recurrent mutation Cystic fibrosis commonest mutation p. F508del, MAF:0.0
33、1-0.02 in EUC Clinical geneticists Nonpathogenic variant regardless of its frequencyPolymorphism Mutation Refer to either the process of the product An event that changes a DNA sequence She had inherited a mutation from her father A DNA sequence change that may have happened a long time ago eg. UV r
34、adiation produced a mutation in the DNAMutation Minisatellites or VNTRs (Variable Numbers of Tandem Repeats) Microsatellites or SSRs (Simple Sequence Repeats) or STRs (Short Tandem Repeats) Single Nucleotide Polymorphisms (SNPs) and Restriction Fragment Length Polymorphisms (RFLPs) Copy number varia
35、tion (CNVs)Type of Polymorphisms Detection of DNA Polymorphism Many DNA polymorphisms are found by Genome projects Up to 10-fold sequence coverage in order to identify sequence errors and help with assembly provides a resource for SNP discovery. Polymorphic markers of known location in the genome ar
36、e used for: Gene mapping Association of genes to phenotypes Genetic Identity Population GeneticsExampleDetectionMethodsVNTR Flanking restriction sites w/ Southerns Use PCR if small enoughSTR CTCTCT Amplify with flanking primers andCTCTCTCTgenotype allele length with capillary elec.SNP G Many, ASO, 1
37、 nt primer extension, oligoAligation, Taqman, bead array, etc.CNV CGHDetection MethodsThe Most Informative Types of Genetic Variationaka: Minisatellite Repeats5 to 50 Alleles: Very informativeHeterozygosity: 50-90%Usually ExtragenicDetection method: Some PCR-BasedOthers Southern BlotVNTR4/6 3/3 1/7
38、2/6 1/4Detection of VNTR by Southern ExampleDetectionMethodsVNTR Flanking restriction sites w/ Southerns Use PCR if small enoughSTR CTCTCT Amplify with flanking primers andCTCTCTCTgenotype allele length with capillary elec.SNP G Many, ASO, 1 nt primer extension, oligoAligation, Taqman, bead array, e
39、tc.CNV CGHDetection MethodsThe Most Informative Types of Genetic VariationSTRs are very useful for linkage studiesSTRs are also known as Markers because the can be used to mark gene locations, since their chromosomal location is accurately knownIf a gene is close enough to a marker, it will be inher
40、ited together with a particular allele at the marker locus in familiesRecombination between such marker loci is rare if they are closer than 10 million base pairsThus only need about 300 STR markers to link a disease gene to a particular 10 megabase region in a few families with an extended pedigree
41、STRSTRs Often Very Polymorphic - Hence InformativeAlleleGenotype1CTCTCT2CTCTCTCT3CTCTCTCTCT4CTCTCTCTCTCT.11CTCTCTCTCTCTCTCTCTCTCTCTCT Amplify with flanking PCR primers and genotype allele length using capillary electrophoresis in a DNA sequencer Multiplex by dye color and size of amplicon STRs Used
42、to Link Inheritance with Disease PhenotypeExampleDetectionMethodsVNTR Flanking restriction sites w/ Southerns Use PCR if small enoughSTR CTCTCT Amplify with flanking primers andCTCTCTCTgenotype allele length with capillary elec.SNP G Many, ASO, 1 nt primer extension, oligoAligation, Taqman, bead arr
43、ay, etc.CNV CGHDetection MethodsSNPs - Less Informative But More Frequent Numerically the most abundant type of genetic variant, 1 in 100 to 300 bp, in total 12 million ( ) GCCTGTTTTATATTAC/TGATCCAATTTTTTCA GAGACAGAGTTTCGC(T)TCTTGTTGCCCAGGCT CCAAGCCTGGAGCTA/GGCCGTGGGCCAGGCAAG10% of all nucleotides f
44、all in such palindromic sequences2 of 12 million fall in such type; typically one or two nonrepeated nucleotidesAdvantages of SNPsEasy to scale up for high-throughput and massively parallel genotypingPhenotypic changes produced by SNPs (e.g., human diseases caused by SNPs) can be directly genotypedS
45、NPs seem to comprise the largest class of functional polymorphisms (i.e., those producing phenotypic effects)SNPs open the way to the development of ultra-high density mapsOne SNPs per 100 bp in humans (12,000,000)Affimetrix and Illumina now sell assays with 1,000,000SNPs/chipUsually only two allele
46、s, rarely 3 different base choices at one locus, thus may only need 2 unique probes to genotypeSimplest: allele specific oligos (ASO) used for specific hybridization +/- -/- +/+SNPDetection of SNP/RFLP-+/+-/-/+Detection of SNP by PCR/ASO SNP Genotyping Methods Over 100 different approaches Ideal SNP
47、 genotyping platform: High-throughput capacity Simple assay design Robust Affordable price Automated genotype calling Accurate and reliable resultsSNP Genotyping Methods PCR Discriminationbetweenalleles: Allele-specifichybridization Alele-specificprimerextension Allele-specificoligonucleotideligatio
48、n Allele-specificenzymaticcleavage Detectionoftheallelicdiscrimination: Lightemittedbytheproducts Mass ChangeintheelectricalpropertySNP Genotyping Methods No one system has a clear advantage over all the others Each has unique capabilities for specific problems ABI TaqMan and SNaPshot dominate low a
49、nd intermediate-level SNP assays Regional assignment/confirmation studies are being done with SNPlex and other intermediate level throughput technologies Illumina and Affymetrix dominate whole genome associations Complex diseases such as diabetes are caused by more than one gene defect Complex disea
50、ses do not show strong family association SNPs can be selected to tag haplotype blocks within populations SNPs spaced out along all chromosomes can be associated with disease (use 106 SNPs) Such studies involve around 1,000 patients (cases) and 1,000 normal individuals (controls)Genome-wide Associat
51、ion StudiesGenome-wide Association StudiesDuplication and Genome EvolutionBritten - Ancient duplications, 97% proteins share sequence matches (PNAS, 2006) Susumo Ohno - Genome duplication, 1970Definitions Segmental duplicationsLow copy repeats (LCRs)Gene duplicationsCopy number variation (CNV)Segmen
52、tal duplications * Large recent duplications (35-40 Myr) 1 - 300 kb in length 90 % sequence identity 5 % of the human sequence Clustering to pericentromeric and subtelomeric regions (1/3); remainder dispersedAka low copy repeats (LCRs) * Eichler EE, Genome Research 11:653, 2001 What is the extent of
53、 copy number variation between across the genomes of normal individuals?Science 305: 525,2004Copy Number Polymorphisms (CNPs)221 CNPS observed76 unique* Sebat et al, Science 305: 525, 2004Nat Genet 37:727, 2005AuthorPlatformResolution informationSample sizeEthnically diverse?# of CNVsdetectedLength
54、of CNVsSebatROMA1frag. /17 kb20YES76465 Kb AVGIafratearray CGH1BAC /1 Mb55YES255-SharpSegm. duplication BAC array (130 targets)47YES119-Tuzunfosmid end sequence8Kb-50kb2no297most 8-40 KbHindsPerlegen oligo Array, 200 Mb of genome60 bp24YES215 dels70 bp-7kb med=750bpMcCarrollHapMap data (deletions on
55、ly)1,000,000 NPs269YES541 dels1-745 Kb med=7 ConradHapMap data (deletions only)1,000,000 SNPs269YES586 delsCEU:18kb YRI=14.8 Redon26,574 clone array CGH and 500 K SNP chip40 Kb & 24 kb270YES1447300 Kb AVGWong 26,363 duplicate clones array CGH40 Kb105 16 ethn. groups 3654 200 Kb AVG Hybridization - basedROMA (Representation oligonucleotide microarray analysis ; oligos)Affymetrix (oligos)BAC arrays (BACs) Genotype-based, e.g. Illumina Sequence-based, e.g. fosmid end sequences CNV
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 航标器材在智能港口管理系统中的应用考核试卷
- 2025年中国数码摄录放一体机市场调查研究报告
- 建筑试验考试试题及答案
- 公务员考试题目组合及答案
- 维修考试试题及答案
- 创新问题公务员面试题及答案
- 停车厂考试试题及答案
- 工程公司考试试题及答案
- 黑职空乘考试试题及答案
- 高中数学考试试题及答案
- 山东省聊城市2025年高考模拟试题(二)数学+答案
- 团播签经纪合同和合作协议
- 车辆采购合同模板.(2025版)
- 浙江省杭州市萧山区2025年中考一模数学模拟试题(含答案)
- 浙江省丽水市发展共同体2024-2025学年高二下学期4月期中联考地理试卷(PDF版含答案)
- 田园综合体可行性研究报告
- 职业技术学院2024级跨境电子商务专业人才培养方案
- 沈阳市东北大学非教师岗位招聘考试真题2024
- 2025年中考语文二轮复习:散文阅读 专题练习题(含答案)
- 超市转包合同协议
- 厨师合同协议书
评论
0/150
提交评论