蛋白质结构预测.doc

上传人：清*** IP属地：河南上传时间：2020-04-26 格式：DOC 页数：27 大小：7.31MB 积分：15 举报 版权申诉

已阅读5页，还剩22页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

14-3-3protein10结构预测 14-3-3protein10的蛋白质序列P93207MAALIPENLSREQCLYLAKLAEQAERYEEMVQFMDKLVLNSTPAGELTVEERNLLSVAYKNVIGSLRAAWRIVSSIEQKEESRKNEEHVHLVKEYRGKVENELSQVCAGILKLLESNLVPSATTSESKVFYLKMKGDYYRYLAEFKIGDERKQAAEDTMNSYKAAQEIALTDLPPTHPIRLGLALNFSVFYFEILNSSDKACSMAKQAFEEAIAELDTLGEESYKDSTLIMQLLRDNLTLWTSDAQDQLDES等电点、相对分子质量计算Compute pI/MwTheoretical pI/Mw (average) for the user-entered sequence: 10 20 30 40 50 60MAALIPENLS REQCLYLAKL AEQAERYEEM VQFMDKLVLN STPAGELTVE ERNLLSVAYK 70 80 90 100 110 120NVIGSLRAAW RIVSSIEQKE ESRKNEEHVH LVKEYRGKVE NELSQVCAGI LKLLESNLVP 130 140 150 160 170 180SATTSESKVF YLKMKGDYYR YLAEFKIGDE RKQAAEDTMN SYKAAQEIAL TDLPPTHPIR 190 200 210 220 230 240LGLALNFSVF YFEILNSSDK ACSMAKQAFE EAIAELDTLG EESYKDSTLI MQLLRDNLTL 250WTSDAQDQLD ESTheoretical pI/Mw: 4.80 / 28624.47 蛋白质参数预测Number of amino acids: 252Molecular weight: 28624.4Theoretical pI: 4.80窗体顶端Amino acid composition: Ala (A) 24 9.5% Arg (R) 11 4.4% Asn (N) 11 4.4% Asp (D) 12 4.8% Cys (C) 3 1.2% Gln (Q) 11 4.4% Glu (E) 30 11.9% Gly (G) 8 3.2% His (H) 3 1.2% Ile (I) 11 4.4% Leu (L) 33 13.1% Lys (K) 17 6.7% Met (M) 7 2.8% Phe (F) 7 2.8% Pro (P) 6 2.4% Ser (S) 21 8.3% Thr (T) 11 4.4% Trp (W) 2 0.8% Tyr (Y) 11 4.4% Val (V) 13 5.2% Pyl (O) 0 0.0% Sec (U) 0 0.0% (B) 0 0.0% (Z) 0 0.0% (X) 0 0.0% 窗体底端Total number of negatively charged residues (Asp + Glu): 42Total number of positively charged residues (Arg + Lys): 28Atomic composition:Carbon C 1265Hydrogen H 2012Nitrogen N 332Oxygen O 402Sulfur S 10Formula: C1265H2012N332O402S10Total number of atoms: 4021Extinction coefficients:Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.Ext. coefficient 27515Abs 0.1% (=1 g/l) 0.961, assuming all pairs of Cys residues form cystinesExt. coefficient 27390Abs 0.1% (=1 g/l) 0.957, assuming all Cys residues are reducedEstimated half-life:The N-terminal of the sequence considered is M (Met).The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro). 20 hours (yeast, in vivo). 10 hours (Escherichia coli, in vivo).Instability index:The instability index (II) is computed to be 40.77This classifies the protein as unstable.Aliphatic index: 92.58Grand average of hydropathicity (GRAVY): -0.356氨基酸组成、电荷分布、疏水区域、跨膜区域等预测SAPS output for P93207ISREC-Server Date: Sat Nov 27 6:42:14 Europe/Zurich 2010 SAPS. Version of April 11, 1996.Date run: Sat Nov 27 06:42:15 2010 SAPS (Statistical Analysis of Protein Sequences) evaluates by statistical criteria a wide variety of protein sequence properties. A full description of the methods is given in the paper referred to below. The output is or- ganized in the following sections: file name, sequence printout, composi- tional analysis, charge distributional analysis (charge clusters; high scoring (un)charged segments; charge runs and patterns), distribution of other amino acid types (high scoring hydrophobic and transmembrane seg- ments; cysteine spacings), repetitive structures (in the amino acid alpha- bet and in a 11-letter reduced alphabet), multiplets (counts, spacings, and clusters in the amino acid and charge alphabets), periodicity analysis, spacing analysis. Each section is annotated below under its sec- tion title. The SAPS program was developed in the group of Prof. Samuel Karlin at Stanford University. Correspondence relating to SAPS should be addressed to either Volker Brendel or Samuel Karlin at the Department of Mathemat- ics, Stanford University, Stanford CA 94305, U.S.A.; phone: (415) 723- 2209; fax: (415) 725-2040; email: . Users of the program should cite the following reference: Brendel, V., Bucher, P., Nourbakhsh, I., Blaisdell, B.E., Karlin, S. (1992) Methods and algorithms for statistical analysis of protein sequences. Proc. Natl. Acad. Sci. USA 89: 2002-2006.*Protein 1 (File: wwwtmp/.SAPS.10079.6464.seq)SWISS-PROT ANNOTATION:ID P93207DE P93207, 252 bases, 2EE29C11 checksum.number of residues: 252; molecular weight: 28.6 kdal 1 MAALIPENLS REQCLYLAKL AEQAERYEEM VQFMDKLVLN STPAGELTVE ERNLLSVAYK 61 NVIGSLRAAW RIVSSIEQKE ESRKNEEHVH LVKEYRGKVE NELSQVCAGI LKLLESNLVP 121 SATTSESKVF YLKMKGDYYR YLAEFKIGDE RKQAAEDTMN SYKAAQEIAL TDLPPTHPIR 181 LGLALNFSVF YFEILNSSDK ACSMAKQAFE EAIAELDTLG EESYKDSTLI MQLLRDNLTL 241 WTSDAQDQLD ES-COMPOSITIONAL ANALYSIS (extremes relative to: swp23s.q) The composition of the input sequence is evaluated relative to the residue usage quantile table specified with the -s species flag. Low usage inthe 1% quantile is indicated by the label - (e.g., Y- means that theinput sequence uses tyrosine as little as the 1% least tyrosine contain- ing proteins in the reference set); low usage in the 5% quantile is indi-cated by the label - (e.g., L-); high usage above the 95% quantile point is indicated by the label + (e.g., A+); and high usage above the 99% quantile point is indicated by the label + (e.g., LIVFM+). The usage is evaluated for all 20 amino acids, positive (KR) and negative (ED) charge, total charge (KRED), net charge (KR-ED), major hydrophobics (LVIFM), and the groupings ST, AGP (encoded by CCN, GCN, and GGN codons), and FIKMNY (encoded by AAN, AUN, UAN, and UUN codons).A : 24( 9.5%); C : 3( 1.2%); D : 12( 4.8%); E+ : 30(11.9%); F : 7( 2.8%)G- : 8( 3.2%); H : 3( 1.2%); I : 11( 4.4%); K : 17( 6.7%); L : 33(13.1%)M : 7( 2.8%); N : 11( 4.4%); P : 6( 2.4%); Q : 11( 4.4%); R : 11( 4.4%)S : 21( 8.3%); T : 11( 4.4%); V : 13( 5.2%); W : 2( 0.8%); Y : 11( 4.4%)KR : 28 ( 11.1%); ED : 42 ( 16.7%); AGP : 38 ( 15.1%);KRED : 70 ( 27.8%); KR-ED - : -14 ( -5.6%); FIKMNY : 64 ( 25.4%);LVIFM : 71 ( 28.2%); ST : 32 ( 12.7%).-CHARGE DISTRIBUTIONAL ANALYSIS The distribution of charges in the protein sequence is evaluated in terms of clusters, high scoring segments, and runs and periodic patterns. Clus- ters indicate regions of typically 30 to 60 residues exhibiting a rela- tively high charge concentration. For high scoring charge segments, posi- tive scores are assigned to charge residues of the appropriate type and negative scores to all other residues. A significant cumulative positive score again indicates a region of high charge concentration. The cluster method and the scoring method will generally pick out the same segments (with the scoring method often delimiting the segment to a narrower range), conferring robustness to the results. Short segments of high charge concentration are displayed as runs (with errors). Periodic pat- terns focus on those with charges every second or third position, with possible relevance to amphipathic secondary structures; other periodic patterns are displayed in the general periodicity analysis section of the Output. 1 000000-000 +-000000+0 0-00-+0-0 0000-+0000 00000-000- -+0000000+ 61 000000+000 +00000-0+- -0+0-000 00+-0+0+0- 0-00000000 0+00-00000 121 00000-0+00 00+0+0-00+ 000-0+00- +000-000 00+000-000 0-0000000+ 181 0000000000 00-00000-+ 00000+000- -000-0-000 -00+-0000 0000+-0000 241 000-00-00- -0A. CHARGE CLUSTERS. Positive, negative, and mixed charge clusters are distinguished. In each case, cmin indicates the minimum number of charges required for a signifi- cant charge cluster corresponding to the given window size; e.g., cmin = 9/30 or 12/45 or 15/60 means that significance requires at least 9 charges in a segment of 30 (or fewer) residues, or 12 charges in a segment of length 45, or 15 charges in a segment of length 60. In the case of posi- tive and negative charge clusters, these counts refer to net charge, i.e., charges of the opposite sign within the window are counted as -1. The sizes of the clusters are optimized for display to indicate the segment of highest charge concentration, but a minimum size of 20 residues is required. A mixed charge cluster that begins and ends within 15 residues of the endpoints of a pure charge cluster is not displayed (since its sig- nificance rests mostly on the charged residues comprising the displayed pure charge cluster), unless the -v (verbose output) flag is set, in which case both the pure and the mixed charge cluster are displayed. On the other hand, pure charge clusters that are embedded in mixed charge clus- ters are displayed separately (indicated by a * preceding the specifica- tion of location). For each cluster are given its location in the sequence (From, to), the quartile of the location (1st, 2nd, 3rd, or 4th quarter of the sequence), length, count, and t-value (standard deviations above the mean; to accommodate the multiple tests performed, the t-value significance threshold is set to 4.0 for sequences up to 750 residues, to 4.5 for sequences of length 750-1500 residues, and to 5.0 for longer sequences); also indicated are residues comprising at least 10% of the cluster.Positive charge clusters (cmin = 10/30 or 13/45 or 16/60): noneNegative charge clusters (cmin = 13/30 or 17/45 or 21/60): noneMixed charge clusters (cmin = 18/30 or 24/45 or 30/60): noneB. HIGH SCORING (UN)CHARGED SEGMENTS. For each scoring scheme (scores assigned to residues as displayed), SAPS displays segments of the sequence with aggregate score exceeding the par- ticular threshold values M_0.01 (1% significance level, segments labeled with *), M_0.05 (5% significance level, segments labeled *), or other- wise as indicated. A minimal segment length is set as shown. The expected score/letter should be sufficiently large negative, and the average infor- mation per letter should be sufficiently large positive in order for the scoring statistics to apply properly (the program prints out when the con- ditions are not met and skips evaluations)._High scoring positive charge segments:score= 2.00 frequency= 0.111 ( KR )score= 0.00 frequency= 0.000 ( BZX )score= -1.00 frequency= 0.722 ( LAGSVTIPNFQYHMCW )score= -2.00 frequency= 0.167 ( ED ) Expected score/letter: -0.833; Average information/letter: 1.329 Minimal length of displayed segments set to: 20M_0.01= 9.97 (cv= 6.10, lambda= 0.90659, k= 0.33539, x= 3.87; 90% confidence interval for segment length: 10 +- 9)M_0.05= 8.17 (x= 2.07)# of segments (=20 residues) exceeding M_0.05: none_High scoring negative charge segments:score= 2.00 frequency= 0.167 ( ED )score= 0.00 frequency= 0.000 ( BZX )score= -1.00 frequency= 0.722 ( LAGSVTIPNFQYHMCW )score= -2.00 frequency= 0.111 ( KR ) Expected score/letter: -0.611; Average information/letter: 0.642 Minimal length of displayed segments set to: 20M_0.01= 13.77 (cv= 8.85, lambda= 0.62481, k= 0.21809, x= 4.93; 90% confidence interval for segment length: 19 +- 18)M_0.05= 11.17 (x= 2.32)# of segments (=20 residues) exceeding M_0.05: none_High scoring mixed charge segments:score= 1.00 frequency= 0.278 ( KEDR )score= 0.00 frequency= 0.000 ( BZX )score= -1.00 frequency= 0.722 ( LAGSVTIPNFQYHMCW ) Expected score/letter: -0.444; Average information/letter: 0.613 Minimal length of displayed segments set to: 20M_0.01= 9.24 (cv= 5.79, lambda= 0.95551, k= 0.27350, x= 3.46; 90% confidence interval for segment length: 21 +- 18)M_0.05= 7.54 (x= 1.75)# of segments (=20 residues) exceeding M_0.05: none_High scoring uncharged segments:score= 1.00 frequency= 0.722 ( LAGSVTIPNFQYHMCW )score= 0.00 frequency= 0.000 ( BZX )score= -8.00 frequency= 0.278 ( KEDR ) Expected score/letter: -1.500; Average information/letter: 0.334 Minimal length of displayed segments set to: 20M_0.01= 28.42 (cv= 18.44, lambda= 0.29987, k= 0.20027, x= 9.98; 90% confidence interval for segment length: 37 +- 22)M_0.05= 22.98 (x= 4.54)# of segments (=20 residues) exceeding M_0.05: noneC. CHARGE RUNS AND PATTERNS. The table below shows the charge runs and patterns searched for (* stands for + or -) and the required minimum number of matches to the pattern allowing for at most 0 (lmin0), 1 (lmin1), or 2 (lmin2) mismatches or insertions/deletions (1% significance level). Occurrences are arranged in the order in which they appear in the sequence. For each run or pattern are displayed its length (number of matches) and a triplet giving the number of mismatches, insertions and deletions. 0-runs are further charac- terized by their composition (residues comprising more than 10% of the run). Run count statistics are compiled for runs of lengths at least 2/3 of the minimal significant length (lmin0); given are the number and locations of such runs.pattern (+)| (-)| (*)| (0)| (+0)| (-0)| (*0)|(+00)|(-00)|(*00)| (H.)|(H.)|lmin0 5 | 6 | 8 | 27 | 9 | 10 | 13 | 10 | 12 | 15 | 6 | 7 | lmin1 6 | 7 | 10 | 33 | 11 | 12 | 16 | 12 | 14 | 18 | 7 | 9 | lmin2 7 | 8 | 11 | 37 | 12 | 14 | 17 | 14 | 16 | 20 | 8 | 10 | (Significance level: 0.010000; Minimal displayed length: 6)There are no charge runs or patterns exceeding the given minimal lengths.Run count statistics: + runs = 3: 0 - runs = 4: 0 * runs = 5: 0 0 runs = 18: 0-DISTRIBUTION OF OTHER AMINO ACID TYPES Routinely, SAPS indicates high scoring hydrophobic and transmembrane seg- ments. The display is as desribed above for high scoring charge segments. The scores for the hydrophobic segments correspond to a digitized hydro- pathy scale. The transmembrane scores were derived from target frequen- cies in putative transmembrane proteins (see the paper referred to above; note, however, that the scores used in the program have been rederived and differ from the ones given in the paper). With the -a command line flag, the user can invoke a similar analysis for other residue types. In view of the special role of cysteines for protein structure, the spacings of the cysteine residues in the sequence are displayed separately, with par- ticular emphasis on close pairs of cysteines and distances between such pairs.1. HIGH SCORING SEGMENTS._High scoring hydrophobic segments: 2.00 (LVIFM) 1.00 (AGYCW) 0.00 (BZX) -2.00 (PH) -4.00 (STNQ) -8.00 (KEDR)Expected score/letter: -2.397; Average information/letter: 0.779 Minimal length of displayed segments set to: 15M_0.01= 21.77 (cv= 13.35, lambda= 0.41428, k= 0.32934, x= 8.42; 90% confidence interval for segment length: 17 +- 10)M_0.05= 17.84 (x= 4.49)# of segments (=15 residues) exceeding M_0.05: none_High scoring transmembrane segments: 5.00 (LVIF) 2.00 (AGM) 0.00 (BZX) -1.00 (YCW) -2.00 (ST) -6.00 (P) -8.00 (H) -10.00 (NQ) -16.00 (KR) -17.00 (ED)Expected score/letter: -4.460; Average information/letter: 0.654 Minimal length of displayed segments set to: 15M_0.01= 51.27 (cv= 32.51, lambda= 0.17009, k= 0.24449, x= 18.76; 90% confidence interval for segment length: 19 +- 13)M_0.05= 41.69 (x= 9.18); M_0.30= 30.29 (x= -2.22)# of segments (=15 residues) exceeding M_0.30: none2. SPACINGS OF C.H2N-13-C-92-C-94-C-50-COOH-REPETITIVE STRUCTURES. Repeats are indicated for two alphabets: the 20-letter amino acid alpha- bet, and a reduced 11-letter alphabet in which the major hydrophobics LVIF, the charged residues KR and ED, the small residues AG, the hydroxyl group residues ST, the amid group residues NQ, and the aromatics YW are treated as combined letters. For each alphabet, three classes of repeats are distinguished: separated repeats, simple tandem repeats, and periodic repeats. The separated repeats are largely non-overlapping. They are displayed in groups of matching blocks (exceeding a given core block length of contiguous exact matches) and intervening spacer distances (which may be negative, signifying a partial overlap). The core block length in case of the amino acid alphabet is set to 4 for sequences up to 500 residues, to 5 for sequences between 500 and 2000 residues, and to 6 for longer sequences (same values increased by 4 for the reduced alpha- bet). Simple tandem repeats are displayed in similar layout, but separately. Sequence segments that are highly repetitive with relatively short repe

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

蛋白质结构预测.doc

文档简介

温馨提示

最新文档

评论

蛋白质结构预测.doc

文档简介

温馨提示

最新文档

评论

相关文档