版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Protein Secondary Structure (中文版编译中ing)(转)These are a collection of protein secondary structure analysis and information sites. Detailed information is following: PSIpred - Prediction of secondary structure from multiple sequences SAM-T99 BCM Protein Secondary Structure Prediction
2、0;Jnet- a neural network protein secondary structure prediction method Jpred- A consensus method for protein secondary structure prediction Prof (son of DSC) Secondary Structure Prediction Predator Nnpredict server SOPMA PROTINFO - Secondary structure prediction Pr
3、edictProtein - sequence analysis and structure prediction HMMTOP - predict transmembrane helices and topology DAS (Prediction of transmembrane alpha-helices in prokaryotic membrane proteins) SOSUI - Secondary Structure Prediction of Membrane Proteins TopPred - Topology prediction
4、 of membrane proteins TMpred - prediction of transmembrane regions and orientation Coils - prediction of coiled coil regions Paircoil SignalP - predicts signal peptides of secretory proteins Sigfind - Signal Peptide Prediction Server (Human) ChloroP - Chloroplast Transi
5、t Peptide Prediction Helix-Turn-Helix (HTH) = Detailed information on the above options PSIpred - Prediction of secondary structure from multiple sequences PSIpred - Prediction of secondary structure from multiple sequences MEMSAT 2 - Prediction of transmembrane t
6、opology from multiple sequences GenTHREADER - Fast and reliable protein fold recognition SAM-T99 The best 2ry structure predictor at CASP3 was clearly Jones's PSIPRED. A close second was this predictor. They have since improved thier predictor considerably.
7、160;They hope to beat PSIPRED at CASP4 with this predictor. Currently, this predictor is about 77-78% correct, and does a good job of knowing when it is inaccurate. BCM Protein Secondary Structure Prediction This provides a rich set of programs for protein secondary structure
8、0;determination. Coils - prediction of coiled coil regions nnPredict - uses a 2 layer neural network PSSP / SSP - segment-oriented prediction PSSP / NNSSP - nearest-neighbor prediction SAPS - statistical analysis of protein sequences TMpred - transmembrane region
9、and orientation prediction PHDsec - profile network method PSA - for single domain globular proteins SOPM - self optimized prediction method SSPRED - with residue exchange statistics Swiss-Model - from alignment to crystallographic data Jnet - a neural network protein s
10、econdary structure prediction method Jnet is a neural network prediction algorithm that works by applying multiple sequence alignments, alongside PSIBLAST and HMM profiles. Consensus techniques are applied that predict the final secondary structure more accurately. It was w
11、ritten as part of a continuing study to improve protein secondary structure prediction. Jnet can also predict 2 state solvent exposure at 25, 5 and 0% relative exposure. Positions where the different prediction methods do not agree are marked as no jury positions. A separat
12、e network is applied for these positions, which improves the cross-validated accuracy. A reliability index indicates which residues are predicted with a high confidence. Jpred - A consensus method for protein secondary structure prediction Jpred takes either a protein seque
13、nce or mulitply aligned protein sequences, and predicts secondary structure. It works by combining a number of modern, high quality prediction methods to form a consensus. Jpred runs DSC, PHD, PREDATOR and NNSSP to build it's consensus prediction, but predictions from o
14、lder algorithms Mulpred and Zpred are also included in the final output. The consensus method has been shown, to be on average more accurate than any of the component methods, by ca. 1%. However the strength of this server lies in the fact that it leaves the final decision
15、to the user who can use the supplied coloured HTML and Java viewer to decide where the best or most sensible consensus may be. Prof (son of DSC) Secondary Structure Prediction Submit a single amino acid sequence for secondary structure prediction Predator Protein secondary s
16、tructure prediction from single sequence or from a set of sequences. PREDATOR takes as input a sequence file in FASTA, MSF or CLUSTAL format containing one or many protein sequences. By default, the prediction will be made for the first sequence in the set. Nnpredict server&
17、#160;nnpredict is a program that predicts the secondary structure type for each residue in an amino acid sequence. The basis of the prediction is a two-layer, feed-forward neural network. nnpredict takes as input a sequence consisting of one-letter amino acid codes (A C D E
18、 F G H I K L M N P Q R S T V W Y) (NOTE: B and Z are not recognized as valid amino acid codes) or three-letter amino acid codes separated by spaces (ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASN PRO GLN ARG SER THR VAL TRP TYR). The output is a secondary structure prediction fo
19、r each position in the sequence. Multiple-chain proteins can be predicted either in pieces, or as a single sequence, with a '!' character between chains. SOPMA SOPMA (Self Optimized Prediction Method from Alignment) is a package to make secondary structure pred
20、ictions of proteins. PROTINFO - Secondary structure prediction The goal of this website is to provide information about proteins. One can: assign 2D structure with PsiCSI using NMR chemical shift data and neural networks generate comparative models using RAMP software g
21、enerate fold recognition models using RAMP software generate de novo models using RAMP software PredictProtein - sequence analysis and structure prediction An automatic service for protein database searches and the prediction of aspects of protein structure. Database searche
22、s: generation of multiple sequence alignments (MaxHom) detection of functional motifs (PROSITE) detection of composition-bias (SEG) detection of protein domains (PRODOM) fold recognition by prediction-based threading (TOPITS) Predictions of: secondary structure (PH
23、Dsec, and PROFsec) residue solvent accessibility (PHDacc, and PROFacc) transmembrane helix location and topology (PHDhtm, PHDtopology) protein globularity (GLOBE) coiled-coil regions (COILS) cysteine bonds (CYSPRED) structural switching regions (ASP) Evaluation of
24、secondary structure prediction accuracy (EvalSec) See also: EVA : an automatic evaluation of prediction methods HMMTOP - predict transmembrane helices and topology HMMTOP is an automatic server for predicting topology of transmembrane proteins. The method is based on the hypothes
25、is that topology is determined by the maximum divergence of the amino acid distributions of the various structural parts in membrane proteins. DAS (Prediction of transmembrane alpha-helices in prokaryotic membrane proteins) The so-called Dense Alignment Surface (DAS) method
26、 was introduced in an attempt to improve sequence alignments in the G-protein coupled receptor family of transmembrane proteins. We have now generalized this method to predict transmembrane segments in any integral membrane protein. DAS is based on low-stringency dot-plots of
27、0;the query sequence against a collection of non-homologous membrane proteins using a previously derived, special scoring matrix. SOSUI - Secondary Structure Prediction of Membrane Proteins The SOSUI system is a useful tool for secondary structure prediction of membrane pro
28、teins from a protein sequence. The basic idea of prediction in this system is based on the physicochemical properties of amino acid sequences such as hydrophobicity and charges. The system deals with three types of prediction: discrimination of membrane proteins from soluble one,
29、 prediction of existence of transmembrane helices and determination of transmembrane helical regions. The accuracy of this system, discrimination of membrane proteins, existence of transmembrane helices and transmembrane helical regions, are about 99%, 96% and 85% respectively.
30、160; TopPred - Topology prediction of membrane proteins A new, simple method for predicting transmembrane segments in integral membrane proteins. It is based on low-stringency dot-plots of the query sequence against a collection of non-homologous membrane proteins using a previou
31、sly derived scoring matrix. This so-called dense alignment surface (DAS) method is shown to perform on par with earlier methods that require extra information in the form of multiple sequence alignments or the distribution of positively charged residues outside the transmembrane segme
32、nts, and thus improves prediction abilities when only single-sequence information is available or for classes of membrane proteins that do not follow the 'positive inside' rule. TMpred - prediction of transmembrane regions and orientation This program tries to find
33、putative transmembrane domains in proteins and also speculates on the possible orientation of these segments. For its scoring, it uses a combination of multiple weight-matrices that have been extracted from a statistical analysis of TMbase, a collection of all annotated transmemb
34、rane proteins present in SwissProt. Coils - prediction of coiled coil regions This program predicts (2 stranded) coiled coil regions in proteins by the Lupas-algorithm. Paircoil The Paircoil program predicts the location of coiled-coil regions in amino acid
35、sequences. SignalP - predicts signal peptides of secretory proteins SignalP predicts signal peptides of secretory proteins. For cleaved signal peptides, the precise location of the cleavage site in the amino acid sequence is predicted. The prediciton is optimised for three
36、different types of organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks. Sigfind
37、 - Signal Peptide Prediction Server (Human) This software (SIGFIND) predicts signal peptides at the start of protein sequences. A novel neural network learning algorithm is used for prediction. Using the same fivefold crossvalidation as SIGNALP, the 5 networks of SIGFI
38、ND (avgerage Mathews correlation coefficiant 0.98) perform better than SIGNALP (avgerage Mathews correlation coefficiant 0.96). It should be noted that the performances on the test-sets in SIGNALP are used as a stop criterion during the training of the neural networks, wher
39、eas the test-sets in SIGFIND are not used in any way during the training. The predictions of the 5 networks are combined into a jury decision. ChloroP - Chloroplast Transit Peptide Prediction The ChloroP www-server is able to predict two things: 1. cTP or n
40、o cTP Whether or not an amino acid sequence contains an N-terminal chloroplast transit peptide, cTP. 2. Cleavage site The probable site for cleavage of the transit peptide (if it was predicted to exist in the first step). Helix-Turn-Helix (HTH) This predicts Helix-turn
41、-helix motifs.· # 科学84· 分享回复喜欢转PCR引物设计时,各种酶切位点的保护碱基限制酶不同,所需要的酶切位点的保护碱基的数量也不同。一般情况下,在酶切位点以外多出3个碱基即可满足几乎所有限制酶的酶切要求。具体参考NEB网页上的说明:http:/www.neb-在资料上查不到的,一般都随便加3个碱基做保护。寡核苷酸近末端位点的酶切(Cleavage Close to the End of DNA Fragments (oligonucleotides)为 了解不同内切酶对识别位点以外最少保护碱基数目的要求,NEB采用了一系列含识别序列的短双链寡核苷酸作为
42、酶切底物进行实验。实验结果对于确定双酶切顺序 将会有帮助(比如在多接头上切割位点很接近时),或者当切割位点靠近DNA末端时也很有用。在本表中没有列出的酶,则通常需在识别位点两端至少加上6个保 护碱基,以确保酶切反应的进行。若底物有较长的回文结构,切割效率则可能因为出现发夹结构而降低。 Enzyme< xmlnamespace prefix ="o" ns ="urn:schemas-microsoft-com:office:office" />Oligo SequenceChain Length% Cleava
43、ge2 hr20 hrAccIGGTCGACC CGGTCGACCG CCGGTCGACCGG8 10 120 0 00 0 0AflIIICACATGTG CCACATGTGG CCCACATGTGGG8 10 120 >90 >900 >90 >90AscIGGCGCGCC AGGCGCGCCT TTGGCGCGCCAA8 10 12>90 >90
44、 >90>90 >90 >90AvaICCCCGGGG CCCCCGGGGG TCCCCCGGGGGA8 10 1250 >90 >90>90 >90 >90BamHICGGATCCG CGGGATCCCG CGCGGATCCGCG8 10 1210 >90 >9025 >90 >90BglIICAGATCTG GAAG
45、ATCTTC GGAAGATCTTCC8 10 120 75 250 >90 >90BssHIIGGCGCGCC AGGCGCGCCT TTGGCGCGCCAA8 10 120 0 500 0 >90BstEIIGGGT(A/T)ACCC9010BstXIAACTGCAGAACCAATGCATTGG AAAACTGCAGCCAATGCATTGGAA CTGCAGAACCAATGCATTGGATGCAT22
46、160;24 270 25 250 50 >90ClaICATCGATG GATCGATC CCATCGATGG CCCATCGATGGG8 8 10 120 0 >90 500 0 >90 50EcoRIGGAATTCC CGGAATTCCG CCGGAATTCCGG8 10 12>90 >90 >90>90 &g
47、t;90 >90HaeIIIGGGGCCCC AGCGGCCGCT TTGCGGCCGCAA8 10 12>90 >90 >90>90 >90 >90HindIIICAAGCTTG CCAAGCTTGG CCCAAGCTTGGG8 10 120 0 100 0 75KpnIGGGTACCC GGGGTACCCC CGGGGTACCCCG8 10
48、120 >90 >900 >90 >90MluIGACGCGTC CGACGCGTCG8 100 250 50NcoICCCATGGG CATGCCATGGCATG8 140 500 75NdeICCATATGG CCCATATGGG CGCCATATGGCG GGGTTTCATATGAAACCC GGAATTCCATATGGAATTCC GGGAATTCCATATGGAATTCCC8 10&
49、#160;12 18 20 220 0 0 0 75 750 0 0 0 >90 >90NheIGGCTAGCC CGGCTAGCCG CTAGCTAGCTAG8 10 120 10 100 25 50NotITTGCGGCCGCAA ATTTGCGGCCGCTTTA AAATATGCGGCCGCTATAAA ATAAGAATGCGGCCGCTA
50、AACTAT AAGGAAAAAAGCGGCCGCAAAAGGAAAA12 16 20 24 280 10 10 25 250 10 10 90 >90NsiITGCATGCATGCA CCAATGCATTGGTTCTGCAGTT12 2210 >90>90 >90PacITTAATTAA GTTAATTAAC CCTTAATTAAGG8 10 120
51、0 00 25 >90PmeIGTTTAAAC GGTTTAAACC GGGTTTAAACCC AGCTTTGTTTAAACGGCGCGCCGG8 10 12 240 0 0 750 25 50 >90PstIGCTGCAGC TGCACTGCAGTGCA AACTGCAGAACCAATGCATTGG AAAACTGCAGCCAATGCATTGGAA CTGCAGAACCAATGCATTGGAT
52、GCAT8 14 22 24 260 10 >90 >90 00 10 >90 >90 0PvuICCGATCGG ATCGATCGAT TCGCGATCGCGA8 10 120 10 00 25 10SacICGAGCTCG81010SacIIGCCGCGGC TCCCCGCGGGGA8 120 500 >90SalIGTCG
53、ACGTCAAAAGGCCATAGCGGCCGC GCGTCGACGTCTTGGCCATAGCGGCCGCGG ACGCGTCGACGTCGGCCATAGCGGCCGCGGAA28 30 320 10 100 50 75ScaIGAGTACTC AAAAGTACTTTT8 1210 7525 75SmaICCCGGG CCCCGGGG CCCCCGGGGG TCCCCCGGGGGA6 8 10 120 0&
54、#160;10 >9010 10 50 >90SpeIGACTAGTC GGACTAGTCC CGGACTAGTCCG CTAGACTAGTCTAG8 10 12 1410 10 0 0>90 >90 50 50SphIGGCATGCC CATGCATGCATG ACATGCATGCATGT8 12 140 0 100 25 50St
55、uIAAGGCCTT GAAGGCCTTC AAAAGGCCTTTT8 10 12>90 >90 >90>90 >90 >90XbaICTCTAGAG GCTCTAGAGC TGCTCTAGAGCA CTAGTCTAGACTAG8 10 12 140 >90 75 750 >90 >90 >90XhoICCTCGAGG CCCT
56、CGAGGG CCGCTCGAGCGG8 10 120 10 100 25 75XmaICCCCGGGG CCCCCGGGGG CCCCCCGGGGGG TCCCCCCGGGGGGA8 10 12 140 25 50 >900 75 >90 >9070· 分享回复喜欢R原文见:/jean/Presentation/IMSLAB.
57、pdf 为了方便大家学习,我将该文翻译成中文加上一些相关的简单介绍,经验尚浅,还请大家多提意见。 1、R统计分析工具 文中主要利用R做为分析统计工具,软件相关信息见请参考/。 英文简介:/doc/manuals/R-intro.pdf 中文简介:/pages/newhtm/r/schtml 2、Bioconductor Biocondocutor是基于R开发的用于基因组数据分析的软件,详情请参考http
58、://。 Bioconductor的安装方法:打开R的命令窗口键入如下命令 source (/biocLite.R) biocLite() 3、数据 指南中使用的数据来自于三种急性白血病的基因表达研究,分别是B细胞急性淋巴性白血病(B-ALL),T细胞急性淋巴性白血病(T-ALL)和急性脊髓性白血病(AML)。利用含有6817个人类基因的Affymetrix高密度寡核苷酸阵列(hgu68a)分析38个B-ALL,9 个T-ALL和25个AML肿瘤样品的基
59、因表达水平。 4、数据预处理 1)阈值:10016,000 2)筛选:除去max/min 5或者(max-min) 500的基因。这里max和min是指mRNA样品中基因的最大和最小密度。 3)以2为底的对数转换 数据文件GolubData.RData包括了基因的表达水平和基因名。筛选后的基因表达水平存储在3571 72的golub矩阵中,行和列分别对应基因和mRAN样品。 5、 练习 有两种方法完成这个练习,对于熟悉R或者S-plus的用户,你可以用自己写的代码完成练习;对于不熟悉R的用户,可以利用tkWkdgets包中的
60、vExplorer功能完成练习。vExplorer功能提供了浏览和执行代码的图形界面。启动R,通过如下代码导入指南: >install.packages("IMSLAB",contriburl="/jean/software") > library(IMSLAB) > vExplorer( ) 然后利用打开的窗口选择IMSLAB程序包。 开始 在开始练习前,有一些获得帮助的重要命令和语句需要介绍 >
61、 help.start() > apropos("mean") > ? mean > example("mean") 载入数据包 > library(IMSLAB) > data(GolubData) 聚类 聚类分析是以基因间的相似程度,或者说是基因间的距离为基础的。利用hclust功能聚类白血病mRNA样品。T-ALL,B-ALL和AML样品是否聚类到一起?通过改变hclust中的method参数试用不同的类间距离进行练习。在dist中通过改变
62、method参数试用不同的基因距离进行练习。下面这些问题可以帮助你开始练习。 Q1:利用相关系数和最大类间距离对mRNA样品进行系统聚类分析 > library(mva) > clust.cor <- hclust(as.dist(1 - cor(golub), method = "complete") > plot(clust.cor, cex = 0.6) Q2:用欧式距离和平均类间距类对mRNA样品进行系统聚类分析 > clust.euclid <- hclust(dist
63、(t(golub), method = "average") > plot(clust.euclid, cex = 0.6) Q3:利用mva包中的heatmap功能,你可以得到聚类分析的图像。但是要注意,这个功能对基因和样品都执行聚类分析,如果基因的数量过大,会导致这个方法运行缓慢,为了说明问题,我们仅选择100个基因为例。 > library(sma) > golubvar <- apply(golub, 1, var, na.rm = TRUE) > top100 <- stat.g
64、names(golubvar, 1:length(golubvar), crit = 100)$gnames > heatmap(golubtop100, ) 然后我们尝试不同的聚类分裂方法 Q4:利用相关系数作为基因距离对mRNA样品进行Kmeans聚类分析。 > clust.kmeans <- kmeans(as.dist(1 - cor(golub), 3) > names(clust.kmeans$cluster) <- colnames(golub) > clust.kmeans$cluster1:10 Q5: 利用cluster包中的PAM功能对mRNA样品进行“Partition Around Medoids”分析 > library(cluster) > clust.pam <- pam(as.dist(1 - cor(golub), 3, diss = TRUE) > clus
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 医院感染预防与控制知识考试题与标准答案(范文)
- 员额法官考试B卷及答案
- 一建市政真题及参考答案
- 检验科三基考试试题及答案
- 叉车司机N1考试竞赛考核模拟题大全附答案
- 绿动力植树营销
- 难点详解人教版八年级物理上册第4章光现象专项训练试题(含答案解析)
- 2025年注册会计师考试题库附答案
- 绿色旅游新篇章
- 强化训练苏科版八年级物理下册《从粒子到宇宙》综合测试试卷(含答案解析)
- 2025年度护理三基考试题库及答案
- 公路工程施工安全检查表
- 2025年松阳县机关事业单位公开选调工作人员34人考试参考试题及答案解析
- 2025年教师编制考试面试题库及答案
- 幼儿园家长工作沟通技巧培训教材
- 二类医疗器械零售经营备案质量管理制度
- 黑龙江省 2025 年专升本英语全真模拟卷
- 浙江南海实验高中2025年秋9月月考高一数学试题+答案(9月29日)
- 司法鉴定人岗前考试题及答案解析
- 地面保洁施工方案
- 医用耗材不良事件课件
评论
0/150
提交评论