生物信息学实验四、蛋白质序列分析及结构预测_第1页
生物信息学实验四、蛋白质序列分析及结构预测_第2页
生物信息学实验四、蛋白质序列分析及结构预测_第3页
生物信息学实验四、蛋白质序列分析及结构预测_第4页
生物信息学实验四、蛋白质序列分析及结构预测_第5页
已阅读5页,还剩43页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

实验四、蛋白质序列分析及结构预测用BioEdit等软件进行序列分析打开FASTA格式序列1、Sequence-Protein-AminoAcidComposition分子质量;氨基酸组成Protein:gi|725605238|ref|XM_010330964.1|PREDICTED:Saimiriboliviensisboliviensisinterferon,lambda3(IFNL3),mRNALength=248aminoacidsMolecularWeight=27462.22DaltonsAminoAcidNumberMol%AlaA2510.08CysC93.63AspD135.24GluE145.65PheF62.42GlyG104.03HisH104.03IleI31.21LysK83.23LeuL3714.92MetM52.02AsnN41.61ProP187.26GlnQ104.03ArgR228.87SerS156.05ThrT166.45ValV156.05TrpW41.61TyrY20.81氨基酸组成表helicalwheeldiagramHydrophobicMomentmatrixwithEisenbergconsensusscale疏水性Kyte&DoolittleMeanHydrophobicityProfileEisenbergScaleMeanHydrophobicityProfileCornetteScaleMeanHydrophobicityProfileParkerHPLCScaleMeanHydrophobicityProfileBoykoScaleMeanHydrophilicityProfileHopp%WoodsScaleMeanHydrophilicityProtParamtool/protparam/ProtParam

(References

/

Documentation)isatoolwhichallowsthecomputationofvariousphysicalandchemicalparametersforagivenproteinstoredin

Swiss-ProtorTrEMBL

orforauserenteredsequence.Thecomputedparametersincludethemolecularweight,theoreticalpI,aminoacidcomposition,atomiccomposition,extinctioncoefficient,estimatedhalf-life,instabilityindex,aliphaticindexandgrandaverageofhydropathicity(GRAVY)(Disclaimer).输入FASTA格式序列等电点等电点跨膜区分析进入CBS依次进入TMHMMWelcometoCBShttp://www.cbs.dtu.dk/index.shtmlCBSPredictionServershttp://www.cbs.dtu.dk/services/TMHMM/protparam/输入FASTA格式序列结果Data部分数据#WEBSEQUENCE

#AA inside membr outside

1A 0.00271 0.00000 0.99729

2T 0.00267 0.00004 0.99729

3G 0.00265 0.00006 0.99729

4A 0.00265 0.00008 0.99727

5A 0.00252 0.00022 0.99726

6A 0.00252 0.00023 0.99726

7C 0.00172 0.00102 0.99726

8T 0.00172 0.00102 0.99726

……

1403C 0.00059 0.00002 0.99939

1404G 0.00059 0.00002 0.99939

1405C 0.00059 0.00002 0.99939

1406G 0.00059 0.00002 0.99939

1407A 0.00059 0.00002 0.99939

1408G 0.00059 0.00002 0.99939

1409A 0.00059 0.00002 0.99939

1410C 0.00059 0.00002 0.99939

1411C 0.00059 0.00002 0.99938

1412T 0.00060 0.00005 0.99935

1413G 0.00060 0.00009 0.99932

1414A 0.00060 0.00012 0.99928

1415A 0.00060 0.00014 0.99926

1416T 0.00060 0.00016 0.99924

1417T 0.00060 0.00018 0.99922

1418G 0.00060 0.00019 0.9992

1419T 0.00060 0.00023 0.99917

1420G 0.00060 0.00023 0.99917

1421T 0.00060 0.00023 0.99918

1422T 0.00060 0.00023 0.99918

1423G 0.00059 0.00024 0.99917

1424C 0.00059 0.00024 0.99917

1425C 0.00059 0.00024 0.99917

1426A 0.00059 0.00024 0.99917

1427G 0.00059 0.00024 0.99917

1428C 0.00060 0.00024 0.99917

1429G 0.00060 0.00024 0.99917

1430G 0.00060 0.00024 0.99917

1431G 0.00060 0.00023 0.99917

1432G 0.00060 0.00023 0.99917

1433A 0.00061 0.00023 0.99917

1434C 0.00062 0.00021 0.99917

1435C 0.00066 0.00017 0.99917

1436T 0.00070 0.00013 0.99917

1437G 0.00072 0.00011 0.99917

1438T 0.00075 0.00009 0.99917

1439G 0.00076 0.00008 0.99917

1440T 0.00078 0.00006 0.99917

1441G 0.00079 0.00004 0.99917

1442T 0.00082 0.00001 0.99917

1443C 0.00082 0.00001 0.99917

1444T 0.00082 0.00001 0.99917

1445G 0.00083 0.00000 0.99917

1446A 0.00083 0.00000 0.9991712、信号肽及亚细胞定位进入SignalP4.1Serverhttp://www.cbs.dtu.dk/services/SignalP/输入FASTA格式序列结果:亚细胞定位:进入:TargetP1.1Serverhttp://www.cbs.dtu.dk/services/TargetP/输入序列提交:结果:13、功能分析1)基于序列同源性分析的蛋白质功能预测NCBIblast找到吻合相对高的序列查看详情序列同源性蛋白质功能分析NCBIGENE进入相关文献了解功能2)基于motif、结构位点、结构功能域数据库的蛋白质功能预测Motif:PROSITE//cgi-bin/prosite/ScanView.cgi?scanfile=806498321699.scan.gz结构域基序MyHits:http://hits.isb-sib.ch/cgi-bin/PFSCAN输入序列结果:蛋白质结构功能域的分析SMARThttp://smart.embl-heidelberg.de/二、蛋白质二级结构预测1)NetTurnP-PredictionofBeta-turnsinproteinsNetTurnP1.0-PredictionofBeta-turnregionsinproteinsequenceshttp://www.cbs.dtu.dk/services/NetTurnP/输入序列结果:

NetTurnP-PredictionofBeta-turnsinproteinsTechnicalUniversityofDenmark#Forpublicationofresults,pleasecite:

#NetTurnP-NeuralNetworkPredictionofBeta-turnsbyUseofEvolutionaryInformationandPredictedProteinSequenceFeatures.

#PetersenB,LundegaardC,PetersenTN(2010)

#PLoSONE5(11):e15079doi:10.1371/journal.pone.0015079

#

#Column1: Aminoacid

#Column2: Sequencename

#Column3: Aminoacidnumber

#Column4: PredictionforBeta-turn

#Column5: Classassignment-"T"forBeta-turn

#

VSequence10.287.

TSequence20.363.

ASequence30.403.

SSequence40.482.

ESequence50.495.

WSequence60.493.

GSequence70.552T

PSequence80.527T

SSequence90.564T

ASequence100.572T

DSequence110.643T

ESequence120.631T

DSequence130.620T

QSequence140.612T

RSequence150.497.

SSequence160.518T

ESequence170.515T

MSequence180.557T

KSequence190.582T

RSequence200.555T

GSequence210.561T

MSequence220.552T

SSequence230.559T

RSequence240.560T

GSequence250.533T

CSequence260.486.

MSequence270.351.

ASequence280.269.

VSequence290.233.

LSequence300.190.

VSequence310.177.

LSequence320.179.

MSequence330.184.

ASequence340.210.

TSequence350.236.

VSequence360.269.

LSequence370.319.

TSequence380.396.

VSequence390.448.

TSequence400.475.

GSequence410.505T

ASequence420.480.

VSequence430.449.

PSequence440.455.

VSequence450.463.

TSequence460.456.

RSequence470.467.

PSequence480.523T

PSequence490.504T

RSequence500.492.

ASequence510.488.

LSequence520.526T

PSequence530.568T

DSequence540.612T

ASequence550.650T

RSequence560.585T

GSequence570.497.

CSequence580.452.

HSequence590.380.

ISequence600.425.

ASequence610.452.

QSequence620.457.

FSequence630.558T

KSequence640.524T

SSequence650.494.

LSequence660.482.

SSequence670.347.

PSequence680.280.

QSequence690.259.

ESequence700.254.

LSequence710.181.

QSequence720.153.

ASequence730.152.

FSequence740.167.

KSequence750.187.

RSequence760.192.

ASequence770.250.

KSequence780.269.

DSequence790.292.

ASequence800.304.

LSequence810.362.

ESequence820.382.

ESequence830.373.

SSequence840.401.

LSequence850.373.

LSequence860.414.

LSequence870.555T

KSequence880.547T

DSequence890.559T

CSequence900.576T

RSequence910.414.

CSequence920.424.

RSequence930.443.

SSequence940.442.

RSequence950.522T

LSequence960.531T

FSequence970.572T

PSequence980.632T

RSequence990.596T

TSequence1000.572T

WSequence1010.535T

DSequence1020.394.

LSequence1030.416.

RSequence1040.404.

QSequence1050.398.

LSequence1060.414.

QSequence1070.371.

VSequence1080.453.

RSequence1090.475.

ESequence1100.472.

RSequence1110.481.

PSequence1120.371.

VSequence1130.271.

ASequence1140.240.

LSequence1150.188.

ESequence1160.182.

ASequence1170.175.

ESequence1180.164.

LSequence1190.168.

ASequence1200.150.

LSequence1210.141.

TSequence1220.142.

LSequence1230.143.

ESequence1240.151.

VSequence1250.175.

LSequence1260.242.

ESequence1270.290.

ASequence1280.358.

TSequence1290.458.

ASequence1300.479.

DSequence1310.576T

NSequence1320.572T

DSequence1330.541T

MSequence1340.512T

ASequence1350.329.

LSequence1360.275.

GSequence1370.255.

DSequence1380.253.

VSequence1390.278.

LSequence1400.373.

DSequence1410.400.

RSequence1420.395.

PSequence1430.383.

LSequence1440.308.

HSequence1450.244.

TSequence1460.202.

LSequence1470.173.

HSequence1480.152.

HSequence1490.151.

VSequence1500.149.

LSequence1510.152.

SSequence1520.162.

QSequence1530.173.

LSequence1540.233.

RSequence1550.280.

ASequence1560.306.

CSequence1570.354.

VSequence1580.366.

QSequence1590.405.

PSequence1600.406.

QSequence1610.403.

PSequence1620.466.

TSequence1630.517T

ASequence1640.541T

GSequence1650.588T

PSequence1660.540T

RSequence1670.493.

PSequence1680.503T

WSequence1690.433.

GSequence1700.397.

RSequence1710.341.

LSequence1720.232.

HSequence1730.198.

HSequence1740.174.

WSequence1750.166.

LSequence1760.168.

HSequence1770.183.

RSequence1780.203.

LSequence1790.253.

QSequence1800.273.

ESequence1810.290.

ASequence1820.447.

PSequence1830.494.

KSequence1840.517T

KSequence1850.554T

ESequence1860.472.

SSequence1870.628T

SSequence1880.604T

GSequence1890.595T

CSequence1900.593T

LSequence1910.334.

ESequence1920.306.

ASequence1930.286.

SSequence1940.243.

VSequence1950.230.

TSequence1960.194.

FSequence1970.177.

NSequence1980.185.

LSequence1990.180.

FSequence2000.181.

RSequence2010.199.

LSequence2020.191.

LSequence2030.249.

TSequence2040.462.

RSequence2050.469.

DSequence2060.466.

LSequence2070.491.

KSequence2080.304.

CSequence2090.311.

VSequence2100.393.

ASequence2110.467.

SSequence2120.554T

GSequence2130.630T

DSequence2140.634T

LSequence2150.593T

CSequence2160.566T

ASequence2170.554T

PSequence2180.579T

SSequence2190.573T

HSequence2200.577T

LSequence2210.544T

PSequence2220.483.

ASequence2230.491.

TSequence2240.535T

HSequence2250.530T

HSequence2260.479.

ASequence2270.427.

ISequence2280.362.

DSequence2290.326.

FSequence2300.303.

ISequence2310.312.

YSequence2320.343.

TSequence2330.420.

SSequence2340.480.

TSequence2350.499.

TSequence2360.491.

CSequence2370.509T

LSequence2380.459.

NSequence2390.472.

LSequence2400.475.

LSequence2410.412.

PSequence2420.594T

PSequence2430.599T

NSequence2440.612T

RSequence2450.650T

YSequence2460.368.\t"http://www.cbs.dtu.dk//cgi-bin/_blank"Explaintheoutput.Goback.2)GOR

-Garnieretal,1996NPS@:GOR4secondarystructurepredictionhttps://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html结果:3)NetSurfP-1.1-Proteinsecondarystructureandsurfaceaccessibilityserverhttp://www.cbs.dtu.dk/services/NetSurfP/结果:

NetSurfP-ProteinSurfaceAccessibilityandSecondaryStructurePredictionsTechnicalUniversityofDenmark#Forpublicationofresults,pleasecite:

#Agenericmethodforassignmentofreliabilityscoresappliedtosolventaccessibilitypredictions.

#BentPetersen,ThomasNordahlPetersen,PernilleAndersen,MortenNielsenandClausLundegaard

#BMCStructuralBiology2009,9:51doi:10.1186/1472-6807-9-51

#

#Column1:Classassignment-BforburiedorEforExposed-Threshold:25%exposure,butnotbasedonRSA

#Column2:Aminoacid

#Column3:Sequencename

#Column4:Aminoacidnumber

#Column5:RelativeSurfaceAccessibility-RSA

#Column6:AbsoluteSurfaceAccessibility

#Column7:Z-fitscoreforRSAprediction

#Column8:ProbabilityforAlpha-Helix

#Column9:ProbabilityforBeta-strand

#Column10:ProbabilityforCoil

EVSequence10.752115.644-1.1230.0030.0030.994

ETSequence20.50069.378-0.2550.0520.0840.864

EASequence30.43447.882-1.2970.1130.0870.800

ESSequence40.58568.527-0.8120.1130.0870.800

EESequence50.613107.1090.1590.1130.0870.800

BWSequence60.24959.981-0.6390.0520.0840.864

EGSequence70.33826.577-0.8140.0530.0430.903

EPSequence80.41058.207-1.1170.0530.0430.903

ESSequence90.58468.410-1.0200.0530.0430.903

EASequence100.36740.388-1.0620.0580.0170.925

EDSequence110.53677.238-0.6480.0530.0430.903

EESequence120.644112.542-0.7100.1840.0430.773

EDSequence130.58183.708-1.9770.1840.0430.773

EQSequence140.50890.693-0.5890.2680.0430.689

ERSequence150.464106.302-0.3550.3540.0480.598

ESSequence160.41448.533-1.8350.3540.0480.598

EESequence170.592103.370-0.4920.3540.0480.598

EMSequence180.40080.020-1.9800.3540.0480.598

EKSequence190.526108.198-0.6050.2780.0930.628

ERSequence200.472108.180-0.9490.1130.0870.800

BGSequence210.27221.391-2.2260.1130.0870.800

BMSequence220.19739.440-0.9620.1180.1500.732

BSSequence230.28132.875-1.2790.1180.1500.732

ERSequence240.29166.593-1.6650.1910.0860.723

BGSequence250.15812.458-1.3600.2680.0430.689

BCSequence260.0263.678-0.0980.5020.1020.396

BMSequence270.14328.6340.2570.7250.1630.112

BASequence280.10411.483-0.2000.7250.1630.112

BVSequence290.0487.4540.7910.8070.1370.056

BLSequence300.0417.5070.2190.8700.0770.053

BVSequence310.08112.465-0.0590.8860.0900.024

BLSequence320.06712.2130.5440.8700.0770.053

BMSequence330.07314.6670.4320.8700.0770.053

BASequence340.0727.901-0.0580.8310.0440.125

BTSequence350.11516.020-0.4340.8310.0440.125

BVSequence360.12819.735-0.3120.8310.0440.125

BLSequence370.13023.7300.0630.7510.0500.199

BTSequence380.26636.964-0.2310.6600.0490.291

EVSequence390.33952.104-1.2180.3540.0480.598

ETSequence400.40956.770-2.0170.1840.0430.773

BGSequence410.31324.625-1.5530.0530.0430.903

EASequence420.37040.752-2.0390.0180.0880.893

BVSequence430.18628.542-0.4940.0200.2050.775

EPSequence440.33747.806-1.3250.0200.2050.775

BVSequence450.17026.206-1.0510.0180.0880.893

ETSequence460.38152.803-1.5020.0180.0470.935

ERSequence470.526120.362-0.2920.0180.0190.964

BPSequence480.24134.127-1.1810.0180.0190.964

EPSequence490.39556.079-1.4540.0180.0190.964

ERSequence500.649148.621-0.4630.0180.0470.935

BASequence510.23425.831-1.4110.0180.0470.935

ELSequence520.33561.265-0.1800.0180.0470.935

EPSequence530.34048.232-0.6910.0180.0470.935

EDSequence540.732105.4240.2750.0180.0190.964

EASequence550.47552.301-1.3150.0180.0190.964

ERSequence560.514117.660-0.1500.0180.0470.935

EGSequence570.46636.698-0.4970.0190.1410.840

BCSequence580.0618.578-0.4170.0210.2790.699

EHSequence590.34262.2830.1510.0220.3590.619

BISequence600.11020.368-0.5600.0220.3590.619

EASequence610.32535.848-1.1720.0200.2050.775

EQSequence620.50389.8720.4090.0190.1410.840

BFSequence630.12625.348-0.1990.0180.0880.893

EKSequence640.564116.0770.1350.0180.0880.893

ESSequence650.48256.444-1.4790.0180.0470.935

BLSequence660.20737.902-0.7760.0180.0190.964

ESSequence670.39245.9660.1220.0180.0190.964

EPSequence680.38654.802-1.1240.8580.0020.139

EQSequence690.50990.872-0.4270.9230.0020.076

BESequence700.21337.159-0.3700.9230.0020.076

BLSequence710.19635.9610.4200.9700.0010.030

EQSequence720.47684.9600.3190.9700.0010.030

BASequence730.11813.048-0.1540.9700.0010.030

BFSequence740.06112.2630.1680.9700.0010.030

EKSequence750.40282.6301.0030.9230.0020.076

ERSequence760.40793.2491.0340.9230.0020.076

BASequence770.0465.0470.1020.8580.0020.139

EKSequence780.33969.7320.9570.8580.0020.139

EDSequence790.53577.1220.1000.8580.0020.139

BASequence800.22224.4970.3250.8580.0020.139

BLSequence810.08615.7830.0880.8020.0140.185

EESequence820.42173.4790.1130.8020.0140.185

EESequence830.579101.064-0.6350.7170.0140.269

BSSequence840.23427.437-1.1700.6220.0150.363

BLSequence850.14025.726-0.1410.5220.0160.462

BLSequence860.25847.203-0.1560.4550.0460.498

BLSequence870.25145.976-0.8870.2680.0430.689

EKSequence880.591121.651-0.0380.1910.0860.723

EDSequence890.57783.160-0.8340.0520.0840.864

BCSequence900.21429.9890.5730.0560.1420.802

ERSequence910.462105.7520.7030.0660.2960.638

BCSequence920.09212.945-0.8680.0660.2960.638

ERSequence930.441100.897-0.5880.0640.2160.721

ESSequence940.34740.668-1.4630.0190.1410.840

ERSequence950.456104.538-0.1340.0200.2050.775

BLSequence960.21339.055-1.1150.0210.2790.699

BFSequence970.13727.5760.3980.0190.1410.840

EPSequence980.37352.957-0.9180.0180.0880.893

ERSequence990.40292.150-0.7040.0180.0880.893

ETSequence1000.54375.370-0.6240.0560.1420.802

BWSequence1010.19747.3540.3330.1250.2270.648

EDSequence1020.40858.8500.6280.1250.2270.648

BLSequence1030.13524.6640.2520.2160.2350.548

ERSequence1040.493112.9890.6120.2160.2350.548

EQSequence1050.46082.1020.7720.3210.2520.427

BLSequence1060.10919.9950.6720.2160.2350.548

EQSequence1070.42375.5480.3330.1990.1520.649

BVSequence1080.12619.4280.0260.3070.1650.527

ERSequence1090.38488.0050.2850.2780.0930.628

EESequence1100.57099.527-0.7870.3540.0480.598

BRSequence1110.24255.4870.5470.5610.0470.393

BPSequence1120.21230.111-0.2370.7170.0140.269

EVSequence1130.26440.6080.5270.8310.0440.125

BASequence1140.12914.216-0.4160.9110.0330.057

BLSequence1150.07113.0730.5880.9110.0330.057

EESequence1160.31254.5760.3650.9380.0070.055

BASequence1170.11812.982-0.2030.9380.0070.055

BESequence1180.22639.3950.1830.9110.0330.057

BLSequence1190.05810.6380.7300.9110.0330.057

EASequence1200.38742.6140.9350.9110.0330.057

BLSequence1210.10920.0130.5980.8310.0440.125

BTSequence1220.07810.8460.1830.9180.0630.019

BLSequence1230.07714.1170.5610.9110.0330.057

EESequence1240.43976.6231.8940.9500.0280.022

BVSequence1250.08112.3880.5640.9500.0280.022

BLSequence1260.06912.5790.4370.8790.0100.111

EESequence1270.47683.2100.4470.8790.0100.111

EASequence1280.48953.833-0.5630.6220.0150.363

BTSequence1290.20428.281-0.5260.3390.0160.645

EASequence1300.42446.714-0.8650.1090.0050.886

EDSequence1310.58183.6640.0090.0530.0050.942

ENSequence1320.49973.112-1.3680.0530.0050.942

EDSequence1330.55079.255-1.0820.1760.0040.820

EMSequence1340.529105.7730.2960.5020.0020.495

EASequence1350.31334.5480.9850.8020.0140.185

BLSequence1360.0539.7780.1830.9230.0020.076

BGSequence1370.21216.669-0.0220.9700.0010.030

EDSequence

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论