NCBI本地Blast 安装方法.doc_第1页
NCBI本地Blast 安装方法.doc_第2页
NCBI本地Blast 安装方法.doc_第3页
NCBI本地Blast 安装方法.doc_第4页
NCBI本地Blast 安装方法.doc_第5页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

IN HOUSE LOCAL BLAST SEARCHTo get started you need the blastall.exe and formatdb.exe (From NCBI). The rest of the perl and batch programs you might need to change the path of the directories they are pointing to or the blast option they use, could be downloaded from: /SGMD/software/blast/Blast.htmFor the programs to work without modifying the paths, the whole folder “Blast.zip” should be unzipped to a folder Blast moved under the “C:” directory. For questions or comments please contact: Imed Ben CI. Step one: Blasting1) Download the database that you want to blast against, for example the NT database from NCBI. If you want to use a local database, store all the sequences in a text file.The file provided by NCBI is a zipped (nt.gz) file so you have to unzip it.2) At the DOS prompt (which you can get to from windows by choosing: Start, Run, then typing: command), run formatdb.exe to create a local database from that text file or the downloaded database. Usage:formatdb t databasename i inputfile p FExamples: 1)formatdb t nt i nt p F2) formatdb t snc i inputfile p Fdatabasename is the name you want to give to your databaseinputfile is the name of the text file that contains your sequences or the name of the database that you downloaded from GenBank (technically also a text file of sequences).More about formatdb.exe information and command options can be found here:/IEB/ToolBox/C_DOC/lxr/source/doc/formatdb.txt3) Open the file BlastList.pl (using Notepad or your favorite text editor)Make the small changes as instructed in the file then save it.These are the only two changes that should be made to run the program.4) Run BlastList.pl as follows:c: cd Blastc:Blastperl BlastList.plThe file BlastList.pl automatically creates a batch file “DosBlast.bat” depending on the list of the sequence to be blasted.5) Run DosBlast.batc:Blast DosBlast.batDosBlast.bat is the actual file that does the blast search.II. Step two: Extracting data from blast results6) Move all the resulting .txt files to BlastOut7) Go to the directory BlastOutc:Blast cd BlastOut8) Run Hits.pl c:BlastBlastOutperl Hits.plThat will move the files that returned no hits to a different directory9) Run DataExt.plc:BlastBlastOutperl DataExt.plThe output will be written to the file Blasted.txt.With Excel open (using tab delimited) the file Blasted.txt. It contains a summary of the blast results that you can save, edit, etc.LIST OF PROGRAMS: The required programs are available in this directory but here is the code for four of them in case you wish to make some modifications:1) BlastList.bat2) DosBlast.bat3) Hits.pl4) DataExt.pl=BlastList.bat=#!/usr/local/bin/perl# file: BlastList.pl# # Imed Ben Chouikha # 04/24/03# # This files creates a batch file DosBlast.bat based on # the list of .seq sequence files.# The file DosBlast.Bat runs separetly. (For more information see ReadMe.doc)# send comments to: # CHANGES TO MAKE# 1) Change SCN_seq.fas (in the first line of the program) with the name of the local # database you are Blasting against:# 2) eliminate (only if needed) anything other than the sequence files in the #unless statement (below, in the middle of the code).$DBNAME = SCN_seq.fas; # Replace SCN_seq.fas with local Database name$dirtoget=C:/Blast;opendir(IMD, $dirtoget) | die(Cannot open directory);# delete the old DosBlast.bat file that contains the list of sequences# to be blasted$dosfile = DosBlast.bat;unlink($dosfile);# Get the list of the new sequence files to blastthefiles= readdir(IMD);closedir(IMD);# Create a new file DosBlast.bat open(OUT,DosBlast.bat) | die cannot open file for writing: $!;foreach $f (thefiles) # Add to the list below everything other that the sequence files # Here is the Unless statement: unless ( ($f eq .) | ($f eq .) | ($f eq DosBlast.bat) | ($f eq BlastList.pl) | ($f eq $DBNAME)|($f eq BlastOut) | ($f eq blastall.exe) | ($f eq formatdb.exe)| ($f eq formatdb.log)| ($f eq ReadMe.doc) myarray = split(/./,$f); # Old file name $extension =.txt; # This is the new file extension newname=myarray0.$extension; print(OUT blastall -p blastn -d $DBNAME -i $f -o newname -v 0 -b 1n); # end of unless # end of foreach=DosBlast.bat=# Changed this file to be automatically generated. So you do not have to worry about it# it contains lines of the form # blastall -p blastn -d $DBNAME -i $f -o newname -v 0 -b 1# where $DBNAME is the Database name, $f and newname are the input and output names # read from the directory by BlastList.pl=Hits.pl=#!/usr/local/bin/perl# file Hits.pl# Imed Ben Chouikha# 04/24/03 # This files moves all the files that returned No Hits to the directory called NoHits# and deletes them from the current directory # Send comments to: $dirtoget=C:/Blast/BlastOut;opendir(IMD, $dirtoget) | die(Cannot open directory);thefiles= readdir(IMD);#closedir(IMD);# loop over the filesforeach $f (thefiles) unless ( ($f eq .) | ($f eq .) | ($f eq Blasted.txt) | ($f eq DataExt.pl)| ($f eq NoHits) | ($f eq Hits.pl) open(IN, $f) | die cannot open file for reading: $!;#open(OUT,nohitlist.txt) | die cannot open file for writing: $!;$nohit = No Hits Found;$count = 0; while($lines = ) chop($lines);if ($lines = /$nohit/)$count += 1;else$count = $count;$lines += 1; # end of while loop if ($count = 1)close(IN);$odir=NoHits;opendir(IMD1, $odir) | die(Cannot open directory);rename($f, $odir/$f);closedir(IMD1); #unlink($f); else $count = $count; # end of unless # end of foreachclosedir(IMD);=DataExt.pl=#!/usr/local/bin/perl# file: DataExt.pl# # Imed Ben Chouikha# 04/24/03# # This file extracts the E-values, best hits, and other values from the Blast result.# send comments to: $dirtoget=C:/Blast/BlastOut;opendir(IMD, $dirtoget) | die(Cannot open directory);thefiles= readdir(IMD);closedir(IMD);# loop over the filesopen(OUT,Blasted.txt) | die cannot open file for writing: $!;foreach $f (thefiles) unless ( ($f eq .) | ($f eq .) | ($

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论