




已阅读5页,还剩15页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
l 32bit windows xp 上安装64bit ubuntu ,vbox设置系统处理器数量设置1,为2会报错l BIOS 启用vt x-AMD-v 支持 进入BIOS-Advanced BIOS Features-Virtualization-Disabled(预设值)修改为Enabled,储存(save),重启。l Vbox加载ubuntul Ubuntu共享windows文件夹设置l 安装增强功能 vbox-设备-安装增强功能l 在ubuntu中创建的挂载目录为/media/shared,命令为:sudo mkdir /media/sharedl sudo passwd rootl 切换成root账户 sudo sl 在windows E:盘创建文件夹ubuntu1110_64sharefolderll sudo mount.vboxsf ubuntu1110_64sharefolder /media/shared 将文件夹ubuntu1110_64sharefolder挂载到/media/shared下l 設定開機就自動掛載。指令sudo gedit /etc/fstab 開啟fstab,最後面加入ubuntu1110_64sharefolder /media/shared vboxsf rw 0 0l 安装jdk-7u3-linux-x64.tar.gz(参考/yang_hui1986527/article/details/6677450)进入 media/shared运行 sudo mkdir /usr/lib/jvmsudo tar zxvf ./ jdk-7u3-linux-x64.tar.gz -C /usr/lib/jvm z通过gzip指令处理备份文件x从备份文件中还原文件v显示指令执行过程f指定备份文件cd/usr/lib/jvmcd到jvm目录下sudomvjdk1.7.0/java-7-sun 改名apt-get install vim 安装vim包修改环境变量 vim/.bashrc添加:exportJAVA_HOME=/usr/lib/jvm/java-7-sunexportJRE_HOME=$JAVA_HOME/jreexportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/libexportPATH=$JAVA_HOME/bin:$PATHshift+:wq保存退出,输入以下命令使之立即生效。source/.bashrc配置默认JDK版本由于ubuntu中可能会有默认的JDK,如openjdk,所以,为了将我们安装的JDK设置为默认JDK版本,还要进行如下工作。执行代码:sudoupdate-alternatives-install/usr/bin/javajava/usr/lib/jvm/java-7-sun/bin/java300sudoupdate-alternatives-install/usr/bin/javacjavac/usr/lib/jvm/java-7-sun/bin/javac300sudoupdate-alternatives-install/usr/bin/jarjar/usr/lib/jvm/java-7-sun/bin/jar300执行代码:sudoupdate-alternatives-configjava测试Java version安装hadoop1.0.0mkdir /home/appsudo tar zxvf ./hadoop-1.0.0.tar.gz C /home/app/cd/home/app/hadoop-1.0.0进入Hadoop目录viconf/hadoop-env.sh修改配置文件,指定JDk安装路径source conf/hadoop-env.sh使立即生效修改Hadoop核心配置文件core-site.xml,这里配置的是HDFS的地址和端口号 1Vimconf/4hdfs:/localhost:900056修改Hadoop中HDFS的配置,配置的备份方式默认为3,因为安装的是单机版,所以需要改为1 1vimconf/hdfs-site.xml123dfs.replication4156修改Hadoop中MapReduce的配置文件,配置的是JobTracker的地址和端口 1vimconf/mapred-site.xml123mapred.job.tracker4localhost:900156接下来,启动Hadoop,在启动之前,需要格式化Hadoop的文件系统HDFS,进入Hadoop文件夹,输入下面命令 1bin/hadoop namenode format(为什么启动之前要执行这一步,否则50060和50070会报错)然后启动Hadoop,输入命令 1bin/start-all.sh这个命令为所有服务全部启动。 最后,验证Hadoop是否安装成功。打开浏览器,分别输入一下网址:http:/localhost:50030 (MapReduce的Web页面)http:/localhost:50070 (HDfS的web页面)如果都能查看,说明安装成功建立ssh无密码登录本机Ssh-keygen t rsa P “”回车进入/.ssh/目录下,将id_rsa.pub追加到authorized_keys授权文件中,开始是没有authorized_keys文件的登录localhost,如图执行退出命令安装hbase进入cd /media/shared/sudo tar zxvf ./ hbase-0.92.1.tar.gz C /home/app/ 安装完之后/home/app/下出现hbase-0.92.1修改配置进入hbase-0.92.1目录Vim conf/hbase-env.sh修改配置文件修改hbase-env.sh#必修配置的地方为:export JAVA_HOME=/usr/lib/jvm/java-7-sunexport HBASE_CLASSPATH=/home/app/hbase-0.92.1/confexport HBASE_OPTS=-XX:+UseConcMarkSweepGCexport HBASE_MANAGES_ZK=true其中,JAVA_HOME为java安装路径,HBASE_CLASSPATH为Hadoop安装路径。source conf/hbase-env.sh使立即生效修改配置文件 / 伪分布式配置hbase-site.xml(/Linux/2012-03/56349.htm)修改其内容为: hbase.rootdir hdfs:/localhost:9000/hbase The directory shared by region servers. # hbase.master.port# 60000# hbase.cluster.distributed# true# # #perty.dataDir #/home/Hadooptest/zookeeper-3.4.3/zookeeperdir/zookeeper-data # perty.clientPort# 2181# # hbase.zookeeper.quorum# zookeeper# 启动 /home/app/hadoop-1.0.0/bin/start-all.shrootzhuwei-VirtualBox:/home/app/hbase-0.92.1/bin# ./stop-hbase.shrootzhuwei-VirtualBox:/home/app/hbase-0.92.1/bin# ./start-hbase.shrootzhuwei-VirtualBox:/home/app/hbase-0.92.1/bin# ./hbase shell安装zookeeper-3.4.3.tar.gzsudo tar zxvf ./zookeeper-3.4.3.tar.gz -C /home/app/将“/ zookeeper-3.4.3 /conf”目录下zoo_sample.cfg修改名称为“zoo.cfg”Sudo mv zoo_sample.cfg zoo.cfgrootzhuwei-VirtualBox:/home/app/zookeeper-3.4.3# sudo mkdir zookeeper_data新建文件夹修改dataDir参数/home/app/zookeeper-3.4.3/zookeeper_databin/zkServer.sh startbin/zkCli.sh -server :2181/Connecting to ZooKeeper黄色部分为terminal报出的信息rootzhuwei-VirtualBox:/home/app/zookeeper-3.4.3# bin/zkCli.sh -server :2181Connecting to :21812012-04-27 12:17:50,875 myid: - INFO main:Environment98 - Client environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT2012-04-27 12:17:50,889 myid: - INFO main:Environment98 - Client environment:=zhuwei-VirtualBox2012-04-27 12:17:50,890 myid: - INFO main:Environment98 - Client environment:java.version=1.7.0_032012-04-27 12:17:50,896 myid: - INFO main:Environment98 - Client environment:java.vendor=Oracle Corporation2012-04-27 12:17:50,896 myid: - INFO main:Environment98 - Client environment:java.home=/usr/lib/jvm/java-7-sun/jre2012-04-27 12:17:50,897 myid: - INFO main:Environment98 - Client environment:java.class.path=/home/app/zookeeper-3.4.3/bin/./build/classes:/home/app/zookeeper-3.4.3/bin/./build/lib/*.jar:/home/app/zookeeper-3.4.3/bin/./lib/slf4j-log4j12-1.6.1.jar:/home/app/zookeeper-3.4.3/bin/./lib/slf4j-api-1.6.1.jar:/home/app/zookeeper-3.4.3/bin/./lib/netty-3.2.2.Final.jar:/home/app/zookeeper-3.4.3/bin/./lib/log4j-1.2.15.jar:/home/app/zookeeper-3.4.3/bin/./lib/jline-0.9.94.jar:/home/app/zookeeper-3.4.3/bin/./zookeeper-3.4.3.jar:/home/app/zookeeper-3.4.3/bin/./src/java/lib/*.jar:/home/app/zookeeper-3.4.3/bin/./conf:.:/usr/lib/jvm/java-7-sun/lib:/usr/lib/jvm/java-7-sun/jre/lib2012-04-27 12:17:50,897 myid: - INFO main:Environment98 - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib2012-04-27 12:17:50,898 myid: - INFO main:Environment98 - Client environment:java.io.tmpdir=/tmp2012-04-27 12:17:50,898 myid: - INFO main:Environment98 - Client environment:piler=2012-04-27 12:17:50,898 myid: - INFO main:Environment98 - Client environment:=Linux2012-04-27 12:17:50,899 myid: - INFO main:Environment98 - Client environment:os.arch=amd642012-04-27 12:17:50,899 myid: - INFO main:Environment98 - Client environment:os.version=3.0.0-12-generic2012-04-27 12:17:50,900 myid: - INFO main:Environment98 - Client environment:=root2012-04-27 12:17:50,900 myid: - INFO main:Environment98 - Client environment:user.home=/root2012-04-27 12:17:50,901 myid: - INFO main:Environment98 - Client environment:user.dir=/home/app/zookeeper-3.4.32012-04-27 12:17:50,903 myid: - INFO main:ZooKeeper433 - Initiating client connection, connectString=:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher4ef5c3a62012-04-27 12:17:50,937 myid: - INFO main-SendThread():ClientCnxn$SendThread933 - Opening socket connection to server /:2181Welcome to ZooKeeper!2012-04-27 12:17:50,965 myid: - INFO main-SendThread(localhost:2181):ZooKeeperSaslClient125 - Client will not SASL-authenticate because the default JAAS configuration section Client could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.JLine support is enabled2012-04-27 12:17:51,006 myid: - INFO main-SendThread(localhost:2181):ClientCnxn$SendThread846 - Socket connection established to localhost/:2181, initiating sessionzk: :2181(CONNECTING) 0 2012-04-27 12:17:51,094 myid: - INFO main-SendThread(localhost:2181):ClientCnxn$SendThread1175 - Session establishment complete on server localhost/:2181, sessionid = 0x136f1fa35960000, negotiated timeout = 30000WATCHER:WatchedEvent state:SyncConnected type:None path:nullHive安装(数据库)Cd /media/shared/sudo tar xzvf ./hive-0.8.1.tar.gz -C /home/app/vim /.bashrcexport HIVE_HOME=pwd修改环境变量export PATH=$JAVA_HOMEbin:$HIVE_HOMEbin:$PATHvim hive-env.sh.templateexport HADOOP_HOME=/home/app/hadoop-1.0.0cp hive-default.xml.template hive-site.xml复制一份hive-default.xml为hive-site.xml 设置主要环境变量(手动)export HADOOP_HOME=/home/app/hadoop-1.0.0(你自己的hadoop安装路径) 在hdfs建立目录用来保存hive数据$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse 运行运行 bin/hive 即可hive SET mapred.job.tracker=localhost:50030; hive SET -v;hive show tables;hive create table log_stat(ip STRING,time STRING,http_request STRING,uri STRING,http STRING,status int,code STRING);(此表创建时没有指定数据分隔方式和行分隔方式,下面将删除它重新建)OKTime taken: 9.893 secondshive drop table log_stat;OKTime taken: 2.172 secondshive create table log_stat(ip STRING,time STRING,http_request STRING,uri STRING,http STRING,status int,code STRING) row format delimited fields terminated by t lines terminated by n stored as textfile;(指定数据分隔方式t,行分隔n,以文本存储)hive load data local inpath /home/app/pig-0.9.2/tutorial/scripts/load_result/part-m-00000 overwrite into table log_stat;(如果数据在hdfs上,则不需要Local关键字)hive dfs -ls /user/hive/warehouse;Found 1 itemsdrwxr-xr-x - root supergroup 0 2012-06-21 10:19 /user/hive/warehouse/log_stathive dfs -ls /user/hive/warehouse/log_stat;Found 1 items-rw-r-r- 1 root supergroup 1593 2012-06-21 10:19 /user/hive/warehouse/log_stat/part-m-00000hive发现hive中select count(*) 会花费很多时间而且执行mapreducehive insert overwrite local directory /media/shared/reg_3 select a.* from log_stat a;(将表数据导入本地文件)hive load data inpath /user/root/load_result/part-m-00000 overwrite into table tt;(hdfs文件导入表)load data inpath hdfs:/localhost:9000/user/root/outlog/click_mini/20120628 into table click_mini;hive set hive.enforce.bucketing=true;hiveset hive.enforce.bucketing;外部表创建桶不会拆分目录hive select * from ad_3rd tablesample(bucket 3 out of 3 on rand();select count(*) from ad_3rd tablesample(bucket 2 out of 3 on access_date);这句执行有问题select count(*) from ad_3rd tablesample(bucket 1 out of 3 on access_date);这句可以安装mysqlapt-get install mysql-client-5.1 mysql-server-5.1rootzhuwei-VirtualBox:/etc/mysql# service mysql statusrootzhuwei-VirtualBox:/home# mysql -uroot -proot进入mysqlmysql create user hive identified by 123456;创建用户密码mysql grant all privileges on *.* to hive% with grant option;赋权mysql select user();查看当前用户mysql create database hive;sudo apt-get install mysql-server使用上面的命令下载的版本较低移除mysqlsudo apt-get autoremove -purge mysql-server-5.1sudo apt-get remove mysql-serversudo apt-get autoremove mysql-serversudo apt-get remove mysql-common (非常重要)MySQL-5.5.23-1.linux2.6.x86_64.tar安装sudo tar zxvf ./MySQL-5.5.23-1.linux2.6.x86_64.tar -C/home/apphive连接mysqlconf/hive-site.xml配置 javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=metastore_db;create=true jdbc:mysql:/localhost:3306/hive?createDatabaseIfNotExist=true JDBC connect string for a JDBC metastore javax.jdo.option.ConnectionDriverName org.apache.derby.jdbc.EmbeddedDriver com.mysql.jdbc.Driver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName APP hive username to use against metastore database javax.jdo.option.ConnectionPassword mine 123456 password to use against metastore database复制mysqljdbc驱动至hive lib下mysql-connector-java-5.1.19-bin.jarrootzhuwei-VirtualBox:/home/app/hive-0.8.1# bin/hivehive show tables;证明成功OKTime taken: 5.852 secondshivemysql use hive;Reading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedmysql select * from TBLS;查看mysql保存的hive元数据pig安装(pig操作HDFS文件,管理mapreduce作业)sudo tar zxvf ./pig-0.9.2.tar.gz -C /home/app/$ sudo vi /etc/profileExport JAVA_HOME=/usr(加上这一句,否则不会成功)export PIG_INSTALL=/home/app/pig-0.9.2export PATH=$PATH:$PIG_INSTALL/binexport PIG_HADOOP_VERSION=20export PIG_CLASSPATH=$HADOOP_INSTALL/conf (pig的mapReduce模式)$ source /etc/profilerootzhuwei-VirtualBox:/home/app/pig-0.9.2/tutorial/src/org/apache/pig/tutorial# javac -classpath /home/app/pig-0.9.2/pig-0.9.2.jar *.java编译src下java文件rootzhuwei-VirtualBox:/home/app/pig-0.9.2/tutorial/src# jar -cvf tutorial.jar org将安装报下tutorial 下org文件夹内容打包,可以执行scripts目录下示例pig程序运行rootzhuwei-VirtualBox:/home/app/pig-0.9.2/tutorial/scripts# pig -x local -param load_path=outlog/ipad_ads_error test.pig(自己编写的script,程序中接受参数load_path用来指定输入的数据文件)Linux -Eclipse hadoop配置eclipse hadoop开发环境在Eclipse下安装hadoop-plugin。 1.复制 hadoop安装目录/contrib/eclipse-plugin/hadoop-0.20.2-eclipse-plugin.jar 到 eclipse安装目录/plugins/ 下。 2.重启eclipse,配置hadoop installation directory。 如果安装插件成功,打开Window-Preferens,你会发现Hadoop Map/Reduce选项,在这个选项里你需要配置Hadoop installation directory。配置完成后退出。 3.配置Map/Reduce Locations。 在Window-Show View中打开Map/Reduce Locations。 在Map/Reduce Locations中新建一个Hadoop Location。在这个View中,右键-New Hadoop Location。在弹出的对话框中你需要配置Location name,如myubuntu,还有Map/Reduce Master和DFS Master。这里面的Host、Port分别为你在mapred-site.xml、core-site.xml中配置的地址及端口。如: Map/Reduce Master Java代码 1. localhost 2. 9001localhost9001DFS Master Java代码 1. localhost 2. 9000localhost9000配置完后退出。点击DFS Locations-myubuntu如果能显示文件夹(2)说明配置正确,如果显示拒绝连接,请检查你的配置。 第三步,新建项目。 File-New-Other-Map/Reduce Project 项目名可以随便取,如hadoop-test。 复制 hadoop安装目录/src/example/org/apache/hadoop/example/WordCount.java到刚才新建的项目下面。rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop dfs -mkdir pig(在hdfs上创建pig文件夹)Hadoop hello world程序rootzhuwei-VirtualBox:/home/app# cd hadoop-1.0.0/rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# mkdir inputrootzhuwei-VirtualBox:/home/app/hadoop-1.0.0/input# vi test1.txttest1.txt输入(HelloWorldByeWorld)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0/input# vi test2.txttest2.txt输入(HelloHadoopGoodbyeHadoop)cd 到hadoop安装目录,运行下面命令rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop fs -put input test1.txt(这个命令将input文件夹上传到了hadoop文件系统,文件夹命名为test1.txt)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop dfs -rmr test1.txt(删除test1.txt文件夹)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop dfs -put input in(这个命令将input文件夹上传到了hadoop文件系统,文件夹命名为in)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop jar hadoop-examples-1.0.0.jar wordcount in out(运行wordcount计数in 文件夹,结果输出至out目录)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop jar hadoop-examples-1.0.0.jar wordcount in outWarning: $HADOOP_HOME is deprecated.12/06/14 10:09:22 INFO input.FileInputFormat: Total input paths to process : 212/06/14 10:09:23 INFO mapred.JobClient: Running job: job_201206131736_000112/06/14 10:09:24 INFO mapred.JobClient: map 0% reduce 0%12/06/14 10:10:00 INFO mapred.JobClient: map 100% reduce 0%12/06/14 10:10:30 INFO mapred.JobClient: map 100% reduce 100%12/06/14 10:10:36 INFO mapred.JobClient: Job complete: job_201206131736_000112/06/14 10:10:37 INFO mapred.JobClient: Counters: 2912/06/14 10:10:37 INFO mapred.JobClient: Job Counters 12/06/14 10:10:37 INFO mapred.JobClient: Launched reduce tasks=112/06/14 10:10:37 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=4968612/06/14 10:10:37 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=012/06/14 10:10:37 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=012/06/14 10:10:37 INFO mapred.JobClient: Launched map tasks=212/06/14 10:10:37 INFO mapred.JobClient: Data-local map tasks=212/06/14 10:10:37 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=2743612/06/14 10:10:37 INFO mapred.JobClient: File Output Format Counters 12/06/14 10:10:37 INFO mapred.JobClient: Bytes Written=4012/06/14 10:10:37 INFO mapred.JobClient: FileSystemCounters12/06/14 10:10:37 INFO mapred.JobClient: FILE_BYTES_READ=7812/06/14 10:10:37 INFO mapred.JobClient: HDFS_BYTES_READ=26712/06/14 10:10:37 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6470012/06/14 10:10:37 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=4012/06/14 10:10:37 INFO mapred.JobClient: File Input Format Counters 12/06/14 10:10:37 INFO mapred.JobClient: Bytes Read=4912/06/14 10:10:37 INFO mapred.JobClient:
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 安徽省池州一中2026届化学高三第一学期期末学业质量监测试题含解析
- 情景歌课件教学课件
- 全新学校清明节扫墓活动方案
- 陕西省西安市新城区西安中学2026届高一化学第一学期期中经典试题含解析
- 高校大学生联谊活动策划方案
- 幼儿园家园共庆端午节活动方案
- 恐龙大迁移课件
- 现代物业面试题及答案
- 海关监制考试题及答案
- 福建莆田秀屿下屿中学2026届化学高二上期末预测试题含答案
- 医院6S管理标准
- 市政项目EPC总承包项目方案投标文件(技术方案)
- JG/T 324-2011建筑幕墙用陶板
- 第四届安徽省现代服务业职业技能竞赛(粮油保管员)备赛试题库(含答案)
- 城市道路智慧路灯项目投标方案(技术标)
- 人工智能辅助的舆论危机传播分析-洞察阐释
- 2025-2030年中国透皮贴剂行业市场现状供需分析及投资评估规划分析研究报告
- 广西安全员考试试题试题及答案
- 电力建设风电工程智慧工地技术规范
- 苏州瑞高新材料股份有限公司扩建汽车内饰环保合成革材料及膜塑复合制品项目报告表
- 新课标版2024-2025学年高中化学第一章从实验学化学第一节第2课时过滤蒸发及SO2-4的检验学案新人教版必修1
评论
0/150
提交评论