hadoop教程十二、hbase hive pig zookeeper学习笔记_第1页
hadoop教程十二、hbase hive pig zookeeper学习笔记_第2页
hadoop教程十二、hbase hive pig zookeeper学习笔记_第3页
hadoop教程十二、hbase hive pig zookeeper学习笔记_第4页
hadoop教程十二、hbase hive pig zookeeper学习笔记_第5页
免费预览已结束,剩余56页可下载查看

付费下载

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、HIVE 结HIVE架Hive 和Hadoop 关HIVEHIVE 结HIVE架Hive 和Hadoop 关HIVE元数据 HIVE其它HIVE操create总语基本例创建分其它例AlterAdd DropRename Change Add/ReplaceCreateInsertingoHiveTablesfromWritingofilesystemfromdlineeractive其Top REGEXColumnHiveGroup Order /SortHiveHIVE参数设HIVE HiveGroup Order /SortHiveHIVE参数设HIVE 基本函关系操作代数操作逻辑操作内建

2、函数学函集合函类型转日期函条件函字符串函的GROUP字符压子查优化与技全排例例Bucket JOIN原Map JOIN原Map 合并小文HIVE 1HIVE结机制。HiveSQLQL,SQL1HIVE结机制。HiveSQLQL,SQL和mapper1.1HIVECnt, 中Hadoop:HDFSMapReduce nt和WUI。其中最常用的是CLI,Cli启动的时候会同时启动一个 Hive 副本。C 在启动 CHadoop:HDFSMapReduce nt和WUI。其中最常用的是CLI,Cli启动的时候会同时启动一个 Hive 副本。C 在启动 CServerWUIntHiveHiveServ

3、er。 2Hive4HiveHDFSselect*fromtbl和 HiveHadoop在Hadoop务,如:select * from table)Hive1.3Hive SQLHQLSQLHiveHDFS文件数据的方法(Hive中默认有三个文件格式SequenceFile 以及 RCFile)。由于在加载数据的过程中,不需要从用户数据格式到 1.3Hive SQLHQLSQLHiveHDFS文件数据的方法(Hive中默认有三个文件格式SequenceFile 以及 RCFile)。由于在加载数据的过程中,不需要从用户数据格式到 ,Hive而数据库中的数据通常是需要经常进行修改的,因此可以使

4、用 INSERTO VALUESUPDATESETMapReduce入, Hive,HiveHive执行。HiveHadoopMapReduce(select*fromtblMapReduce) Raw Deviceor Local无有MapReduce 高低HiveMapReduceMapReduceMapReduceHive较小,当数据规模大到超过数据库的处理能力的时候,Hive可扩展性。由于 HiveHadoopHive 的可扩展性是和 HadoopHiveMapReduceMapReduceMapReduceHive较小,当数据规模大到超过数据库的处理能力的时候,Hive可扩展性。由于

5、 HiveHadoopHive 的可扩展性是和 Hadoop行数据库 Oracle 在理论上的扩展能力也只有 100 台左右。1.4HIVE Hive1.4.1 进入到 hive 的安装 startNetworkServerh试JDBCconnectstringforaJDBC输入Connect2. 根据建对象的ID,与构建对象信称,类型等)一同通方法写入到元数据实际统表2. 根据建对象的ID,与构建对象信称,类型等)一同通方法写入到元数据实际统表中hive 元数据一样 露了这些 id 信息(oid,cid 等),而 Oracle 等商业化的系统则见附待完导出建表语句的 21,则下一个新创建的

6、 hive 表其TBL_ID 就是21,同时 SEQUENCE_TABLE 表中271786 被更新为26(这里每次都是1.5HIVE 的Hive Hive HDFS 中,Hive Hive Table Table 1.5HIVE 的Hive Hive HDFS 中,Hive Hive Table Table Table warehousexiaojun,其中,wh hive-site.xml 括ExternalTable)都保存在这Table 数据(Partition Partition Hive Partition Hive Partition city Partitiondt201008

7、01,ctryUS HDFS warehousexiaojun/dt=20100801/ctry=USdt20100801,ctryCA 为;/warehouseBuckets hashhash Bucket user 32 bucketuser hashhash 0 为:warehouse=20100801/ctry=US/part-00000;hash 20 ExternalTable HDFS Partition ExternalTable 只有一个过程,加载数据和创建表同时完成(CREATE LOCATION HDFS 中。当删除一个ExternalTable 时,仅删sh$HIVE_

8、HOME/bin/hive-servicehadoopfs-textateinfo2HIVE 基本操2.1create 2.1.1CREATETABLE户可以用 IF NOT EXIST 选项来忽略这个异常。 关键字可以让用户(LOCATONhadoopfs-textateinfo2HIVE 基本操2.1create 2.1.1CREATETABLE户可以用 IF NOT EXIST 选项来忽略这个异常。 关键字可以让用户(LOCATONHive LIKE用户在建表的时候可以自定义 SerDe 或者使用自带的 SerDe。如果没有指定 ROW FORMATROWFORMATDELIMITED,

9、SerDe。在建表的时候,用户还SrDi如果文件数据是纯文本,可以使用 STORED AS TEXTFILE。如果数据需要压缩,使用 STORED AS SEQUENCE 。个分区,每一个分区单独存在一个下。而且,表和分区都可以对某个列进行CLUSTEREDBY(bucket)SORTBY2.1.2CREATE EXTERNAL TABLE IF NOT EXISTS (col_namedata_typement,PARTITIONEDBY(col_namedata_typement,CLUSTERED BY (col_name, col_name, .) SORTED BY ASC|DESC

10、,Onum_bucketsROW FORMAT row_format STORED AS |STOREDBYameWITH(.) only available starting with LOCATION TBLPROPERTIES (property_name=property_value, available starting with CREATE EXTERNAL LOCATION hdfs_path:|:|STOREDBYameWITH(.) only available starting with LOCATION TBLPROPERTIES (property_name=prop

11、erty_value, available starting with CREATE EXTERNAL LOCATION hdfs_path:|:|: ARRAY : MAP primitive_type, data_type : STRUCT selectLOCATIONfromtblsa,sdsbwherea.sd_id=b.sd_idandhivecreate EXTERNAL table,contijselectLOCATIONfromtblsa,sdsbwherea.sd_id=b.sd_idandhivecreate EXTERNAL table,contstring)rowfor

12、matfieldsterminatedby005storedashiveLOADDATAINPATH/user/admin/xiaojunOTABLELoadingotablehive droptableadminhadoop1 bin$ ./hadoop fs -ls Found 1 items2.1.4 创建分HIVE 的分区通过在创建表时启用 partition by 实现,用来partition 的维度并不是实wherewheretablename.partition_keyaCREATE TABLE page_view(viewTime, userid BIGpage_url STR

13、ING, referrer_url STRING,ipSTRINGCOMMENTCOMMENTThisisthepage PARTITIONED BY(dt AddressoftheUser) view tablecountry CLUSTEREDBY(userid)SORTEDO32ROW FORMAT FIELDS TERMINATED BY CREATETABLE(yyyymmdd ,string ,)PARTITIONEDBY(dtrow fieldsterminatedbyCREATETABLE(yyyymmdd ,string ,)PARTITIONEDBY(dtrow field

14、sterminatedby005storedasO SELECT count(*) FROMc02_clicks_fatdt1WHEREa.dt=20101101a.dtdescxi; Time taken:0.061hivecreatetablexibaklikeTime taken:xibakreplacecolumns(ins_datehivealter Timehive desc2.3CreateCREATEVIEWIFNOTEXISTSview_name(column_nameCOMMENT ment, .) TBLPROPERTIES(property_name=property_

15、value,AECT DESCRIBE EXTENDED RTITION (ds=2008-08-3.1LOAD DATA LOCAL INPATH filepath OTABLE PARTITION (partcol1=val1, partcol2=val2 LoadHivefilepatho SELECT a.foo FROM invites a limit RTITIONS DESCRIBE EXTENDED RTITION (ds=2008-08-3.1LOAD DATA LOCAL INPATH filepath OTABLE PARTITION (partcol1=val1, pa

16、rtcol2=val2 LoadHivefilepatho SELECT a.foo FROM invites a limit RTITIONS SELECT a.foo FROM invites a WHERE a.ds=2008-08-DESCRIBE SHOWTABLESpage.*; SHOWTABLES SHOW 绝对路径,例如: 包含模式的完整 URI,例如: filepath中)或者是一(在这种情况下,Hive 会中)LOCAL,o loadfilepath。如果发现是相对为本地文件指定一个完整的 URI,比如: 绝对路径,例如: 包含模式的完整 URI,例如: filepath

17、中)或者是一(在这种情况下,Hive 会中)LOCAL,o loadfilepath。如果发现是相对为本地文件指定一个完整的 URI,比如: o load 命令会将 filepathLOCALfilepathURI, hive 会直接使用这个 URI。 否则:o 如果没有指定 schema 或者 authority,Hive 会使用在 hadoop了 Namenode 的 URI。o ,Hive/user/o Hivefilepathtable (OVERWRITE(或者分区)中的内容(如果有)filepath 指向的文件/ 中的内容添加到表/名LOAD DATA O table _fatdt

18、1OVERWRITE PARTITION LOAD DATA LOCAL INPATH O TABLE LOADDATALOCALINPATH/tmp/pv_2008-06-OTABLEPARTITION(date=2008-06-08, loaddataok2.6.1Inserting oHiveTablesfromStandard INSERTOVERWRITETABLEtablename1PARTITION.) ement1 FROM Hive exten FROMfrom_s(multipleinserts): INSERTOVERWRITETABLEtablename1PARTITI

19、ONloaddataok2.6.1Inserting oHiveTablesfromStandard INSERTOVERWRITETABLEtablename1PARTITION.) ement1 FROM Hive exten FROMfrom_s(multipleinserts): INSERTOVERWRITETABLEtablename1PARTITIONpartcol2=val2 .) INSERT OVERWRITE TABLE tablename2 PARTITION ement2Hive (dynamic partition INSERTOVERWRITETABLEtable

20、namePARTITIONpartcol2=val2 .) ementFROM hive sql 法实现面的方:BinserttableAselect 1,abclimit hive FROM invites a INSERT OVERWRITE TABLE eventECTa.bar, count(*) WHERE a.foo 0 GROUP BY a.bar;hiveINSERTOVERWRITETABLEa WHERE a.foo 0 GROUP BY ECTa.bar,count(*)FROMFROM INSERT OVERWRITE TABLE dest1 SELECT src.*

21、WHERE src.key INSERTOVERWRITETABLEdest2SELECTsrc.key,src.valueWHEREsrc.keyFROM INSERT OVERWRITE TABLE dest1 SELECT src.* WHERE src.key INSERTOVERWRITETABLEdest2SELECTsrc.key,src.valueWHEREsrc.key100 and src.key = 200 and src.key = from insertoverwrite tabletest2select 1,2,3limit1 insert overwrite ta

22、ble d select 4,5,6 limit 1;Hive 不支持一条一条的用 insert 语句进行2.6.2Writing ofilesystem fromStandard INSERT OVERWRITE LOCAL DIRECTORY directory1 SELECT . FROM Hive (multiple FROMINSERT OVERWRITE LOCAL DIRECTORY directory1 select_sINSERT OVERWRITE LOCAL DIRECTORY directory2 select_sement2.INSERTOCALDIRECTORY/t

23、mp/local_outSELECTa.*FROMINSERT OVERWRITE SELECT a.* FROM _fatdt1 a WHERE 2.7.1dlineUsage:hive-hiveconf query-string -Sx=y*| -Sx=y*|-i InitializationSqlfromfileautomatically and beforeanyother Sql fromSql from -quoted d is -hiveconfx=y Usethistosethive/hadooperactive is However, -i can be used ny ot

24、her options. Multipleof -i can be used to execute multiple init To see this usage help, run hive -$HIVE_HOME/bin/hive-eselectcount(*)fromleofsetting hiveconfiguration $HIVE_HOME/bin/hive-eselecta.colfromtab1a-hiveconf hive.exec.scratchdir=/home/my/hive_scratch -hiveconf leofrunninganinitializationsc

25、ript2.7.2eractiveddquitorexitsetIi,setThiswillgivesible-configuration FILE Adds a file to the list of .list list all the already listFILECheck are already added orleofrunninganinitializationscript2.7.2eractiveddquitorexitsetIi,setThiswillgivesible-configuration FILE Adds a file to the list of .list

26、list all the already listFILECheck are already added or ! execute a dfrom hive dfs execute dd from hive set i=32; hive set i;hiveselecta.*fromxiaojuna; hive !ls;HIVE_HOME/bin/hive -i /home/my/hive-HIVE_HOME/bin/hive -f /home/my/hive-HIVE_HOME/bin/hive -S -e select count(*) from c02_clickshivesethive

27、selectcount(*)from_fatdt12.7.3HiveHive can manage the addition of to a where hivesethiveselectcount(*)from_fatdt12.7.3HiveHive can manage the addition of to a where siblefilecanbeaddedtothe.Onceafileisaddedtoa y its name (in, hive query can refer to this map/reduce/transform clauses) and this file i

28、s available locally at execution time on the entire hadoop cluster. Hive uses Hadoops DistributedCachetodistributetheaddedfilestoallthemachinesin the cluster at query execution time.FILEarejust added tothe distributedcache. this might be somethinglike atransformscript to beexecuted. JARresour areals

29、oaddedtotheJavaclasspath.Thisisrequired in order to reference objects they contain such as UDFs.ARCHIVE resour distributingthem.are automatically unarchived as part hiveaddFILE/tmp/tt.py; hive list FILES;hive from networks a MAPworkidUSINGtt.pywhere a.ds = 2009-01-04 ADDFILES|JARS|ARCHIVES LIST FILE

30、S | JARS | ARCHIVES DELETE FILES | JARS | ARCHIVES hive dfs -Itisnotsarytoaddfilestotheifthefilesusedintransform script are already available on all he cluster using the same path name. For .Itisnotsarytoaddfilestotheifthefilesusedintransform script are already available on all he cluster using the

31、same path name. For .MAPworkidUSINGwc-l.:herewcisanexecutable available on all machines. MAP workid USING /home/nfsserv1/hadoopscripts/tt.py.:herett.pymaybesibleviaanfsmountpo all the cluster nodestsconfiguredidentically2.7.4调、import sys forlineinsys.stdin: line = line.strip()userid,movieid,rating,u

32、nixtime=kday = t.join(userid, movieid, rating, CREATETABLEu_data_new ( useridROW FORMAT DELIMITED FIELDSTERMINATEDBYt;add INSERTOVERWRITETABLEu_data_new TRANSFORM(userid,movieid,rating,unixtime) USING AS (userid, movieid, rating,FROM u_data;FROMinvitesaINSERTOVERWRITETABLEECTa.bar) AS (oof, rab) USI

33、NG /bin/cat FROMinvitesaINSERTOVERWRITETABLEECTa.bar) AS (oof, rab) USING /bin/cat WHERE a.ds 2008-08-2.9.1Limit 可以限制查询的中随机查询 5 条SELECT * FROM t1 LIMIT 2.9.2Top下面的查询语句查询销最大的5个销售代表SET mapred.reduce.tasks = SELECT * FROM sales SORT BY amount DESC LIMIT 2.9.3REGEXColumnSELECTdshr 之SELECT 3.HivegroupByC

34、lause:BY(,: groupByQuery:SELECT(,)*FROMsrcutilitites按FROMINSERT OVERWRITE TABLE pv_gender_sum SELECTpv_users.gender,count(DISTINCT GROUP BY pv_users.genderINSERTOVERWRITEDIRECTORYSELECT 3.HivegroupByClause:BY(,: groupByQuery:SELECT(,)*FROMsrcutilitites按FROMINSERT OVERWRITE TABLE pv_gender_sum SELECT

35、pv_users.gender,count(DISTINCT GROUP BY pv_users.genderINSERTOVERWRITEDIRECTORYSELECTpv_users.age,count(DISTINCTpv_users.userid) GROUP BY pv_users.age;SELECT ALL | DISTINCT select_expr, select_expr, FROM table_reference GROUP BY col_listCLUSTER BY | DISTRIBUTE BY col_list SORT BY LIMIT 3.2Order/Sort

36、OrderbycolOrder: ( ASC | DESC orderBy:ORDERBYcolNamecolOrder?(,colNamequery: SELECT (, )* FROM src SortBycolOrder: ( ASC | DESC sortBy:SORTBYcolOrder?(,colNamequery: SELECT (, )* FROM src 3.2Order/SortOrderbycolOrder: ( ASC | DESC orderBy:ORDERBYcolNamecolOrder?(,colNamequery: SELECT (, )* FROM src

37、SortBycolOrder: ( ASC | DESC sortBy:SORTBYcolOrder?(,colNamequery: SELECT (, )* FROM src 4.Hivetable_reference able_factor |table_referenceLEFT|RIGHT|FULLOUTER| table_reference LEFT SEMI able_reference | tbl_name | table_subquery | ( ON ( AND set hive.map.aggr=true; SELECTCOUNT(*)FROMtable2;:=joinsj

38、oins支持多于 2 个表的连接。SELECTa.* SELECTON FROMa FROM :=joinsjoins支持多于 2 个表的连接。SELECTa.* SELECTON FROMa FROM = JOINbON(a.id=b.id) JOIN bANDa.department=SELECTa.*FROMJOINbON(a.id 1. 可以 joinSELECTa.val,b.val,c.val ON(a.key=b.key1) FROM aJOINcON (c.key=joinmap/reduceSELECTa.val, ON(a.keyON(c.keyb.val,c.val b.

39、key1)FROM aJOINcmap/reducejoinb.key1joinkeySELECTa.val,c.valFROMaJOINbON(a.key=b.key1) 2map/reduceb.key1joinc ON(c.keyjoinjoin3joinmap/reducereducerjoinSELECTa.val,b.val,c.valFROMJOIN bON(a.key=b.key1)JOINcON(c.key=joinkey(1map/reduceReducea和 b,然后每次取得一个c表就计算一次join结果,类似的还有SELECTa.val,b.val,c.valFROMJ

40、OIN bON(a.key=b.key1)JOINcON(c.key=2map/reduceab4LEFT,RIGHTFULLOUTERjoinSELECTa.val,b.valFROMaLEFTb ON 2map/reduceab4LEFT,RIGHTFULLOUTERjoinSELECTa.val,b.valFROMaLEFTb ON aa.key=b.keyb.keya.val,NULL“FROMaLEFTOUTERJOINbaba SQLspecWHEREjoinWHERESELECTa.val,b.valFROMLEFTOUTERJOINbONWHEREa.ds=2009-07-07

41、ANDb.ds=2009-07-joinab(OUTERJOINa.valb.valba,bNULL,dsjoinbajoin key 的所有 。这样的话,LEFTOUTERWHERE法是在 OUTER JOIN 时使用以下语法:SELECTa.val,b.valFROMaLEFTOUTERJOINON (a.key=b.key AND b.ds=2009-07-07AND 用于 RIGHT 和 FULL 类型的 join 中。JoinLEFTRIGHTjoin,都是左连接的。 SELECT a.val1, a.val2, b.val, c.valFROM JOIN bON(a.key=OUT

42、ER JOINcON(a.key=keyacbjoinaJOINb(a.val1,a.val2a.keycjoinc.keya.keyb.keyNULL,NULL,NULL,c.val5LEFT SEMI JOIN 是 IN/EXISTS 子查询的一种更高效的实现。Hive 当前没有实现 IN/EXISTS子查询,所以你可以用LEFTSEMIJOIN重写你的子查询语句。LEFTSEMIJOINSELECTa.key,FROM WHEREa.keyin (SELECTb.key FROM B);SELECTa.key,FROMaLEFTSEMIJOIN bFROM WHEREa.keyin (S

43、ELECTb.key FROM B);SELECTa.key,FROMaLEFTSEMIJOIN bon(a.key =5HIVE参数设开发Hive应用时,不可避免地需要设定Hive的参数。设定Hive bin/hive-hiveconf:可以在HQL中使用SETset 些参数在 createtable:可以在HQL中使用SETset 些参数在 createtableifnotexists)WITHSERDEPROPERTIES )STOREDAS 6.HIVE 6.1.1 A= TRUEifAisequaltoBotherwiseSHOWDESCRIBEFUNCTIONA= Failsbec

44、auseofinvalidsyntax.SQLuses=,notA NULL if A or B is NULL, TRUE if expresAisNOTequaltoexpres B otherwise FALSEA= Failsbecauseofinvalidsyntax.SQLuses=,notA NULL if A or B is NULL, TRUE if expresAisNOTequaltoexpres B otherwise FALSEA NULL if A or B is NULL, TRUE if expresA is lessn expres otherwise FAL

45、SEA NULL if A or B is NULL, TRUE if expresA is greatern expres otherwise FALSEA= NULL if A or B is NULL, TRUE if expresA is greaternorequalto B otherwise FALSEallTRUEifAevaluatestoNULLotherwiseAISNOT AllTRUEifAevaluatestoNULLotherwiseALIKE NULL if A or B is NULL, TRUE if string A matches the SQL sim

46、ple regular B, otherwise FALSE. The comparison is done character by character.The_character inBmatchesanycharacterinA(similar to . in ix regular express) while the % character in B matches an arbitrary number of characters in A(similar to .* inix regular s) e.g. foobar like foo evaluates to FALSE wh

47、ere as likefoo_evaluatestoTRUEandsodoesfoobarlikeA RLIKE NULL if A or B is NULL, TRUE if string A matches the Java regular expres B(See Java regular expres s syntax), otherwise FALSE e.g. foobar rlike foo evaluates to FALSE where as foobar rlike f.*r$ evaluates to TRUESameas6.1.2 返回数字类型,如果任意一个操作符为 N

48、ULL,则结果为 A+ 6.1.2 返回数字类型,如果任意一个操作符为 NULL,则结果为 A+ Givestheresultofadding AandB.Thetypeoftheresultisthesameas thecommonhetypehierarchy)ofthetypesofthee.g. since everyegerisafloat,thereforefloatisacontainingtypeof eger so the + operator on a float and anwill result in a float.A- Gives the result of sub

49、tracting B from A. The type of the result is the same as the common parent( he type hierarchy) of the types of the A* Gives the result of multiplying A and B. The type of the result is the same as the common parent( he type hierarchy) of the types of the operands. Note t if the multiplication causin

50、g overflow, you will have tocast one ofthe operators to atype higher hetype hierarchy.A/ GivestheresultofdividingBfromA.TheresultisadoubleA%GivesthereminderresultingfromdividingAbyB.Thetype ofthe result is the same as the common parent( he type hierarchy) of the types of the operands.A& Gives the re

51、sult of bitwise AND of A and B. The type of the result is the same as the common parent( he type hierarchy) of the types of the A|Gives the result of bitwise OR of A and B. The type of the result is the same as the common parent( he type hierarchy) of the types of the A Gives the result of bitwise X

52、OR of A and B. The type of the result is the same as the common parent( he type hierarchy) of the types of the GivestheresultofbitwiseNOTofA.Thetypeoftheresultisthesame as the type of A.6.1.3 6.1.4 6.1.5 6.1.6 6.1.7 6.1.8 6.1.9 from_unixtime(0)=1970-01-01 6.1.3 6.1.4 6.1.5 6.1.6 6.1.7 6.1.8 6.1.9 fr

53、om_unixtime(0)=1970-01-01 (string to_date(1970-01-00:00:00)=1970-01-Createsamapwiththegivenkey/valueval3, .)Createsastructwiththegivenfieldvalues.Structfieldnameswill be col1, col2, .CreatesanarraywiththegivenThe following are built-String functions are supported in -testCondition 为真时返 回 valueTrue ,

54、 testCondition 为假 或 NULL 时 返 回 -v2, .)都为空则返回 NULL-CASE a WHEN b THEN c The following are built-String functions are supported in -testCondition 为真时返 回 valueTrue , testCondition 为假 或 NULL 时 返 回 -v2, .)都为空则返回 NULL-CASE a WHEN b THEN c f ENDab,返回c;ad,返回 e;否则返回 f-CASE WHEN a THEN b e END返回 d;否则e year(19

55、70-01-01 00:00:00) = 1970,year(1970-01-01) = 1970 datediff(string enddate, string date_add(2008-12-31,1)=2009-01-date_sub(2008-12-31, 1) = 2008-12-30 substr(string A,返回子串,例如substr(foobar, substr(string A, start, len) substring(string A, start, len)返回限定长度的子串,例如substr(foobar substr(string A,返回子串,例如sub

56、str(foobar, substr(string A, start, len) substring(string A, start, len)返回限定长度的子串,例如substr(foobar4 trim(string string B, string C)Returns the string resulting from replacing all substrings in B t match the Java regular expres syntax(See Java regular expres s syntax) with C e.g. regexp_replace(foobar

57、, oo|ar, ) returns fb. Note t some care isne sary in using predefined character classes: using s as the second argument will match the letter s; sisne sarytomatchwhitespace,subject, string pattern, regexp_extract(foothebar, foo(.*?)(bar), 2)=bar。注意使parse_url(string urlString, string partToExtract)UR

58、L 字符串,partToExtract 的可选项有:HOST, PATH, QUERY, REF, PROTOCOL, FILE, AUTHORITY, 的格式是QUERY:,例如QUERY:k1 如:SELECTpageid,explode(adid_listASmyCol. 3GROUP 如:SELECTpageid,explode(adid_listASmyCol. 3GROUPBYCLUSTERBYDISTRIBUTEBYSORT 应的valuejson_string,stringpath)json 字符串。若源 json 字符串则返回NULL。 path 参数支持JSONPath 的

59、一个子集,包括以下标记:$:Root:Subscriptoperatorfor&:Wildcardfor.:Child返回一个包含n 重复str字符串n 返回str 中第一个字符的ascii lpad(string string左端补齐str 到长度为len。补齐的字符串由pad rpad(string string右端补齐str 到长度为len。补齐的字符串由pad split(stringstr,string 如,split(foobar, o)2 = bar。?不是很明白这个结果string strList)Returns the occurance of str in strList

60、where strList is a comma-delimited string. Returns null if either argument is null. Returns 0 if the argument contains any commas. 6.2.1 6.2.1 2、insert OVERWRITE table test2 select * from (select array(1,2,3) from a union all select from d)c;3hiveSELECTexplode(myCol)ASmyNewColFROMtest2; 1237897HIVE

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论