1、应用场景:面向结构化数据,即:结构清晰的数据2、CLASS_PATH有以下几种选择:选择一:CSV【简单类型】数据呈现:"1","2","Football""2","2","Soccer""3","2","Baseball & Softball"代码:createtableifnotexists TABLE_NAME(id string,page string,word string)row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'with serdeproperties('separatorChar'=',','quoteChar'='"','escapeChar'='\\')选择二:regex【正则】数据呈现:123,张三,16853210211116,true,26238.5,阅读;跑步;唱歌,java:98;mysql:54,province:南京;city:江宁代码:createtableifnotexists TABLE_NAME(id int,name string,timebigint,isPartyMember boolean,hobby array<string>,scores map<string,int>,address struct<province:string,city:string>)row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'with serdeproperties('input.regex'='^(//d+),(.*?),(//d+),(true|false),(\\d+\\.?\\d+?)$')选择三:JsonSerDe数据呈现:{"name":"henry","age":22,"gender":"male","phone":"18014499655"}代码:createtableifnotexists json(name string,age int,gender string,phone string)row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
*额外处理
1、store【存储】基本语法:stored as'存储格式'存储格式:textfile✔,orc,parquet,sequencefile,...案例:stored as textfile2、tblproperties【表属性】(通用):案例【实际情况具体分析】:tblproperties('skip.header.line.count'='1' 【跳过表头,即:第一行】...)
案例一:/*1|henry|1.81|1995-03-18|江苏,南京,玄武,北京东路68号|logicjava:88,javaoop:76,mysql:80,ssm:82|beauty,money,joke2|arill|1.59|1996-7-30|安徽,芜湖,南山,西湖东路68号|logicjava:79,javaoop:58,mysql:65,ssm:85|beauty,power,sleeping3|mary|1.72|1995-09-02|山东,青岛,长虹,天山东路68*/droptableifexists students;createtableifnotexists students(number int,name string,height decimal(3,2),birthday date,house struct<province:string,city:string,district:string,street:string>,scores map<string,int>,hobby array<string>)row format delimitedfieldsterminatedby"|"collection items terminatedby","map keysterminatedby":"stored as textfile;loaddata inpath '/zhou/students.txt'overwrite intotable zhou.students;案例二:/*user_id,auction_id,cat_id,cat1,property,buy_mount,day
786295544,41098319944,50014866,50022520,21458:86755362;13023209:3593274;10984217:21985;122217965:3227750;21477:28695579;22061:30912;122217803:3230095,2,123434123*/droptableifexists sam_mum_baby_trade;create external tableifnotexists sam_mum_baby_trade(user_id bigint,auction_id bigint,cat_id bigint,cat1 bigint,property map<bigint,bigint>,buy_mount int,daybigint)row format delimitedfieldsterminatedby","collection items terminatedby";"map keysterminatedby":"stored as textfiletblproperties ('skip.header.line.count'='1');loaddata inpath '/zhou/sam_mum_baby_trade.csv'intotable zhou.sam_mum_baby_trade;案例三:/*"1","2","Football""2","2","Soccer""3","2","Baseball & Softball"*/droptableifexists categories;createtableifnotexists categories(id string,page string,word string)row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'with serdeproperties('separatorChar'=',','quoteChar'='"','escapeChar'='\\')stored as textfile;loaddata inpath '/zhou/categories.csv'overwrite intotable zhou.categories;select*from categories;案例四:/*{"name":"henry","age":22,"gender":"male","phone":"18014499655"}*///Jsondroptableifexists json;createtableifnotexists json(name string,age int,gender string,phone string)row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'stored as textfile;loaddata inpath '/zhou/json.log'overwrite intotable zhou.json;案例五:/*125;男;2015-9-7 1:52:22;1521.84883;男;2014-9-18 5:24:42;6391.45652;女;2014-5-4 5:56:45;9603.79*/create external tableifnotexists test1w(user_id int,user_gender string,order_time timestamp,order_amount decimal(6,2))row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'with serdeproperties('input.regex'='(\\d+);(.*?);(\\d{4}-\\d{1,2}-\\d{1,2} \\d{1,2}:\\d{1,2}:\\d{1,2});(\\d+\.?\\d+?)')stored as textfilelocation '/zhou/test1w';select*from test1w;
二:hive建表【高阶语法】
1:CTAS
【本质】:在原有表的基础上查询并创建新表
基本语法:create table if not exists NEW_TABLE_NAME as select ... from OLD_TABLE_NAME ...
案例:原有的表:hive_ext_regex_test1w语句:create table if not exists hive_ext_test_before2015 asselect * from hive_ext_regex_test1wwhere year(order_time)<=2015;