When I use hive to create a table, I am prompted not to include the dot symbol.
(state=42000,code=40000)
How can I solve this problem?
CREATE EXTERNAL TABLE `ods.a2`(
`key` string COMMENT 'k',
`value` string COMMENT 'v')
COMMENT '注释'
ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES (
'field.delim'=',,',
'serialization.format'=',,')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs:///user/hive/warehouse/ods.db/a2';
err:
Error: Error while compiling statement: FAILED: SemanticException Line 1:22 Table or database name may not contain dot(.) character 'ods.a2' (state=42000,code=40000)
Please use CREATE EXTERNAL TABLE `ods`.`a2`
If any components of a multiple-part name require quoting, quote them individually rather than quoting the name as a whole. For example, write `my-table`.`my-column`, not `my-table.my-column`
Related
I want to create an external table from a .csv file I uploaded to the server earlier.
In Bline (shell for Hive), I tried running this script:
CREATE EXTERNAL TABLE c_fink_category_mapping (
trench_code string,
fink_code string
)
row format delimited fields terminated by '\073' stored as textfile
location '/appl/trench/dev/data/in/main/daily_wf/fink_category_mapping'
TABLEPROPERTIES ('serialization.null.format' = '')
;
which creates the table w/o any error byt the table itself is empty.
Help would be appreciated.
My textfile is populated with data.
First, check if the location path is correct.
Then try with this configuration:
CREATE EXTERNAL TABLE c_fink_category_mapping (
trench_code string,
fink_code string
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'quoteChar'='"',
'separatorChar'=',')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'/appl/trench/dev/data/in/main/daily_wf/fink_category_mapping';
response provided above seems to be correct:
CREATE EXTERNAL TABLE c_fink_category_mapping (
trench_code string,
fink_code string
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'quoteChar'='"',
'separatorChar'=',')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'/appl/trench/dev/data/in/main/daily_wf/fink_category_mapping';
This will create the table using a comma as the delimiter, which should correctly parse the data in your CSV file and populate the table with the data from the file. You can also specify a different delimiter character, such as '\t', if that is more appropriate for your data.
I want to create a View with the following code but I somehow encounter errors which I cant really resolve.
The code:
create view default.table_all as
select *
from default.table_a
union select *
from default.table_b;
The Error:
Error while compiling statement: FAILED: SemanticException 3:20 Schema of both sides of union should match: Column column_42 is of type double on first table and type string on second table. Error encountered near token 'table_b'
Firstly, column_42 is in both tables table_a and table_b the same data type, double (I already checked numerous times). Secondly when unionizing only on column_42
create view default.table_all as
select column_42
from default.table_a
union select column_42
from default.table_b;
It works like a charm and does not show any errors whatsoever.
I really can't figure out the problem.
#Mureinik, here the DDLs of the two tables:
DDL - table_a:
createtab_stmt
CREATE TABLE `default.table_a`(`column_42` double, ...)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://ITEG-EG-DC2-DR01-ns/user/hive/warehouse/table_a'
TBLPROPERTIES (
'transient_lastDdlTime'='1635342670')
DDL - table_b:
createtab_stmt
CREATE TABLE `default.table_b`(`column_42` double,...)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://ITEG-EG-DC2-DR01-ns/user/hive/warehouse/table_b`'
TBLPROPERTIES (
'transient_lastDdlTime'='1635410688')
Thank you very much!
I'm making some automatic processes to create tables on Cloudera Hive.
For that I am using the show create table statement that me give (for example) the following ddl:
CREATE TABLE clsd_core.factual_player ( player_name STRING, number_goals INT ) PARTITIONED BY ( player_name STRING ) WITH SERDEPROPERTIES ('serialization.format'='1') STORED AS PARQUET LOCATION 'hdfs://nameservice1/factual_player'
What I need is to run the ddl on a different place to create a table with the same name.
However, when I run that code I return the following error:
Error while compiling statement: FAILED: ParseException line 1:123 missing EOF at 'WITH' near ')'
And I remove manually this part "WITH SERDEPROPERTIES ('serialization.format'='1')" it was able to create the table with success.
Is there a better function to retrieves the tables ddls without the SERDE information?
First issue in your DDL is that partitioned column should not be listed in columns spec, only in the partitioned by. Partition is the folder with name partition_column=value and this column is not stored in the table files, only in the partition directory. If you want partition column to be in the data files, it should be named differently.
Second issue is that SERDEPROPERTIES is a part of SERDE specification, If you do not specify SERDE, it should be no SERDEPROPERTIES. See this manual: StorageFormat andSerDe
Fixed DDL:
CREATE TABLE factual_player (number_goals INT)
PARTITIONED BY (player_name STRING)
STORED AS PARQUET
LOCATION 'hdfs://nameservice1/factual_player';
STORED AS PARQUET already implies SERDE, INPUTFORMAT and OUPPUTFORMAT.
If you want to specify SERDE with it's properties, use this syntax:
CREATE TABLE factual_player(number_goals int)
PARTITIONED BY (player_name string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
WITH SERDEPROPERTIES ('serialization.format'='1') --I believe you really do not need this
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 'hdfs://nameservice1/factual_player'
I used the following syntax while creating the hive table--
Create table tablename (ColumnName Type)
row format SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with SERDEPROPERTIES ("separatorChar" = "\;")
lines terminated by '\n'
tblproperties ("skip.header.line.count" = "1");
But I am getting an error message
FAILED: ParseException line 1:361 missing EOF at 'lines' near ')'
I'm not sure what I'm doing wrong. Please help!
If you have a single column, you don't need a separatorchar.If you have multiple fields and if they are separated by ';' then you don't need to escape the ';'
SERDEPROPERTIES ("separatorChar" = ";")
STORED AS TEXTFILE
LOCATION '/path/yourfile.csv'
This article shows that we can use multi-character delimiter in Hive.
But can we also specify the NULL value?
I tried the following hive sql which returns an error:
CREATE TABLE temp
( a STRING, b STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="##")
NULL DEFINED AS 'NULL'
STORED AS TEXTFILE;
The error:
Error: Error while compiling statement: FAILED: ParseException line 5:0 missing EOF at 'NULL' near ')' (state=42000,code=40000)
The option to use NULL DEFINED AS 'NULL' is available when we are using a ROW FORMAT DELIMITED option. Here we are using a ROW FORMAT SERDE option so we need to explicitly pass the property serialization.null.format.
you can use the below query by setting the property value of serialization.null.format:
CREATE TABLE temp
( a STRING, b STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="##",'serialization.null.format'='NULL')
STORED AS TEXTFILE;
For more information you can refer Hive DDL reference guide. MultiDelimitSerde source code.
HIVE DDL GUIDE:
row_format
: DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
[MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
[NULL DEFINED AS char] -- (Note: Available in Hive 0.13 and later)
| SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]