Getting Exception while using Timeline column as Hive Partition Field - sql

I am trying to load data from normal table to Hive partitioned table.
Here is my normal table syntax:
create table x(name string, date1 string);
Here is my new partitioned table syntax:
create table y (name string, date1 string) partitioned by (timestamp1 string);
Here is how I am how to load data to y:
insert into table y PARTITION(SUBSTR(date1,0,2)) select name, date1 from x;
Here is my Exception:
FAILED: ParseException line 1:39 missing ) at '(' near ',' in column name
line 1:51 cannot recognize input near '0' ',' '2' in column name

Use dynamic partitioning:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
insert into table y PARTITION(timestamp1)
select name, date1, SUBSTR(date1,0,2) as timestamp1 from x;

Related

Copying the structure of temp hive table to new table and adding additional table properties

I want to copy the structure of full load temp table and add the addition table properties like partitioned by (partition_col), Format='ORC'
Temp table :
Create table if not exists tmp.temp_table( id int,
name string,
datestr string )
temp table got created.
Final table :
CREATE TABLE IF NOT EXISTS tmp.{final_table_name} (
LIKE tmp.temp_table
)
WITH (
FORMAT = 'ORC'
partitioned by('datestr')
)
But I am getting the error as "Error: Error while compiling statement: FAILED: ParseException line 1:63 missing EOF at 'WITH' near 'temp_table' (state=42000,code=40000)"
Any solution to achieve this functionality.
You should not use like and instead use create table as (CTAS) select * from mytab where 1=2.
CREATE TABLE IF NOT EXISTS tmp.{final_table_name}
As select * from tmp.temp_table where 1=2
WITH (
FORMAT = 'ORC'
partitioned by('datestr')
)
Like will create an empty table with exact same definition. CTAS will use same column sequence, data type/length, the sql, and your definition to create new empty table because we are using 1=2.

How to create partitioned table from other tables in Amazon Athena?

I am looking to create a table from an existing table in Amazon Athena. The existing table is partitioned on partition_0, partition_1, and partition_2 (all strings) and I would like this partition to carry over. Here is my code:
CREATE TABLE IF NOT EXISTS newTable
AS
Select x, partition_0, partition_1, partition_2
FROM existingTable T
PARTITIONED BY (partition_0 string, partition_1 string, partition_2 string)
Trying to run this gives me an error at the FROM line, saying "mismatched input 'by'. expecting: '(', ',',".... Status code: 400; error code:invalidrequestexception
Not sure what syntax I am missing here.
This is the syntax for creating new tables:
CREATE TABLE new_table
WITH (
format = 'parquet',
external_location = 's3://example-bucket/output/',
partitioned_by = ARRAY['partition_0', 'partition_1', 'partition_2'])
AS SELECT * FROM existing_table
See the documentation for more examples: https://docs.aws.amazon.com/athena/latest/ug/ctas-examples.html#ctas-example-partitioned

I was not able to insert data(which is declared by set) into day partitioned table

I have a table which is partitioned by day .I tried inserting the data by setting
set hivevar:ds=2018-12-01;
then using ** INSERT OVERWRITE table XTABLE partition(day='${hivevar:ds}') **
which is working fine
but when i do like below
set hivevar:pd=date_add('${hivevar:ds}',-1);
then ** INSERT OVERWRITE table XTABLE partition(day='${hivevar:pd}') **
it is throwing error. i think problem is because of extra quotes but not able to find how to solve.
error is :
cannot recognize input near ''date_add('' '2018' '-' in constant
MYCODE:
set hivevar:ds=2018-12-01;
set hivevar:pd=date_add('${hivevar:ds}',-1);
set hive.exec.dynamic.partition.mode=nonstrict;
CREATE TABLE IF NOT EXISTS XTABLE (emp_id BIGINT, start_time STRING, end_time STRING)
PARTITIONED BY(day STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
--THIS IS WORKING FINE
INSERT OVERWRITE table XTABLE partition(day='${hivevar:ds}')
select distinct d.emp_id, d.start_time, d.end_time from
(
select emp_id, start_time, end_time from XTABLE where day='${hivevar:ds}'
) d;
--THIS IS THROWING AN ERROR cannot recognize input near ''date_add('' '2018' '-' in constant
--SEEMS PROBLEM IS WHILE SETTING THE VARIABLE
INSERT OVERWRITE table XTABLE partition(day='${hivevar:pd}')
select distinct d.emp_id, d.start_time, d.end_time from
(
select emp_id, start_time, end_time from XTABLE where day='${hivevar:pd}'
) d;
if success it should give message like below:
Loading data to table xtable partition (day=2018-12-01)
Currently you are trying to insert using static partition with function in it's specification. You can use dynamic partition insert, providing partition in the dataset:
set hivevar:ds=2018-12-01;
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT OVERWRITE table XTABLE partition(day)
select distinct d.emp_id, d.start_time, d.end_time from
(
select emp_id, start_time, end_time, day --partition present in dataset, also it can be date_sub('${hivevar:ds}',1) as day
from XTABLE where day=date_sub('${hivevar:ds}',1);
) d;
This will work, but it may cause table full scan because partition pruning does not work with functions. So, the best solution is to calculate date-1 day in the shell and pass it as a parameter inside HQL script:
ds=$(date +"%Y-%m-%d" --date " -1 day")
hive --hiveconf ds="$ds" -f your_script.hql
And inside your script use '${hiveconf:ds}'
#saicharan You cannot add a function while setting a variable.
I had faced a similar issue.
set hivevar:ds='should always have a static value'
To solve this issue, you need to create a simple script as below:
ds=`date -d "+1 day" +"%Y-%m-%d"`
echo $ds
hive --hivevar ds="${ds}" -e "INSERT OVERWRITE table XTABLE partition(day='${hivevar:ds}') "
This should solve your problem. Let me know if it works.

Getting Error 10293 while inserting a row to a hive table having array as one of the fileds

I have a hive table created using the following query:
create table arraytbl (id string, model string, cost int, colors array <string>,size array <float>)
row format delimited fields terminated by ',' collection items terminated by '#';
Now , while trying to insert a row:
insert into mobilephones values
("AA","AAA",5600,colors("red","blue","green"),size(5.6,4.3));
I get the following error:
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values Expression of type TOK_FUNCTION not supported in insert/values
How can I resolve this issue?
The syantax to enter values in complex datatype if kinda bit weird, however this is my personal opinion.
You need a dummy table to insert values into hive table with complex datatype.
insert into arraytbl select "AA","AAA",5600, array("red","blue","green"), array(CAST(5.6 AS FLOAT),CAST(4.3 AS FLOAT)) from (select 'a') x;
And this is how it looks after insert.
hive> select * from arraytbl;
OK
AA AAA 5600 ["red","blue","green"] [5.6,4.3]

How do I insert data into a table containg struct data type in BQ

While trying to insert data into the following table, I get the following error message.
--create table mydataset.struct_1(x struct<course string,id int64>)
insert into `mydataset.struct_1` (course,id) values("B.A",12)
Error: Column course is not present in table mydataset.struct_1 at [2:35]
-- CREATE TABLE mydataset.struct_1(x STRUCT<course STRING,id INT64>)
INSERT INTO `mydataset.struct_1` (x) VALUES(STRUCT("B.A",12))
If you want to create STRUCT with a nested STRUCT named x with two fields y and z you should do this:
STRUCT<x STRUCT<y INT64, z INT64>>
So in your example:
create table mydataset.struct_1(STRUCT<x STRUCT<course string,id int64>>)
CREATE TABLE STRUCT_1 (x STRUCT<course: STRING,id: int>)
comment 'demonstrating how to work-around to insert complex
datatype unnested structs into a complex table '
Stored as parquet
location '/user/me/mestruct'
tblproperties ('created date: '=' 2019/01','done by: '='me');
Now lets do the insert.
insert into table STRUCT_1 select named_struct("course","B.A","id",12) from (select 't') s;