Dynamic partitioning with dervied column

Dynamic partitioning with dervied column - hive

I am trying to insert data into a Hive table through Dynamic partitioning the table is
CREATE EXTERNAL TABLE target_tbl_wth_partition(
booking_id string,
code string,
txn_date timestamp,
logger string,
)
partition by (txn_date date,txn_hour int)
Values
txn_date=20160216
txn_hour=12
CREATE EXTERNAL TABLE stg_target_tbl_wth_partition(
booking_id string,
code string,
txn_date timestamp,
logger string,
)
insert overwrite table target_tbl_wth_partition partition(txn_date,hour(txn_date))
select booking_id,code,txn_date,logger from stg_target_tbl_wth_partition;
I am not able to insert with derived columns in Dynamic partition. Any help on how to proceed with such case will be helpful.
Regards,
Rakesh

I suggest you start from something like that...
CREATE TABLE blahblah (...)
PARTITIONED BY (aaa STRING, bbb STRING)
;
SET hive.exec.dynamic.partition = true
;
SET hive.exec.dynamic.partition.mode = nonstrict
;
INSERT INTO TABLE blahblah PARTITION (aaa, bbb)
SELECT ...,
SUBSTRING(aaabbb,1,5) as aaa,
SUBSTRING(aaabbb,7,2) as bbb
FROM sthg
;
...and make it work; then you can start experimenting some weird and unsupported syntax and see what works and what does not.

Related

Copying the structure of temp hive table to new table and adding additional table properties

I want to copy the structure of full load temp table and add the addition table properties like partitioned by (partition_col), Format='ORC'
Temp table :
Create table if not exists tmp.temp_table( id int,
name string,
datestr string )
temp table got created.
Final table :
CREATE TABLE IF NOT EXISTS tmp.{final_table_name} (
LIKE tmp.temp_table
)
WITH (
FORMAT = 'ORC'
partitioned by('datestr')
)
But I am getting the error as "Error: Error while compiling statement: FAILED: ParseException line 1:63 missing EOF at 'WITH' near 'temp_table' (state=42000,code=40000)"
Any solution to achieve this functionality.

You should not use like and instead use create table as (CTAS) select * from mytab where 1=2.
CREATE TABLE IF NOT EXISTS tmp.{final_table_name}
As select * from tmp.temp_table where 1=2
WITH (
FORMAT = 'ORC'
partitioned by('datestr')
)
Like will create an empty table with exact same definition. CTAS will use same column sequence, data type/length, the sql, and your definition to create new empty table because we are using 1=2.

String is too long and would be truncated

Query:
CREATE TABLE SRC(SRC_STRING VARCHAR(20))
CREATE OR REPLACE TABLE TGT(tgt_STRING VARCHAR(10))
INSERT INTO SRC VALUES('JKNHJYGHTFGRTYGHJ')
INSERT INTO TGT(TGT_STRING) SELECT SRC_STRING::VARCHAR(10) FROM SRC
Error: String 'JKNHJYGHTFGRTYGHJ' is too long and would be truncated
Is there any way we can enable enforce length(not for COPY command) while inserting data from high precision to low precision column?

I'd recommend using the SUBSTR( ) function, to pick the piece of data you want, example as follows where I take the first 10 characters (if available, if there were only 5 it'd use those 5 characters).
CREATE OR REPLACE TEMPORARY TABLE SRC(
src_string VARCHAR(20));
CREATE OR REPLACE TEMPORARY TABLE TGT(
tgt_STRING VARCHAR(10));
INSERT INTO src
VALUES('JKNHJYGHTFGRTYGHJ');
INSERT INTO tgt(tgt_string)
SELECT SUBSTR(src_string, 1, 10)
FROM SRC;
SELECT * FROM tgt; --JKNHJYGHTF
Here's the documentation on the function:
https://docs.snowflake.com/en/sql-reference/functions/substr.html

How do I create a variable/parameter that is a string of values in SQL SSMS that I can use as a substitute in my where clause?

This may be a very basic question, but I have been struggling with this.
I have a SSMS query that I'll be using multiple times for a large set of client Ids. Its quite cumbersome to have to amend the parameters in all the where clauses every time I want to run it.
For simplicity, I want to convert a query like the one below:
SELECT
ID,
Description
From TestDb
Where ID in ('1-234908','1-345678','1-12345')
to a query of the format below so that I only need to change my variable field once and it can be applied across my query:
USE TestDb
DECLARE #ixns NVARCHAR(100)
SET #ixns = '''1-234908'',''1-345678'',''1-12345'''
SELECT
ID,
Description
From TestDb
Where ID IN #ixns
However, the above format doesn't work. Can anyone help me on how I can use a varchar/string variable in my "where" clause for my query so that I can query multiple IDs at the same time and only have to adjust/set my variable once?
Thanks in advance :D

The most appropriate solution would be to use a table variable:
DECLARE #ixns TABLE (id NVARCHAR(100));
INSERT INTO #ixns(id) VALUES
('1-234908'),
('1-345678'),
('1-12345');
SELECT ID, Description
FROM TestDb
WHERE ID IN (SELECT id FROM #ixns);

You can load ids to temp table use that in where condition
USE TestDb
DECLARE #tmpIDs TABLE
(
id VARCHAR(50)
)
insert into #tmpIDs values ('1-234908')
insert into #tmpIDs values ('1-345678')
insert into #tmpIDs values ('1-12345')
SELECT
ID,
Description
From TestDb
Where ID IN (select id from #tmpIDs)

The most appropriate way is to create a table type because it is possible to pass this type as parameters.
1) Creating the table type with the ID column.
create type MyListID as table
(
Id int not null
)
go
2) Creating the procedure that receives this type as a parameter.
create procedure MyProcedure
(
#MyListID as MyListID readonly
)
as
select
column1,
column2
...
from
MyTable
where
Id in (select Id from #MyListID)
3) In this example you can see how to fill this type through your application ..: https://stackoverflow.com/a/25871046/8286724

MACRO to create a table in SQL

Hi everyone thanks so much for taking the time to read this.
I'd like to create a macro in Teradata that will create a table from another table based on specific parameters.
My original table consists of three columns patient_id, diagnosis_code and Date_of_birth
......
I'd like to build a macro that would allow me to specify a diagnosis code and it would then build the table consisting of data of all patients with that diagnosis code.
My current code looks like this
Create Macro All_pats (diag char) as (
create table pats as(
select *
from original_table
where diag = :diagnosis_code;)
with data primary index (patid);
I cant seem to get this to work - any tips?
Thanks once again

Your code has a semicolon in a wrong place and a missing closing bracket:
Create Macro All_pats (diag char) as (
create table pats as
(
select *
from original_table
where diag = :diagnosis_code
) with data primary index (patid);
);
Edit:
Passing multiple values as a delimited list is more complicated (unless you use Dynamic SQL in a Stored Procedure):
REPLACE MACRO All_lpats (diagnosis_codes VARCHAR( 1000)) AS
(
CREATE TABLE pats AS
(
SELECT *
FROM original_table AS t
JOIN TABLE (StrTok_Split_To_Table(1, :diagnosis_codes, ',')
RETURNS (outkey INTEGER,
tokennum INTEGER,
token VARCHAR(20) CHARACTER SET Unicode)
) AS dt
ON t.diag = dt.token
) WITH DATA PRIMARY INDEX (patid);
);
EXEC All_lpats('111,112,113');
As the name implies StrTok_Split_To_Table splits a delimited string into a table. You might need to adust the delimiter and the length of the resulting token.

Hive partition with wildcard

I am very new to partition.
Suppose I have the following table
table mytable(mytime timestamp, myname string)
where the column mytime is like this: year-month-day hour:min:sec.msec (for example,2014-12-05 08:55:59.3131)
I want to partition mytable based on year-month-day of mytime
For example,I want to make a partition for 2014-12-05
The record which has mytime like 2014-12-05 08:55:59,3131 will be in this partition.
So the query like select * from mytable where mytime='2014-12-05%' will search the
partition.
How can I do that in hive?
I already have data in mytable, do I need to recreate mytable and reload all the data?
Thank you

input
1997-12-31 23:59:59.999,kishore
2014-12-31 23:59:59.999999,manish
create table mytable_tmp(mytime string,myname string)
row format delimited
fields terminated by ',';
load data local inpath 'input.txt'
overwrite into table mytable_tmp;
create table mytable(myname string,mytimestamp string)
PARTITIONED BY (mydate string)
row format delimited
fields terminated by ',';
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
INSERT OVERWRITE TABLE mytable PARTITION(mydate)
SELECT myname,mytime,to_date(mytime) from mytable_tmp;
select * from mytable where mydate='2014-12-31';
manish 2014-12-31 23:59:59.999999 2014-12-31
there is partition mydate which include myname and mytime according to your problem;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Dynamic partitioning with dervied column - hive

Related

Copying the structure of temp hive table to new table and adding additional table properties

String is too long and would be truncated

How do I create a variable/parameter that is a string of values in SQL SSMS that I can use as a substitute in my where clause?

MACRO to create a table in SQL

Hive partition with wildcard

Categories

Resources