MIGRATE PIVOT ASTER FUNCTION TO HIVE FUNCTION - hive

I have a existing aster script need to migrate into hive language. But i dont understand how the script define the value into the column. I dont know how to change the function so that it can be execute on hive.Anyone can help me? Here are the script.
create table coll_ledger_cashfprofile_ta distribute by hash(it_ref_no) as (
SELECT * FROM pivot (
ON coll_ledger_cashfprofile_tb
PARTITION BY it_ref_no, file_type, comb_id, startyear
ORDER BY moorder
Partitions ('it_ref_no', 'file_type', 'comb_id', 'startyear')
Rows (45)
Metrics ('cum_cashflow', 'relcum_cf', 'relcum_cinf')
));

Related

How to create an auto increment id column using Spark (pure SQL no Python)

How can I generate an ID number in Spark SQL? In the python interface, Spark has the monotonically_increasing_id() function. But I do not know how to realize this function in SQL grammar. I want to create a table from an old table with some string operations and establish an id column for the new table.
This does the trick I hope:
ROW_NUMBER() over (ORDER BY myColumn)

Bigquery UDF that returns latest partition

I'm trying to build a BigQuery UDF that returns the latest partition DATE for a partitioned table, given the dataset name and table name as parameters to the UDF. I cannot use BQ scripting to get latest partition since I need to save my final query as a view definition (and views don't support scripting).
The UDF right now returns an error message 'Not found: Dataset my-project-id:dataset_name was not found in location northamerica-northeast1'.
The error message makes sense, but I don't want to hard-code my actual dataset's name in the UDF. How do I get around this problem ?
CREATE FUNCTION `my-project-id.test_dataset`.get_latest_partition(dataset_name STRING, tab_name STRING)
RETURNS DATE
AS (
(SELECT PARSE_DATE('%Y%m%d', MAX(partition_id)) FROM dataset_name.INFORMATION_SCHEMA.PARTITIONS WHERE table_name = tab_name)
)
It's not possible to pass dynamically the name of the dataset inside the function as you do with tab_name because it's completing the ".INFORMATION_SCHEMA". You can't concatenate or scripting (format, declare) in that case.

BigQuery CREATE TABLE with dynamic table name [duplicate]

I am trying to write a BigQuery script that I can store as a procedure, I would like one of the arguments I pass to be used in the table name that is written out by the script, for example:
DECLARE id STRING;
SET id = '123';
CREATE OR REPLACE TABLE test.id AS(
SELECT * FROM dataset.table
)
However, in this example the table is created with the name id rather than the value of the "id" variable, 123. Is there any way I can dynamically create a table using the value of a declared variable in the BigQuery UI?
Why not just use Execute Immediate with concat if you know the table schema?
EXECUTE IMMEDIATE CONCAT('CREATE TABLE `', id, '` (column_name STRING)');
So far we have officially announced BigQuery scripting document, which is still in Beta phase, leveraging usage of dynamic parameters (variables) as a placeholders for values in SQL queries . However, according to Parameterized queries in BigQuery documentation, query parameters can't be used for SQL object identifiers:
Parameters cannot be used as substitutes for identifiers, column
names, table names, or other parts of the query.
Maybe you can use a wildcard table. You would create a wildcard table with all subtables you want to query and use the WHERE clause to select any subtable you want. Just be careful, the tables must have the same schema.

SSIS ( SQL Server Integration Services ) & Teradata Volatile Tables . ( Teradata SQL tuning )

I would like to understand what Modalities SQL server Integration service ( SSIS ) uses to connect to Teradata 14
ODBC
.NEt
OLE DB
These ones or more / less than these. My main question is HOW do I IMPLEMENT teradata volatile table create syntax in SSIS Package. Which of these above support it and how is it done ? Thank You
As an alternative to using a sql task, you could use a global temporary table . That way you still have a session specific temp table, but you don't have to create it on the fly.
shouldn't it be as easy as an SQL task with the table sql?
CREATE VOLATILE TABLE table_1
(
column1 datatype,
column2 datatype,
.
.
columnN datatype
) ON COMMIT PRESERVE ROWS;

Oracle SQL use variable partition name

I run a daily report that has to query another table which is updated separately. Due to the high volume of records in the source table (8M+ per day) each day is stored in it's own partition. The partition has a standard format as P ... 4 digit year ... 2 digit month ... 2 digit date, so yesterday's partition is P20140907.
At the moment I use this expression, but have to manually change the name of the partition each day:
select * from <source_table> partition (P20140907) where ....
By using sysdate, toChar and Concat I have created another table called P_NAME2 that will automatically generate and update a string value as the name of the partition that I need to read. Now I need to update my main query so it does this:
select * from <source_table> partition (<string from P_NAME2>) where ....
You are working too hard. Oracle already does all these things for you. If you query the table using the correct date range oracle will perform the operation only on the relevant partitions - this is called pruning .
I suggest reading the docs on that.
If you'r still skeptic, Query all_tab_partitions.HIGH_VALUE to get each partitions high value (the table you created ... ).
I thought I'd pop back to share how I solved this in the end. The source database has a habit of leaking dates across partitions which is why queries for one day were going outside a single partition. I can't affect this, just work around it ...
begin
execute immediate
'create table LL_TEST as
select *
from SCHEMA.TABLE Partition(P'||TO_CHAR(sysdate,'YYYYMMDD')||')
where COLUMN_A=''Something''
and COLUMN_B=''Something Else''
';
end
;
Using the PL/SQL script I create the partition name with TO_CHAR(sysdate,'YYYYMMDD') and concatenate the rest of the query around it.
Note that the values you are searching for in the where clause require double apostrophes so to send 'Something' to the query you need ''Something'' in the script.
It may not be pretty, but it works on the database that I have to use.