Table partitioning with procedure input parameter - sql

I'm trying to partitioning my table on ID which I got from procedure parameter.
For example my table ddl:
CREATE TABLE bigtable
( ID number )
As input procedure parameter I got eg. number: 130 , So I'm trying to create partition:
Alter table bigtable
add partition part_random_number values(random number);
Of course as random number I mean eg. 120,56 etc : )
But I got an error that object is not partitioned. So I tried to first defined partition clause in crate table statement:
CREATE TABLE bigtable
( ID number )
PARTITION BY list (ID)
But i doesn't work, It works when I defined some partition eg.
CREATE TABLE bigtable
( ID number )
PARTITION BY list (ID)
( partition type values(130);
)
But I would like to avoid it... Is there any other solution?
As result I would like to have table partitioned by procedure input parameterers.

A partitioned table has to have at least one partition. Just create it with a dummy partition and add the ones you actually need using your procedure.

Related

SQL Impala create partitioned table from a view

I am interested in turning a view into a table, but I want the table to be partitioned wrt to one variable:
My query is:
CREATE TABLE table_test AS (
SELECT
*
FROM view_test
And I want to have the table partitioned by the variable "time-period".
Thank you in advance.
You can do this directly like this
CREATE TABLE table_test
PARTITIONED BY (time_period)
as
SELECT col1,col2,
time_period -- Pls make sure this partition column as the last column in SELECT.
FROM schema.view_test;

Partition key failing error for postgres table

ERROR: no partition of relation "test_table" found for row DETAIL:
Partition key of the failing row contains (start_time) = (2021-04-25
00:00:00). SQL state: 23514
I am inserting a data where i have a column start time (2021-04-25 00:00:00)
This is my Schema
CREATE TABLE test_table (
start_time timestamp NULL,
)
PARTITION BY RANGE (start_time);
This sounds as if you have no partition-tables defined for this table.
You might need something like this:
CREATE TABLE test_table_2021 PARTITION OF test_table
FOR VALUES FROM ('2021-01-01') TO ('2022-01-01');
After you defined this partition for your partitioned table, you should be able to insert the data (as long as start_time is anywhen in 2021).
See the docs: https://www.postgresql.org/docs/current/ddl-partitioning.html

Split Hive table on subtables by field value

I have a Hive table foo. There are several fields in this table. One of them is some_id. Number of unique values in this fields in range 5,000-10,000. For each value (in example it 10385) I need to perform CTAS queries like
CREATE TABLE bar_10385 AS
SELECT * FROM foo WHERE some_id=10385 AND other_id=10385;
What is the best way to perform this bunch of queries?
You can store all these tables in the single partitioned one. This approach will allow you to load all the data in single query. Query performance will not be compromised.
Create table T (
... --columns here
)
partitioned by (id int); --new calculated partition key
Load data using one query, it will read source table only once:
insert overwrite table T partition(id)
select ..., --columns
case when some_id=10385 AND other_id=10385 then 10385
when some_id=10386 AND other_id=10386 then 10386
...
--and so on
else 0 --default partition for records not attributed
end as id --partition column
from foo
where some_id in (10385,10386) AND other_id in (10385,10386) --filter
Then you can use this table in queries specifying partition:
select from T where id = 10385; --you can create a view named bar_10385, it will act the same as your table. Partition pruning works fast

Hive table creation with a default value

I have a table in RDBMS like so:
create table test (sno number, entry_date date default sysdate).
Now I want to create a table in hive with a structure as adding a default value to a column.
Hive currently doesn't support the feature of adding default value to any column while creating a table.
As a workaround load data into a temporary table and use the insert overwrite table statement to add the current date and time into the main table.
Create a temporary table:
create table test (sno number);
Load data into the table:
Create final table:
create table final_table (sno number, createDate string);
Finally load the data from temp test table to the final table:
insert overwrite table final_table select sno, FROM_UNIXTIME( UNIX_TIMESTAMP(), 'dd/MM/YYYY' ) from test;
Hive doesn't support DEFAULT fields
Doesn't mean you can't do it, though. Just a two step process of creating one "staging" table, then inserting into a second table and selecting that "default" value.
Adding a default value to a column while creating table in hive
Since you mention,
I've table in RDBMS
You could also use your existing table, and use Sqoop to import the data into Hive.

CREATE TABLE AS select * from partitioned table

I want to create a table using CTAS of partitioned table.
New table must have all the data and partitions, subpartitions of old table.
How to do this?
You need to first create the new table with all the partitions, there is no way you can add partition definitions to a CTAS. Once the table is created you can populate it using insert into .. select.
You can use dbms_metadata.get_ddl to get the definition of the old table.
select dbms_metadata.get_ddl('TABLE', 'NAME_OF_EXISTING_TABLE')
from dual;
Save the output of that into a script, do a search and replace to adjust the table name, then run the create table and then run the insert into ... select ...