drop range partition in oracle between two dates - sql

in Teradata you can do something like:
DROP RANGE BETWEEN DATE FROM_DATE AND DATE TO_DATE, EACH INTERVAL '1' DAY;
Is there an equivalent way of doing this in Oracle? dropping a r

DROP RANGE BETWEEN DATE FROM_DATE AND DATE TO_DATE, EACH INTERVAL '1' DAY;
In Oracle, you could use PARTITION FOR clause.
For example,
alter table table_name drop partition for (TO_DATE('some date','date format'))
I think one important thing needs to be kept in mind, you cannot drop the last partition, i.e. the first partition. It would throw an error:
ORA-14758: Last partition in the range section cannot be dropped
Since you have not mentioned your exact Oracle version, I am sharing this 11gR2 documentation about Dropping partitions.

Related

Performing Date Math on Hive Partition Columns

My data is partitioned by day in the standard Hive format:
/year=2020/month=10/day=01
/year=2020/month=10/day=02
/year=2020/month=10/day=03
/year=2020/month=10/day=04
...
I want to query all data from the last 60 days, using Amazon Athena (IE: Presto). I want this query to use the partitioned columns (year, month, day) so that only the necessary partition files are scanned. Assuming I can't change the file partition format, what is the best approach to this problem?
You don't have to use year, month, day as the partition keys for the table. You can have a single partition key called date and add the partitions like this:
ALTER TABLE the_table ADD
PARTITION (`date` = '2020-10-01') LOCATION 's3://the-bucket/data/year=2020/month=10/day=01'
PARTITION (`date` = '2020-10-02') LOCATION 's3://the-bucket/data/year=2020/month=10/day=02'
...
With this setup you can even set the type of the partition key to date:
PARTITIONED BY (`date` date)
Now you have a table with a date column typed as a DATE, and you can use any of the date and time functions to do calculations on it.
What you won't be able to do with this setup is use MSCK REPAIR TABLE to load partitions, but you really shouldn't do that anyway – it's extremely slow and inefficient and really something you only do when you have a couple of partitions to load into a new table.
An alternative way to that proposed by Theo, is to use the following syntax, e.g.:
select ... from my_table where year||month||day between '2020630' and '20201010'
this works when the format for the columns year, month and day are string. It's particularly useful to query across months.

Creating a daily Oracle partition

Creating oracle partition for a table for the every day.
ALTER TABLE TAB_123 ADD PARTITION PART_9999 VALUES LESS THAN ('0001') TABLESPACE TS_1
Here I am getting error because value is decreased as 0001 as lower boundary.
You can have Oracle automatically create partitions by using the PARTITION BY RANGE option.
Sample DDL, assuming that the partition key is column my_date_column :
create table TAB_123
( ... )
partition by range(my_date_column) interval(/*numtoyminterval*/ NUMTODSINTERVAL(1,'day'))
( partition p_first values less than (to_date('2010-01-01', 'yyyy-mm-dd')) tablespace ts_1)
;
With this set up in place, Oracle will, if needed, create a partition on the fly when you insert data into the table. It is also usually a good idea to create a default partition, as shown above.
This naming convention (last digit of year plus day number) won't support holding more than ten years worth of data. Maybe you think that doesn't matter but I know databases which are well into their second decade. Be optimistic!
Also, that key is pretty much useless for querying. Most queries against partitioned tables want to get the benefit of partition elimination. But that only' works if the query uses the same value as the partition key. Developers really won't want to be casting a date to YDDD format every time they write a select on the table.
So. Use an actual date for defining the partition key and hence range. Also for naming the partition if it matters that much.
ALTER TABLE TAB_123
ADD PARTITION P20200101 VALUES LESS THAN (date '2020-01-02') TABLESPACE TS_1
/
Note that the range is defined by less than the next day. Otherwise the date of the partition name won't align with the date of the records in the actual partition.

Creating dynamic partition in Range partitioning

I have below scenario.
Suppose I have a table which has 3 partition. one is for 20190201 next is 20190202 and one is for 20190210.
I have been given requirement. whichever date we pass automatic partition should be created.
so if I am using dynamic sql I am able to create partition after the max partition for eg 20190211. but if I want to create partition for 20190205 it is giving error.
Is there anyway to create the partition at run time without data loss even when max partition exist.
We have been told not to create interval partitioning
this is very simple.
while creating the table itself use interval partition on the date column.
you can choose the partition interval as hour/day/month whichever you like.
so any time you insert a new data to the table based on the date value the data will go to correct partition or create a new partition.
use the below syntax in your table while creating..
partition by range ( date_col )
interval ( NUMTODSINTERVAL(1,'day') )
( partition p1 values less then ( date '2016-01-01' ))

Error when I try to create ordered SQL table

I'm trying to create a volatile table in SQL with an ORDER BY and I get an error.
CREATE VOLATILE TABLE orderd_dates AS
(SELECT * FROM date_table
ORDER BY id_date)
with data primary index (id_date) on commit preserve rows;
The error is: ORDER BY is not allowed in subqueries.
If I can't use order by, how can I create a volatile table that's ordered?
SQL tables are inherently unordered. You need to explicitly use an order by clause when querying the table, not when creating it.
You could add TOP 100 PERCENT to allow the ORDER BY, but the table would still be unordered, because a table is internally ordered by the Hash of the Primary Index. And if you use a NO PRIMARY INDEX TABLE and it would actually be stored in the specified order the optimizer wouldn't know about it.
The closest thing you can get is to PARTITION BY RANGE_N(id_date BETWEEN DATE '2000-01-01' AND DATE '2050-12-31' EACH INTERVAL '1' DAY:
CREATE VOLATILE TABLE orderd_dates AS
(SELECT * FROM date_table
)
WITH DATA
PRIMARY INDEX (id_date)
PARTITION BY Range_N(id_date BETWEEN DATE '2000-01-01'
AND DATE '2050-12-31' EACH INTERVAL '1' DAY)
ON COMMIT PRESERVE ROWS;

BigQuery table partitioning by month

I can't find any documentation relating to this. Is time_partitioning_type=DAY the only way to partition a table in BigQuery? Can this parameter take any other values besides a date?
Note that even if you partition on day granularity, you can still write your queries to operate at the level of months using an appropriate filter on _PARTITIONTIME. For example,
#standardSQL
SELECT * FROM MyDatePartitionedTable
WHERE DATE_TRUNC(EXTRACT(DATE FROM _PARTITIONTIME), MONTH) = '2017-01-01';
This selects all rows from January of this year.
Unfortunately not. BigQuery currently only supports date-partitioned tables.
https://cloud.google.com/bigquery/docs/partitioned-tables
BigQuery offers date-partitioned tables, which means that the table is divided into a separate partition for each date
It seems like this would work:
#standardSQL
CREATE OR REPLACE TABLE `My_Partition_Table`
PARTITION BY event_month
OPTIONS (
description="this is a table partitioned by month"
) AS
SELECT
DATE_TRUNC(DATE(some_event_timestamp), month) as event_month,
*
FROM `TableThatNeedsPartitioning`
For those that run into the error "Too many partitions produced by query, allowed 4000, query produces at least X partitions", due to the 4000 partitions BigQuery limit as of 2023.02, you can do the following:
CREATE OR REPLACE TABLE `My_Partition_Table`
PARTITION BY DATE_TRUNC(date_column, MONTH)
OPTIONS (
description="This is a table partitioned by month"
) AS
-- Your query
Basically, take #david-salmela 's answer, but move the DATE_TRUNC part to the PARTITION BY section.
It seems to work exactly like PARTITION BY date_column in terms of querying the table (e.g. WHERE date_column = "2023-02-20"), but my understanding is that you always retrieve data for a whole month in terms of cost.