I'm trying to create a volatile table in SQL with an ORDER BY and I get an error.
CREATE VOLATILE TABLE orderd_dates AS
(SELECT * FROM date_table
ORDER BY id_date)
with data primary index (id_date) on commit preserve rows;
The error is: ORDER BY is not allowed in subqueries.
If I can't use order by, how can I create a volatile table that's ordered?
SQL tables are inherently unordered. You need to explicitly use an order by clause when querying the table, not when creating it.
You could add TOP 100 PERCENT to allow the ORDER BY, but the table would still be unordered, because a table is internally ordered by the Hash of the Primary Index. And if you use a NO PRIMARY INDEX TABLE and it would actually be stored in the specified order the optimizer wouldn't know about it.
The closest thing you can get is to PARTITION BY RANGE_N(id_date BETWEEN DATE '2000-01-01' AND DATE '2050-12-31' EACH INTERVAL '1' DAY:
CREATE VOLATILE TABLE orderd_dates AS
(SELECT * FROM date_table
)
WITH DATA
PRIMARY INDEX (id_date)
PARTITION BY Range_N(id_date BETWEEN DATE '2000-01-01'
AND DATE '2050-12-31' EACH INTERVAL '1' DAY)
ON COMMIT PRESERVE ROWS;
Related
I have below scenario.
Suppose I have a table which has 3 partition. one is for 20190201 next is 20190202 and one is for 20190210.
I have been given requirement. whichever date we pass automatic partition should be created.
so if I am using dynamic sql I am able to create partition after the max partition for eg 20190211. but if I want to create partition for 20190205 it is giving error.
Is there anyway to create the partition at run time without data loss even when max partition exist.
We have been told not to create interval partitioning
this is very simple.
while creating the table itself use interval partition on the date column.
you can choose the partition interval as hour/day/month whichever you like.
so any time you insert a new data to the table based on the date value the data will go to correct partition or create a new partition.
use the below syntax in your table while creating..
partition by range ( date_col )
interval ( NUMTODSINTERVAL(1,'day') )
( partition p1 values less then ( date '2016-01-01' ))
Not sure if this is possible in PostgreSQL 9.3+, but I'd like to create a unique index on a non-unique column. For a table like:
CREATE TABLE data (
id SERIAL
, day DATE
, val NUMERIC
);
CREATE INDEX data_day_val_idx ON data (day, val);
I'd like to be able to [quickly] query only the distinct days. I know I can use data_day_val_idx to help perform the distinct search, but it seems this adds extra overhead if the number of distinct values is substantially less than the number of rows in the index covers. In my case, about 1 in 30 days is distinct.
Is my only option to create a relational table to only track the unique entries? Thinking:
CREATE TABLE days (
day DATE PRIMARY KEY
);
And update this with a trigger every time we insert into data.
An index can only index actual rows, not aggregated rows. So, yes, as far as the desired index goes, creating a table with unique values like you mentioned is your only option. Enforce referential integrity with a foreign key constraint from data.day to days.day. This might also be best for performance, depending on the complete situation.
However, since this is about performance, there is an alternative solution: you can use a recursive CTE to emulate a loose index scan:
WITH RECURSIVE cte AS (
( -- parentheses required
SELECT day FROM data ORDER BY 1 LIMIT 1
)
UNION ALL
SELECT (SELECT day FROM data WHERE day > c.day ORDER BY 1 LIMIT 1)
FROM cte c
WHERE c.day IS NOT NULL -- exit condition
)
SELECT day FROM cte;
Parentheses around the first SELECT are required because of the attached ORDER BY and LIMIT clauses. See:
Combining 3 SELECT statements to output 1 table
This only needs a plain index on day.
There are various variants, depending on your actual queries:
Optimize GROUP BY query to retrieve latest row per user
Unused index in range of dates query
Select first row in each GROUP BY group?
More in my answer to your follow-up querstion:
Counting distinct rows using recursive cte over non-distinct index
I would like to know how to order a table by date(the oldest date first) after the insert of a new value in the same table using a trigger.
Thank you
RDBMSs (or at least the common ones) have no notion of ordering a table - a table is just a bunch of rows, which the database may return in any arbitrary order.
Your way to control this order is to explicitly declare what order you want them with an order by clause in your query:
SELECT *
FROM my_table
ORDER BY last_modification_time
Here is my SQL query:
select * from TABLE T where ROWNUM<=100
If i execute this and then re-execute this, I don't get the same result. Why?
Also, on a sybase system if i execute
set rowcount 100
select * from TABLE
even on re-execution i get the same result?
Can someone explain why? and provide possible solution for RowNum
Thanks
If you don't use ORDER BY in your query you get the results in natural order.
Natural order is whatever is fastest for the database at the moment.
A possible solution is to ORDER BY your primary key, if it's an INT
SELECT TOP 100 START AT 0 * FROM TABLE
ORDER BY TABLE.ID;
If your primary key is not a sequentially incrementing integer and you don't have another column to order by (such as a timestamp) you may need to create an extra column SORT_ORDER INT and increment in automatically on insert using either an Autoincrement column or a sequence and an insert trigger, depending on the database.
Make sure to create an index on that column to speed up the query.
You need to specify an ORDER BY. Queries without explicit ORDER BY clause make no guarantee about the order in which the rows are returned. And from this result set you take the first 100 rows. As the order in which the rows can be different every time, so can be your first 100 rows.
You need to use ORDER BY first, followed by ROWNUM. You will get inconsistent results if you don't follow this order.
select * from
(
select * from TABLE T ORDER BY rowid
) where ROWNUM<=100
I've just restructured my database to use partitioning in Postgres 8.2. Now I have a problem with query performance:
SELECT *
FROM my_table
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100;
There are 45 million rows in the table. Prior to partitioning, this would use a reverse index scan and stop as soon as it hit the limit.
After partitioning (on time_stamp ranges), Postgres does a full index scan of the master table and the relevant partition and merges the results, sorts them, then applies the limit. This takes way too long.
I can fix it with:
SELECT * FROM (
SELECT *
FROM my_table_part_a
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100) t
UNION ALL
SELECT * FROM (
SELECT *
FROM my_table_part_b
WHERE time_stamp >= '2010-02-10' and time_stamp < '2010-02-11'
ORDER BY id DESC
LIMIT 100) t
UNION ALL
... and so on ...
ORDER BY id DESC
LIMIT 100
This runs quickly. The partitions where the times-stamps are out-of-range aren't even included in the query plan.
My question is: Is there some hint or syntax I can use in Postgres 8.2 to prevent the query-planner from scanning the full table but still using simple syntax that only refers to the master table?
Basically, can I avoid the pain of dynamically building the big UNION query over each partition that happens to be currently defined?
EDIT: I have constraint_exclusion enabled (thanks #Vinko Vrsalovic)
Have you tried Constraint Exclusion (section 5.9.4 in the document you've linked to)
Constraint exclusion is a query
optimization technique that improves
performance for partitioned tables
defined in the fashion described
above. As an example:
SET constraint_exclusion = on;
SELECT count(*) FROM measurement WHERE logdate >= DATE '2006-01-01';
Without
constraint exclusion, the above query
would scan each of the partitions of
the measurement table. With constraint
exclusion enabled, the planner will
examine the constraints of each
partition and try to prove that the
partition need not be scanned because
it could not contain any rows meeting
the query's WHERE clause. When the
planner can prove this, it excludes
the partition from the query plan.
You can use the EXPLAIN command to
show the difference between a plan
with constraint_exclusion on and a
plan with it off.
I had a similar problem that I was able fix by casting conditions in WHERE.
EG: (assuming the time_stamp column is timestamptz type)
WHERE time_stamp >= '2010-02-10'::timestamptz and time_stamp < '2010-02-11'::timestamptz
Also, make sure the CHECK condition on the table is defined the same way...
EG:
CHECK (time_stamp < '2010-02-10'::timestamptz)
I had the same problem and it boiled down to two reasons in my case:
I had indexed column of type timestamp WITH time zone and partition constraint by this column with type timestamp WITHOUT time zone.
After fixing constraints ANALYZE of all child tables was needed.
Edit: another bit of knowledge - it's important to remember that constraint exclusion (which allows PG to skip scanning some tables based on your partitioning criteria) doesn't work with, quote: non-immutable function such as CURRENT_TIMESTAMP
I had requests with CURRENT_DATE and it was part of my problem.