PostgreSQL limiting results by year - sql

i have a working PostgreSQL query, column "code" is common in both tables and table test.a has date column and i want to limit search results on year, date format is like ( 2010-08-25 )
SELECT *
FROM test.a
WHERE form IN ('xyz')
AND code IN (
SELECT code
FROM test.city)
any help is appreciated

To return rows with date_col values in the year 2010:
SELECT *
FROM test.a
WHERE form = 'xyz'
AND EXISTS (
SELECT 1
FROM test.city
WHERE code = a.code
)
AND date_col >= '2010-01-01'
AND date_col < '2011-01-01';
This way, the query can use an index on date_col (or, ideally on (form, date_col) or (form, code, date_col) for this particular query). And the filter works correctly for data type date and timestamp alike (you did not disclose data types, the "date format" is irrelevant).
If performance is of any concern, do not use an expression like EXTRACT(YEAR FROM dateColumn) = 2010. While that seems clean and simple to the human eye it kills performance in a relational DB. The left-hand expression has to be evaluated for every row of the table before the filter can be tested. What's more, simple indexes cannot be used. (Only an expression index on (EXTRACT(YEAR FROM dateColumn)) would qualify.) Not important for small tables, crucial for big tables.
EXISTS can be faster than IN, except for simple cases where the query plan ends up being the same. The opposite NOT IN can be a trap if NULL values are involved, though:
Select rows which are not present in other table

If by "limit" you mean "filter", then I can give you an option
SELECT
*
FROM
test_a
WHERE
form IN ('xyz')
AND code IN (
SELECT code
FROM test_city
)
AND EXTRACT(YEAR FROM dateColumn) = 2010;
db-fiddle for you to run and play with it: https://www.db-fiddle.com/f/5ELU6xinJrXiQJ6u6VH5/6

Related

Dynamic start date from specific column in a table (sysdate)

I am pretty new in this field, trying to learn slowly so please be patient with me :)
My database contains a table called t_usage_interval. In this table there is a column name ID_Interval. Each month a new random 10 digit number is created in this column.
This is the query I am using
I would like to find out if there is a way to pull the latest interval by using column name DT_START with SYSDATE option? I guess it would be a dynamic query search from a sysdate to display the latest ID_Interval?
Thank you,
A
This is how I understood the question.
A straightforward query returns row(s) whose dt_start is the first in that table that is lower or equal to sysdate (you might also use trunc(sysdate), if you don't care about time component). Drawback of this query is that it scans t_usage_Interval table twice.
select *
from t_usage_interval a
where a.dt_start = (select max(b.dt_start)
from t_usage_interval b
where b.dt_start <= sysdate
);
Somewhat less intuitive option is to "rank" rows (whose dt_start is lower than sysdate) by dt_start, and then return row(s) that rank the "highest". This option scans the table only once, so it should perform better.
with temp as
(select a.*,
rank() over (order by a.dt_start desc) rn
from t_usage_interval a
where a.dt_start <= sysdate
)
select t.*
from temp t
where t.rn = 1;

Getting peak value of a column in table till this date

I have an oracle table having columns {date, id, profit, max_profit}.
I have data in date and profit, and I want highest value of profit till date in max_profit, I am using query below
UPDATE MY_TABLE a SET a.MAX_PROFIT = (SELECT MAX(b.PROFIT)
FROM MY_TABLE b WHERE b.DATE <= a.DATE
AND a.id = b.id)
This is giving me correct result, but I have millions of rows for which query is taking considerable time, any faster way of doing it ?
You can use a MERGE statement with an analytic function:
MERGE INTO my_table dst
USING (
SELECT ROWID rid,
MAX( profit ) OVER ( PARTITION BY id ORDER BY "DATE" ) AS max_profit
FROM my_table
) src
ON ( src.rid = dst.ROWID )
WHEN MATCHED THEN
UPDATE SET max_profit = src.max_profit;
When you do something like "SELECT MAX(...)" you're going to scan all the records implicated in the 'WHERE" part of the query, so you want to make getting all those records as easy on the database as possible.
Do you have an index on the table that includes the id and date columns?
Depending on the behavior of this application, if you're doing a lot fewer updates/inserts (as opposed to doing a ton of reads during reporting or some other process), a possible performance enhancement might be to keep the value you're storing in the max_profit column up to date somewhere while you're changing the data. Have you considered a separate table that just stores the profit calculation for each possible date?

having dates as column name while executing select statement

I want to display dates as column names while executing select statement, like
select number as sysdate from employee;
But this is not working. How can i do this
SQL queries work on fixed columns, i.e. the result columns (and their names) are known before executing the query.
Of course you can work with column names like "today" and the like:
select
sum(case when mydate = sysdate then value end) as sum_today,
sum(case when mydate = sysdate - 1 then value end) as sum_yesterday
from mytable;
but SQL cannot change the column names.
Anyway, users usually don't work with some geek SQL IDE, but with a program or website written by us. So why bother? Have SQL get you the data and then care about the layout (often with a loop and a grid) in your app.
Simply
select mydate, sum(value) as sum_value
from mytable
group by mydate
order by mydate;
and do the rest in your app.

Filling gaps in DATE fiel

I am querying a DATE field:
SELECT DATE ,
FIELD2 ,
FIELD3
into Table_new
FROM Table_old
WHERE (criteria iLIKE '%xxxyyy%')
The DATE field runs from 10/1/2010 to present, but it has missing days along the way. When I export the data (in Tableau, for example), I need the data to line up with a calendar that DOES NOT have any missing dates. This means I need a space/holder for a date, even if no data exists for that date in the query. How can I achieve this?
Right now I am exporting the data, and manually creating a space where no data for a date exists, which is extremely inefficient.
Tableau can do this natively. No need to alter your data set. You just need to make sure that your DATE field is of the date type in Tableau and then show emptycolumns/rows.
My test data:
Before I show empty columns:
How I show empty columns:
After I show empty columns (end result):
If you want to then restrict those dates, you can add the date field to the filter, select your date range, and Apply to Context.
In Postgres, you can easily generate the dates:
select mm.date, t.field1, t.field2
from (select generate_series(mm.mindate, mm.maxdate, interval '1 day') as date
from (select min(date) as mindate, max(date) as maxdate
from table_old
where criteria ilike '%xxxyyy%'
) mm
) d left join
table_old t
on t.date = mm.date and
criteria ilike '%xxxyyy%';
This returns all dates between the minimum and maximum for the criteria. If you have another date range in mind, just use that for the generate_series().
Note: The final condition on criteria needs to go in the on clause not a where clause.

Oracle: SELECT where date is less if not equals null

I have a table of records, and one column holds the value when the records turns in-active.
Most of the records are still open, and therefore do not hold any value in the end_date column.
I want to select all of those records, which are still active. One way to achieve this (from the top of my head):
select *
from table t
where nvl(t.end_date, to_date('2099-DEC-31', 'MM-DD-yyyy')) > sysdate
But it doesn't feel right. Is there a better way to achieve what I want?
EDIT: BTW, the table isn't huge, and isn't going to grow :)
select *
from table t
where nvl(t.end_date, to_date('2099-DEC-31', 'MM-DD-yyyy')) > sysdate
won't use a "normal", non function based index, so it may hurt performance.
You could query it like
select *
from table t
where t.end_date > sysdate OR t.end_date is null
instead