Related
I have a requirement where I need to retrieve Row(s) 60 days prior to their "Retest Date" which is a column present in the table. I have also attached the screenshot and the field "Retest Date" is highlighted.
reagentlotid
reagentlotdesc
u_retest
RL-0000004
NULL
2021-09-30 17:00:00.00
RL-0000005
NULL
2021-09-29 04:21:00.00
RL-0000006
NULL
2021-09-29 04:22:00.00
RL-0000007
Y-T4
2021-08-28 05:56:00.00
RL-0000008
NULL
2021-09-30 05:56:00.00
RL-0000009
NULL
2021-09-28 04:23:00.00
This is what I was trying to do in SQL Server:
select r.reagentlotid, r.reagentlotdesc, r.u_retestdt
from reagentlot r
where u_retestdt = DATEADD(DD,60,GETDATE());
But, it didn't work. The above query returning 0 rows.
Could please someone help me with this query?
Use a range, if you want all data from the day 60 days hence:
select r.reagentlotid, r.reagentlotdesc, r.u_retestdt
from reagentlot r
where
u_retestdt >= CAST(DATEADD(DD,60,GETDATE())
AS DATE) AND
u_retestdt < CAST(DATEADD(DD,61,GETDATE()) AS DATE)
Dates are like numbers; the time is like a decimal part. 12:00:00 is half way through a day so it's like x.5 - SQLServer even lets you manipulate datetime types by adding fractions of days etc (adding 0.5 is adding 12h)
If you had a column of numbers like 1.1, 1.5. 2.4 and you want all the one-point-somethings you can't get any of them by saying score = 1; you say score >= 1 and score < 2
Generally, you should try to avoid manipulating table data in a query's WHERE clause because it usually makes indexes unusable: if you want "all numbers between 1 and 2", use a range; don't chop the decimal off the table data in order to compare it to 1. Same with dates; don't chop the time off - use a range:
--yes
WHERE score >= 1 and score < 2
--no
WHERE CAST(score as INTEGER) = 1
--yes
WHERE birthdatetime >= '1970-01-01' and birthdatetime < '1970-01-02'
--no
WHERE CAST(birthdatetime as DATE) = '1970-01-01'
Note that I am using a CAST to cut the time off in my recommendation to you, but that's to establish a pair of constants of "midnight on the day 60 days in the future" and "midnight on 61 days in the future" that will be used in the range check.
Follow the rule of thumb of "avoid calling functions on columns in a where clause" and generally, you'll be fine :)
Try something like this. -60 days may be the current or previous year. HTH
;with doy1 as (
select DATENAME(dayofyear, dateadd(day,-60,GetDate())) as doy
)
, doy2 as (
select case when doy > 0 then doy
when doy < 0 then 365 - doy end as doy
, case when doy > 0 then year(getdate())
when doy < 0 then year(getdate())-1 end as yr
from doy1
)
select r.reagentlotid
, r.reagentlotdesc
, cast(r.u_retestdt as date) as u_retestdt
from reagentlot r
inner join doy2 d on DATENAME(dayofyear, r.u_retestdt) = d.doy
where DATENAME(dayofyear, r.u_retestdt) = doy
and year(r.u_retestdt) = d.yr
I need to write a Postgres script to loop over all the rows I have in a particular table, get the date column in that row, find all rows in a different table that have that date and assign that row the ID of the first row in its foreign key column. Explaining further, my data (simplified) looks like:
first_table:
id INT, run_date DATE
second table:
id INT, target_date TIMESTAMP, first_table_id INT
pseudocode:
first_table_array = SELECT * FROM first_table;
i, val = range first_table_array {
UPDATE second_table SET first_table_id={val.id} WHERE date={val.run_date}
}
But it's not actually that simple, because I'm comparing a date against a timestamp (which I presume will cause some problems?), and I need to refine the time period as well.
So instead of doing where date={val.run_date} I actually need to check whether the date is run_date starting at 4am, and going into 4am the following day.
So say run_date is 01/01/2015, I need to convert that value into a timestamp of 01/01/2015 04:00:00, and then create another timestamp of 01/02/2015 04:00:00, and do:
where date > first_date AND date < second_date.
So, something like this:
pseudocode:
first_table_array = SELECT * FROM first_table;
i, val = range first_table_array {
first_date = timestamp(val.run_date) + 4 hrs
second_date = timestamp(val.run_date + 1 day) + 4 hrs
UPDATE second_table SET first_table_id={val.id} WHERE date > first_date AND date < second_date
}
Is postgres capable of this sort of thing, or am I better off writing a script to do this? Do I need to use loops, or could I accomplish this stuff just using joins and standard SQL queries?
I did the best I could. A data sample would be great.
UPDATE second_table ST
SET ST.id = FT.id
FROM first_table FT
WHERE ST.date BETWEEN FT.run_date + interval '4 hour'
AND FT.run_date + interval '4 hour' + interval '1 day'
How to do an update + join in PostgreSQL?
http://www.postgresql.org/docs/9.1/static/functions-datetime.html
NOTE:
BETWEEN is equivalent to use <= and >=
Can I have a view with an infinite number of rows? I don't want to
select all the rows at once, but is it possible to have a view that
represents a repeating weekly schedule, with rows for any date?
I have a database with information about businesses, their hours on
different days of the week. Their names:
# SELECT company_name FROM company;
company_name
--------------------
Acme, Inc.
Amalgamated
...
(47 rows)
Their weekly schedules:
# SELECT days, open_time, close_time
FROM hours JOIN company USING(company_id)
WHERE company_name='Acme, Inc.';
days | open_time | close_time
---------+-----------+-----------
1111100 | 08:30:00 | 17:00:00
0000010 | 09:00:00 | 12:30:00
Another table, not shown, has holidays they're closed.
So I can trivially create a user-defined function in the form of a
stored procedure that takes a particular date as an argument and
returns the business hours of each company:
SELECT company_name,open_time,close_time FROM schedule_for(current_date);
But I want to do it as a table query, in order that any
SQL-compatible host-language library will have no problem
interfacing with it, like this:
SELECT company_name, open_time, close_time
FROM schedule_view
WHERE business_date=current_date;
Relational database theory tells me that tables (relations) are
functions in the sense of being a unique mapping from each
primary key to a row (tuple). Obviously if the WHERE clause on
the above query were omitted it would result in a table (view)
having an infinite number of rows, which would be a practical issue. But
I'm willing to agree never to query such a view without a WHERE
clause that restricts the number of rows.
How can such a view be created (in PostgreSQL)? Or is a view even the way to do what I want?
Update
Here are some more details about my tables. The days of the week are saved as bits, and I select the appropriate row using a bit mask that has a single bit shifted once for each day of the requested week. To wit:
The company table:
# \d company
Table "company"
Column | Type | Modifiers
----------------+------------------------+-----------
company_id | smallint | not null
company_name | character varying(128) | not null
timezone | timezone | not null
The hours table:
# \d hours
Table "hours"
Column | Type | Modifiers
------------+------------------------+-----------
company_id | smallint | not null
days | bit(7) | not null
open_time | time without time zone | not null
close_time | time without time zone | not null
The holiday table:
# \d holiday
Table "holiday"
Column | Type | Modifiers
---------------+----------+-----------
company_id | smallint | not null
month_of_year | smallint | not null
day_of_month | smallint | not null
The function I currently have that does what I want (besides invocation) is defined as:
CREATE FUNCTION schedule_for(requested_date date)
RETURNS table(company_name text, open_time timestamptz, close_time timestamptz)
AS $$
WITH field AS (
/* shift the mask as many bits as the requested day of the week */
SELECT B'1000000' >> (to_char(requested_date,'ID')::int -1) AS day_of_week,
to_char(requested_date, 'MM')::int AS month_of_year,
to_char(requested_date, 'DD')::int AS day_of_month
)
SELECT company_name,
(requested_date+open_time) AT TIME ZONE timezone AS open_time,
(requested_date+close_time) AT TIME ZONE timezone AS close_time
FROM hours INNER JOIN company USING (company_id)
CROSS JOIN field
CROSS JOIN holiday
/* if the bit-mask anded with the DOW is the DOW */
WHERE (hours.days & field.day_of_week) = field.day_of_week
AND NOT EXISTS (SELECT 1
FROM holiday h
WHERE h.company_id = hours.company_id
AND field.month_of_year = h.month_of_year
AND field.day_of_month = h.day_of_month);
$$
LANGUAGE SQL;
So again, my goal is to be able to get today's schedule by doing this:
SELECT open_time, close_time FROM schedule_view
wHERE company='Acme,Inc.' AND requested_date=CURRENT_DATE;
and also be able to get the schedule for any arbitrary date by doing this:
SELECT open_time, close_time FROM schedule_view
WHERE company='Acme, Inc.' AND requested_date=CAST ('2013-11-01' AS date);
I'm assuming this would require creating the view here referred to as schedule_view but maybe I'm mistaken about that. In any event I want to keep any messy SQL code hidden from usage at the command-line-interface and client-language database libraries, as it currently is in the user-defined function I have.
In other words, I just want to invoke the function I already have by passing the argument in a WHERE clause instead of inside parentheses.
You could create a view with infinite rows by using a recursive CTE. But even that needs a starting point and a terminating condition or it will error out.
A more practical approach with set returning functions (SRF):
WITH x AS (SELECT '2013-10-09'::date AS day) -- supply your date
SELECT company_id, x.day + open_time AS open_ts
, x.day + close_time AS close_ts
FROM (
SELECT *, unnest(arr)::bool AS open, generate_subscripts(arr, 1) AS dow
FROM (SELECT *, string_to_array(days::text, NULL) AS arr FROM hours) sub
) sub2
CROSS JOIN x
WHERE open
AND dow = EXTRACT(ISODOW FROM x.day);
-- AND NOT EXISTS (SELECT 1 FROM holiday WHERE holiday = x.day)
-> SQLfiddle demo. (with constant day)
Expanding SRFs side-by-side is generally frowned upon (and for good reason, it's not in the SQL standard and show surprising behavior if the number of elements is not the same). The new feature WITH ORDINALITY in the upcoming Postgres 9.4 will allow cleaner syntax. Consider this related answer on dba.SE or similarly:
PostgreSQL unnest() with element number
I am assuming bit(7) as most effective data type for days. To work with it, I am converting it to an array in the first subquery sub.
Note the difference between ISODOW and DOW as field pattern for EXTRACT().
Updated question
Your function looks good, except for this line:
CROSS JOIN holiday
Otherwise, if I take the bit-shifting route, I end up with a similar query:
WITH x AS (SELECT '2013-10-09'::date AS day) -- supply your date
,y AS (SELECT day, B'1000000' >> (EXTRACT(ISODOW FROM day)::int - 1) AS dow
FROM x)
SELECT c.company_name, y.day + open_time AT TIME ZONE c.timezone AS open_ts
, y.day + close_time AT TIME ZONE c.timezone AS close_ts
FROM hours h
JOIN company c USING (company_id)
CROSS JOIN y
WHERE h.days & y.dow = y.dow;
AND NOT EXISTS ...
EXTRACT(ISODOW FROM requested_date)::int is just a faster equivalent of to_char(requested_date,'ID')::int
"Pass" day in WHERE clause?
To make that work you would have to generate a huge temporary table covering all possible days before selecting rows for the day in the WHERE clause. Possible (I would employ generate_series()), but very expensive.
My answer to your first draft is a smaller version of this: I expand all rows only for a pattern week before selecting the day matching the date in the WHERE clause. The tricky part is to display timestamps built from the input in the WHERE clause. Not possible. You are back to the huge table covering all days. Unless you have only few companies and a decently small date range, I would not go there.
This is built off the previous answers.
The sample data:
CREATE temp TABLE company (company_id int, company text);
INSERT INTO company VALUES
(1, 'Acme, Inc.')
,(2, 'Amalgamated');
CREATE temp TABLE hours(company_id int, days bit(7), open_time time, close_time time);
INSERT INTO hours VALUES
(1, '1111100', '08:30:00', '17:00:00')
,(2, '0000010', '09:00:00', '12:30:00');
create temp table holidays(company_id int, month_of_year int, day_of_month int);
insert into holidays values
(1, 1, 1),
(2, 1, 1),
(2, 1, 12) -- this was a saturday in 2013
;
First, just matching a date's day of week against the hours table's day of week, using the logic you provided:
select *
from company a
left join hours b
on a.company_id = b.company_id
left join holidays c
on b.company_id = c.company_id
where (b.days & (B'1000000' >> (to_char(current_date,'ID')::int -1)))
= (B'1000000' >> (to_char(current_date,'ID')::int -1))
;
Postgres lets you create custom operators to simplify expressions like in that where clause, so you might want an operator that matches the day of week between a bit string and a date. First the function that performs the test:
CREATE FUNCTION match_day_of_week(bit, date)
RETURNS boolean
AS $$
select ($1 & (B'1000000' >> (to_char($2,'ID')::int -1))) = (B'1000000' >> (to_char($2,'ID')::int -1))
$$
LANGUAGE sql IMMUTABLE STRICT;
You could stop there make in your where clause look something like "where match_day_of_week(days, some-date)". The custom operator makes this look a little prettier:
CREATE OPERATOR == (
leftarg = bit,
rightarg = date,
procedure = match_day_of_week
);
Now you've got syntax sugar to simplify that predicate. Here I've also added in the next test (that the month_of_year and day_of_month of a holiday don't correspond with the supplied date):
select *
from company a
left join hours b
on a.company_id = b.company_id
left join holidays c
on b.company_id = c.company_id
where b.days == current_date
and extract(month from current_date) != month_of_year
and extract(day from current_date) != day_of_month
;
For simplicity I start by adding an extra type (another awesome postgres feature) to encapsulate the month and day of the holiday.
create type month_day as (month_of_year int, day_of_month int);
Now repeat the process above to make another custom operator.
CREATE FUNCTION match_day_of_month(month_day, date)
RETURNS boolean
AS $$
select extract(month from $2) = $1.month_of_year
and extract(day from $2) = $1.day_of_month
$$
LANGUAGE sql IMMUTABLE STRICT;
CREATE OPERATOR == (
leftarg = month_day,
rightarg = date,
procedure = match_day_of_month
);
Finally, the original query is reduced to this:
select *
from company a
left join hours b
on a.company_id = b.company_id
left join holidays c
on b.company_id = c.company_id
where b.days == current_date
and not ((c.month_of_year, c.day_of_month)::month_day == current_date)
;
Reducing that down to a view looks like this:
create view x
as
select b.days,
(c.month_of_year, c.day_of_month)::month_day as holiday,
a.company_id,
b.open_time,
b.close_time
from company a
left join hours b
on a.company_id = b.company_id
left join holidays c
on b.company_id = c.company_id
;
And you could use that like this:
select company_id, open_time, close_time
from x
where days == current_date
and not (holiday == current_date)
;
Edit: You'll need to work on this logic a bit, by the way - this was more about showing the idea of how to do it with custom operators. For starters, if a company has multiple holidays defined you'll likely get multiple results back for that company.
I posted a similar response on PostgreSQL mailing list. Basically, avoiding the use of a function-invocation API in this situation is likely a foolish decision. The function call is the best API for this use-case. If you have a concrete scenario that you need to support where a function will not work then please provide that and maybe that scenario can be solved without having to compromise the PostgreSQL API. All your comments so far are about planning for an unknown future that very well may never come to be.
I am trying to select dates that have an anniversary in the next 14 days. How can I select based on dates excluding the year? I have tried something like the following.
SELECT * FROM events
WHERE EXTRACT(month FROM "date") = 3
AND EXTRACT(day FROM "date") < EXTRACT(day FROM "date") + 14
The problem with this is that months wrap.
I would prefer to do something like this, but I don't know how to ignore the year.
SELECT * FROM events
WHERE (date > '2013-03-01' AND date < '2013-04-01')
How can I accomplish this kind of date math in Postgres?
TL/DR: use the "Black magic version" below.
All queries presented in other answers so far operate with conditions that are not sargable: they cannot use an index and have to compute an expression for every single row in the base table to find matching rows. Doesn't matter much with small tables. Matters a lot with big tables.
Given the following simple table:
CREATE TABLE event (
event_id serial PRIMARY KEY
, event_date date
);
Query
Version 1. and 2. below can use a simple index of the form:
CREATE INDEX event_event_date_idx ON event(event_date);
But all of the following solutions are even faster without index.
1. Simple version
SELECT *
FROM (
SELECT ((current_date + d) - interval '1 year' * y)::date AS event_date
FROM generate_series( 0, 14) d
CROSS JOIN generate_series(13, 113) y
) x
JOIN event USING (event_date);
Subquery x computes all possible dates over a given range of years from a CROSS JOIN of two generate_series() calls. The selection is done with the final simple join.
2. Advanced version
WITH val AS (
SELECT extract(year FROM age(current_date + 14, min(event_date)))::int AS max_y
, extract(year FROM age(current_date, max(event_date)))::int AS min_y
FROM event
)
SELECT e.*
FROM (
SELECT ((current_date + d.d) - interval '1 year' * y.y)::date AS event_date
FROM generate_series(0, 14) d
,(SELECT generate_series(min_y, max_y) AS y FROM val) y
) x
JOIN event e USING (event_date);
Range of years is deduced from the table automatically - thereby minimizing generated years.
You could go one step further and distill a list of existing years if there are gaps.
Effectiveness co-depends on the distribution of dates. It's better for few years with many rows each.
Simple db<>fiddle to play with here
Old sqlfiddle
3. Black magic version
Create a simple SQL function to calculate an integer from the pattern 'MMDD':
CREATE FUNCTION f_mmdd(date) RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT (EXTRACT(month FROM $1) * 100 + EXTRACT(day FROM $1))::int';
I had to_char(time, 'MMDD') at first, but switched to the above expression which proved fastest in new tests on Postgres 9.6 and 10:
db<>fiddle here
It allows function inlining because EXTRACT(xyz FROM date) is implemented with the IMMUTABLE function date_part(text, date) internally. And it has to be IMMUTABLE to allow its use in the following essential multicolumn expression index:
CREATE INDEX event_mmdd_event_date_idx ON event(f_mmdd(event_date), event_date);
Multicolumn for a number of reasons:
Can help with ORDER BY or with selecting from given years. Read here. At almost no additional cost for the index. A date fits into the 4 bytes that would otherwise be lost to padding due to data alignment. Read here.
Also, since both index columns reference the same table column, no drawback with regard to H.O.T. updates. Read here.
Basic query:
SELECT *
FROM event e
WHERE f_mmdd(e.event_date) BETWEEN f_mmdd(current_date)
AND f_mmdd(current_date + 14);
One PL/pgSQL table function to rule them all
Fork to one of two queries to cover the turn of the year:
CREATE OR REPLACE FUNCTION f_anniversary(_the_date date = current_date, _days int = 14)
RETURNS SETOF event
LANGUAGE plpgsql AS
$func$
DECLARE
d int := f_mmdd($1);
d1 int := f_mmdd($1 + $2 - 1); -- fix off-by-1 from upper bound
BEGIN
IF d1 > d THEN
RETURN QUERY
SELECT *
FROM event e
WHERE f_mmdd(e.event_date) BETWEEN d AND d1
ORDER BY f_mmdd(e.event_date), e.event_date;
ELSE -- wrap around end of year
RETURN QUERY
SELECT *
FROM event e
WHERE f_mmdd(e.event_date) >= d OR
f_mmdd(e.event_date) <= d1
ORDER BY (f_mmdd(e.event_date) >= d) DESC, f_mmdd(e.event_date), event_date;
-- chronological across turn of the year
END IF;
END
$func$;
Call using defaults: 14 days beginning "today":
SELECT * FROM f_anniversary();
Call for 7 days beginning '2014-08-23':
SELECT * FROM f_anniversary(date '2014-08-23', 7);
db<>fiddle here - comparing EXPLAIN ANALYZE
"February 29"
When dealing with anniversaries or "birthdays", you need to define how to deal with the special case "February 29" in leap years.
When testing for ranges of dates, Feb 29 is usually included automatically, even if the current year is not a leap year. The range of days is extended by 1 retroactively when it covers this day.
On the other hand, if the current year is a leap year, and you want to look for 15 days, you may end up getting results for 14 days in leap years if your data is from non-leap years.
Say, Bob is born on the 29th of February:
My query 1. and 2. include February 29 only in leap years. Bob has birthday only every ~ 4 years.
My query 3. includes February 29 in the range. Bob has birthday every year.
There is no magical solution. You have to define what you want for every case.
Test
To substantiate my point I ran an extensive test with all the presented solutions. I adapted each of the queries to the given table and to yield identical results without ORDER BY.
The good news: all of them are correct and yield the same result - except for Gordon's query that had syntax errors, and #wildplasser's query that fails when the year wraps around (easy to fix).
Insert 108000 rows with random dates from the 20th century, which is similar to a table of living people (13 or older).
INSERT INTO event (event_date)
SELECT '2000-1-1'::date - (random() * 36525)::int
FROM generate_series (1, 108000);
Delete ~ 8 % to create some dead tuples and make the table more "real life".
DELETE FROM event WHERE random() < 0.08;
ANALYZE event;
My test case had 99289 rows, 4012 hits.
C - Catcall
WITH anniversaries as (
SELECT event_id, event_date
,(event_date + (n || ' years')::interval)::date anniversary
FROM event, generate_series(13, 113) n
)
SELECT event_id, event_date -- count(*) --
FROM anniversaries
WHERE anniversary BETWEEN current_date AND current_date + interval '14' day;
C1 - Catcall's idea rewritten
Aside from minor optimizations, the major difference is to add only the exact amount of years date_trunc('year', age(current_date + 14, event_date)) to get this year's anniversary, which avoids the need for a CTE altogether:
SELECT event_id, event_date
FROM event
WHERE (event_date + date_trunc('year', age(current_date + 14, event_date)))::date
BETWEEN current_date AND current_date + 14;
D - Daniel
SELECT * -- count(*) --
FROM event
WHERE extract(month FROM age(current_date + 14, event_date)) = 0
AND extract(day FROM age(current_date + 14, event_date)) <= 14;
E1 - Erwin 1
See "1. Simple version" above.
E2 - Erwin 2
See "2. Advanced version" above.
E3 - Erwin 3
See "3. Black magic version" above.
G - Gordon
SELECT * -- count(*)
FROM (SELECT *, to_char(event_date, 'MM-DD') AS mmdd FROM event) e
WHERE to_date(to_char(now(), 'YYYY') || '-'
|| (CASE WHEN mmdd = '02-29' THEN '02-28' ELSE mmdd END)
,'YYYY-MM-DD') BETWEEN date(now()) and date(now()) + 14;
H - a_horse_with_no_name
WITH upcoming as (
SELECT event_id, event_date
,CASE
WHEN date_trunc('year', age(event_date)) = age(event_date)
THEN current_date
ELSE cast(event_date + ((extract(year FROM age(event_date)) + 1)
* interval '1' year) AS date)
END AS next_event
FROM event
)
SELECT event_id, event_date
FROM upcoming
WHERE next_event - current_date <= 14;
W - wildplasser
CREATE OR REPLACE FUNCTION this_years_birthday(_dut date)
RETURNS date
LANGUAGE plpgsql AS
$func$
DECLARE
ret date;
BEGIN
ret := date_trunc('year' , current_timestamp)
+ (date_trunc('day' , _dut)
- date_trunc('year' , _dut));
RETURN ret;
END
$func$;
Simplified to return the same as all the others:
SELECT *
FROM event e
WHERE this_years_birthday( e.event_date::date )
BETWEEN current_date
AND current_date + '2weeks'::interval;
W1 - wildplasser's query rewritten
The above suffers from a number of inefficient details (beyond the scope of this already sizable post). The rewritten version is much faster:
CREATE OR REPLACE FUNCTION this_years_birthday(_dut INOUT date)
LANGUAGE sql AS
$func$
SELECT (date_trunc('year', now()) + ($1 - date_trunc('year', $1)))::date
$func$;
SELECT *
FROM event e
WHERE this_years_birthday(e.event_date) BETWEEN current_date
AND (current_date + 14);
Test results
I ran this test with a temporary table on PostgreSQL 9.1.7.
Results were gathered with EXPLAIN ANALYZE, best of 5.
Results
Without index
C: Total runtime: 76714.723 ms
C1: Total runtime: 307.987 ms -- !
D: Total runtime: 325.549 ms
E1: Total runtime: 253.671 ms -- !
E2: Total runtime: 484.698 ms -- min() & max() expensive without index
E3: Total runtime: 213.805 ms -- !
G: Total runtime: 984.788 ms
H: Total runtime: 977.297 ms
W: Total runtime: 2668.092 ms
W1: Total runtime: 596.849 ms -- !
With index
E1: Total runtime: 37.939 ms --!!
E2: Total runtime: 38.097 ms --!!
With index on expression
E3: Total runtime: 11.837 ms --!!
All other queries perform the same with or without index because they use non-sargable expressions.
Conclusion
So far, #Daniel's query was the fastest.
#wildplassers (rewritten) approach performs acceptably, too.
#Catcall's version is something like the reverse approach of mine. Performance gets out of hand quickly with bigger tables.
The rewritten version performs pretty well, though. The expression I use is something like a simpler version of #wildplassser's this_years_birthday() function.
My "simple version" is faster even without index, because it needs fewer computations.
With index, the "advanced version" is about as fast as the "simple version", because min() and max() become very cheap with an index. Both are substantially faster than the rest which cannot use the index.
My "black magic version" is fastest with or without index. And it is very simple to call.
The updated version (after the benchmark) is a bit faster, yet.
With a real life table an index will make even greater difference. More columns make the table bigger, and sequential scan more expensive, while the index size stays the same.
I believe the following test works in all cases, assuming a column named anniv_date:
select * from events
where extract(month from age(current_date+interval '14 days', anniv_date))=0
and extract(day from age(current_date+interval '14 days', anniv_date)) <= 14
As an example of how it works when crossing a year (and also a month), let's say an anniversary date is 2009-01-04 and the date at which the test is run is 2012-12-29.
We want to consider any date between 2012-12-29 and 2013-01-12 (14 days)
age('2013-01-12'::date, '2009-01-04'::date) is 4 years 8 days.
extract(month...) from this is 0 and extract(days...) is 8, which is lower than 14 so it matches.
How about this?
select *
from events e
where to_char(e."date", 'MM-DD') between to_char(now(), 'MM-DD') and
to_char(date(now())+14, 'MM-DD')
You can do the comparison as strings.
To take year ends into account, we'll convert back to dates:
select *
from events e
where to_date(to_char(now(), 'YYYY')||'-'||to_char(e."date", 'MM-DD'), 'YYYY-MM-DD')
between date(now()) and date(now())+14
You do need to make a slight adjustment for Feb 29. I might suggest:
select *
from (select e.*,
to_char(e."date", 'MM-DD') as MMDD
from events
) e
where to_date(to_char(now(), 'YYYY')||'-'||(case when MMDD = '02-29' then '02-28' else MMDD), 'YYYY-MM-DD')
between date(now()) and date(now())+14
For convenience, I created two functions that yield the (expected or past) birsthday in the current year, and the upcoming birthday.
CREATE OR REPLACE FUNCTION this_years_birthday( _dut DATE) RETURNS DATE AS
$func$
DECLARE
ret DATE;
BEGIN
ret =
date_trunc( 'year' , current_timestamp)
+ (date_trunc( 'day' , _dut)
- date_trunc( 'year' , _dut)
)
;
RETURN ret;
END;
$func$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION next_birthday( _dut DATE) RETURNS DATE AS
$func$
DECLARE
ret DATE;
BEGIN
ret =
date_trunc( 'year' , current_timestamp)
+ (date_trunc( 'day' , _dut)
- date_trunc( 'year' , _dut)
)
;
IF (ret < date_trunc( 'day' , current_timestamp))
THEN ret = ret + '1year'::interval; END IF;
RETURN ret;
END;
$func$ LANGUAGE plpgsql;
--
-- call the function
--
SELECT date_trunc( 'day' , t.topic_date) AS the_date
, this_years_birthday( t.topic_date::date ) AS the_day
, next_birthday( t.topic_date::date ) AS next_day
FROM topic t
WHERE this_years_birthday( t.topic_date::date )
BETWEEN current_date
AND current_date + '2weeks':: interval
;
NOTE: the casts are needed because I only had timestamps available.
This should handle wrap-arounds at the end of the year as well:
with upcoming as (
select name,
event_date,
case
when date_trunc('year', age(event_date)) = age(event_date) then current_date
else cast(event_date + ((extract(year from age(event_date)) + 1) * interval '1' year) as date)
end as next_event
from events
)
select name,
next_event,
next_event - current_date as days_until_next
from upcoming
order by next_event - current_date
You can filter than on the expression next_event - current_date to apply the "next 14 days"
The case ... is only necessary if you consider events that would be "today" as "upcoming" as well. Otherwise, that can be reduced to the else part of the case statement.
Note that I "renamed" the column "date" to event_date. Mainly because reserved words shouldn't be used as an identifier but also because date is a terrible column name. It doesn't tell you anything about what it stores.
You can generate a virtual table of anniversaries, and select from it.
with anniversaries as (
select event_date,
(event_date + (n || ' years')::interval)::date anniversary
from events, generate_series(1,10) n
)
select event_date, anniversary
from anniversaries
where anniversary between current_date and current_date + interval '14' day
order by event_date, anniversary
The call to generate_series(1,10) has the effect of generating 10 years of anniversaries for each event_date. I wouldn't use the literal value 10 in production. Instead, I'd either calculate the right number of years to use in a subquery, or I'd use a large literal like 100.
You'll want to adjust the WHERE clause to fit your application.
If you have a performance problem with the virtual table (when you have a lot of rows in "events"), replace the common table expression with a base table having the identical structure. Storing anniversaries in a base table makes their values obvious (especially for, say, Feb 29 anniversaries), and queries on such a table can use an index. Querying an anniversary table of half a million rows using just the SELECT statement above takes 25ms on my desktop.
I found a way to do it.
SELECT EXTRACT(DAYS FROM age('1999-04-10', '2003-05-12')),
EXTRACT(MONTHS FROM age('1999-04-10', '2003-05-12'));
date_part | date_part
-----------+-----------
-2 | -1
I can then just check that the month is 0 and the days are less than 14.
If you have a more elegant solution, please do post it. I'll leave the question open for a bit.
I don't work with postgresql so I googled it's date functions and found this: http://www.postgresql.org/docs/current/static/functions-datetime.html
If I read it correctly, looking for events in the next 14 days is as simple as:
where mydatefield >= current_date
and mydatefield < current_date + integer '14'
Of course I might not be reading it correctly.
A series of dates with a specified interval can be generated using a variable and a static date as per the linked question that I asked earlier. However when there's a where clause to produce a start date, the dates generation seems to stop and only shows the first interval date. I also checked other posts, those that I found e.g. 1, e.g. 2, e.g. 3 are shown with a static date or using CTE.. I am looking for a solution without storedprocedures/functions...
This works:
SELECT DATE(DATE_ADD('2012-01-12',
INTERVAL #i:=#i+30 DAY) ) AS dateO
FROM members, (SELECT #i:=0) r
where #i < DATEDIFF(now(), date '2012-01-12')
;
These don't:
SELECT DATE_ADD(date '2012-01-12',
INTERVAL #j:=#j+30 DAY) AS dateO, #j
FROM `members`, (SELECT #j:=0) s
where #j <= DATEDIFF(now(), date '2012-01-12')
and mmid = 100
;
SELECT DATE_ADD(stdate,
INTERVAL #k:=#k+30 DAY) AS dateO, #k
FROM `members`, (SELECT #k:=0) t
where #k <= DATEDIFF(now(), stdate)
and mmid = 100
;
SQLFIDDLE REFERENCE
Expected Results:
Be the same as the first query results given it starts generating dates with stDate of mmid=100.
Preferably in ANSI SQL so it can be supported in MYSQL, SQL Server/MS Access SQL as Oracle has trunc and rownum given per this query with 14 votes and PostGres has generatge_Series function. I would like to know if this is a bug or a limitation in MYSQL?
PS: I have asked a similar quetion before. It was based on static date values where as this one is based on a date value from a table column based on a condition.
The simplest way to insure cross-platform compatibility is to use a calendar table. In its simplest form
create table calendar (
cal_date date primary key
);
insert into calendar values
('2013-01-01'),
('2013-01-02'); -- etc.
There are many ways to generate dates for insertion.
Instead of using a WHERE clause to generate rows, you use a WHERE clause to select rows. To select October of this year, just
select cal_date
from calendar
where cal_date between '2013-10-01' and '2013-10-31';
It's reasonably compact--365,000 rows to cover a period of 1000 years. That ought to cover most business scenarios.
If you need cross-platform date arithmetic, you can add a tally column.
drop table calendar;
create table calendar (
cal_date date primary key,
tally integer not null unique check (tally > 0)
);
insert into calendar values ('2012-01-01', 1); -- etc.
To select all the dates of 30-day intervals, starting on 2012-01-12 and ending at the end of the calendar year, use
select cal_date
from calendar
where ((tally - (select tally
from calendar
where cal_date = '2012-01-12')) % 30 ) = 0;
cal_date
--
2012-01-12
2012-02-11
2012-03-12
2012-04-11
2012-05-11
2012-06-10
2012-07-10
2012-08-09
2012-09-08
2012-10-08
2012-11-07
2012-12-07
If your "mmid" column is guaranteed to have no gaps--an unspoken requirement for a calendar table--you can use the "mmid" column in place of my "tally" column.