Postgresql query between date ranges - sql

I am trying to query my postgresql db to return results where a date is in certain month and year. In other words I would like all the values for a month-year.
The only way i've been able to do it so far is like this:
SELECT user_id
FROM user_logs
WHERE login_date BETWEEN '2014-02-01' AND '2014-02-28'
Problem with this is that I have to calculate the first date and last date before querying the table. Is there a simpler way to do this?
Thanks

With dates (and times) many things become simpler if you use >= start AND < end.
For example:
SELECT
user_id
FROM
user_logs
WHERE
login_date >= '2014-02-01'
AND login_date < '2014-03-01'
In this case you still need to calculate the start date of the month you need, but that should be straight forward in any number of ways.
The end date is also simplified; just add exactly one month. No messing about with 28th, 30th, 31st, etc.
This structure also has the advantage of being able to maintain use of indexes.
Many people may suggest a form such as the following, but they do not use indexes:
WHERE
DATEPART('year', login_date) = 2014
AND DATEPART('month', login_date) = 2
This involves calculating the conditions for every single row in the table (a scan) and not using index to find the range of rows that will match (a range-seek).

From PostreSQL 9.2 Range Types are supported. So you can write this like:
SELECT user_id
FROM user_logs
WHERE '[2014-02-01, 2014-03-01]'::daterange #> login_date
this should be more efficient than the string comparison

Just in case somebody land here... since 8.1 you can simply use:
SELECT user_id
FROM user_logs
WHERE login_date BETWEEN SYMMETRIC '2014-02-01' AND '2014-02-28'
From the docs:
BETWEEN SYMMETRIC is the same as BETWEEN except there is no
requirement that the argument to the left of AND be less than or equal
to the argument on the right. If it is not, those two arguments are
automatically swapped, so that a nonempty range is always implied.

SELECT user_id
FROM user_logs
WHERE login_date BETWEEN '2014-02-01' AND '2014-03-01'
Between keyword works exceptionally for a date. it assumes the time is at 00:00:00 (i.e. midnight) for dates.

Read the documentation.
http://www.postgresql.org/docs/9.1/static/functions-datetime.html
I used a query like that:
WHERE
(
date_trunc('day',table1.date_eval) = '2015-02-09'
)
or
WHERE(date_trunc('day',table1.date_eval) >='2015-02-09'AND date_trunc('day',table1.date_eval) <'2015-02-09')

Related

Optimization on large tables

I have the following query that joins two large tables. I am trying to join on patient_id and records that are not older than 30 days.
select * from
chairs c
join data id
on c.patient_id = id.patient_id
and to_date(c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') >= 0
and to_date (c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') < 30
Currently, this query takes 2 hours to run. What indexes can I create on these tables for this query to run faster.
I will take a shot in the dark, because as others said it depends on what the table structure, indices, and the output of the planner is.
The most obvious thing here is that as long as it is possible, you want to represent dates as some date datatype instead of strings. That is the first and most important change you should make here. No index can save you if you transform strings. Because very likely, the problem is not the patient_id, it's your date calculation.
Other than that, forcing hash joins on the patient_id and then doing the filtering could help if for some reason the planner decided to do nested loops for that condition. But that is for after you fixed your date representation AND you still have a problem AND you see that the planner does nested loops on that attribute.
Some observations if you are stuck with string fields for the dates:
YYYYMMDD date strings are ordered and can be used for <,> and =.
Building strings from the data in chairs to use to JOIN on data will make good use of an index like one on data for patient_id, from_date.
So my suggestion would be to write expressions that build the date strings you want to use in the JOIN. Or to put it another way: do not transform the child table data from a string to something else.
Example expression that takes 30 days off a string date and returns a string date:
select to_char(to_date('20200112', 'YYYYMMDD') - INTERVAL '30 DAYS','YYYYMMDD')
Untested:
select * from
chairs c
join data id
on c.patient_id = id.patient_id
and id.from_date between to_char(to_date(c.from_date, 'YYYYMMDD') - INTERVAL '30 DAYS','YYYYMMDD')
and c.from_date
For this query:
select *
from chairs c join data
id
on c.patient_id = id.patient_id and
to_date(c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') >= 0 and
to_date (c.from_date, 'YYYYMMDD') - to_date(id.from_date, 'YYYYMMDD') < 30;
You should start with indexes on (patient_id, from_date) -- you can put them in both tables.
The date comparisons are problematic. Storing the values as actual dates can help. But it is not a 100% solution because comparison operations are still needed.
Depending on what you are actually trying to accomplish there might be other ways of writing the query. I might encourage you to ask a new question, providing sample data, desired results, and a clear explanation of what you really want. For instance, this query is likely to return a lot of rows. And that just takes time as well.
Your query have a non SERGABLE predicate because it uses functions that are iteratively executed. You need to discard such functions and replace them by a direct access to the columns. As an exemple :
SELECT *
FROM chairs AS c
JOIN data AS id
ON c.patient_id = id.patient_id
AND c.from_date BETWEEN id.from_date AND id.from_date + INTERVAL '1 day'
Will run faster with those two indexes :
CREATE X_SQLpro_001 ON chairs (patient_id, from_date);
CREATE X_SQLpro_002 ON data (patient_id, from_date) ;
Also try to avoid
SELECT *
And list only the necessary columns

Timestamp to date in SQL

Here is what I did:
Select count(check_id)
From Checks
Where timestamp::date > '2012-07-31'
Group by 1
Is it right to do it like I did or is there a better way? Should/could I have used the DateDIFF function in my WHERE clause? Something like: DATEDIFF(day, timestamp, '2012/07/31') > 0
Also, I need to figure out how I'd calculate the total rate of acceptance for this
time period? Can anyone provide their expertise with this?
Is it right to do it like I did or is there a better way?
Using a cast like that is a perfectly valid way to convert a timestamp to a date (I don't understand the reference to the non-existing datediff though - why would adding anything to a timestamp change it)
However, the cast has one drawback: if there is an index on the column "timestamp" it won't be used.
But as you just want a range after a certain date, there is no reason to cast the column to begin with.
The following will achieve the same thing as your query, but can make use of an index on the column "timestamp" in case there is one and using it is considered beneficial by the optimizer.
Select count(distinct check_id)
From Checks
Where "timestamp" > date '2012-07-31' + 1
Note the + 1 which selects the day after, otherwise the query would include rows that are on that date but after midnight.
I removed the unnecessary group by from your query.
If you want to get a count per day, then you will need to include the day in the SELECT list. In that case casting is a good way to do it:
Select "timestamp"::date, count(distinct check_id)
From Checks
Where "timestamp" > date '2012-07-31' + 1
group by "timestamp"::date

Why are different result between use date_part and exactly date parameter query data in peroid date?

I'm try to count distinct value in some columns in a table.
i have a logic and i try to write in 2 way
But i get diffent results from this two query.
Can any one help to clarify me? I dont know what wrong is code or i think.
SQL
select count(distinct membership_id) from members_membership m
where date_part(year,m.membership_expires)>=2019
and date_part(month,m.membership_expires)>=7
and date_part(day,m.membership_expires)>=1
and date_part(year,m.membership_creationdate)<=2019
and date_part(month,m.membership_creationdate)<=7
and date_part(day,m.membership_creationdate)<=1
;
select count(distinct membership_id) from members_membership m
where m.membership_expires>='2019-07-01'
and m.membership_creationdate<='2019-07-01'
;
I actually think that this is the query you intend to run:
SELECT
COUNT(DISTINCT membership_id)
FROM members_membership m
WHERE
m.membership_expires >= '2019-07-01' AND
m.membership_creationdate < '2019-07-01';
It doesn't make sense for a membership to expire at the same moment it gets created, so if it expires on midnight of 1st-July 2019, then it should have been created strictly before that point in time.
That being said, the problem with the first query is that, e.g., the restriction on the month being on or before July would apply to every year, not just 2019. It is difficult to write a date inequality using the year, month, and day terms separately. For this reason, the second version you used is preferable. It is also sargable, meaning that an index on membership_expires or membership_creationdate can be used.
There is an issue with the first query:
select count(distinct membership_id) from members_membership m
where date_part(year,m.membership_expires)>=2019
and date_part(month,m.membership_expires)>=7
and date_part(day,m.membership_expires)>=1
and date_part(year,m.membership_creationdate)<=2019
and date_part(month,m.membership_creationdate)<=7
and date_part(day,m.membership_creationdate)<=1; -- do you think that any day is less than 1??
-- this condition will be satisfy by only 01-Jul-2019, But I think you need all the dates before 01-Jul-2019
and date_part(day,m.membership_creationdate)<=1 is culprit of the issue.
even membership_creationdate = 15-jan-1901 will not satisfy above condition.
You need to always use date functions on date columns to avoid such type of issue. (Your second query is perfectly fine)
Cheers!!
The reason could be due to a time component.
The proper comparison for the first query is:
select count(distinct membership_id)
from members_membership m
where m.membership_expires >= '2019-07-01' and
m.membership_creationdate < '2019-07-02'
--------------------------------^ not <= ---^ next day
This logic should work regardless of whether or not the "date" has a time component.

Is there a way to turn these multiple SQL queries into one?

I have a query that looks like this:
SELECT COUNT(*)
FROM table
WHERE date_field >= '2018-04-08'
AND date_field <= '2018-04-14'
I need to do this 26 times, for the current week and for 25 previous weeks, with each result separated by a carriage return. Is this possible with a single SQL query or do I need to put it in a loop, as I'm now doing?
Note that this is in FileMaker. I don't think that's relevant, but now you know, just in case.
Look into using "group by". Assuming that you are looking at calendar weeks, grouping by the week of the date will give you the counts per week and an extra condition can limit the overall range.
FileMaker has a WeekOfYearFiscal() function. 2nd parameter is the day of week start date.
SELECT WeekOfYearFiscal(dateField;2), COUNT(*)
FROM table
WHERE date_field >= '2018-01-01'
AND date_field <= '2018-04-14'
group by WeekOfYearFiscal(dateField;2)
See this documentation - http://www.filemaker.com/help/12/fmp/html/func_ref1.31.28.html
Give this a go. If the GROUP BY doesn't work on the function, you can nest the inner part and give an alias.

Specifying custom date range in SQL query

I want to write a query where in I need to specify the custom range (instead of hardcoded date range) for date starting from the order day. In the table being used, I have the date for the order.
As of now I have hardcoded the date range like:
where owh.order_day between TO_DATE('2016/07/15','YYYY/MM/DD') and TO_DATE('2017/01/17','YYYY/MM/DD')
where order_day is a date.
But rather I want something like:
where owh.order_day between TO_DATE(owh.order_day - 1,'YYYY/MM/DD') and TO_DATE(owh.order_day +3,'YYYY/MM/DD')
I am doing "-1" as it's "between", so it will take from order_day - order_day+2
For example, If the order_day is: "17/01/2016" then I want the condition to be where the date range is dynamically calculated as: "16/01/2016 - 20/01/2016" .
Is something like this possible? If yes, how can we achieve in in SQL??
The DB in question is Oracle
Any leads appreciated
Since you have not told us which RDBMS you are using, and since you are saying "any leads appreciated", I suppose we are free to give an answer for any RDBMS. The following will work for MySQL:
BETWEEN DATE_SUB( somedate, INTERVAL 1 DAY ) AND DATE_ADD( somedate, INTERVAL 1 DAY )
(source: https://www.tutorialspoint.com/sql/sql-date-functions.htm#function_date-add)