How to determine largest resolution of an INTERVAL? - sql

How can I determine the largest resolution of an INTERVAL value? For example:
INTERVAL '100 days and 3 seconds' => day
TIME '20:05' - TIME '12:01:01' => hour
AGE(NOW(), NOW() - INTERVAL '1 MONTH') => month

The question isn't 100% clear so the answer may or may not be exactly what you're looking for, but...
There is a justify_interval() function, which you might want to look into.
test=# select justify_interval(INTERVAL '100 days 3 seconds');
justify_interval
-------------------------
3 mons 10 days 00:00:03
(1 row)
test=# select justify_interval(TIME '20:05' - TIME '12:01:01');
justify_interval
------------------
08:03:59
(1 row)
test=# select justify_interval(AGE(NOW(), NOW() - INTERVAL '1 MONTH'));
justify_interval
------------------
1 mon
(1 row)
For there extract the year, then month, then day, etc. until you come up with a non-zero answer:
test=# select extract('mon' from interval '3 mons 10 days 00:00:03');
date_part
-----------
3
Re your other question in comments:
create function max_res(interval) returns interval as $$
select case
when extract('year' from justify_interval($1)) > 0 or
extract('mon' from justify_interval($1)) > 0 or
extract('day' from justify_interval($1)) > 0
then '1 day'
when extract('hour' from justify_interval($1)) > 0
then '1 hour'
when ...
end;
$$ language sql immutable strict;

INTERVAL is 12 bytes and is a struct containing months, days and microseconds and has a range of +/- 178000000 years. It always has a fixed max size of 178000000 years due to the way that it stores this information.
Be careful with your understanding of "a month" because the Julian month is not a constant in the same way that an hour or a minute is (e.g. how many days are in the month of February? Or how many days are there in a year? It's not always 30 or 365 in reality and PostgreSQL updates things correctly. per an interesting conversation on IRC, Adding 1 month::INTERVAL to January 30th will result in whatever the last day of February because it increments the tm_mon member of struct tm (and in this case, rolls back to the previous valid date).
Ah ha! I get the question now (or at least I think so). You're looking to determine the largest "non-zero integer unit" for a given INTERVAL.
PostgreSQL doesn't have a built-in function that returns that information. I think you're going to have to chain a conditional and return type. Some example PL code:
t := EXTRACT(EPOCH FROM my_time_input);
IF t >= 31104000 THEN
RETURN 'year';
ELSIF t >= 2592000 THEN
RETURN 'month';
ELSIF t >= 604800 THEN
RETURN 'week';
ELSIF t >= 86400 THEN
RETURN 'day';
ELSIF t >= 3600 THEN
RETURN 'hour';
ELSIF t >= 60 THEN
RETURN 'minute'
ELSIF t > 1 THEN
RETURN 'seconds';
ELSIF t == 1 THEN
RETURN 'second';
ELSE
RETURN resolve_largest_sub_second_unit(my_time);
END IF;

Related

How to Find the Interval Between Two Dates in PostgreSQL

I have a 'test_date' column in 'test' table in Postgres DB, ex - (2018-05-29)
I need to calculate difference between current date and that date column and return the result as days, months, years.
I tried -
select (current_date - test_date) from test;
but it returns the values as days. but I need the result as days, months, years.
How to convert it properly ?
The age() function returns the value as an interval rather than the number of days:
select age(current_date, test_date)
If you use a timestamp then you'll get a `interval' back:
select justify_interval(date_trunc('day', current_timestamp) - test_date)
The date_trunc() is there to set the time part of the timestamp to 00:00:00. By default that would return an interval with only days in it. The justify_interval() will then "normalize" this to months, weeks and days.
E.g. 0 years 7 mons 28 days 0 hours 0 mins 0.0 secs

Get average for "last month" only

Pretty new to SQL and have hit a roadblock.
I have this query, which works fine:
SELECT
(COUNT(*)::float / (current_date - '2017-05-17'::date)) AS "avg_per_day"
FROM "table" tb;
I now want it to include only data from the last month, not all time.
I've tried doing something along the lines of:
SELECT
(COUNT(*)::float / (current_date - (current_date - '1 month' ::date)) AS "avg_per_day"
FROM "table" tb;
The syntax is clearly wrong, but I am not sure what the right answer is. Have googled around and tried various options to no avail.
I can't use a simple AVG because the number I require is an AVG per day for the last month of data. Thus I've done a count of rows divided by the number of days since the first occurrence to get my AVG per day.
I have a column which tells me the date of the occurrence, however there are multiple rows with the same date in the dataset. e.g.
created_at
----------------------------
Monday 27th June 2017 12:00
Monday 27th June 2017 13:00
Tuesday 28th June 2017 12:00
and so on.
I am counting the number of occurrences per day and then need to work out an average from that, for the last month of results only (they date back to May).
The answer depends on the exact definition of "last month" and the exact definition of "average count".
Assuming:
Your column is defined created_at timestamptz NOT NULL
You want the average number of rows per day - days without any rows count as 0.
Cover 30 days exactly, excluding today.
SELECT round(count(*)::numeric / 30, 2) -- simple now with a fixed number of days
FROM tbl
WHERE created_at >= (now()::date - 30)
AND created_at < now()::date -- excl. today
Rounding is optional, but you need numeric instead of float to use round() this way.
Not including the current day ("today"), which is ongoing and may result in a lower, misleading average.
If "last month" is supposed to mean something else, you need to define it exactly. Months have between 28 and 31 days, this can mean various things. And since you obviously operate with timestamp or timestamptz, not date, you also need to be aware of possible implications of the time of day and the current time zone. The cast to date (or the definition of "day" in general) depends on your current timezone setting while operating with timestamptz.
Related:
Ignoring timezones altogether in Rails and PostgreSQL
Select today's (since midnight) timestamps only
Subtract hours from the now() function
I think you just need a where clause:
SELECT
(COUNT(*)::float / (current_date - (current_date - '1 month' ::date)) AS "avg_per_day"
FROM "table" tb
WHERE created_at > (current_date - '1 month' ::date)
I believe Postgresql and other RDBMS has AVG() to calculate average.
SELECT AVG(tb.columnName) AS avg_per_month
FROM someTable tb
WHERE
tb.createdDate >= [start date of month] AND
tb.createdDate <= [end date of month]
Edit: I subtract current date with INTERVAL. I am on mobile phone so I cannot test.
SELECT
(COUNT(*)::float / (current_date - ( current_date - INTERVAL '1 month')) AS "avg_per_day"
FROM "table" tb;

SQLite query for n-th day of month

Say, I have the following table:
CREATE TABLE Data (
Id INTEGER PRIMARY KEY,
Value DECIMAL,
Date DATE);
Since the application is finance-related, user may choose, which day would be the first day of the month. For instance, if he receives salary every 10th of the month, he may set the first day of the month to be 10th.
I'd like to create a query, which returns average value for n-th day of month, as defined by user. For instance:
Date | Value
---------------+------
10.01.2016 | 10
11.01.2016 | 15
10.02.2016 | 20
11.03.2016 | 10
Result of the query should be:
Day | Average
----+--------
1 | 15
2 | 12.5
Note, that if user sets first day to 10th, 9th of the month may be 28th, 29th, 30th or 31st day of a month (depending on which month we're talking about). So this is not as simple as extracting day number from the date.
Assuming that the date values do not use the format dd.mm.yyyy but one of the supported date formats, you can use the built-in date functions to compute this.
To compute the difference, in days, between two dates, convert them into a date format that uses days as a number, i.e., Julian days.
To get the 'base' day for a month, we can use modifiers:
> SELECT julianday('2001-02-11') -
julianday('2001-02-11', 'start of month', '+10 days') + 2;
2.0
(The +2 is needed because we add to the 1st of the month, not the 0th, and we count beginning at 1, not 0.)
If the day is before the tenth, the computed value would become zero or negative, and we have to use the previous month instead:
> SELECT julianday('2001-02-09') -
julianday('2001-02-09', 'start of month', '-1 month', '+10 days') + 2;
31.0
Combining these results in this expression to compute the n for a date Date:
CASE
WHEN julianday(Date) -
julianday(Date, 'start of month', '+10 days') + 2 > 0
THEN julianday(Date) -
julianday(Date, 'start of month', '+10 days') + 2
ELSE julianday(Date) -
julianday(Date, 'start of month', '-1 month', '+10 days') + 2
END
You can the use this in your query:
SELECT CASE...END AS Day,
AVG(Value) AS Average
FROM Data
GROUP BY Day;
Seems like I found in parallel a different solution:
SELECT avg(Amount),
strftime("%d", Date, "-9 days") AS day
FROM (
SELECT sum(Amount) AS Amount,
strftime("%Y-%m-%d", Date, "start of day") AS Date
FROM Operations
GROUP BY strftime("%Y-%m-%d", Date, "start of day")
)
GROUP BY day;
Where "-9 days" is for 10th of the month (so first_day-1).

Adding months to date: PostgreSQL vs. Oracle

PostgreSQL and Oracle behaviour in adding/subtracting months to/from date differs.
Basically, if we add 1 month to some day, which is not the last one of the month, they'll both return the same day number in the resulting month (or the last one for the resulting month if the day number we are adding to is greater, e.g. 28th of February when adding to 31th of January).
PostgreSQL:
# select '2015-01-12'::timestamptz + '1 month'::interval;
date
------------------------
2015-02-12 00:00:00+03
Oracle:
> select add_months('12-JAN-2015',1) from dual;
ADD_MONTH
---------
12-FEB-15
However.
If the day we are adding to is the last day of the month, Oracle will return the last day of the resulting month, even if it's bigger, and PostgreSQL will still return the same day number (or the lower one if the resulting month is shorter). This can lead to some inconsistency (even funny!), especially with adding/subtracting multiple times and even when grouping operations - in PostgreSQL the result differs:
Oracle:
> select add_months('28-FEB-2015',1) from dual;
ADD_MONTH
---------
31-MAR-15
> select add_months('31-JAN-2015',4) from dual;
ADD_MONTH
---------
31-MAY-15
> select add_months(add_months(add_months(add_months('31-JAN-2015',1),1),1),1) from dual;
ADD_MONTH
---------
31-MAY-15
PostgreSQL:
-- Adding 4 months at once:
# select '2015-01-31'::timestamptz + '4
months'::interval;
date
-------------------------------
2015-05-31 00:00:00+03
-- Adding 4 months by one:
# select '2015-01-31'::timestamptz + '1
months'::interval + '1 months'::interval + '1 months'::interval +'1
months'::interval;
date
-------------------------------
2015-05-28 00:00:00+03
-- Adding 4 months by one with grouping operations:
# select '2015-01-31'::timestamptz + ('1
months'::interval + '1 months'::interval) + '1 months'::interval +'1
months'::interval;
date
-------------------------------
2015-05-30 00:00:00+03
-- And even adding 4 months and then subtracting them does not return the initial date!
# select '2015-01-31'::timestamptz + '1 months'::interval + '1
months'::interval + '1 months'::interval +'1 months'::interval - '4 months'::interval;
date
------------------------
2015-01-28 00:00:00+03
I know I could always use something like
SELECT (date_trunc('MONTH', now())+'1 month'::interval - '1 day'::interval);
to get the last day of month and use it when adding months in PostgreSQL, but
the question is: why both of them chose to implement different standards, which one is better/worse and why.
Oracle specify that
If date is the last day of the month or if the resulting month has
fewer days than the day component of date, then the result is the last
day of the resulting month. Otherwise, the result has the same day
component as date.
PostgreSQL specify that
Note there can be ambiguity in the months returned by age because
different months have a different number of days. PostgreSQL's
approach uses the month from the earlier of the two dates when
calculating partial months. For example, age('2004-06-01',
'2004-04-30') uses April to yield 1 mon 1 day, while using May would
yield 1 mon 2 days because May has 31 days, while April has only 30.
You might want to have a look at the justify_days(interval) function provided by PostgreSQL.
why both of them chose to implement different standards, which one is
better/worse and why ?
None of them is better then the other (it is mostly opinion based), simply different. As of why they decided to implement different standards, honestly I don't think there really is a reason, probably just a matter of facts.

SQL Count days till first of the month

How would I count the days from a date till the first of the following month
Example:
--Start Date
07-07-2011
How many days till:
-- The 1st of the succeeding month of the start date above
08-01-2011
Expected Result (in days):
25
So if I counted the day I get 25, so running this query gets me the desired timestamp:
SELECT CURRENT_DATE + INTERVAL '25 DAYS'
Results:
2011-08-01 00:00:00
just can't think of a way to get the number of days, any suggestions?
Or start date, end date, number of days between?
I don't have a PostgreSQL server handy, so this is untested, but I would try:
SELECT (DATE_TRUNC('month', CURRENT_DATE) + INTERVAL '1 MONTH') - CURRENT_DATE