How to group sum results by date with custom start time PostrgresQL - sql

I am trying to group my sum results by custom day in Postgresql.
As regular day starts at 00:00 , I would like mine to start at 04:00am , so if there is entry with time 2019-01-03 02:23 it would count into '2019-01-02' instead.
Right now my code looks like this:
Bottom part works perfectly on day type 00:00 - 23.59 , however I would like to group it by my upper range created above. I just don't know how to connect those two parts.
with dateRange as(
SELECT
generate_series(
MIN(to_date(payments2.paymenttime,'DD Mon YYYY')) + interval '4 hour',
max(to_date(payments2.paymenttime,'DD Mon YYYY')),
'24 hour') as theday
from payments2
)
select
sum(cast(payments2.servicecharge as money)) as total,
to_date(payments2.paymenttime,'DD Mon YYYY') as date
from payments2
group by date
Result like this
+------------+------------+
| total | date |
+------------+------------+
| 20 | 2019-01-01 |
+------------+------------+
| 60 | 2019-01-02 |
+------------+------------+
| 35 | 2019-01-03 |
+------------+------------+
| 21 | 2019-01-04 |
+------------+------------+
Many thanks for your help.

If I didn't misunderstand your question, you just need to subtract 4 hours from the timestamp before casting to date, you don't even need the CTE.
Something like
select
sum(cast(payments2.servicecharge as money)) as total,
(to_timestamp(payments2.paymenttime,'DD Mon YYYY HH24:MI:SS') - interval '4 hours')::date as date
from payments2
group by date
Yu may need to use a different format in the to_timestamp function depending on the format of the payments2.paymenttime string

Related

How can i split 1 row into multiply rows in SQL

I want to split something like this:
Value | Startdate | Enddate
XXXX | 2.July | 16 August
Into this:
Value | Startdate | Enddate
XXXX | 2.July | 31 July
XXXX | 1.August | 16 August
The value is not important for now.
If I understand correctly, you want to split your range into different months. A convenient method uses generate_series():
select value, greatest(startdate, gs.mon), least(enddate, gs.mon + interval '1 month - 1 day')
from t cross join lateral
generate_series(date_trunc('month', startdate), date_trunc('month', enddate), interval '1 month'
) gs(mon)
Here is a db<>fiddle.

Get rolling 30 day count of users logging in to site

I have a table of login to my site in the format below:
logins
+---------+--------------------------+-----------------------+
| USER_ID | LOGIN_TIMESTAMP | LOGOUT_TIMESTAMP |
+---------+--------------------------+-----------------------+
| 274385 | 01-JAN-20 02.56.12 PM | 02-JAN-20 10.04.40 AM |
| 32498 | 01-JAN-20 05.12.14 PM | 01-JAN-20 08.26.43 PM |
| 981231 | 01-JAN-20 04.41.04 PM | 01-JAN-20 10.51.11 PM |
+---------+--------------------------+-----------------------+
I would like to calculate a unique count of users who logged in only once in the previous 30 days, per day to get something as below
(note - USER_COUNT_LAST_30_DAYS counts only those users who logged in only once in the previous 30 days)
:
+-----------+-------------------------+
| DAY | USER_COUNT_LAST_30_DAYS |
+-----------+-------------------------+
| 01-JAN-20 | 14 |
| 02-JAN-20 | 23 |
| 03-JAN-20 | 29 |
+-----------+-------------------------+
My first thought would be a query as below, but I recognise this would just count all users who logged in the last 30 days, rather than those who only logged in once
SELECT
CAST(LOGIN_TIMESTAMP AS DATE),
COUNT(DISTINCT USER_ID)
FROM
logins
WHERE
LOGIN_TIMESTAMP > SYSDATE - 30
GROUP BY
CAST(LOGIN_TIMESTAMP AS DATE);
Would this query work in getting me a count of users who logged in only once the last 30 days with a rownum partition filter on user id? or is there something that I would have to ensure to get a rolling 30 day count?
The date datatype still has a time component, even if the format mask doesn't show it. You can use the TRUNC function on either a date or a timestamp. If you really want your day to be limited to the day, you'll need to truncate the timestamp. You also need to use INTERVAL, as timestamp math and date math are not the same:
SELECT TRUNC(LOGIN_TIMESTAMP) LOGIN_DATE,
COUNT(DISTINCT USER_ID) USER_COUNT
FROM logins
WHERE TRUNC(LOGIN_TIMESTAMP) > TRUNC(SYSTIMESTAMP - INTERVAL '30' DAY)
GROUP BY TRUNC(LOGIN_TIMESTAMP)
ORDER BY TRUNC(LOGIN_TIMESTAMP) ASC;
Example:
alter session set nls_date_format='DD-MON-YY HH24.MI.SS';
SELECT
SYSTIMESTAMP raw_timestamp,
CAST(SYSTIMESTAMP AS DATE) raw_date,
TRUNC(CAST(SYSTIMESTAMP AS DATE)) trunc_date,
TRUNC(SYSTIMESTAMP) - INTERVAL '30' DAY
from dual;
RAW_TIMESTAMP RAW_DATE TRUNC_DATE TRUNC(SYSTIMESTAMP
-------------------------------------- ------------------ ------------------ ------------------
25-JUN-20 12.27.21.756299000 PM -04:00 25-JUN-20 12.27.21 25-JUN-20 00.00.00 26-MAY-20 00.00.00
For identifying users that have only logged in once, try this:
WITH user_logins as (
SELECT USER_ID,
COUNT(*) LOGIN_COUNT
FROM logins
WHERE TRUNC(LOGIN_TIMESTAMP) > TRUNC(SYSTIMESTAMP - INTERVAL '30' DAY)
GROUP BY USER_ID)
SELECT user_id, login_count
from user_logins
where login_count=1
order by user_id;
Please use below query, since you have string value PM in the date, you cannot use cast function, instead you have to use to_date and convert to date format.
SELECT
to_date(LOGIN_TIMESTAMP, 'DD-MON-YYYY hh.mi.ss PM'),
COUNT(DISTINCT USER_ID)
FROM
logins
WHERE
LOGIN_TIMESTAMP >= SYSDATE - 30
GROUP BY
to_date(LOGIN_TIMESTAMP, 'DD-MON-YYYY hh.mi.ss PM');

How can I aggregate values based on an arbitrary monthly cycle date range in SQL?

Given a table as such:
# SELECT * FROM payments ORDER BY payment_date DESC;
id | payment_type_id | payment_date | amount
----+-----------------+--------------+---------
4 | 1 | 2019-11-18 | 300.00
3 | 1 | 2019-11-17 | 1000.00
2 | 1 | 2019-11-16 | 250.00
1 | 1 | 2019-11-15 | 300.00
14 | 1 | 2019-10-18 | 130.00
13 | 1 | 2019-10-18 | 100.00
15 | 1 | 2019-09-18 | 1300.00
16 | 1 | 2019-09-17 | 1300.00
17 | 1 | 2019-09-01 | 400.00
18 | 1 | 2019-08-25 | 400.00
(10 rows)
How can I SUM the amount column based on an arbitrary date range, not simply a date truncation?
Taking the example of a date range beginning on the 15th of a month, and ending on the 14th of the following month, the output I would expect to see is:
payment_type_id | payment_date | amount
-----------------+--------------+---------
1 | 2019-11-15 | 1850.00
1 | 2019-10-15 | 230.00
1 | 2019-09-15 | 2600.00
1 | 2019-08-15 | 800.00
Can this be done in SQL, or is this something that's better handled in code? I would traditionally do this in code, but looking to extend my knowledge of SQL (which at this stage, isnt much!)
Click demo:db<>fiddle
You can use a combination of the CASE clause and the date_trunc() function:
SELECT
payment_type_id,
CASE
WHEN date_part('day', payment_date) < 15 THEN
date_trunc('month', payment_date) + interval '-1month 14 days'
ELSE date_trunc('month', payment_date) + interval '14 days'
END AS payment_date,
SUM(amount) AS amount
FROM
payments
GROUP BY 1,2
date_part('day', ...) gives out the current day of month
The CASE clause is for dividing the dates before the 15th of month and after.
The date_trunc('month', ...) converts all dates in a month to the first of this month
So, if date is before the 15th of the current month, it should be grouped to the 15th of the previous month (this is what +interval '-1month 14 days' calculates: +14, because the date_trunc() truncates to the 1st of month: 1 + 14 = 15). Otherwise it is group to the 15th of the current month.
After calculating these payment_days, you can use them for simple grouping.
I would simply subtract 14 days, truncate the month, and add 14 days back:
select payment_type_id,
date_trunc('month', payment_date - interval '14 day') + interval '14 day' as month_15,
sum(amount)
from payments
group by payment_type_id, month_15
order by payment_type_id, month_15;
No conditional logic is actually needed for this.
Here is a db<>fiddle.
You can use the generate_series() function and make a inner join comparing month and year, like this:
SELECT specific_date_on_month, SUM(amount)
FROM (SELECT generate_series('2015-01-15'::date, '2015-12-15'::date, '1 month'::interval) AS specific_date_on_month)
INNER JOIN payments
ON (TO_CHAR(payment_date, 'yyyymm')=TO_CHAR(specific_date_on_month, 'yyyymm'))
GROUP BY specific_date_on_month;
The generate_series(<begin>, <end>, <interval>) function generate a serie based on begin and end with an specific interval.

Can you define a custom "week" in PostgreSQL?

To extract the week of a given year we can use:
SELECT EXTRACT(WEEK FROM timestamp '2014-02-16 20:38:40');
However, I am trying to group weeks together in a bit of an odd format. My start of a week would begin on Mondays at 4am and would conclude the following Monday at 3:59:59am.
Ideally, I would like to create a query that provides a start and end date, then groups the total sales for that period by the weeks laid out above.
Example:
SELECT
(some custom week date),
SUM(sales)
FROM salesTable
WHERE
startDate BETWEEN 'DATE 1' AND 'DATE 2'
I am not looking to change the EXTRACT() function, rather create a query that would pull from the following sample table and output the sample results.
If 'DATE 1' in query was '2014-07-01' AND 'DATE 2' was '2014-08-18':
Sample Table:
itemID | timeSold | price
------------------------------------
1 | 2014-08-13 09:13:00 | 12.45
2 | 2014-08-15 12:33:00 | 20.00
3 | 2014-08-05 18:33:00 | 10.00
4 | 2014-07-31 04:00:00 | 30.00
Desired result:
weekBegin | priceTotal
----------------------------------
2014-07-28 04:00:00 | 30.00
2014-08-04 04:00:00 | 10.00
2014-08-11 04:00:00 | 32.45
Produces your desired output:
SELECT date_trunc('week', time_sold - interval '4h')
+ interval '4h' AS week_begin
, sum(price) AS price_total
FROM tbl
WHERE time_sold >= '2014-07-01 0:0'::timestamp
AND time_sold < '2014-08-19 0:0'::timestamp -- start of next day
GROUP BY 1
ORDER BY 1;
db<>fiddle here (extended with a row that actually shows the difference)
Old sqlfiddle
Explanation
date_trunc() is the superior tool here. You are not interested in week numbers, but in actual timestamps.
The "trick" is to subtract 4 hours from selected timestamps before extracting the week - thereby shifting the time frame towards the earlier bound of the ISO week. To produce the desired display, add the same 4 hours back to the truncated timestamps.
But apply the WHERE condition on unmodified timestamps. Also, never use BETWEEN with timestamps, which have fractional digits. Use the WHERE conditions like presented above. See:
Unexpected results from SQL query with BETWEEN timestamps
Operating with data type timestamp, i.e. with (shifted) "weeks" according to the current time zone. You might want to work with timestamptz instead. See:
Ignoring time zones altogether in Rails and PostgreSQL

How to group by week no and get start date and end date for the week number in Sqlite?

I want to group the result by week no and get the start date and end date for the week. I had not idea on how to do this sqlite. Could somebody help me on this.
I also really confiused the way sqlite works. Because if run the following query i get the week no as 00
SELECT strftime('%W','2012-01-01');
and week no as 01 instead of 00 for the following query
SELECT strftime('%W','2012-01-02');
Could somebody explain why the sqlite behaves like this.
Try this out:
select * from t1;
+------------+
| ADate |
+------------+
| 2012-01-04 |
| 2012-01-10 |
| 2012-01-19 |
| 2012-01-22 |
| 2012-01-01 |
| 2012-01-01 |
+------------+
select
strftime('%W', aDate) WeekNumber,
max(date(aDate, 'weekday 0', '-7 day')) WeekStart,
max(date(aDate, 'weekday 0', '-1 day')) WeekEnd,
count(*) as GroupedValues
from t1
group by WeekNumber;
+------------+------------+------------+------------+
| WeekNumber | WeekStart | WeekEnd | WeekNumber |
+------------+------------+------------+------------+
| 00 | 2011-12-25 | 2011-12-31 | 2 |
| 01 | 2012-01-01 | 2012-01-07 | 1 |
| 02 | 2012-01-08 | 2012-01-14 | 1 |
| 03 | 2012-01-15 | 2012-01-21 | 2 |
+------------+------------+------------+------------+
To be honest... I don't know why 2012-01-01 is week 0 and 2012-01-02 is week 1. Sounds very weird, particularly if the week starts on sundays! :s
try something like this:
select
strftime('%W', aDate, 'weekday 1') WeekNumber,
max(date(aDate, 'weekday 1')) WeekStart,
max(date(aDate, 'weekday 1', '+6 day')) WeekEnd,
count(*) as GroupedValues
from t1
group by WeekNumber;
I know very much late to the party, but I can explain why. The Strftime reference used, comes from C. The way C calculates things is frustrating.
Days run from 0 to 6, with 0 being sunday, 6 being saturday.
However Weeks run from Monday to Sunday.
From the C documentations:
%w Weekday as a decimal number with Sunday as 0 (0-6)
%W Week number with the first Monday as the first day of week one (00-53)
So Week 00 would be a sunday, before the first monday of the year. If the year started on a thursday, all the days till the Monday would be week 00. Not sure what the ISO standard is, I think it's first Thursday of the year is week 1.
It took me a while to resolve this as I had to account for weeks running from Thursday to Wednesday.
In short, sqlite defines week of year (%W) on Monday, according to https://sqlite.org/lang_datefunc.html and its link on strftime function (http://opengroup.org/onlinepubs/007908799/xsh/strftime.html).
Thus, if anyone still needs to group weeks starting on Sunday, just advance a day as below:
SELECT strftime('%W','2012-01-01', '+1 day')
The accepted answer can be simplified a little and allow more flexibility. Note that this_sunday includes the day itself if that day is Sunday.
SELECT
DATE(created_at, 'weekday 0') AS this_sunday,
COUNT(*) AS rows_this_week
FROM my_table
GROUP BY this_sunday;
This technique is simpler than strftime('%W', created_at) and it's more powerful. For example if you want to aggregate weeks ending on Monday:
SELECT
DATE(created_at, 'weekday 1') AS this_monday,
COUNT(*) AS rows_this_week
FROM my_table
GROUP BY this_monday;
This works thanks to the modifier 'weekday N' which moves the date to the day of week specified if it's in the future or keeps it if it's the same day.
Finally, to verify which dates are included in your week range, I recommend this method:
SELECT
DATE(created_at, 'weekday 1') AS this_monday,
MIN(DATE(created_at)) AS earliest,
MAX(DATE(created_at)) AS latest,
COUNT(*) AS rows_this_week
FROM my_table
GROUP BY this_monday;
according to SQL Lite documentation Sunday==0 and 2012-01-01 is a sunday so its spitting out 00;
take a look at the date time functions for sqllite here
http://www.sqlite.org/lang_datefunc.html
If the given date is the first of the week, then DATE('2014-07-20', 'weekday 0', '-7 days') will return '2014-07-13', which is 7 days earlier than required.
This is the best I've come up with so far:
CASE DATE(myDate, 'weekday 0')
WHEN DATE(myDate) THEN DATE(myDate)
ELSE DATE(myDate, 'weekday 0', '-7 days')
END