When Using OVER with COUNT, What Does It Mean to Use Two Arguments With PARTITION BY? - sql

SELECT
M.Listing_ID,
COUNT(1) OVER (PARTITION BY M.User_ID,EXTRACT(MONTH FROM M.Start_Date)
ORDER BY M.Start_Date, M.Listing_ID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) X
FROM LISTINGS M
Here is LISTINGS:
Listings
User_ID Listing_ID Start_Date
A 1 2014-02-14
A 2 2014-03-10
A 3 2014-03-22
B 4 2014-06-08
B 5 2014-10-02
C 6 2014-09-04
C 7 2014-09-04
C 8 2014-09-04
C 9 2014-09-05
C 10 2014-10-03
I'm trying to decode what this code returns but I don't really know what it means to partition by 2 catagories. Can someone shed light?

You will get the number of rows, COUNT, by user and by month. The PARTITION BY tells the database when to reset the count

Related

Calculating moving sum (or SUM OVER) for the last X months, but with irregular number of rows

I want to do a window function (like the SUM() OVER() function), but there are two catches:
I want to consider the last 3 months on my moving sum, but the number of rows are not consistent. Some months have 3 entries, others may have 2, 4, 5, etc;
There is also a "group" column, and the moving sum should sum only the amounts of the same group.
In summary, a have a table that has the following structure:
id
date
group
amount
1
2022-01
group A
1100
2
2022-01
group D
2500
3
2022-02
group A
3000
4
2022-02
group B
1000
5
2022-02
group C
2500
6
2022-03
group A
2000
7
2022-04
group C
1000
8
2022-05
group A
1500
9
2022-05
group D
2000
10
2022-06
group B
1000
So, I want to add a moving sum column, containing the sum the amount for each group for the last 3 months. The sum should not reset every 3 months, but should consider only the previous values from the 3 months prior, and of the same group.
The end result should look like:
id
date
group
amount
moving_sum_three_months
1
2022-01
group A
1100
1100
2
2022-01
group D
2500
2500
3
2022-02
group A
3000
4100
4
2022-02
group B
1000
1000
5
2022-02
group C
2500
2500
6
2022-03
group A
2000
6100
7
2022-04
group C
1000
3500
8
2022-05
group A
1500
3500
9
2022-05
group D
2000
2000
10
2022-06
group B
1200
1200
The best example to see how the sum work in this example is line 8.
It considers only lines 8 and 6 for the sum, because they are the only one that meet the criteria;
Line 1 and 3 do not meet the criteria, because they are more than 3 months old from line 8 date;
All the other lines are not from group A, so they are also excluded from the sum.
Any ideias? Thanks in advance for the help!
Use SUM() as a window function partitioning the window by group in RANGE mode. Set the frame to go back 3 months prior the current record using INTERVAL '3 months', e.g.
SELECT *, SUM(amount) OVER w AS moving_sum_three_months
FROM t
WINDOW w AS (PARTITION BY "group" ORDER BY "date"
RANGE BETWEEN INTERVAL '3 months' PRECEDING AND CURRENT ROW)
ORDER BY id
Demo: db<>fiddle

How to query data and its count in multiple range at same time

I have a table like below,
id
number
date
1
23
2020-01-01
2
12
2020-03-02
3
23
2020-09-02
4
11
2019-03-04
5
12
2019-03-23
6
23
2019-04-12
I want to know is that how many times each number appears per year, such as,
number
2019
2020
23
1
2
12
1
1
11
1
0
I'm kinda stuck.. tried with left join or just a single select, but still, cannot figure out how to make it, please help thank you!
SELECT C.NUMBER,
SUM
(
CASE
WHEN C.DATE BETWEEN '20190101'AND '20191231'
THEN 1 ELSE NULL
END
) AS A_2019,
SUM
(
CASE
WHEN C.DATE BETWEEN '20200101'AND '20201231'
THEN 1 ELSE NULL
END
) AS A_2020
FROM I_have_a_table_like_below AS C
GROUP BY C.NUMBER

Creating a new calculated column in SQL

Is there a way to find the solution so that I need for 2 days, there are 2 UD's because there are June 24 2 times and for the rest there are single days.
I am showing the expected output here:
Primary key UD Date
-------------------------------------------
1 123 2015-06-24 00:00:00.000
6 456 2015-06-24 00:00:00.000
2 123 2015-06-25 00:00:00.000
3 658 2015-06-26 00:00:00.000
4 598 2015-06-27 00:00:00.000
5 156 2015-06-28 00:00:00.000
No of times Number of days
-----------------------------
4 1
2 2
The logic is 4 users are there who used the application on 1 day and there are 2 userd who used the application on 2 days
You can use two levels of aggregation:
select cnt, count(*)
from (select date, count(*) as cnt
from t
group by date
) d
group by cnt
order by cnt desc;

Getting date difference between consecutive rows in the same group

I have a database with the following data:
Group ID Time
1 1 16:00:00
1 2 16:02:00
1 3 16:03:00
2 4 16:09:00
2 5 16:10:00
2 6 16:14:00
I am trying to find the difference in times between the consecutive rows within each group. Using LAG() and DATEDIFF() (ie. https://stackoverflow.com/a/43055820), right now I have the following result set:
Group ID Difference
1 1 NULL
1 2 00:02:00
1 3 00:01:00
2 4 00:06:00
2 5 00:01:00
2 6 00:04:00
However I need the difference to reset when a new group is reached, as in below. Can anyone advise?
Group ID Difference
1 1 NULL
1 2 00:02:00
1 3 00:01:00
2 4 NULL
2 5 00:01:00
2 6 00:04:00
The code would look something like:
select t.*,
datediff(second, lag(time) over (partition by group order by id), time)
from t;
This returns the difference as a number of seconds, but you seem to know how to convert that to a time representation. You also seem to know that group is not acceptable as a column name, because it is a SQL keyword.
Based on the question, you have put group in the order by clause of the lag(), not the partition by.

SQL - Datediff between rows with Rank Applied

I am trying to work out how to to apply a datediff between rows where a rank is applied to the USER ID;
Example of how the data below;
UserID Order Number ScanDateStart ScanDateEnd Minute Difference Rank | Minute Difference Rank vs Rank+1
User1 10-24 10:20:00 10:40:00 20 1 | 5
User1 10-25 10:45:00 10:50:00 5 2 | 33
User1 10-26 11:12:00 11:45:00 33 3 | NULL
User2 10-10 00:09:00 00:09:20 20 1 | 4
User2 10-11 00:09:24 00:09:25 1 2 | 15
User2 10-12 00:09:40 00:10:12 32 3 | 3
User2 10-13 00:10:15 00:10:35 20 4 | NULL
What i'm looking for is how to code the final column of this table.
The rank is applied to UserID ordered by ScanDateStart.
Basically, i want to know the time between the ScanDateEnd of Rank 1, to ScanDateStart of Rank2, and so on, but for each user.... (calculating time between order processing etc)
Appreciate the help
This can be achieved by performing a LEFT JOIN to the same table on the UserID column and the Rank column, plus 1.
The following (simplified) pseudo-code should illustrate how to achieve this:
SELECT R.UserID,
R.Rank,
R1.Diff
FROM Rank R
LEFT JOIN Rank R1 ON R1.UserID = R.UserID AND R1.Rank = R.Rank + 1
Effectively, you are showing the UserID and Rank from the current row, but the Difference from the row of the same UserID with the Rank + 1.