SQL Running total previous 3 months by date and id - sql

This is a simplification of the table q3 I'm working with:
Partno EndOfMonth AA AS EA ES
a 31.5.2017 5 1 0 1
b 31.5.2017 3 1 0 1
c 31.5.2017 2 2 0 1
a 31.6.2017 1 2 2 2
b 31.6.2017 1 0 1 2
c 31.6.2017 2 3 1 4
a 31.7.2017 4 3 2 0
b 31.7.2017 3 0 6 0
c 31.7.2017 4 1 0 0
I need to sum the numbers in the last four columns for each part in Partno so that the sum represents the running total of the last three months at each date in the EndOfMonth column.
The result i'm looking for is:
Partno EndOfMonth AA AS EA ES
a 31.5.2017 5 1 0 1
b 31.5.2017 3 1 0 1
c 31.5.2017 2 2 0 1
a 31.6.2017 6 3 2 3
b 31.6.2017 6 1 1 3
c 31.6.2017 4 5 1 5
a 31.7.2017 10 6 4 3
b 31.7.2017 7 1 7 3
c 31.7.2017 8 6 1 5
So e.g. for partno A at 31.7.2017 the last thee months' sum for the 'AA' column is 4+1+5=10.
I'm quite new to SQL and am well and truly stuck with this. I've tried something like the following to just get a simple rolling total (without even specifying the sum range to be the last 3 months). Also, I'm not sure if the database even supports all the functions in the below code, since it's giving me the error "Incorrect Syntax near the keyword 'OVER'"
SELECT
Partno,
EndofMonth,
SUM(q3.AA) OVER (PARTITION BY q3.Partno ORDER BY EndofMonth ROWS UNBOUNDED PRECEDING) as 'AA'
FROM q3
Anyway, any help would be greatly appreciated!
Thanks
EDIT:
Thanks to Benjamin and with a little help from this post: https://dba.stackexchange.com/questions/114403/date-range-rolling-sum-using-window-functions
I was able to find the solution:
SELECT a.Partno, a.EndofMonth, SUM(b.AA) as 'AA', SUM(b.AS) as 'AS',...
FROM q3 a, q3 b
WHERE a.Partno = b.Partno AND a.endOfMonth >= b.endOfMonth
AND b.endOfMonth >= DATEADD(month,-2,a.endOfMonth)
GROUP BY a.Partno, a.endOfMonth

Something like this might work:
SELECT a.Partno, a.EndofMonth, SUM(b.AA) as AA
FROM q3 a, q3 b
WHERE a.Partno = b.Partno
AND DATEDIFF(month, b.endOfMonth, a.endOfMonth) < 4
GROUP BY a.Partno, b.Partno
This assumes that endOfMonth is in datetime format, if it is not you will have to use convert(). Note that you might have to replace DATEDIFF() depending on what implementation you are using.
I haven't tested this, so I might be way off. It has been a while since I worked with SQL. Hopefully you can get it working by messing around with it a bit, and if not then maybe it will inspire you to write something better. Let me know how it goes!

Related

Creating 2 additional columns based on past dates - PostgresSQL

Seeking some help after spending alot of time on searching but to no avail and decided to post this here as I'm rather new to SQL, so any help is greatly appreciated. I've tried a few functions but can't seem to get it right. e.g. GROUP BY, BETWEEN etc
On the PrestoSQL server, I have a table as shown below starting with columns Date, ID and COVID. Using GROUP BY ID, I would like to create a column EverCOVIDBefore which looks back at all past dates of the COVID column to see if there was ever COVID = 1 or not, as well as another column called COVID_last_2_mth which checks if there was ever COVID = 1 within the past 2 months
(Highlighted columns are my expected outcomes)
Link to dataset: https://drive.google.com/file/d/1Sc5Olrx9g2A36WnLcCFMU0YTQ3-qWROU/view?usp=sharing
You can do:
select *,
max(covid) over(partition by id order by date) as ever_covid_before,
max(covid) over(partition by id order by date
range between interval '2 month' preceding and current row)
as covid_last_two_months
from t
Result:
date id covid ever_covid_before covid_last_two_months
----------- --- ------ ------------------ ---------------------
2020-01-15 1 0 0 0
2020-02-15 1 0 0 0
2020-03-15 1 1 1 1
2020-04-15 1 0 1 1
2020-05-15 1 0 1 1
2020-06-15 1 0 1 0
2020-01-15 2 0 0 0
2020-02-15 2 1 1 1
2020-03-15 2 0 1 1
2020-04-15 2 0 1 1
2020-05-15 2 0 1 0
2020-06-15 2 1 1 1
See running example at db<>fiddle.

How to make one column match duplicates in another column

This problem is out of my ability range and I can’t get anywhere with it beyond knowing I can probably use LEAD, LAG or maybe a cursor?
Here is a breakdown of the table and question:
row_id is always an IDENTITY(1, 1) column.
The set_id column always starts out in groups of 3s (two 0s for the first set_id, don't worry about why).
The letter column is alphabetic. There are varying counts of duplicates.
Here's the original table:
row_id
set_id
letter
1
0
A
2
0
A
3
1
A
4
1
B
5
1
B
6
2
B
7
2
B
8
2
C
9
3
C
10
3
C
11
3
D
12
4
D
13
4
D
14
4
D
What I need is a code that: if there is a duplicate letter in the next row, then the set_id in the next row should be the same as the previous row (alt_set_id).
If that doesn't make sense, here is the result I want:
row_id
set_id
letter
alt_set_id
1
0
A
0
2
0
A
0
3
1
A
0
4
1
B
1
5
1
B
1
6
2
B
1
7
2
B
1
8
2
C
2
9
3
C
2
10
3
C
2
11
3
D
3
12
4
D
3
13
4
D
3
14
4
D
3
Here's where I am with code so far, I'm not really close but I think I am on the right path:
SELECT
*,
CASE
WHEN letter = [letter in next row]
THEN 'yes'
ELSE 'no'
END AS 'next row a duplicate?',
'tbd' AS alt_row_id
FROM
(SELECT
*,
LEAD(letter) OVER (ORDER BY row_id) AS 'letter in next row'
FROM
sort_test) AS dt
WHERE
row_id = row_id
That query has the below result set, which is something I think I can work with, but it doesn't feel very efficient and I'm not yet getting the result needed in the alt_set_id column:
row_id
set_id
letter
letter in next row
next row a duplicate?
alt_set_id
1
0
A
A
yes
tbd
2
0
A
A
yes
tbd
3
1
A
B
no
tbd
4
1
B
B
yes
tbd
5
1
B
B
yes
tbd
6
2
B
B
yes
tbd
7
2
B
C
no
tbd
8
2
C
C
yes
tbd
9
3
C
C
yes
tbd
10
3
C
D
no
tbd
11
3
D
D
yes
tbd
12
4
D
D
yes
tbd
13
4
D
D
yes
tbd
14
4
D
NULL
no
tbd
Thanks for any help!
Based on your example data, you want the minimum set_id for each letter. If so, use window functions;
select t.*, min(set_id) over (partition by letter) as alt_set_id
from sort_test t;
It would appear if I understand correctly a simple correlated subquery will give you the desired result:
select *, (select Min(set_Id) from t t2 where t2.letter=t.letter) as alt_set_id
from t
See working DB Fiddle

How to show the closest date to the selected one

I'm trying to extract the stock in an specific date. To do so, I'm doing a cumulative of stock movements by date, product and warehouse.
select m.codart AS REF,
m.descart AS 'DESCRIPTION',
m.codalm AS WAREHOUSE,
m.descalm AS WAREHOUSEDESCRIP,
m.unidades AS UNITS,
m.entran AS 'IN',
m.salen AS 'OUT',
m.entran*1 + m.salen*-1 as MOVEMENT,
(select sum(m1.entran*1 + m1.salen*-1)
from MOVSTOCKS m1
where m1.codart = m.codart and m1.codalm = m.codalm and m.fecdoc >= m1.fecdoc) as 'CUMULATIVE',
m.PRCMEDIO as 'VALUE',
m.FECDOC as 'DATE',
m.REFERENCIA as 'REF',
m.tipdoc as 'DOCUMENT'
from MOVSTOCKS m
where (m.entran <> 0 or m.salen <> 0)
and (select max(m2.fecdoc) from MOVSTOCKS m2) < '2020-11-30T00:00:00.000'
order by m.fecdoc
Without the and (select max(m2.fecdoc) from MOVSTOCKS m2) < '2020-11-30T00:00:00.000' it shows data like this, which is ok.
REF WAREHOUSE UNITS IN OUT MOVEMENT CUMULATIVE DATE
1 0 2 0 2 -2 -7 2020-11-25
1 1 3 0 3 -3 -3 2020-11-25
1 0 5 0 5 -5 -7 2020-11-25
1 0 9 9 0 9 2 2020-11-26
2 0 2 2 0 2 2 2020-11-26
1 0 1 1 0 1 3 2020-12-01
The problem is, with the subselect in the where clause it returns no results (I think it is because it just looks for the max date and says it is bigger than 2020-11-30). I would like it to show the closest dates (all of them, for each product and warehouse) to the selected one, in this case 2020-11-30.
It should look slike this:
REF WAREHOUSE UNITS IN OUT MOVEMENT CUMULATIVE DATE
1 1 3 0 3 -3 -3 2020-11-25
1 0 9 9 0 9 2 2020-11-26
2 0 2 2 0 2 2 2020-11-26
Sorry if I'm not clear. Ask me if I have to clarify anything
Thank you
I am guessing that you want something like this:
select t.*
from (select m.*,
sum(m.entran - m1.salen) over (partition by m.codart, m.codalm order by fecdoc) as cumulative,
max(fecdoc) over (partition by m.codart, m.codalm) as max_fecdoc
from MOVSTOCKS m
where fecdoc < '2020-11-30'
) m
where fecdoc = max_fecdoc;
The subquery calculates the cumulative amount of stock using window functions and filters for records before the cutoff date. The outer query selects the most recent record from the combination of codeart/codalm, which seems to be how you are identifying a product.

Customers who bought and not bought some product in last 90 days

I need a dax measure which shows me which customers bought products B and C in last 90 days.
And another one which shows me those whose bought products B and C in last 90 days.
(based in my filter date context)
Below is like it should be:
Can someone help me?
Here is a sample data if needed:
FactSales
KeyDate KeyCustomer KeyProduct Total
1 1 1 12,9
1 2 2 13
1 3 1 156,4
1 4 1 564,8
2 1 1 894,8
2 2 1 56,5
3 1 2 564,85
3 2 3 564,8
4 1 1 1325,6
4 2 1 132,3
Customer
KeyCustomer Name
1 Jean
2 Mari
3 Lisa
4 Julian
5 Jhonny
Calendar
KeyDate Date
1 01/01/2018
2 02/01/2018
3 01/05/2018
4 01/08/2018
Product
KeyProduct Product
1 A
2 B
3 C
Try something along these lines:
IfBought = IF(
COUNTROWS(
FILTER(FactSales,
RELATED('Product'[Product]) IN {"B", "C"} &&
RELATED('Calendar'[Date]) > TODAY() - 90)
) > 0,
1, 0)
Note that May 1st is longer than 90 days ago as of today though, so you won't get the result you asked for unless you change 90 to 114 or greater.

SQL for MS Access: Another question about COUNT, JOIN, 0s and Dates

I asked a question regarding joins yesterday. However although that answer my initial question, i'm having more problems.
I have a telephony table
ID | Date | Grade
1 07/19/2010 Grade 1
2 07/19/2010 Grade 1
3 07/20/2010 Grade 1
4 07/20/2010 Grade 2
5 07/21/2010 Grade 3
I also have a Grade table
ID | Name
1 Grade 1
2 Grade 2
3 Grade 3
4 Grade 4
5 Grade 5
6 Grade 6
7 Grade 7
8 Grade 8
9 Grade 9
10 Grade 10
11 Grade 11
12 Grade 12
I use the following query to get the COUNT of every grade in the telephony table, it works great.
SELECT grade.ID, Count(telephony.Grade) AS Total
FROM grade LEFT JOIN telephony ON grade.ID=telephony.Grade
GROUP BY grade.ID
ORDER BY 1;
This returns
ID | Total
1 3
2 1
3 1
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
However, what i'm trying to do is the following:
Group by date and only return results between two dates
SELECT telephony.Date, grade.ID, Count(telephony.Grade) AS Total
FROM grade LEFT JOIN telephony ON grade.ID=telephony.Grade
WHERE telephony.Date BETWEEN #07/19/2010# AND #07/23/2010#
GROUP BY telephony.Date, grade.ID
ORDER BY 1;
I'm getting the following
Date | ID | Total
07/19/2010 1 2
07/20/2010 1 1
07/20/2010 2 1
07/21/2010 3 1
It's not returning all the grades with 0 entries between the two dates, only the entries that exist for those dates. What i'm looking for is something like this:
Date | ID | Total
07/19/2010 1 2
07/19/2010 2 0
07/19/2010 3 0
07/19/2010 4 0
07/19/2010 5 0
07/19/2010 6 0
07/19/2010 7 0
07/19/2010 8 0
07/19/2010 9 0
07/19/2010 10 0
07/19/2010 11 0
07/19/2010 12 0
07/20/2010 1 1
07/20/2010 2 1
07/20/2010 3 0
07/20/2010 4 0
07/20/2010 5 0
07/20/2010 6 0
07/20/2010 7 0
07/20/2010 8 0
07/20/2010 9 0
07/20/2010 10 0
07/20/2010 11 0
07/20/2010 12 0
07/21/2010 1 2
07/21/2010 2 0
07/21/2010 3 1
07/21/2010 4 0
07/21/2010 5 0
07/21/2010 6 0
07/21/2010 7 0
07/21/2010 8 0
07/21/2010 9 0
07/21/2010 10 0
07/21/2010 11 0
07/21/2010 12 0
07/22/2010 1 2
07/22/2010 2 0
07/22/2010 3 0
07/22/2010 4 0
07/22/2010 5 0
07/22/2010 6 0
07/22/2010 7 0
07/22/2010 8 0
07/22/2010 9 0
07/22/2010 10 0
07/22/2010 11 0
07/22/2010 12 0
07/23/2010 1 2
07/23/2010 2 0
07/23/2010 3 0
07/23/2010 4 0
07/23/2010 5 0
07/23/2010 6 0
07/23/2010 7 0
07/23/2010 8 0
07/23/2010 9 0
07/23/2010 10 0
07/23/2010 11 0
07/23/2010 12 0
I hope someone can help. I'm using Microsoft Access 2003.
Cheers
Create a separate query on telephony which uses your BETWEEN #07/19/2010# AND #07/23/2010# constraint.
qryTelephonyDateRange:
SELECT *
FROM telephony
WHERE [Date] BETWEEN #07/19/2010# AND #07/23/2010#;
Then, in your original query, use:
LEFT JOIN qryTelephonyDateRange ON grade.ID=qryTelephonyDateRange.Grade
instead of
LEFT JOIN telephony ON grade.ID=telephony.Grade
You could use a subquery instead of a separate named query for qryTelephonyDateRange.
Note Date is a reserved word, so I bracketed the name to avoid ambiguity ... Access' database engine will understand it is supposed to be looking for a field named Date instead of the VBA Date() function. However, if it were my project, I would rename the field to avoid ambiguity ... name it something like tDate.
Update: You asked to see a subquery approach. Try this:
SELECT g.ID, t.[Date], Count(t.Grade) AS Total
FROM
grade AS g
LEFT JOIN (
SELECT Grade, [Date]
FROM telephony
WHERE [Date] BETWEEN #07/19/2010# AND #07/23/2010#
) AS t
ON g.ID=t.Grade
GROUP BY g.ID, t.[Date]
ORDER BY 1, 2;
Try this:
SELECT grade.ID, Count(telephony.Grade) AS Total
FROM grade LEFT JOIN telephony ON grade.ID=telephony.Grade
GROUP BY grade.ID
HAVING COUNT(telephony.Grade) > 0
ORDER BY grade.ID;
That's completely different.
You want a range of individual dates joined with your first table, and the between clause isn't going to do that for you.
I think you'll need a table with all the dates you want, say from 1/1/2010 to 12/31/2010, or whatever range you need to support. One column, 365 or however many rows with one date value each.
then join that table with the ones with the dates and grades, and limit by your date range,
then do the aggregation to count.
Take it one step at a time and it will be easier to figure out.
The way I got it to work was to:
Create a table named Dates with a single primary key date/time field named MyDate (I'm with HansUp on not using reserved words like "Date" for field names).
Fill the table with the date values I wanted (7/19/2010 to 7/23/2010, as in your example).
Write a query with the following SQL statement
SELECT x.MyDate AS [Date], x.ID, Count(t.ID) AS Total
FROM (SELECT Dates.MyDate, Grade.ID FROM Dates, Grade) AS x
LEFT JOIN Telephony AS t ON (x.MyDate = t.Date) AND (x.ID = t.Grade)
GROUP BY x.MyDate, x.ID;
That should get the results you asked for.
The subquery statement in the SQL creates a cross-join to get you every combination of date in the Dates table and grade in the Grade table.
(SELECT Dates.MyDate, Grade.ID FROM Dates, Grade) AS x
Once you have that, then it's just an outer join to the Telephony table to do the rest.