Generating columns for daily stats in SQL - sql

I have a table that currently looks like this (simplified to illustate my issue):
Thing| Date
1 2022-12-12
2 2022-11-05
3 2022-11-18
4 2022-12-01
1 2022-11-02
2 2022-11-21
5 2022-12-03
5 2022-12-08
2 2022-11-18
1 2022-11-20
I would like to generate the following:
Thing| 2022-11 | 2022-12
1 2 1
2 3 0
3 1 0
4 0 1
5 0 2
I'm new to SQL and can't quite figure this out - would I use some sort of FOR loop equivalent in my SELECT clause? I'm happy to figure out the exact syntax myself, I just need someone to point me in the right direction.
Thank you!

You may use conditional aggregation as the following:
Select Thing,
Count(Case When Date Between '2022-11-01' And '2022-11-30' Then 1 End) As '2022-11',
Count(Case When Date Between '2022-12-01' And '2022-12-31' Then 1 End) As '2022-12'
From table_name
Group By Thing
Order By Thing
See a demo.
The count function counts only the not null values, so for each row not matching the condition inside the count function a null value is returned, hence not counted.

Related

Creating 2 additional columns based on past dates - PostgresSQL

Seeking some help after spending alot of time on searching but to no avail and decided to post this here as I'm rather new to SQL, so any help is greatly appreciated. I've tried a few functions but can't seem to get it right. e.g. GROUP BY, BETWEEN etc
On the PrestoSQL server, I have a table as shown below starting with columns Date, ID and COVID. Using GROUP BY ID, I would like to create a column EverCOVIDBefore which looks back at all past dates of the COVID column to see if there was ever COVID = 1 or not, as well as another column called COVID_last_2_mth which checks if there was ever COVID = 1 within the past 2 months
(Highlighted columns are my expected outcomes)
Link to dataset: https://drive.google.com/file/d/1Sc5Olrx9g2A36WnLcCFMU0YTQ3-qWROU/view?usp=sharing
You can do:
select *,
max(covid) over(partition by id order by date) as ever_covid_before,
max(covid) over(partition by id order by date
range between interval '2 month' preceding and current row)
as covid_last_two_months
from t
Result:
date id covid ever_covid_before covid_last_two_months
----------- --- ------ ------------------ ---------------------
2020-01-15 1 0 0 0
2020-02-15 1 0 0 0
2020-03-15 1 1 1 1
2020-04-15 1 0 1 1
2020-05-15 1 0 1 1
2020-06-15 1 0 1 0
2020-01-15 2 0 0 0
2020-02-15 2 1 1 1
2020-03-15 2 0 1 1
2020-04-15 2 0 1 1
2020-05-15 2 0 1 0
2020-06-15 2 1 1 1
See running example at db<>fiddle.

Calculate time different from previous record

I have a set of data that I want to determine the difference in days between the Begin_time and End_Time for every 2 records to determine the processing time. I'm familiar with DateDiff('d','End_Time','Begin_Time',) to determine the processing time on the same row but how do I determine this for the previous record? For example, something like this DateDiff('Record2.Begin_time','Record1.End_Time') then DateDiff('Record4.Begin_time','Record3.End_Time') then DateDiff('Record6.Begin_time','Record5.End_Time') etc. It doesn't have to use DateDiff function, I'm just using that to illustrate my question. thanks
> Record Begin_Time End_Time Processing_Time
1 11/23/2020 11/24/2020 1
2 11/23/2020 11/24/2020 1
3 11/30/2020 11/30/2020 0
4 11/30/2020 11/30/2020 0
5 11/2/2020 11/3/2020 1
6 11/2/2020 11/3/2020 1
7 11/3/2020 11/5/2020 2
8 11/3/2020 11/5/2020 2
An Aproach could be like this:
Select DateDiff(YourTableEven.Begin_time, YourTableOdd.End_Time)
From YourTable AS YourTableEven
Join YourTable AS YourTableOdd ON YourTableOdd.Record = YourTableEven.Record + 1
Where YourTableEven.Record % 2 = 0

How to show the closest date to the selected one

I'm trying to extract the stock in an specific date. To do so, I'm doing a cumulative of stock movements by date, product and warehouse.
select m.codart AS REF,
m.descart AS 'DESCRIPTION',
m.codalm AS WAREHOUSE,
m.descalm AS WAREHOUSEDESCRIP,
m.unidades AS UNITS,
m.entran AS 'IN',
m.salen AS 'OUT',
m.entran*1 + m.salen*-1 as MOVEMENT,
(select sum(m1.entran*1 + m1.salen*-1)
from MOVSTOCKS m1
where m1.codart = m.codart and m1.codalm = m.codalm and m.fecdoc >= m1.fecdoc) as 'CUMULATIVE',
m.PRCMEDIO as 'VALUE',
m.FECDOC as 'DATE',
m.REFERENCIA as 'REF',
m.tipdoc as 'DOCUMENT'
from MOVSTOCKS m
where (m.entran <> 0 or m.salen <> 0)
and (select max(m2.fecdoc) from MOVSTOCKS m2) < '2020-11-30T00:00:00.000'
order by m.fecdoc
Without the and (select max(m2.fecdoc) from MOVSTOCKS m2) < '2020-11-30T00:00:00.000' it shows data like this, which is ok.
REF WAREHOUSE UNITS IN OUT MOVEMENT CUMULATIVE DATE
1 0 2 0 2 -2 -7 2020-11-25
1 1 3 0 3 -3 -3 2020-11-25
1 0 5 0 5 -5 -7 2020-11-25
1 0 9 9 0 9 2 2020-11-26
2 0 2 2 0 2 2 2020-11-26
1 0 1 1 0 1 3 2020-12-01
The problem is, with the subselect in the where clause it returns no results (I think it is because it just looks for the max date and says it is bigger than 2020-11-30). I would like it to show the closest dates (all of them, for each product and warehouse) to the selected one, in this case 2020-11-30.
It should look slike this:
REF WAREHOUSE UNITS IN OUT MOVEMENT CUMULATIVE DATE
1 1 3 0 3 -3 -3 2020-11-25
1 0 9 9 0 9 2 2020-11-26
2 0 2 2 0 2 2 2020-11-26
Sorry if I'm not clear. Ask me if I have to clarify anything
Thank you
I am guessing that you want something like this:
select t.*
from (select m.*,
sum(m.entran - m1.salen) over (partition by m.codart, m.codalm order by fecdoc) as cumulative,
max(fecdoc) over (partition by m.codart, m.codalm) as max_fecdoc
from MOVSTOCKS m
where fecdoc < '2020-11-30'
) m
where fecdoc = max_fecdoc;
The subquery calculates the cumulative amount of stock using window functions and filters for records before the cutoff date. The outer query selects the most recent record from the combination of codeart/codalm, which seems to be how you are identifying a product.

Creating min and max values and comparing them to timestamp values sql

I have a PostgreSQL database and I have a table that I am looking to query to determine which presses have been updated between the first cycle created_timestamp and the most recent cycle created_timestamp. Here is an example of the table, which is called event_log_summary.
press_id cycle_number created_timestamp
1 1 2020-02-07 16:07:52
1 2 2020-02-07 16:07:53
1 3 2020-02-07 16:07:54
1 4 2020-04-01 13:23:10
2 1 2020-01-13 8:33:23
2 2 2020-01-13 8:33:24
2 3 2020-01-13 8:33:25
3 1 2020-02-21 18:45:44
3 2 2020-02-21 18:45:45
3 3 2020-02-26 14:22:12
This is the query that I used to get me a three column output of press_id, mincycle, max_cycle, but then I want to compare the maxcycle created_timestamp to the mincycle created_timestamp and see if there is at least x amount of time between the two, say at least 1 day, I am unsure about how to implement that.
SELECT
press_id,
MIN(cycle_number) AS minCycle,
MAX(cycle_number) AS maxCycle
FROM
event_log_detail
GROUP BY
press_id
I have tried different things like using WHERE (MAX(cycle_number) - MIN(cycle_number > 1), but I am pretty new to SQL and don't quite fully know how to implement this. The output I am looking for, would have a difference of at least one day would be the following:
press_id
1
3
Presses 1 and 3 have their maximum cycle created_timestamp at least 1-day difference than their minimum cycle created_timestamp. I am just looking for the press_ids whose first cycle and the last cycle have a difference of at least 1 day, I don't need any other information on the output, just one column with the press_ids. Any help would be appreciated. Thanks.
You can use a HAVING clause:
select press_id,
max(created_timestamp) - min(created_timestamp) as diff
from event_log_detail
group by press_id
having max(created_timestamp) > min(created_timestamp) + interval '1 day';

SQL Running total previous 3 months by date and id

This is a simplification of the table q3 I'm working with:
Partno EndOfMonth AA AS EA ES
a 31.5.2017 5 1 0 1
b 31.5.2017 3 1 0 1
c 31.5.2017 2 2 0 1
a 31.6.2017 1 2 2 2
b 31.6.2017 1 0 1 2
c 31.6.2017 2 3 1 4
a 31.7.2017 4 3 2 0
b 31.7.2017 3 0 6 0
c 31.7.2017 4 1 0 0
I need to sum the numbers in the last four columns for each part in Partno so that the sum represents the running total of the last three months at each date in the EndOfMonth column.
The result i'm looking for is:
Partno EndOfMonth AA AS EA ES
a 31.5.2017 5 1 0 1
b 31.5.2017 3 1 0 1
c 31.5.2017 2 2 0 1
a 31.6.2017 6 3 2 3
b 31.6.2017 6 1 1 3
c 31.6.2017 4 5 1 5
a 31.7.2017 10 6 4 3
b 31.7.2017 7 1 7 3
c 31.7.2017 8 6 1 5
So e.g. for partno A at 31.7.2017 the last thee months' sum for the 'AA' column is 4+1+5=10.
I'm quite new to SQL and am well and truly stuck with this. I've tried something like the following to just get a simple rolling total (without even specifying the sum range to be the last 3 months). Also, I'm not sure if the database even supports all the functions in the below code, since it's giving me the error "Incorrect Syntax near the keyword 'OVER'"
SELECT
Partno,
EndofMonth,
SUM(q3.AA) OVER (PARTITION BY q3.Partno ORDER BY EndofMonth ROWS UNBOUNDED PRECEDING) as 'AA'
FROM q3
Anyway, any help would be greatly appreciated!
Thanks
EDIT:
Thanks to Benjamin and with a little help from this post: https://dba.stackexchange.com/questions/114403/date-range-rolling-sum-using-window-functions
I was able to find the solution:
SELECT a.Partno, a.EndofMonth, SUM(b.AA) as 'AA', SUM(b.AS) as 'AS',...
FROM q3 a, q3 b
WHERE a.Partno = b.Partno AND a.endOfMonth >= b.endOfMonth
AND b.endOfMonth >= DATEADD(month,-2,a.endOfMonth)
GROUP BY a.Partno, a.endOfMonth
Something like this might work:
SELECT a.Partno, a.EndofMonth, SUM(b.AA) as AA
FROM q3 a, q3 b
WHERE a.Partno = b.Partno
AND DATEDIFF(month, b.endOfMonth, a.endOfMonth) < 4
GROUP BY a.Partno, b.Partno
This assumes that endOfMonth is in datetime format, if it is not you will have to use convert(). Note that you might have to replace DATEDIFF() depending on what implementation you are using.
I haven't tested this, so I might be way off. It has been a while since I worked with SQL. Hopefully you can get it working by messing around with it a bit, and if not then maybe it will inspire you to write something better. Let me know how it goes!