How to filter out in SQL the records in a partionned ordered table where first records of group are null? - sql

The Data
ROW YEAR PROD KEY DATE
1 2011 APPLE TIME 2011-11-18 00:00:00.000
2 2011 APPLE TIME 2011-11-19 00:00:00.000
3 2013 APPLE NULL 2011-11-18 00:00:00.000
4 2013 APPLE NULL 2011-11-19 00:00:00.000
5 2013 APPLE TIME 2014-04-08 00:00:00.000
6 2013 APPLE DIM 2014-04-09 00:00:00.000
7 2013 APPLE TIME 2014-11-10 10:50:14.113
8 2013 APPLE TIME 2014-11-12 10:46:04.947
9 2013 MELON JAK 2011-10-17 11:01:19.657
10 2013 MELON TIME 2014-11-18 11:19:35.547
11 2013 MELON NULL 2014-11-19 11:19:35.547
12 2013 MELON TIME 2014-11-21 10:32:36.017
13 2014 APPLE JAK 2003-04-10 00:00:00.000
14 2014 APPLE DIM 2003-04-11 00:00:00.000
15 2015 APPLE TIME 2002-09-27 00:00:00.000
16 2015 APPLE NULL 2004-09-28 00:00:00.000
ROW is not a column in the table. Is just to show which records i want.
The question
The above data is partitionned by (YEAR, PROD) and ordered by DATE.
I need to keep all the rows except row 3 and 4 based on the following logic :
if the first rows of a group (here (YEAR, PROD)) are NULL, discard them.
11 and 16 are null but we keep them because they are not first of their group.
Each group has to start with records that have a KEY that is are not null
==> otherwise discard
In other words, i can have : not null, null, not null, null
But i cannot have : null, not null, null, not null
Expected result
ROW YEAR PROD KEY DATE
1 2011 APPLE TIME 2011-11-18 00:00:00.000
2 2011 APPLE TIME 2011-11-19 00:00:00.000
5 2013 APPLE TIME 2014-04-08 00:00:00.000
6 2013 APPLE DIM 2014-04-09 00:00:00.000
7 2013 APPLE TIME 2014-11-10 10:50:14.113
8 2013 APPLE TIME 2014-11-12 10:46:04.947
9 2013 MELON JAK 2011-10-17 11:01:19.657
10 2013 MELON TIME 2014-11-18 11:19:35.547
11 2013 MELON TIME 2014-11-19 11:19:35.547
12 2013 MELON TIME 2014-11-21 10:32:36.017
13 2014 APPLE JAK 2003-04-10 00:00:00.000
14 2014 APPLE DIM 2003-04-11 00:00:00.000
15 2015 APPLE TIME 2002-09-27 00:00:00.000
16 2015 APPLE TIME 2004-09-28 00:00:00.000
I want to do that, so later i have always a non null key at the begginning of each group.
In that way, i can later always use the former row to fill a subsequent records which have null value (in this example 11 and 16)
Any observation or suggestion would be much appreciated !

The following gets the output you desire. I am checking of the value of key column between rows unbounded preceeding and current row, and since NULL has the highest rank, if there are preceeding rows that are not null it would populate the field min_val with a NOT NULL column.
select * from (
select year,prod,key1,date1
,min(key1) over(partition by year,prod order by date1 asc) as min_val
from t
)x
where x.min_val is not null
+------+-------+------+-------------------------+---------+
| year | prod | key1 | date1 | min_val |
+------+-------+------+-------------------------+---------+
| 2011 | APPLE | TIME | 2011-11-18 00:00:00.000 | TIME |
| 2011 | APPLE | TIME | 2011-11-19 00:00:00.000 | TIME |
| 2013 | APPLE | TIME | 2014-04-08 00:00:00.000 | TIME |
| 2013 | APPLE | DIM | 2014-04-09 00:00:00.000 | DIM |
| 2013 | APPLE | TIME | 2014-11-10 10:50:14.113 | DIM |
| 2013 | APPLE | TIME | 2014-11-12 10:46:04.947 | DIM |
| 2013 | MELON | JAK | 2011-10-17 11:01:19.657 | JAK |
| 2013 | MELON | TIME | 2014-11-18 11:19:35.547 | JAK |
| 2013 | MELON | | 2014-11-19 11:19:35.547 | JAK |
| 2013 | MELON | TIME | 2014-11-21 10:32:36.017 | JAK |
| 2014 | APPLE | JAK | 2003-04-10 00:00:00.000 | JAK |
| 2014 | APPLE | DIM | 2003-04-11 00:00:00.000 | DIM |
| 2015 | APPLE | TIME | 2002-09-27 00:00:00.000 | TIME |
| 2015 | APPLE | | 2004-09-28 00:00:00.000 | TIME |
+------+-------+------+-------------------------+---------+
link
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ae82f64802674aa60005b8e9f534a150

There might be fancier solutions but in essence (you can remove the square brackets if KEY, DATE, etc are not reserved words in your product - I used TSQL):
select *
from Tbl T1
where
/* Do not include if... */
NOT (
t1.[KEY] is null
/* This is part of the first KEY=NULL rows for this group
(no preceding record with KEY<>NULL) */
and not exists
(select 1
from Tbl T3
where T3.[YEAR]=T1.[YEAR]
and T3.PROD=T1.PROD
and T3.[DATE] < T1.[DATE]
and T3.[KEY] is not null
)
/* There are KEY<>NULL values further down */
and exists
(select 1
from Tbl T2
where T2.[YEAR]=T1.[YEAR]
and T2.PROD=T1.PROD
and T2.[DATE] > T1.[DATE]
and T2.[KEY] is not null
)
)

This kind of query could help:
select YEAR, PROD, KEY, DATE
from (
select YEAR, PROD, KEY, DATE,
MIN(CASE WHEN KEY IS NULL THEN DATE ELSE NULL END)
OVER(PARTITION BY YEAR, PROD) AS MIN_NULL_KEY_DATE,
ROW_NUMBER() OVER(PARTITION BY YEAR, PROD ORDER BY DATE ASC) RN
from your_table yt
)rpr
where 1 = 1
and CASE WHEN RN = 1 AND DATE = MIN_NULL_KEY_DATE THEN 0 ELSE 1 END = 1
so what did I try to achieve here: when the key column null we just found the min date based on year and prod columns. And also check that row is the first row of that group or not. If the rn = 1 and the date is equal min date value of when the key is null, then just ignore them in case when.

Related

Postgresql Get Maximum value per day with corresponding time

I have the following table:
Date | Time | Value | ReceivedTime
2022-04-01| 00:59:59 | 5 | 00:30:15
2022-04-01| 13:59:59 | 15 | 13:30:00
2022-04-02| 21:59:59 | 5 | 21:30:15
2022-04-02| 22:59:59 | 25 | 22:25:15
2022-04-02| 23:59:59 | 25 | 23:00:15
2022-04-03| 14:59:59 | 50 | 00:30:15
2022-04-03| 15:59:59 | 555 | 00:30:15
2022-04-03| 16:59:59 | 56 | 00:30:15
I want to get maximum value along with Date,ReceivedTime.
Expected Result:
Date | Value | ReceivedTime
2022-04-01 | 15 | 13:30:00
2022-04-02 | 25 | 23:00:15
2022-04-03 | 555 | 00:30:15
This answer assumes that, in the event of two or more records being tied on a given day for the same highest value, you want to retain the single record with the most recent ReceivedTime. We can use DISTINCT ON here:
SELECT DISTINCT ON (Date) Date, Value, ReceivedTime
FROM yourTable
ORDER BY Date, Value DESC, ReceivedTime DESC;

How do I use a historic value as at a particular month when there are no values for the given month?

I have 2 SQL Server tables.
PurchaseOrderReceivingLine (PORL) is a table that contains every receipt from a purchase order. This has hundreds of entries per month.
PartyRelationshipScore (PRS) is a table with a party (supplier) reference number (that is used to join to the PORL table) and a score out of 10 for relationship and price. It also has a date field for when the score is updated so we have a history of the updates.
What I want to achieve is a supplier summary for each month. So I would have Supplier #, TotalValue, LateParts etc. I'm fine with creating the code for that. What I'm struggling with is getting the score for the given month if there are no values for that month.
So, for example I might have a value of 5 on the 1st August. Then it doesn't change until the 1st October when it is increased to 6.
On the grouping, September will have a TotalValue & a LateParts value but because there are no records in September in the PRS table, it will return a NULL value. I need it to get the last value recorded and return that (in this case August's 5). So it will return;
Aug 2019 - 5
Sep 2019 - 5
Oct 2019 - 6
Thanks in advance.
PORL Table
+-------+----------------+-------+-------+
| PORL# | Date (UK) | Value | Party |
+-------+----------------+-------+-------+
| 1 | 1/8/2019 | 100 | 6 |
| 2 | 1/8/2019 | 250 | 6 |
| 3 | 1/9/2019 | 1000 | 6 |
| 4 | 1/10/2019 | 2000 | 6 |
+-------+----------------+-------+-------+
PRS Table
+-------------+------------+-------------------+------------+
| DateChanged (UK) | Party | RelationShipScore | PriceScore |
+-------------+------------+-------------------+------------+
| 1/8/2019 | 6 | 5 | 5 |
| 1/10/2019 | 6 | 6 | 7 |
+------------------+-------+-------------------+------------+
Preferred outcome
+----------+-------+------+------------+-------------------+------------+
| Supplier | Month | Year | TotalValue | RelationshipScore | PriceScore |
+----------+-------+------+------------+-------------------+------------+
| 6 | 8 | 2019 | 350 | 5 | 5 |
| 6 | 9 | 2019 | 1000 | 5 | 5 |
| 6 | 10 | 2019 | 2000 | 6 | 7 |
+----------+-------+------+------------+-------------------+------------+
The relationshipscore & pricescore for month 9 are based on it not changing from month 8.
I think this helps
select Supplier = T.Party
, Month = DATEPART(MONTH,T.[Date])
, Year = DATEPART(YEAR,T.[Date])
, T.TotalValue
, R.RelationShipScore
, R.PriceScore
from ( Select P.[Party],P.[Date],[TotalValue] = sum(P.[Value])
from PurchaseOrderReceivingLine P
group by P.[Party],P.[Date] ) T
outer apply ( select top 1 RelationShipScore , PriceScore
from PartyRelationshipScore
where Party = T.Party
and DateChanged <= T.[Date]
Order by DateChanged desc ) R

How to subtract previous value in a column with calculation of other column on SQL server

I have a requirement for a table as shown below. As you can see mgt_year,tot_dflt_mgt and to_accum_mgt columns. In year column where its 2016 the value is 20 and accum value is 600. What I want is that when I do
(to_accum_mgt - tot_dflt_mgt)
I want this calculated result in previous row as shown in the table below. Then this calculated result i.e. 580 is used for subtracting 9 like (580 - 9) for year 2015 and so on for all trailing years. I have done this in excel and also in Oracle thanks to #mathguy, but how to achieve this result in SQL server. I have tried to use this SQL server but its not working.
Please forgive My bad English and noob formatting.
My table t:
line_seg MGT_YEAR TOT_DFLT_MGT TOT_ACCUM_MGT
--------- -------- ------------ ------------
A 2013 10
A 2014 15
A 2015 9
A 2016 20 600
B 2013 10
B 2014 15
B 2015 8
B 2016 20 500
Oracle Solution:
select mgt_year, tot_dflt_mgt,
max(tot_accum_mgt) over () -
nvl( sum(tot_dflt_mgt) over
(order by mgt_year
rows between 1 following and unbounded following)
, 0 ) as tot_accum_mgt
from t;
but I am unable use this in SQL Server.
required output
line_seg MGT_YEAR TOT_DFLT_MGT TOT_ACCUM_MGT
--------- -------- ------------ ------------
A 2013 10 556
A 2014 15 471
A 2015 9 580
A 2016 20 600
B 2013 12 457
B 2014 15 472
B 2015 8 480
B 2016 20 500
select *,
(sum(TOT_ACCUM_MGT) over()) -
(sum(TOT_DFLT_MGT ) over (order by TOT_DFLT_MGT )) as somecolname
from
table
Put Row_number() and self join it with the previous row on (a.ID = b.ID) and (a.row_num = b.row_num - 1)
OR
You can use lag() function
Please try the following query. I assumed that you are using 2012+ version of SQL Server. If not, please change the FIRST_VALUE to SUM -
SELECT t1.line_seg, t1.mgt_year, t1.[tot_dflt_mgt]
, FIRST_VALUE(t1.tot_accum_mgt) OVER(PARTITION BY t1.[line_seg] ORDER BY t1.mgt_year DESC)
- ISNULL(SUM(t2.[tot_dflt_mgt]) OVER(PARTITION BY t2.[line_seg] ORDER BY t2.mgt_year DESC), 0) AS tot_accum_mgt
FROM [dbo].[t] AS t1
LEFT JOIN [dbo].[t] AS t2 ON (t2.line_seg = t1.line_seg AND t2.mgt_year = t1.mgt_year + 1)
ORDER BY t1.line_seg, t1.mgt_year ASC;
To do this first I have to imagine the table as sorted by the descending order of date -
+------------+----------+--------------+---------------+
| line_seg | mgt_year | tot_dflt_mgt | tot_accum_mgt |
+------------+----------+--------------+---------------+
| A | 2016 | 20 | 600 |
| A | 2015 | 9 | NULL |
| A | 2014 | 15 | NULL |
| A | 2013 | 10 | NULL |
| B | 2016 | 20 | 500 |
| B | 2015 | 8 | NULL |
| B | 2014 | 15 | NULL |
| B | 2013 | 12 | NULL |
+------------+----------+--------------+---------------+
Then all I have to do is to subtract the PREVIOUS running total of tot_dflt_mgt from the latest year's tot_accum_mgt. This is equivalent to subtract the previous tot_dflt_mgt from the current computed value of tot_accum_mgt To use the previous year's fields LEFT JOIN is used to self join the table. Resulting in the following table -
+------------+----------+--------------+---------------+------------+----------+--------------+---------------+
| line_seg | mgt_year | tot_dflt_mgt | tot_accum_mgt | line_seg | mgt_year | tot_dflt_mgt | tot_accum_mgt |
+------------+----------+--------------+---------------+------------+----------+--------------+---------------+
| A | 2013 | 10 | NULL | A | 2014 | 15 | NULL |
| A | 2014 | 15 | NULL | A | 2015 | 9 | NULL |
| A | 2015 | 9 | NULL | A | 2016 | 20 | 600 |
| A | 2016 | 20 | 600 | NULL | NULL | NULL | NULL |
| B | 2013 | 12 | NULL | B | 2014 | 15 | NULL |
| B | 2014 | 15 | NULL | B | 2015 | 8 | NULL |
| B | 2015 | 8 | NULL | B | 2016 | 20 | 500 |
| B | 2016 | 20 | 500 | NULL | NULL | NULL | NULL |
+------------+----------+--------------+---------------+------------+----------+--------------+---------------+
The AND t2.mgt_year = t1.mgt_year + 1 filter in the LEFT join clause does the trick of getting previous rows value. Now all I had to do is to calculate the running total on this previous rows (t2). Also as, subtracting NULL from anything will result in NULL. So ISNULL replaces any NULL with zeros.
ISNULL(SUM(t2.[tot_dflt_mgt]) OVER(PARTITION BY t2.[line_seg] ORDER BY t2.mgt_year DESC), 0) AS tot_accum_mgt
Now, as we have the previous running total of tot_dflt_mgt, all we have to do is to delete the latest (largest mgt_year) tot_accum_mgt. We get that by using FIRST_VALUE function. SUM could also be used instead I guess.
FIRST_VALUE(t1.tot_accum_mgt) OVER(PARTITION BY t1.[line_seg] ORDER BY t1.mgt_year DESC)

Postgres: Adjust monthly calculations based on goals set

Below is my table:
practice_id | practice_name | practice_location | practice_monthly_revenue | practice_no_of_patients | date
-------------+-------------------+-------------------+--------------------------+-------------------------+---------------------
6 | Practice Clinic 1 | Location1 | 10000 | 8 | 2016-01-12 00:00:00
7 | Practice Clinic 1 | Location1 | 12000 | 10 | 2016-02-12 00:00:00
8 | Practice Clinic 1 | Location1 | 8000 | 4 | 2016-03-12 00:00:00
9 | Practice Clinic 1 | Location1 | 15000 | 10 | 2016-04-12 00:00:00
10 | Practice Clinic 1 | Location1 | 7000 | 3 | 2016-05-12 00:00:00
11 | Practice Clinic 2 | Location2 | 15000 | 12 | 2016-01-13 00:00:00
12 | Practice Clinic 2 | Location2 | 9000 | 8 | 2016-02-13 00:00:00
13 | Practice Clinic 2 | Location2 | 5000 | 2 | 2016-03-03 00:00:00
14 | Practice Clinic 2 | Location2 | 12000 | 9 | 2016-04-13 00:00:00
----------------------------------------------------------------------------------------------------------------------------------
I am firing below query to get monthly revenue vs monthly goal:-
select [date:month], SUM(practice_monthly_revenue) as Monthly_Revenue, 100000/12 as Goals
from practice_info
where practice_name IN ('Practice Clinic 1')
group by [date:month], practice_name
ORDER BY [date:month] ASC
Where "Monthly_Revenue" refers to exact revenue every month while Goal was the exact revenue expected to be generated.
Now I am having issue to write a sql query to adjust the goals next month if the goals aren't met.
E.g. if in March the revenue generated is below 8k which is the monthly goal then the remaining amount in goal should be adjusted in next months goal.
Will it be possible to achieve this with a sql query or I will have to write a sql procedure for it?
EDIT:- I forgot to add that the db belong to postgres.
Goals can be counted as
with recursive goals(mon, val, rev) as
(select min([pinf.date:month]) as mon /* Starting month */, 8000 as val /* Starting goal value */, pinf.practice_monthly_revenue as rev
from practice_info pinf
where pinf.practice_name IN ('Practice Clinic 1')
union all
select goals.mon + 1 as mon, 8000 + greatest(0, goals.val - goals.rev) as val, pinf.practice_monthly_revenue as rev
from practice_info pinf, goals
where goals.mon + 1 = [pinf.date:month]
and pinf.practice_name IN ('Practice Clinic 1')
)
select * from goals;
Just integrate it with your query to compare goals and revenues. It can be not exactly what you want, but I do believe you'll get the main point.

Query to use GROUP BY multiple columns

I have a table full of patients/responsible parties/insurance carrier combinations (e.g. patient Jim Doe's responsible party is parent John Doe who has insurance carrier Aetna Insurance). For each of these combinations, they have a contract that has multiple payments. For this particular table, I need to write a query to find any parent/RP/carrier combo that has multiple contract dates in the same month. Is there anyway to do this?
Example table:
ContPat | ContResp | ContIns | ContDue
------------------------------------------------------
53 | 13 | 27 | 2012-01-01 00:00:00.000
53 | 13 | 27 | 2012-02-01 00:00:00.000
53 | 15 | 27 | 2012-03-01 00:00:00.000
12 | 15 | 3 | 2011-05-01 00:00:00.000
12 | 15 | 3 | 2011-05-01 00:00:00.000
12 | 15 | 3 | 2011-06-01 00:00:00.000
12 | 15 | 3 | 2011-07-01 00:00:00.000
12 | 15 | 3 | 2011-08-01 00:00:00.000
12 | 15 | 3 | 2011-09-01 00:00:00.000
In this example, I would like to generate a list of all the duplicate months for any Patient/RP/Carrier combinations. The 12/15/3 combination would be the only row returned here, but I'm working with thousands of combinations.
Not sure if this is possible using a GROUP BY or similar functions. Thanks in advance for any advice!
If all you care about is multiple entries in the same calendar month:
SELECT
ContPat,
ContResp,
ContIns,
MONTH(ContDue) as Mo,
YEAR(ContDue) as Yr,
COUNT(*) as 'Records'
FROM
MyTable
GROUP BY
ContPat,
ContResp,
ContIns,
MONTH(ContDue),
YEAR(ContDue)
HAVING
COUNT(*) > 1
This will show you any Patient/Responsible Party/Insurer/Calendar month combination with more than one record for that month.