How to filter out rows based on information in other columns and rows - sql

I have a pricing policy table that determines the price a customer is given based on qty purchased as per below.
debtor_code
stock_code
min_quantity
default_price
contract_price
2393
GRBAG100GTALL-50
0
295
236
2393
GRBAG100GTALL-50
5
295
265.5
2393
GRBAG100GTALL-50
10
295
221.25
The pricing offered is based on the cheapest contract_price available for the lowest qty, meaning that the second row is obsolete as the min_quantity, and the cheaper price from the first row overrides the second row.
How can I use a SQL Server query to filter out obsolete rows like this as the first row supersedes it by having a cheaper contract_price at a lower min_quantity.
The result should look like:
debtor_code
stock_code
min_quantity
default_price
contract_price
2393
GRBAG100GTALL-50
0
295
236
2393
GRBAG100GTALL-50
10
295
221.25

use LEAD() to find the next tier's contract_price and compare with current level. Set the flag and filter out accordingly in the final query.
Based on assumption that price at higher tier (higher min_quantity value) should be cheaper than current tier.
with cte as
(
select *,
case when lead(contract_price) over (partition by debtor_code
order by min_quantity) < contract_price
then 1
else 0
end as flag
from pricing
)
select *
from cte
where flag = 0
dbfiddle demo
EDIT :
The following query uses recursive cte to compare current row with previous row to determine the validity of the price
with cte as
(
select *, rn = row_number() over (partition by debtor_code, stock_code
order by min_quantity)
from pricing
),
rcte as
(
select debtor_code, stock_code, rn, min_quantity, default_price,
contract_price,
valid_price = contract_price, valid = 1
from cte
where rn = 1
union all
select c.debtor_code, c.stock_code, c.rn, c.min_quantity, c.default_price,
c.contract_price,
valid_price = case when c.contract_price < r.contract_price
then c.contract_price
else r.contract_price
end,
valid = case when c.contract_price < r.contract_price
then 1
else 0
end
from rcte r
inner join cte c on r.rn = c.rn - 1
)
select *
from rcte
where valid = 1
dbfiddle demo
Edit 2
A much simplified solution. First is to find the min(contract_price) in the sequence of min_quantity. Then simply compare the current contract_price with that. It is same or equal, it is valid
select *
from
(
select *, valid_price = min(contract_price)
over (partition by debtor_code, stock_code
order by min_quantity)
from pricing
) p
where contract_price <= valid_price
dbfiddle demo

Related

SQL Get closest value to a number

I need to find the closet value of each number in column Divide from the column Quantity and put the value found in the Value column for both Quantities.
Example:
In the column Divide the value of 5166 would be closest to Quantity column value 5000. To keep from using those two values more than once I need to place the value of 5000 in the value column for both numbers, like the example below. Also, is it possible to do this without a loop?
Quantity Divide Rank Value
15500 5166 5 5000
1250 416 5 0
5000 1666 5 5000
12500 4166 4 0
164250 54750 3 0
5250 1750 3 0
6250 2083 3 0
12250 4083 3 0
1750 583 2 0
17000 5666 2 0
2500 833 2 0
11500 3833 2 0
1250 416 1 0
There are a couple of answers here but they both use ctes/complex subqueries. There is a much simpler/faster way by just doing a couple of self joins and a group-by
https://www.db-fiddle.com/f/rM268EYMWuK7yQT3gwSbGE/0
select
min(min.quantity) as minQuantityOverDivide
, t1.divide
, max(max.quantity) as maxQuantityUnderDivide
, case
when
(abs(t1.divide - coalesce(min(min.quantity),0))
<
abs(t1.divide - coalesce(max(max.quantity),0)))
then max(max.quantity)
else min(min.quantity) end as cloestQuantity
from t1
left join (select quantity from t1) min on min.quantity >= t1.divide
left join (select quantity from t1) max on max.quantity < t1.divide
group by
t1.divide
If I understood the requirements, 5166 is not closest to 5000 - it's closes to 5250 (delta of 166 vs 84)
The corresponding query, without loops, shall be (fiddle here: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=be434e67ba73addba119894a98657f17).
(I added a Value_Rank as it's not sure if you want Rank to be kept or recomputed)
select
Quantity, Divide, Rank, Value,
dense_rank() over(order by Value) as Value_Rank
from
(
select
Quantity, Divide, Rank,
--
case
when abs(Quantity_let_delta) < abs(Quantity_get_delta) then Divide + Quantity_let_delta
else Divide + Quantity_get_delta
end as Value
from
(
select
so.Quantity, so.Divide, so.Rank,
-- There is no LessEqualThan, assume GreaterEqualThan
max(isnull(so_let.Quantity, so_get.Quantity)) - so.Divide as Quantity_let_delta,
-- There is no GreaterEqualThan, assume LessEqualThan
min(isnull(so_get.Quantity, so_let.Quantity)) - so.Divide as Quantity_get_delta
from
SO so
left outer join SO so_let
on so_let.Quantity <= so.Divide
--
left outer join SO so_get
on so_get.Quantity >= so.Divide
group by so.Quantity, so.Divide, so.Rank
) so
) result
Or, if by closest you mean the previous closest (fiddle here: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=b41fb1a3fc11039c7f82926f8816e270).
select
Quantity, Divide, Rank, Value,
dense_rank() over(order by Value) as Value_Rank
from
(
select
so.Quantity, so.Divide, so.Rank,
-- There is no LessEqualThan, assume 0
max(isnull(so_let.Quantity, 0)) as Value
from
SO so
left outer join SO so_let
on so_let.Quantity <= so.Divide
group by so.Quantity, so.Divide, so.Rank
) result
You don't need a loop, basically you need to find which is lowest difference between the divide and all the quantities (first cte). Then use this distance to find the corresponding record (second cte) and then join with your initial table to get the converted values (final select)
;with cte as (
select t.Divide, min(abs(t2.Quantity-t.Divide)) as ClosestQuantity
from #t1 as t
cross apply #t1 as t2
group by t.Divide
)
,cte2 as (
select distinct
t.Divide, t2.Quantity
from #t1 as t
cross apply #t1 as t2
where abs(t2.Quantity-t.Divide) = (select ClosestQuantity from cte as c where c.Divide = t.Divide)
)
select t.Quantity, cte2.Quantity as Divide, t.Rank, t.Value
from #t1 as t
left outer join cte2 on t.Divide = cte2.Divide

SQL Implementation with Sliding-Window functions or Recursive CTEs

I have a problem that it's very easy to be solved in C# code for example, but I have no idea how to write in a SQL query with Recursive CTE-s or Sliding-Windows functions.
Here is the situation: let's say I have a table with 3 columns (ID, Date, Amount), and here is some data:
ID Date Amount
-----------------------
1 01.01.2016 -500
2 01.02.2016 1000
3 01.03.2016 -200
4 01.04.2016 300
5 01.05.2016 500
6 01.06.2016 1000
7 01.07.2016 -100
8 01.08.2016 200
The result I want to get from the table is this (ID, Amount .... Order By Date):
ID Amount
-----------------------
2 300
4 300
5 500
6 900
8 200
The idea is to distribute the amounts into installments, for each client separately, but the thing is when negative amount comes into play you need to remove amount from the last installment. I don't know how clear I am, so here is an example:
Let's say I have 3 Invoices for one client with amounts 500, 200, -300.
If i start distribute these Invoices, first i distribute the amount 500, then 200. But when i come to the third one -300, then i need to remove from the last Invoice. In other words 200 - 300 = -100, so the amount from second Invoice will disappear, but there are still -100 that needs to be substracted from first Invoice. So 500 - 100 = 400. The result i need is result with one row (first invoice with amount 400)
Another example when the first invoice is with negative amount (-500, 300, 500).
In this case, the first (-500) invoice will make the second disappear and another 200 will be substracted from the third. So the result will be: Third Invoice with amount 300.
This is something like Stack implementation in programming language, but i need to make it with sliding-window functions in SQL Server.
I need an implementation with Sliding Function or Recursive CTEs.
Not with cycles ...
Thanks.
Ok, think this is what you want. there are two recursive queries. One for upward propagation and the second one for the downward propagation.
with your_data_rn as
(
select *, row_number() over (order by date) rn
from your_data
), rcte_up(id, date, ammount, rn, running) as
(
select *, ammount as running
from your_data_rn
union all
select d.*,
d.ammount + rcte_up.running
from your_data_rn d
join rcte_up on rcte_up.running < 0 and d.rn = rcte_up.rn - 1
), data2 as
(
select id, date, min(running) ammount,
row_number() over (order by date) rn
from rcte_up
group by id, date, rn
having min(running) > 0 or rn = 1
), rcte_down(id, date, ammount, rn, running) as
(
select *, ammount as running
from data2
union all
select d.*, d.ammount + rcte_down.running
from data2 d
join rcte_down on rcte_down.running < 0 and d.rn = rcte_down.rn + 1
)
select id, date, min(running) ammount
from rcte_down
group by id, date
having min(running) > 0
demo
I can imagine that you use just the upward propagation and the downward propagation of the first row is done in some procedural language. Downward propagation is one scan through few first rows, therefore, the recursive query may be a hammer on a mosquito.
I add client ID in table for more general solution. Then I implemented the stack stored as XML in query field. And emulated a program cycle with Recursive-CTE:
with Data as( -- Numbering rows for iteration on CTE
select Client, id, Amount,
cast(row_number() over(partition by Client order by Date) as int) n
from TabW
),
CTE(Client, n, stack) as( -- Recursive CTE
select Client, 1, cast(NULL as xml) from Data where n=1
UNION ALL
select D.Client, D.n+1, (
-- Stack operations to process current row (D)
select row_number() over(order by n) n,
-- Use calculated amount in first positive and oldest stack cell
-- Else preserve value stored in stack
case when n=1 or (n=0 and last=1) then new else Amount end Amount,
-- Set ID in stack cell for positive and new data
case when n=1 and D.Amount>0 then D.id else id end id
from (
select Y.Amount, Y.id, new,
-- Count positive stack entries
sum(case when new<=0 or (n=0 and Amount<0) then 0 else 1 end) over (order by n) n,
row_number() over(order by n desc) last -- mark oldest stack cell by 1
from (
select X.*,sum(Amount) over(order by n) new
from (
select case when C.stack.value('(/row/#Amount)[1]','int')<0 then -1 else 0 end n,
D.Amount, D.id -- Data from new record
union all -- And expand current stack in XML to table
select node.value('#n','int') n, node.value('#Amount','int'), node.value('#id','int')
from C.stack.nodes('//row') N(node)
) X
) Y where n>=0 -- Suppress new cell if the stack contained a negative amount
) Z
where n>0 or (n=0 and last=1)
for xml raw, type
)
from Data D, CTE C
where D.n=C.n and D.Client=C.Client
) -- Expand stack into result table
select CTE.Client, node.value('#id','int') id, node.value('#Amount','int')
from CTE join (select Client, max(n) max_n from Data group by Client) X on CTE.Client=X.Client and CTE.n=X.max_n+1
cross apply stack.nodes('//row') N(node)
order by CTE.Client, node.value('#n','int') desc
Test on sqlfiddle.com
I think this method is slower than #RadimBača. And it is shown to demonstrate the possibilities of implementing a sequential algorithm on SQL.

How to display only few columns of a table based on the data

Consider the table below
Name partno. sch_date WO# owed panels
aa 1234 08/22/2017 121 22 26
aa 1234 08/22/2017 222 22 27
aa 1234 08/22/2017 242 22 27
aa 1234 08/29/2017 152 20 24
aa 1234 08/29/2017 167 20 24
aa 1234 08/29/2017 202 20 26`
Is it possible to display the data in such way that when the number of panels is greater than owed, then i don't won't to dispaly the other partno. schedule on the same date(sch_date).
Expected Result
Name partno. sch_date WO# owed panels
aa 1234 08/22/2017 121 22 26
aa 1234 08/29/2017 152 20 24
Cross apply may help here. (note you can see why I asked about order in my earlier comment as the ORDER of records in a table is not guaranteed! we need to know in what you want the records evaluated! Date isn't enough (unless it has a time compoent not displayed that is different!)
WORKING example on Rextester: http://rextester.com/CAUK18185
Many assumptions made:
When owned is > panels you don't need to see the record.
You want to see the the lowest WO# when owed is < panels and suppress all other records including ones where owed > panels.
If there are no records for a date, name and partno that have owed < panels, you want to see no records.
If these assumptions are incorrect, please provide a better sample set and expected results to test these types of situations and explain what you want to have happen.
SELECT Distinct B.*
FROM tblName Z
CROSS APPLY (SELECT TOP 1 A.*
FROM tblName A
WHERE A.owed < A.panels
and Z.Name = A.Name
and Z.[partno.] = a.[partno.]
and Z.sch_date = a.sch_date
ORDER by A.name, A.[partno.], A.sch_date, A.[wo#]) B
For each record in A run a query which returns the lowest wo# for a name, partno and sch_date when the owed < panels.
UPDATED:
I see in a comment you want to keep records if owed > panels... if it's encountered first.... but what if it's not encountered first?
http://rextester.com/NXS51018
--First we get all the records w/ a owed < panels per group and assign the earliest row (that having the lowest WO) a RN of 1. then we return that set.
cte as (
Select A.*, row_number() over (Partition by Name, [partno.], sch_date ORDER BY [WO#]) RN
from tblName A
where owed < panels)
Select * from cte where RN =1
UNION ALL
--We then union in the records where owed >=panels and their WO# < the wo# from the CTE.
SELECT Z.*, 0 as rn FROM tblName Z where owed >=panels
and exists (Select * from cte
where Z.name = CTE.name
and Z.[partno.] = cte.[partno.]
and Z.sch_date = cte.sch_date
and CTE.[WO#] > Z.[WO#]) --Now this line may not be needed, depending on if you want all or just some of the WO#'s when owed >=panels.
ORDER BY name, [partno.], Sch_date, [Wo#]
After last comment update:
cte as (
Select A.*, row_number() over (Partition by Name, [partno.], sch_date ORDER BY [WO#]) RN
from tblName A
where owed < panels),
cte2 as (Select * from cte where RN =1
UNION ALL
SELECT Z.*, 0 as rn FROM tblName Z where owed >=panels
and exists (Select * from cte
where Z.name = CTE.name
and Z.[partno.] = cte.[partno.]
and Z.sch_date = cte.sch_date
and CTE.[WO#] > Z.[WO#]))
Select * into SOQ#45619304 from CTE2; --This line creates the table based on the 2nd cte results.
Select * from SOQ#45619304;
You can try this -
SELECT Name, partno., sch_date, WO#, owed, panels
FROM YOUR_TABLE
WHERE panels < owed
UNION ALL
SELECT Name, partno., sch_date, MIN(WO#), owed, MIN(panels)
FROM YOUR_TABLE
WHERE panels > owed
GROUP BY Name, partno., sch_date, owed
ORDER BY Name

In T-SQL How Can I Select Up To The 5 Most Recent Rows, Grouped By An Identifier, If They Contain A Specific Value?

Long title.
I am using T-SQL and attempting to find all accounts who's most recent transactions are ACHFAIL, and determine how many in a row they have had, up to 5.
I already wrote a huge, insanely convoluted query to group and count all ACHFAIL transactions that have had x ACHFAILs in a row. Now the requirements are the simpler "only count the most recent transactions"
Below is what I have so far, but I cannot wrap my head around the next step to take. I was trying to simplify my task by only counting up the 5, but if I could provide an accurate count of all the ACHFAIL attempts in a row, that would more ideal.
WITH grouped
AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY TRANSACTIONS.deal_id
ORDER BY TRANSACTIONS.deal_id, tran_date DESC) AS row_num
,TRANSACTIONS.tran_code
,TRANSACTIONS.tran_date
,TRANSACTIONS.deal_id
FROM TRANSACTIONS
)
SELECT TOP 1000 * FROM grouped
which returns rows such as:
row_num tran_code tran_date deal_id
1 ACHFAIL 2014-08-05 09:20:38.000 {01xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
2 ACHCLEAR 2014-08-04 16:27:17.473 {01xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
1 ACHCLEAR 2014-09-09 15:14:48.337 {02xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
2 ACHCLEAR 2014-09-08 14:23:00.737 {02xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
1 ACHFAIL 2014-07-18 14:35:38.037 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
2 ACHFAIL 2014-07-18 13:58:52.000 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
3 ACHCLEAR 2014-07-17 14:48:58.617 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
4 ACHFAIL 2014-07-16 15:04:28.023 {03xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
01xxxxxx has 1 ACHFAIL
02xxxxxx has 0 ACHFAIL
03xxxxxx has 2 ACHFAIL
You are half way there. With any sort of problem with "consecutive rows", you will need a recursive CTE (that's TEMP2 below):
;WITH
TEMP1 AS
(
SELECT tran_code,
deal_id,
ROW_NUMBER() OVER (PARTITION BY deal_id ORDER BY tran_date DESC) AS tran_rank
FROM TRANSACTIONS
),
TEMP2 AS
(
SELECT tran_code,
deal_id,
tran_rank
FROM TEMP1
WHERE tran_rank = 1 -- last transaction for a deal
AND tran_code = 'ACHFAIL' -- failed transactions only
UNION ALL
SELECT curr.tran_code,
curr.deal_id,
curr.tran_rank
FROM TEMP1 curr
INNER JOIN TEMP2 prev ON curr.deal_id = prev.deal_id -- transaction must be for the same deal
AND curr.tran_rank = prev.tran_rank + 1 -- must be consecutive
WHERE curr.tran_code = 'ACHFAIL' -- must have failed
AND curr.tran_rank <= 5 -- up to 5 only
)
SELECT t.deal_id,
ISNULL(MAX(tran_rank),0) AS FailCount
FROM TRANSACTIONS t
LEFT JOIN TEMP2 t2 ON t.deal_id = t2.deal_id
GROUP BY t.deal_id
SQL Fiddle
If I understand correctly, you want the number of fails in the five most recent transactions for each deal. That would be something like:
WITH grouped AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY t.deal_id ORDER BY tran_date DESC
) AS seqnum
FROM TRANSACTIONS t
)
SELECT deal_id, sum(case when tran_code = 'ACHFAIL' then 1 else 0 end) as NuMFails
FROM grouped
WHERE seqnum <= 5
GROUP BY deal_id;
The CTE enumerates the rows. The where clause takes the 5 most recent rows for each deal. The group by then aggregates by deal_id.
Note that you do not need to include the partition by column(s) in the order by when you use over.

Find the longest sequence of consecutive increasing numbers in SQL

For this example say I have a table with two fields, AREA varchar(30) and OrderNumber INT.
The table has the following data
AREA | OrderNumber
Fontana | 32
Fontana | 42
Fontana | 76
Fontana | 12
Fontana | 3
Fontana | 99
RC | 32
RC | 1
RC | 8
RC | 9
RC | 4
I would like to return
The results I would like to return is for each area the longest length of increasing consecutive values. For Fontana it is 3 (32, 42, 76). For RC it is 2 (8,9)
AREA | LongestLength
Fontana | 3
RC | 2
How would I do this on MS Sql 2005?
One way is to use a recursive CTE that steps over each row. If the row meets the criteria (increasing order number for the same area), you increase the chain length by one. If it doesn't, you start a new chain:
; with numbered as
(
select row_number() over (order by area, eventtime) rn
, *
from Table1
)
, recurse as
(
select rn
, area
, OrderNumber
, 1 as ChainLength
from numbered
where rn = 1
union all
select cur.rn
, cur.area
, cur.OrderNumber
, case
when cur.area = prev.area
and cur.OrderNumber > prev.OrderNumber
then prev.ChainLength + 1
else 1
end
from recurse prev
join numbered cur
on prev.rn + 1 = cur.rn
)
select area
, max(ChainLength)
from recurse
group by
area
Live example at SQL Fiddle.
An alternative way is to use a query to find "breaks", that is, rows that end a sequence of increasing order numbers for the same area. The number of rows between breaks is the length.
; with numbered as
(
select row_number() over (order by area, eventtime) rn
, *
from Table1 t1
)
-- Select rows that break an increasing chain
, breaks as
(
select row_number() over (order by cur.rn) rn2
, cur.rn
, cur.Area
from numbered cur
left join
numbered prev
on cur.rn = prev.rn + 1
where cur.OrderNumber <= prev.OrderNumber
or cur.Area <> prev.Area
or prev.Area is null
)
-- Add a final break after the last row
, breaks2 as
(
select *
from breaks
union all
select count(*) + 1
, max(rn) + 1
, null
from breaks
)
select series_start.area
, max(series_end.rn - series_start.rn)
from breaks2 series_start
join breaks2 series_end
on series_end.rn2 = series_start.rn2 + 1
group by
series_start.area
Live example at SQL Fiddle.
You do not explain why RC's longest sequence does not include 1 while Fontana's does include 32. I take it, the 1 is excluded because it is a decrease: it comes after 32. The Fontana's 32, however, is the first ever item in the group, and I've got two ideas how to explain why it is considered an increase. That's either exactly because it's the group's first item or because it is also positive (i.e. as if coming after 0 and, therefore, an increase).
For the purpose of this answer, I'm assuming the latter, i.e. a group's first item is an increase if it is positive. The below script implements the following idea:
Enumerate the rows in every AREA group in the order of the eventtime column you nearly forgot to mention.
Join the enumerated set to itself to link every row with it's predecessor.
Get the sign of the difference between the row and its preceding value (defaulting the latter to 0). At this point the problem turns into a gaps-and-islands one.
Partition every AREA group by the signs determined in #3 and enumerate every subgroup's rows.
Find the difference between the row numbers from #1 and those found in #4. That would be a criterion to identify individual streaks (together with AREA).
Finally, group the results by AREA, the sign from #3 and the result from #5, count the rows and get the maximum count per AREA.
I implemented the above like this:
WITH enumerated AS (
SELECT
*,
row = ROW_NUMBER() OVER (PARTITION BY AREA ORDER BY eventtime)
FROM atable
),
signed AS (
SELECT
this.eventtime,
this.AREA,
this.row,
sgn = SIGN(this.OrderNumber - COALESCE(last.OrderNumber, 0))
FROM enumerated AS this
LEFT JOIN enumerated AS last
ON this.AREA = last.AREA
AND this.row = last.row + 1
),
partitioned AS (
SELECT
AREA,
sgn,
grp = row - ROW_NUMBER() OVER (PARTITION BY AREA, sgn ORDER BY eventtime)
FROM signed
)
SELECT DISTINCT
AREA,
LongestIncSeq = MAX(COUNT(*)) OVER (PARTITION BY AREA)
FROM partitioned
WHERE sgn = 1
GROUP BY
AREA,
grp
;
A SQL Fiddle demo can be found here.
You can do some math by ROW_NUMBER() to figure out where you have consecutive items.
Here's the code sample:
;WITH rownums AS
(
SELECT [area],
ROW_NUMBER() OVER(PARTITION BY [area] ORDER BY [ordernumber]) AS rid1,
ROW_NUMBER() OVER(PARTITION BY [area] ORDER BY [eventtime]) AS rid2
FROM SomeTable
),
differences AS
(
SELECT [area],
[calc] = rid1 - rid2
FROM rownums
),
summation AS
(
SELECT [area], [calc], COUNT(*) AS lengths
FROM differences
GROUP BY [area], [calc]
)
SELECT [area], MAX(lengths) AS LongestLength
FROM differences
JOIN summation
ON differences.[calc] = summation.[calc]
AND differences.area = calc.area
GROUP BY [area]
So if I do one set of row numbers ordered by my ordernumber and another set of row numbers by my event time, the difference between those two numbers will always be the same, so long as their order is the same.
You can then get a count grouped by those differences and then pull the largest count to get what you need.
EDIT: ...
Ignore the first edit, what I get for rushing.