How to group this query so it behaves correctly?

How to group this query so it behaves correctly? - sql

I have a table called Stock and another called Listed, within the Stock Table is a Status Code that indicates when something is at the front of the queue of items of stock - I want to be able to find the most recently added item and set this to be the "front of queue" status.
For example to get all the items listed and then order them by the one most recently listed
I would use this query:
SELECT SKU FROM Stock
INNER JOIN Listed
ON Listed.ListingID = Stock.ListingID
WHERE Stock.StatusCode = 2
ORDER BY Listed.ListDate
However I want to find all the items in my Stock table which need to be at the front of the queue - ie. have a StatusCode of 1 where those items have no SKU with a StatusCode of 1
e.g. I have a few items with various ProductCodes in the Stock table but can have StatusCodes of 1s and 2s - where the 1 indicates the first item in the queue, and 2 indicates the rest of the items with the same ProductCode.
How do I write my query to set all those items which need a StatusCode of 1 where anything with a given ProductCode has nothing with a status code of 1?
I want to set the most recently added Stock item listed with a Status Code of 1 as I have to reset them all to 2 as part of a maintainence process and need to restore the "front-of-queue" item.
Most Recently Added: ListDate
StatusCode: 1 (Front of Queue), 2 (Other Items in Stock of same Product Code)
Here is some sample Data
Stock Table
SKU ProductCode StatusCode
1 111111 1
2 111111 2
3 222222 1
4 222222 2
5 333333 2
6 333333 2
Listed Table
ListID SKU ListDate
01 1 01/01/2009
02 2 02/01/2009
03 3 03/01/2009
04 4 04/01/2009
05 5 05/01/2009
06 6 06/01/2009
In the Stock Table SKU 6 with the ProductCode 333333 has two items with the same StatusCode, I want to set the one with the most recent ListDate from the Listed Table
to StatusCode 1. This would apply to all other cases of this where I need the most
recently added item to have this StatusCode

UPDATE S1
SET S1.StatusCode = 1
FROM Stock S1
LEFT JOIN Stock S2
ON (S1.ProductCode = S2.ProductCode
AND S2.StatusCode = 1)
JOIN Listed L1
ON (S1.SKU = L1.SKU)
WHERE S2.StatusCode IS NULL
AND L1.ListDate =
( SELECT MIN(L2.ListDate)
FROM Listed L2
WHERE L1.SKU = L2.SKU )
Sometimes you say you want to "find" such items (that I guess would be a SELECT) and sometimes you say you want to "set" their status code -- I've taken the latter operation because it seems a better match for the problem you describe, whence the UPDATE.
Also, it's not clear what you want to do when multiple otherwise-satisfactory items have identical dates and thus it's impossible to uniquely define the latest one; maybe other consraints in your situation make that impossible? Here I'm setting all of their status codes, of course it would also be possible to set none of them or a somewhat arbitrarily chosen one (by ordering on some other criteria?).

This is a variation of pick-a-winner... it's pick-all-losers.
Here's the gist. There are several records with a common value, but one record is special - it is the winner. The rest of the records with that common value are losers.
For example, this query picks a winner (per name) from Customers by using the lowest id. It does this by defining what a winner is in the subquery.
SELECT *
FROM Customers c1
WHERE
(
SELECT Min(CustomerID)
FROM Customers c2
WHERE c2.Name = c1.Name
GROUP BY c2.Name
) = c1.CustomerID
Then picking the losers is a simple change:
SELECT *
FROM Customers c1
WHERE
(
SELECT Min(CustomerID)
FROM Customers c2
WHERE c2.Name = c1.Name
GROUP BY c2.Name
) != c1.CustomerID

This is a variation on a common theme. A similar application to this type of query is used to deal with duplicate rows. In this senario you might want to delete all but one rows of a set.
This query solves you problem:
DECLARE #Stock AS TABLE (SKU Bigint,ProductCode Bigint,StatusCode Bigint)
INSERT INTO #Stock VALUES (1,111111,1)
INSERT INTO #Stock VALUES (2,111111,2)
INSERT INTO #Stock VALUES (3,222222,1)
INSERT INTO #Stock VALUES (4,222222,2)
INSERT INTO #Stock VALUES (5,333333,2)
INSERT INTO #Stock VALUES (6,333333,2)
DECLARE #Listed AS TABLE (ListID Bigint,SKU Bigint,ListDate DateTime)
INSERT INTO #Listed VALUES (1,1,'01/01/2009')
INSERT INTO #Listed VALUES (2,2,'02/01/2009')
INSERT INTO #Listed VALUES ( 3,3,'03/01/2009')
INSERT INTO #Listed VALUES ( 4,4,'04/01/2009')
INSERT INTO #Listed VALUES ( 5,5,'05/01/2009')
INSERT INTO #Listed VALUES ( 6,6,'06/01/2009')
UPDATE #Stock
SET StatusCode = 1
FROM #Stock AS T1
INNER JOIN #Listed AS T2 ON T1.SKU = T2.SKU
WHERE
T1.SKU IN
(SELECT TOP 1 T3.SKU FROM #Stock AS T3
INNER JOIN #Listed AS T4 ON T3.SKU = T4.SKU
AND T3.ProductCode = T1.ProductCode ORDER BY ListDate)
AND ProductCode IN
(SELECT DISTINCT ProductCode
FROM #Stock AS S1
WHERE 1 NOT IN (SELECT DISTINCT StatusCode FROM #Stock AS S2 WHERE S2.ProductCode = S1.ProductCode))

Related

Selecting top n matches without matching the same rows twice

I am given two tables. Table 1 contains a list of appointment entries and Table 2 contains a list of date ranges, where each date range has an acceptable number of appointments it can be matched with.
I need to match an appointment from table 1 (starting with an appointment with the lowest date) to a date range in table 2. Once we've matched N appointments (where N = Allowed Appointments), we can no longer consider that date range.
Moreover, once we've matched an appointment from table 1 we can no longer consider that appointment for other matches.
Based on the matches I return table 3, with a bit column telling me if there was a match.
I am able to successfully perform this using a cursor, however this solution is not scaling well with larger datasets. I tried to match top n groups using row_count() however, this allows the same appointment to be matched multiple times which is not what I'm looking for.
Would anyone have suggestions in how to perform this matching using a set based approach?
Table 1
ApptID
ApptDate
1
01-01-2022
2
01-04-2022
3
01-05-2022
4
01-20-2022
5
01-21-2022
Table 2
DateRangeId
Date From
Date To
Allowed Num Appointments
1
01-01-2020
01-05-2020
2
2
01-06-2020
01-11-2020
1
3
01-12-2020
01-18-2020
2
4
01-20-2020
01-25-2020
1
5
01-20-2020
01-26-2020
1
Table 3 (Expected Output):
ApptID
ApptDate
Matched
DateRangeId
1
01-01-2022
1
1
2
01-04-2022
1
1
3
01-05-2022
0
NULL
4
01-20-2022
1
4
5
01-21-2022
1
5

Here's a set-based, iterative solution. Depending on the size of your data it might benefit from indexing on the temp table. It works by filling in appointment slots in order of appointment id and range id. You should be able to adjust that if something more optimal is important.
declare #r int = 0;
create table #T3 (ApptID int, ApptDate date, DateRangeId int, UsedSlot int);
insert into #T3 (ApptID, ApptDate, DateRangeId, UsedSlot)
select ApptID, ApptDate, null, 0
from T1;
set #r = ##rowcount;
while #r > 0
begin
with ranges as (
select r.DateRangeId, r.DateFrom, r.DateTo, s.ApptID, r.Allowed,
coalesce(max(s.UsedSlot) over (partition by r.DateRangeId), 0) as UsedSlots
from T2 r left outer join #T3 s on s.DateRangeId = r.DateRangeId
), appts as (
select ApptID, ApptDate from #T3 where DateRangeId is null
), candidates as (
select
a.ApptID, r.DateRangeId, r.Allowed,
UsedSlots + row_number() over (partition by r.DateRangeId
order by a.ApptID) as CandidateSlot
from appts a inner join ranges r
on a.ApptDate between r.DateFrom and r.DateTo
where r.UsedSlots < r.Allowed
), culled as (
select ApptID, DateRangeId, CandidateSlot,
row_number() over (partition by ApptID order by DateRangeId)
as CandidateSequence
from candidates
where CandidateSlot <= Allowed
)
update #T3
set DateRangeId = culled.DateRangeId,
UsedSlot = culled.CandidateSlot
from #T3 inner join culled on culled.ApptID = #T3.ApptID
where culled.CandidateSequence = 1;
set #r = ##rowcount;
end
select ApptID, ApptDate,
case when DateRangeId is null then 0 else 1 end as Matched, DateRangeId
from #T3 order by ApptID;
https://dbfiddle.uk/-5nUzx6Q
It also has occurred to me that you don't really need to store the UsedSlot column. Since it's looking for the maximum in the ranges CTE you might as well just use count(*) over . But it might still have some benefit in making sense of what's going on.

SQL Select to return one line multiple times based on a number within the dataset

So I've tried many other attempts at answers around this topic from here and so far everything has either outright failed or not given me the result I'm after:
I have a select statement to use for a report that brings through delivery information. The result set is from a main table that only has one line per delivery number (the delivery header record) and within the dataset there is also a field called palletspaces which we use to indicate (you guessed it) how many pallets are needed for the delivery
What I now need to do is the following:
find that palletspaces number
return the single delivery record the same number of times as that palletspaces number
include a new column in the results that counts up to that palletspaces number
so for instance, my SQL will return every record from the deliveries table and would look something like this
id traderid toaddressid county postcode palletspaces
D-124597 2101 2 READING RG6 1AZ 3
D-124600 20060 12 MAGOR, GWENT NP26 3DF 1
D-124601 20060 13 RUGBY CV23 8YH 2
So now, I'd need to see that palletspaces number, then return the particular line that many times and then also have a new column that counts these instances:
id traderid toaddressid county postcode palletspaces LineCount
D-124597 2101 2 READING RG6 1AZ 3 1
D-124597 2101 2 READING RG6 1AZ 3 2
D-124597 2101 2 READING RG6 1AZ 3 3
D-124600 20060 12 MAGOR, GWENT NP26 3DF 1 1
D-124601 20060 13 RUGBY CV23 8YH 2 1
D-124601 20060 13 RUGBY CV23 8YH 2 2
The other thing to mention is that naturally I'll have hundreds of different delivery records (all returned as one line each) and all will have differing palletspaces numbers. And of course stating the obvious I need the line to only replicate and count based on it's own palletspaces number
The SQL in use is as below
select
deliveries.id,
deliveries.traderid,
customers.name,
deliveries.toaddressid,
deliveries.eutransportid,
deliveries.street,
deliveries.city,
deliveries.county,
deliveries.postcode,
delivery_custom.palletspaces,
ectransport.ectranspdesc
from deliveries
INNER JOIN customers ON
deliveries.traderid = customers.id
INNER JOIN delivery_custom ON
deliveries.id = delivery_custom.id
INNER JOIN ectransport ON
deliveries.eutransportid = ectransport.ectranspcode

Try like this:
select deliveries.id,
deliveries.traderid,
customers.name,
deliveries.toaddressid,
deliveries.eutransportid,
deliveries.street,
deliveries.city,
deliveries.county,
deliveries.postcode,
delivery_custom.palletspaces,
ectransport.ectranspdesc
INTO #MyTemp
from deliveries
INNER JOIN customers ON
deliveries.traderid = customers.id
INNER JOIN delivery_custom ON
deliveries.id = delivery_custom.id
INNER JOIN ectransport ON
deliveries.eutransportid = ectransport.ectranspcode
;WITH CTE AS(
SELECT id,traderid,toaddressid,county,postcode,palletspaces,1 AS LineCount
FROM #MyTemp
UNION ALL
SELECT id,traderid,toaddressid,county,postcode,palletspaces,LineCount+1
FROM CTE
WHERE LineCount<palletspaces
)
SELECT *
FROM CTE
ORDER BY id, LineCount;
DROP TABLE #MyTemp
Hope this time you get it.

Using Recursive CTE, we can achieve this:
DECLARE #TAB TABLE ([D Number] VARCHAR(20) ,customer INT, postcode VARCHAR(20), palletspaces INT)
INSERT INTO #TAB VALUES('D-123456' ,19114, 'DA12 1TF' , 4)
INSERT INTO #TAB VALUES('D-111111' ,19114, 'DDDD 1TF' , 3)
;WITH CTE AS(
SELECT [D Number],customer,postcode,palletspaces,1 AS A
FROM #TAB
UNION ALL
SELECT [D Number],customer,postcode,palletspaces,A+1
FROM CTE
WHERE A<palletspaces
)
SELECT *
FROM CTE
ORDER BY [D Number], LineCount;
Output:
D Number customer postcode palletspaces LineCount
D-123456 19114 DA12 1TF 4 1
D-123456 19114 DA12 1TF 4 2
D-123456 19114 DA12 1TF 4 3
D-123456 19114 DA12 1TF 4 4

showing items when 'x' is present

Ok, so hoping I can get some help here after searching with no joy.
So I have a key 'orderno' and each 'orderno' has multiple items. Each item has a status. I want to pull a Q that shows only the orderno's that contain an item that has status of 'x'
So If there are 3 items and only 1 is showing status 'x' I want to see all three items not just the one.
Essentially removing any order/items that do not show the x value.
So table1
orderno / Itemno / Itemstatus
1 1 y
1 2 x
2 1 z
3 1 y
3 2 x
3 3 y
4 1 y
4 1 y
EDIT:
So basically the letters represent open, closed, or inprogress... I want to see only order that have and item closed as well as an item in progress so I can see why the order is only showing partially complete from there. Still probably not making sense here. Grrrr.
I need to return the ORDER# and all item#'s for any order that contains an item with status of 'x'.

SELECT * FROM Order_Table
WHERE orderno IN
(SELECT orderno FROM Order_Table WHERE Itemstatus = 'x')
The Inner query returns all the orders with the status 'x' and the outer one return all details of those orders.

I prefer EXISTS to the IN or JOIN versions. It general faster.
Added a sqlfiddle.
CREATE TABLE table1(orderno INT, Itemno INT, Itemstatus CHAR(1))
INSERT INTO table1 VALUES
(1,1,'y')
,(1,2,'x')
,(2,1,'z')
,(3,1,'y')
,(3,2,'x')
,(3,3,'y')
,(4,1,'y')
,(4,1,'y')
SELECT *
FROM table1 a
WHERE EXISTS(SELECT 1
FROM table1 b
WHERE b.OrderNo = a.OrderNo
AND b.Itemstatus='x')

List Top Items and Whether Purchased (one SQL query)

Here are the key fields I'm working with: (They come from a join on two other tables and there are also some extra fields like date I'll be filtering on, but that stuff I can handle.)
Customer Type (text), Item ID (number), Sales $s (number), Customer ID (number)
I want to answer two questions with one query, if possible:
1) For a given list of customer type(s), what were the top 25 items (by sum of sales)
2) Using the list of 25 items generated in step 1, did a given list of customer IDs purchase each of the specified items?
So my final result would look something like this:
(header) Item # | Customer Purchased?
(row 01) Item 1123 | Yes
(row 02) Item 2452 | Yes
(row 03) Item 3354 | No
...
(row 25) Item 2554 | No
The item numbers would be listed in decreasing sales volume (within the specified customer category/categories) and I'd be testing whether sum of sales > 0 to trip the Yes / No flag on customer(s) purchased.
Thanks!

Assuming you have your columns in a table #Orders, and the "list of customer Ids" in a table #CustomerIds:
create table #Orders (CustomerId int, CustomerType varchar(10), ItemId int, Sales decimal);
create table #MyCustomers (CustomerId int);
... you could try something like this:
declare #CustomerType varchar(10) = 'Ugly';
with MarkedOrders as (
select
o.ItemId,
o.Sales,
case when mc.CustomerId is not null then 1 else 0 end IsMyCustomer
from
Orders o
left join #MyCustomers mc
on mc.CustomerId = o.CustomerId
where
o.CustomerType = #CustomerType
)
select top 25
o.ItemId,
max(IsMyCustomer) IsPurchasedByMyCustomer
from MarkedOrders o
group by o.ItemId
order by sum(o.Sales) desc

How to loop through a table and look for adjacent rows with identical values in one field and update another column conditionally in SQL?

I have a table that has a field called ‘group_quartile’ which uses the sql ntile() function to calculate which quartile does each customer lie in on the basis of their activity scores. However using this ntile(0 function i find there are some customers which have same activity scores but are in different quartiles. I need to modify the ‘group-quartile’ column to make all customers with the same activity scores lie in the same group_quartile.
A view of the table values :
Customer_id Product Activity_Score Group_Quartile
CH002 T 2328 1
CR001 T 268 1
CN001 T 178 1
MS006 T 45 2
ST001 T 21 2
CH001 T 0 2
CX001 T 0 3
KH001 T 0 3
MH002 T 0 4
SJ003 T 0 4
CN001 S 439 1
AC002 S 177 1
SC001 S 91 2
PV001 S 69 3
TS001 S 0 4
I used CTE expression but it didnot work.
My query only updates(from the above example) :
CX001 T 0 3
modified to
CX001 T 0 2
So only the first repeating activity score is checked and that row’s group_quartile is updated to 2.
I need to update all the below rows as well.
CX001 T 0 3
KH001 T 0 3
MH002 T 0 4
SJ003 T 0 4
I cannot use DENSE_RANK() instead of quartile to segregate the records as arranging the customers per product in approximately 4 quartiels is a business requirement.
From my understanding I need to loop through the table -
Find a row which has same activity score and the same product as its predecessor but has a different group_quartile
Update the selected row's group_quartile to its predecessor's quartile value
Then againg loop through the updated table to look for any row with the above condition , and update that row similarly.
The loop continues until all rows with same activity scores (for the same product) are put in the same group_quartile.
--
THIS IS THE TABLE STRUCTURE I AM WORKING ON:
CREATE TABLE #custs
(
customer_id NVARCHAR(50),
PRODUCT NVARCHAR(50),
ACTIVITYSCORE INT,
GROUP_QUARTILE INT,
RANKED int,
rownum int
)
INSERT INTO #custs
-- adding a column to give row numbers(unique id) for each row
SELECT customer_id, PRODUCT, ACTIVITYSCORE,GROUP_QUARTILE,RANKED,
Row_Number() OVER(partition by product ORDER BY activityscore desc) N
FROM
-- rows derived form a parent table based on 'segmentation' column value
(SELECT customer_id, PRODUCT, ACTIVITYSCORE,
DENSE_RANK() OVER (PARTITION BY PRODUCT ORDER BY ACTIVITYSCORE DESC) AS RANKED,
NTILE(4) OVER(PARTITION BY PRODUCT ORDER BY ACTIVITYSCORE DESC) AS GROUP_QUARTILE
FROM #parent_score_table WHERE (SEGMENTATION = 'Large')
) as temp
ORDER BY PRODUCT
The method I used to achieve this partially is as follows :
-- The query find the rows which have activity score same as its previous row but has a different GRoup_Quartiel value.
-- I need to use a query to update this row.
-- Next, find any rows in this newly updated table that has activity score same as its previous row but a differnet group_quartile vale.
-- Continue to update the tabel in the above manner until all rows with same activity scores have been updated to have the same quartile value
I managed to find only the rows which have activity score same as its previous row but has a different Group_Quartill value but cannot loop thorugh to find new rows that may match this updated row.
select t1.customer_id,t1.ACTIVITYSCORE,t1.PRODUCT, t1.RANKED, t1.GROUP_QUARTILE, t2.GROUP_QUARTILE as modified_quartile
from #custs t1, #custs t2
where (
t1.rownum = t2.rownum + 1
and t1.ACTIVITYSCORE = t2.ACTIVITYSCORE
and t1.PRODUCT = t2.PRODUCT
and not(t1.GROUP_QUARTILE = t2.GROUP_QUARTILE))
Can anyone help with what should be the t-sql statement for the above?
Cheers!

Assuming you've already worked out a basis Group_Quartile as indicated above, you can update the table with a query similar to the following:
update a
set Group_Quartile = coalesce(topq.Group_Quartile, a.Group_Quartile)
from activityScores a
outer apply
(
select top 1 Group_Quartile
from activityScores topq
where a.Product = topq.Product
and a.Activity_Score = topq.Activity_Score
order by Group_Quartile
) topq
SQL Fiddle with demo.
Edit after comment:
I think you did a lot of the work already by getting the Group_Quartile working.
For each row in the table, the statement above will join another row to it using the outer apply statement. Only one row will be joined back to the original table due to the top 1 clause.
So each for each row, we are returning one more row. The extra row will be matched on Product and Activity_Score, and will be the row with the lowest Group_Quartile (order by Group_Quartile). Finally, we update the original row with this lowest Group_Quartile value so each row with the same Product and Activity_Score will now have the same, lowest possible Group_Quartile.
So SJ003, MH002, etc will all be matched to CH001 and be updated with the Group_Quartile value of CH001, i.e. 2.
It's hard to explain code! Another thing that might help is looking at the join without the update statement:
select a.*
, TopCustomer_id = topq.Customer_Id
, NewGroup_Quartile = topq.Group_Quartile
from activityScores a
outer apply
(
select top 1 *
from activityScores topq
where a.Product = topq.Product
and a.Activity_Score = topq.Activity_Score
order by Group_Quartile
) topq
SQL Fiddle without update.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas