Find similar sales orders in SQL - sql

This is my first post.
I work at a manufacturing company and most of the products we are making are custom made.
We believe we can find some commonalities in the products we sale.
To do this, we need to analyze sales orders and compare them to all the sales orders in our system to find identical ones.
Here's an example in form of a SQL result:
etc...
+------------------------------+
| OrderId ProductCode Qty |
+------------------------------+
| SS1234 Widget1 1 |
| SS1234 Widget2 3 |
| SS1234 Widget3 1 |
+------------------------------+
I would like to find orders similar to SS1234, ie orders with the same products (widget1, widget2 and widget3) and the same quantities.
How do I do this in SQL Server 2008R2?
Thanks for your help!
Raf

I won't be able to test this before I go to bed for the evening. This is an overly verbose approach, but I wanted to grind this out as quickly as possible so I tried to use structure / syntax that I know well, instead of trying to write more concise, efficient code that would require I lean on the documentation. Basically, we're counting the number of items in each order, selecting a pair of order ids every time we find two matching line items, then we count how many times an exact pair of order IDs appears. Use inner joins to filter out pairs that matched fewer times than there are products in the order.
WITH
ProductCounts AS (
SELECT COUNT(OrderID) AS ProductCodesInOrder, OrderID
FROM Table
GROUP BY OrderID
), MatchingLineItems AS (
SELECT A.OrderID AS FirstOrderID, B.OrderID AS SecondOrderID
FROM Table AS A
INNER JOIN Table AS B
ON A.ProductCode = B.ProductCode AND A.Qty = B.Qty
ORDER BY FirstOrderID, SecondOrderID
), MatchTotals AS (
SELECT
COUNT(FirstOrderID) AS Matches, FirstOrderID, SecondOrderID
FROM MatchingLineItems
GROUP BY FirstOrderID, SecondOrderID
), FirstMatches AS (
SELECT MatchTotals.FirstOrderID, MatchTotals.SecondOrderID, MatchTotals.Matches
FROM MatchTotals
INNER JOIN ProductCounts
ON MatchTotals.FirstOrderID = ProductCounts.OrderID
WHERE MatchTotals.Matches = ProductCounts.ProductCodesInOrder
)
SELECT FirstMatches.FirstOrderID, FirstMatches.SecondOrderID
FROM FirstMatches
INNER JOIN ProductCounts
ON FirstMatches.SecondOrderID = ProductCounts.OrderID
WHERE FirstMatches.Matches = ProductCounts.ProductCodesInOrder

Setup:
CREATE TABLE #ord (
OrderId VARCHAR(20),
ProductCode VARCHAR(40),
qty int
)
INSERT INTO #ord (OrderId, ProductCode, Qty)
VALUES
('SS1234','Widget1',1)
,('SS1234','Widget2',3)
,('SS1234','Widget3',1)
,('SS1234a','Widget1',1)
,('SS1234a','Widget2',3)
,('SS1234a','Widget3',1)
,('xSS1234','Widget1',1)
,('xSS1234','Widget2',3)
,('xSS1234','Widget3',1)
,('xSS1234','Widget4',1)
,('ySS1234','Widget1',10)
,('ySS1234','Widget2',3)
,('ySS1234','Widget3',1)
,('zSS1234','Widget2',3)
,('zSS1234','Widget3',1)
;
Query:
with CTE as (
select distinct
o.OrderID, ca.ProductString, ca.QtyString
from #ord o
cross apply (
SELECT
STUFF((
SELECT
', ' + o2.ProductCode
FROM #ord o2
WHERE o.OrderID = o2.OrderID
ORDER BY o2.ProductCode
FOR XML PATH ('')
)
, 1, 1, '')
, STUFF((
SELECT
', ' + cast(o2.Qty as varchar)
FROM #ord o2
WHERE o.OrderID = o2.OrderID
ORDER BY o2.ProductCode
FOR XML PATH ('')
)
, 1, 1, '')
) ca (ProductString, QtyString)
)
select
ProductString, QtyString, count(*) Num_Orders
from CTE
group by
ProductString, QtyString
having
count(*) > 1
order by
Num_Orders DESC
, ProductString
Result:
ProductString QtyString Num_Orders
Widget1, Widget2, Widget3 1, 3, 1 2
See: http://rextester.com/DJEN59714

Related

Only get rows until qty total is met

I have an order table (product, qty_required) and a stock/bin location table (product, bin_location, qty_free) which is a one->many (a product may be stored in multiple bins).
Please, please (pretty please!) Does anybody know how to:
When producing a picking report, I only want to return the first x bins for each product ordered THAT SATISFIES the qty_required on the order.
For example
An order requires product 'ABC', QTY 10
Product 'ABC' is in the following locations (this is listed using FIFO rules so oldest first):
LOC1, 3 free
LOC2, 4 free
LOC3, 6 free
LOC4, 18 free
LOC5, 2 free
so. on the report, I'd ONLY want to see the first 3 locations, as the total of those (13) satisfies the order quantity of 10...
Ie:
LOC1, 3
LOC2, 4
LOC3, 6
Use sum(qty_free) over(partition by product order by placement_date desc, bin_location) to calculate running sum and filter rows by your threshold in outer query (select from select). Added location in order by to exclude sum of all locations where placement was in the same day.
with s as (
select st.*,
sum(qty_free) over(partition by product order by placement_date asc, bin_location) as rsum
from stock st
)
select
o.product,
s.bin_location,
s.qty_free,
o.qty_requested
from orders o
left join s
on o.product = s.product
and s.rsum <= o.qty_requested
UPD: Since that turned out that your SQL Server version is so old that there's no analytic function in it, here's another less performant way to do this (maybe need some fixes, didn't tested on real data).
And fiddle with some setup.
with ord_key as (
select stock.*,
/*Generate order key for FIFO*/
row_number() over(order by
placement_date desc,
bin_location asc
) as sort_order_key
from stock
)
, rsum as (
/*Calculate sum of all the items before current*/
select
b.product,
b.bin_location,
b.placement_date,
b.qty_free,
coalesce(sum(sub.item_sum), 0) as rsum
from ord_key as b
left join (
/*Group by partition key and orderby key*/
select
product,
sort_order_key,
sum(qty_free) as item_sum
from ord_key
group by
product,
sort_order_key
) as sub
on b.product = sub.product
and b.sort_order_key > sub.sort_order_key
group by
b.product,
b.bin_location,
b.placement_date,
b.qty_free
)
, calc_quantities as (
select
o.product,
s.placement_date,
s.bin_location,
s.qty_free,
s.rsum,
o.qty_requested,
case
when o.qty_requested > s.rsum + s.qty_free
then s.qty_free
else s.rsum + s.qty_free - o.qty_requested
end as qty_to_retrieve
from orders o
left join rsum s
on o.product = s.product
and s.rsum < o.qty_requested
)
select
s.*,
qty_free - qty_to_retrieve as stock_left
from calc_quantities s
order by
product,
placement_date desc,
bin_location desc

Creating columns from a select query

SELECT TOP 20
TMPPO.PurchaseOrder ,
TMPPO.LineItem ,
ASLD.SignatureDate ,
ASLD.SignatureTime ,
ASLD.Operator ,
ASLD.Variable ,
ASLD.VariableDesc ,
ASLD.VarNumericValue
FROM #POAMENDMENTS TMPPO
LEFT OUTER JOIN [SysproCompanyR].[dbo].[AdmSignatureLogDet] ASLD ON TMPPO.TransactionId = ASLD.TransactionId
AND TMPPO.SignatureDate = ASLD.SignatureDate
AND TMPPO.SignatureTime = ASLD.SignatureTime
WHERE YEAR(TMPPO.SignatureDate) = 2013
AND MONTH(TMPPO.SignatureDate) = 08
AND VariableDesc IN ( 'Previous foreign price', 'Previous price',
'Foreign price', 'Price' )
ORDER BY PurchaseOrder ,
LineItem
I have the following table but don't want to return the records as per below.
Under the column heading Variable Desc I have Foreign Price, Previous foreign pirce, previous price and price I would like to make these as headings the replace Variable, Variable Desc and VarNumberic.
So e.g. for the first line would be
Purchase Order LineItem SignatureDate SignatureTime Operator PrevFPrice FPrice PrevPrice Price
002074 0001 2013-02-23 9523598 UPOFA0 19.68 21.51 19.68 21.51
004931 0001 2013-08-09 7485253 PVWYK0 980.00 840.00 980.00 840.00
Sorry but is difficult to put sample data here no idea how to...
Is this possible?
#Bummi it provides me data like this, why is Purchase Order 005331 duplicating so many times when in essence according to the original sample data it changed only 2 times according to date and time
From what I am understanding you are looking for a join over your first query
;With CTE as
(
SELECT TOP 20 TMPPO.PurchaseOrder, TMPPO.LineItem, ASLD.SignatureDate,ASLD.SignatureTime,ASLD.Operator, ASLD.Variable, ASLD.VariableDesc, ASLD.VarNumericValue FROM #POAMENDMENTS TMPPO
LEFT OUTER JOIN [SysproCompanyR].[dbo].[AdmSignatureLogDet] ASLD ON TMPPO.TransactionId = ASLD.TransactionId and TMPPO.SignatureDate = ASLD.SignatureDate and TMPPO.SignatureTime = ASLD.SignatureTime
WHERE YEAR(TMPPO.SignatureDate) = 2013
and MONTH(TMPPO.SignatureDate) = 08
and VariableDesc IN ('Previous foreign price','Previous price','Foreign price','Price')
ORDER BY PurchaseOrder, LineItem
)
Select c1.PurchaseOrder,c1.LineItem,c1.SignatureDate,c1.SignatureTime,c1.Operator
,c1.VarNumericValue as [Previous foreign price]
,c2.VarNumericValue as [Previous price]
,c3.VarNumericValue as [Foreign price]
,c4.VarNumericValue as [Price]
FROM CTE c1
JOIN CTE c2 on c2.PurchaseOrder=c1.PurchaseOrder and c2.VariableDesc='Previous price'
and c2.LineItem=c1.LineItem and c2.SignatureDate=c1.SignatureDate and c2.SignatureTime=c1.SignatureTime
JOIN CTE c3 on c3.PurchaseOrder=c1.PurchaseOrder and c3.VariableDesc='Foreign price'
and c3.LineItem=c1.LineItem and c3.SignatureDate=c1.SignatureDate and c3.SignatureTime=c1.SignatureTime
JOIN CTE c4 on c4.PurchaseOrder=c1.PurchaseOrder and c4.VariableDesc='Price'
and c4.LineItem=c1.LineItem and c4.SignatureDate=c1.SignatureDate and c4.SignatureTime=c1.SignatureTime
Where c1.VariableDesc='Previous foreign price'
You can just use 'AS' to rename your columns
SELECT something AS somethingelse

SQL Server query troubles, many-many relationship

Not sure how to word this question in one line, apologies for the title...
I have 3 tables in my database, for example:
Shop
Item
ShopStock
Shop and Item have a many-many relationship and so the ShopStock table links them.
The fields in ShopStock are:
ID
ShopID
ItemID
CurrentStock
I want to list the items, showing how much stock each shop has, but I'm having trouble with the SQL. Something like this:
ITEM TESCO STOCK ASDA STOCK SAINSBURY STOCK
Apples 5 20 74
Pears 1000 32 250
How do I build the SQL query to display the data like this?
This would be easier to list as item,shop,currentstock in multiple rows. As is, unless you know the number of shops, you're going to need to use dynamic sql for this. If you know the number of potential shops, you can use PIVOT to return your results.
Something like this assuming you had 2 shops (shop1 and shop2):
select item_name, [Shop1], [Shop2]
from
(
select item_name, shop_name, currentstock
from item i
join shopstock ss on i.item_id = ss.item_id
join shop s on s.shop_id = ss.shop_id
) x
pivot
(
max(currentstock)
for shop_name in ([Shop1],[Shop2])
) p
SQL Fiddle Demo
Here is the dynamic sql approach as I suspect you don't know the number of possible shops:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = stuff((select distinct ',' + quotename(shop_name)
from shop
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'select item_name,' + #cols + '
from
(
select item_name, shop_name, currentstock
from item i
join shopstock ss on i.item_id = ss.item_id
join shop s on s.shop_id = ss.shop_id
) x
pivot
(
max(currentstock)
for shop_name in (' + #cols + ')
) p '
execute(#query)
SQL Fiddle Demo
You might need to add JOINS to get specific names, but this is the idea you're after:
SELECT ItemID
, MAX(CASE WHEN ShopID = 'Tesco' THEN CurrentStock ELSE 0 END)'Tesco Stock'
, MAX(CASE WHEN ShopID = 'ASDA' THEN CurrentStock ELSE 0 END)'ASDA Stock'
, MAX(CASE WHEN ShopID = 'Sainsbury' THEN CurrentStock ELSE 0 END)'SainsburyStock'
FROM ShopStock
GROUP BY ItemID
Assuming one entry per item per shopID. If there are multiples then you would have to SUM() them, but the idea is the same.

select least row per group in SQL

I am trying to select the min price of each condition category. I did some search and wrote the code below. However, it shows null for the selected fields. Any solution?
SELECT Sales.Sale_ID, Sales.Sale_Price, Sales.Condition
FROM Items
LEFT JOIN Sales ON ( Items.Item_ID = Sales.Item_ID
AND Sales.Expires_DateTime > NOW( )
AND Sales.Sale_Price = (
SELECT MIN( s2.Sale_Price )
FROM Sales s2
WHERE Sales.`Condition` = s2.`Condition` ) )
WHERE Items.ISBN =9780077225957
A little more complicated solution, but one that includes your Sale_ID is below.
SELECT TOP 1 Sale_Price, Sale_ID, Condition
FROM Sales
WHERE Sale_Price IN (SELECT MIN(Sale_Price)
FROM Sales
WHERE
Expires_DateTime > NOW()
AND
Item_ID IN
(SELECT Item_ID FROM Items WHERE ISBN = 9780077225957)
GROUP BY Condition )
The 'TOP 1' is there in case more than 1 sale had the same minimum price and you only wanted one returned.
(internal query taken directly from #Michael Ames answer)
If you don't need Sales.Sale_ID, this solution is simpler:
SELECT MIN(Sale_Price), Condition
FROM Sales
WHERE Expires_DateTime > NOW()
AND Item_ID IN
(SELECT Item_ID FROM Items WHERE ISBN = 9780077225957)
GROUP BY Condition
Good luck!

Math with previous row in SQL, avoiding nested queries?

I want to do some math on the previous rows in an SQL request in order to avoid doing it in my code.
I have a table representing the sales of two entities (the data represented here is doesn't make much sense and it's just an excerpt) :
YEAR ID SALES PURCHASE MARGIN
2009 1 10796820,57 2662369,19 8134451,38
2009 2 2472271,53 2066312,34 405959,19
2008 1 9641213,19 1223606,68 8417606,51
2008 2 3436363,86 2730035,19 706328,67
I want to know how the sales, purchase, margin... have evolved and compare one year to the previous one.
In short I want an SQL result with the evolutions pre-computed like this :
YEAR ID SALES SALES_EVOLUTION PURCHASE PURCHASE_EVOLUTION MARGIN MARGIN_EVOLUTION
2009 1 10796820,57 11,99 2662369,19 117,58 8134451,38 -3,36
2009 2 2472271,53 -28,06 2066312,34 -24,31 405959,19 -42,53
2008 1 9641213,19 1223606,68 8417606,51
2008 2 3436363,86 2730035,19 706328,67
I could do some ugly stuff :
SELECT *, YEAR, ID, SALES , (SALES/(SELECT SALES FROM TABLE WHERE YEAR = OUTER_TABLE.YEAR-1 AND ID = OUTER_TABLE.ID) -1)*100 as SALES_EVOLUTION (...)
FROM TABLE as OUTER_TABLE
ORDER BY YEAR DESC, ID ASC
But I have arround 20 fields for which I would have to do a nested query, meaning I would have a very huge and ugly query.
Is there a better way to do this, with less SQL ?
Using sql server (but this should work for almost any sql), with the table provided you can use a LEFT JOIN
DECLARE #Table TABLE(
[YEAR] INT,
ID INT,
SALES FLOAT,
PURCHASE FLOAT,
MARGIN FLOAT
)
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2009,1,10796820.57,2662369.19,8134451.38
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2009,2,2472271.53,2066312.34,405959.19
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2008,1,9641213.19,1223606.68,8417606.51
INSERT INTO #Table ([YEAR],ID,SALES,PURCHASE,MARGIN) SELECT 2008,2,3436363.86,2730035.19,706328.67
SELECT cur.*,
((cur.Sales / prev.SALES) - 1) * 100
FROM #Table cur LEFT JOIN
#Table prev ON cur.ID = prev.ID AND cur.[YEAR] - 1 = prev.[YEAR]
The LEFT JOIN will allow you to still see values from 2008, where an INNER JOIN would not.
Old skool solution:
SELECT c.YEAR, c.ID, c.SALES, c.PURCHASE, c.MARGIN
, p.YEAR, p.ID, p.SALES, p.PURCHASE, p.MARGIN
FROM tab AS c -- current
INNER JOIN tab AS p -- previous
ON c.year = p.year - 1
AND c.id = p.id
If you have a db with analytical functions (MS SQL, Oracle) you can use the LEAD or LAG analytical functions, see http://www.oracle-base.com/articles/misc/LagLeadAnalyticFunctions.php
I think this would be the correct application:
SELECT c.YEAR, c.ID, c.SALES, c.PURCHASE, c.MARGIN
, LAG(c.YEAR, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.ID, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.SALES, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.PURCHASE, 1, 0) OVER (ORDER BY ID,YEAR)
, LAG(c.MARGIN, 1, 0) OVER (ORDER BY ID,YEAR)
FROM tab AS c -- current
(not really sure, haven't played with this enough)
You can do it like this:
SELECT t1.*, t1.YEAR, t1.ID, t1.SALES , ((t1.sales/t2.sales) -1) * 100 as SALES_EVOLUTION
(...)
FROM Table t1 JOIN Table t2 ON t1.Year = (t2.Year + 1) AND t1.Id = t2.Id
ORDER BY t1.YEAR DESC, t1.ID ASC
Now, if you want to compare more years, you'd have to do more joins, so it is a slightly ugly solution.