Subtracting rows depending on values of another column - sql

I have two tables purchase, I want to subtract purchase date. depending on Customer ID, there are repeating customer ID's, so I want to subtract purchase date of Customer ID 105 and 105, 108 and 108 etc.
I have the following code, but it is subtracting each purchase date from the next purchase date
SELECT DATEDIFF(DAY,P1.PURCHASEDATE,P2.PURCHASEDATE) AS "diff in days since last purchase"
FROM Purchases P1
JOIN Purchases P2
ON P1.CustomerID= P2.CustomerID

Try adding to your ON a not equal: P1.PURCHASEID <> P2.PURCHASEID , meaning something like this:
SELECT DATEDIFF(DAY,P1.PURCHASEDATE,P2.PURCHASEDATE) AS "diff in days"
FROM Purchases P1
JOIN Purchases P2
(ON P1.CustomerID= P2.CustomerID and P1.PURCHASEID <> P2.PURCHASEID )

You can use OUTER APPLY:
;WITH Purchases AS (
SELECT *
FROM (VALUES
(1,'2012-08-15',1,105,'a510'),
(2,'2012-08-15',2,102,'a510'),
(3,'2012-08-15',3,103,'a506'),
(4,'2012-08-16',1,105,'a510'),
(5,'2012-08-17',5,106,'a507'),
(6,'2012-08-17',5,107,'a509'),
(7,'2012-08-18',4,108,'a502'),
(8,'2012-08-19',2,108,'a510'),
(9,'2012-08-19',3,109,'a502'),
(10,'2012-08-20',3,110,'a503')
) as t(PurchaseID,PurchaseDate,Qty,CustomerID,ProductID)
)
SELECT p1.*,
DATEDIFF(DAY,P2.PurchaseDate,P1.PurchaseDate) as ddiff
FROM Purchases p1
OUTER APPLY (
SELECT TOP 1 *
FROM Purchases
WHERE p1.CustomerID = CustomerID
AND PurchaseDate < p1.PurchaseDate
ORDER BY PurchaseDate DESC
) p2
Will output:
PurchaseID PurchaseDate Qty CustomerID ProductID ddiff
1 2012-08-15 1 105 a510 NULL
2 2012-08-15 2 102 a510 NULL
3 2012-08-15 3 103 a506 NULL
4 2012-08-16 1 105 a510 1
5 2012-08-17 5 106 a507 NULL
6 2012-08-17 5 107 a509 NULL
7 2012-08-18 4 108 a502 NULL
8 2012-08-19 2 108 a510 1
9 2012-08-19 3 109 a502 NULL
10 2012-08-20 3 110 a503 NULL
Also you can use LAG (SQL Server 2012 and up):
SELECT *,
DATEDIFF(DAY,LAG(PurchaseDate,1,NULL) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate),PurchaseDate) as ddiff
FROM Purchases

Related

Joining multiple tables and getting MAX value in subquery PostgreSQL

I have 4 Tables in PostgreSQL with the following structure as you can see below:
"Customers"
ID | NAME
101 Max
102 Peter
103 Alex
"orders"
ID | customer_id | CREATED_AT
1 101 2022-05-12
2 101 2022-06-14
3 101 2022-07-9
4 102 2022-02-14
5 102 2022-06-18
6 103 2022-05-22
"orderEntry"
ID | order_id | product_id |
1 3 10
2 3 20
3 3 30
4 5 20
5 5 40
6 6 20
"product"
ID | min_duration
10 P10D
20 P20D
30 P30D
40 P40D
50 P50D
Firstly I need to select "orders" with the max(created_at) date for each customer this is done with the query (it works!):
SELECT c.id as customerId,
o.id as orderId,
o.created_at
FROM Customer c
INNER JOIN Orders o
ON c.id = o.customer_id
INNER JOIN
(
SELECT customer_id, MAX(created_at) Max_Date
FROM Orders
GROUP BY customer_id
) res ON o.customer_id = res.customer_id AND
o.created_at = res.Max_date
the result will look like this:
customer_id | order_id | CREATED_AT
101 3 2022-07-9
102 5 2022-06-18
103 6 2022-05-22
Secondly I need to select for each order_id from "orderEntry" Table, "products" with the max(min_duration) the result should be:
order_id | max(min_duration)
3 P30D
5 P40D
6 P20D
and then join results from 1) and 2) queries by "order_id" and the total result which I'm trying to get should look like this:
customer_name | customer_id | Order_ID | Order_CREATED_AT | Max_Duration
Max 101 3 2022-07-9 P30D
Peter 102 5 2022-06-18 P40D
Alex 103 6 2022-05-22 P20D
I'm struggling to get query for 2) and then join everything with query from 1) to get the result. Any help I would appreciate!
You could make the first query to an CTE and use that to join the rest of the queries.
Like this.
WITH CTE AS ( SELECT c.id as customerId,
o.id as orderId,
o.created_at
FROM Customer c
INNER JOIN Orders o
ON c.id = o.customer_id
INNER JOIN
(
SELECT customer_id, MAX(created_at) Max_Date
FROM Orders
GROUP BY customer_id
) res ON o.customer_id = res.customer_id AND
o.created_at = res.Max_date)
SELECT customerId,orderId,created_at,p.min_duration
FROM CTE
JOIN (SELECT "orderId", MAX("product_id") as product_id FROM "orderEntry" GROUP BY orderId) oe ON CTE.orderId = oe.orderId
JOIN "product" pr ON oe.product_id = pr."ID"

How to use a subselect in a LEFT JOIN ON clause?

I have a table t with
ORD_DATE
ORD_ID
ORD_REF
ORD_TYPE1
ORD_TYPE2
PRODNUM
PRODQUAL
PRICE
2020-09-01
101
101
ORDER
ORDER
456
F
555
2020-09-02
102
101
CONF
ORDER
456
F
555
2020-11-30
103
102
ORDER
ORDER
123
K
444
2020-12-01
104
102
CONF
ORDER
123
K
444
2020-12-01
105
103
ORDER
ORDER
123
K
444
2020-12-01
106
104
ORDER
ORDER
123
K
333
2020-12-02
107
104
CONF
ORDER
123
K
333
2020-12-08
108
104
CONF
RETURN
123
K
-333
2020-12-01
109
105
ORDER
ORDER
123
F
222
2020-12-02
110
105
CONF
ORDER
123
F
222
and a table s with:
ORD_DATE
PROD_NUMBER
PROD_QUAL
2020-12-01-00.00.00.000000
123
K
2020-12-01-00.00.00.000000
123
L
In table t are all sales per day.
A sale has 2 stages: first the order is generated when the customer buys something
("ORDER"/"ORDER"). Then it gets confirmed which is at the next day or within the next days normally ("CONF"/"ORDER"). If a customer sends the product back it's a return ("CONF"/"RETURN").
In table s are the products that are "second hand".
if a product is in that table it means all sales from table t with
ORDER_TYPE_1 = "ORDER"
AND ORDER_TYPE_2 = "ORDER"
AND t.ORD_DATE >= s.ORD_DATE
AND t.PROD_NUMBER = s.PROD_NUMBER
AND t.PROD_QUAL = s.PROD_QUAL
count as "second hand".
I need the sum of all "second hand" sales that are confirmed from the year 2021 and month 12. But only rows with CONF/ORDER or CONF/RETURN should be in the calculation. I have CAL_YEAR and CAL_MONTH in table t for that (omitted for less clutter).
From table t only ORDER_REF 105 matches that and the sum would be 0 because only these 2 rows matter:
| 2020-12-02 | 107 | 104 | CONF | ORDER | 123 | K | 333
| 2020-12-08 | 108 | 104 | CONF | RETURN | 123 | K | -333
My code so far:
SELECT SUM(PRICE)
FROM t
--
LEFT JOIN s
ON t.PRODNUM = s.PRODNUM
AND t.PRODQUAL = s.PRODQUAL
AND (SELECT ORD_DATE FROM t WHERE ORDER_TYPE_1 = 'ORDER' AND ORDER_TYPE_2 = 'ORDER') >= s.ORD_DATE
--
WHERE CAL_YEAR = 2021
AND CAL_MONTH = 12
AND ORDER_TYPE_1 = 'CONF'
AND ORDER_TYPE_2 IN ('ORDER', 'RETURN')
--
GROUP BY PRICE
;
SQL-Error: "single-row subquery returns more than one row
My problem is limiting the LEFT JOIN to ORDER/ORDER (so that ORDER_REF 105 is in) but only use CONF/ORDER and CONF/RETURN for the sum (so that ORDER_REF 102 is out).
Anyone can help?
The simplest way I can think of would be to do a self-join, where you join a second copy of table t aliased t2 to use for the CONF/ORDER and CONF/RETURN rows, while you use t for the ORDER/ORDER rows.
SELECT SUM(t2.PRICE)
FROM t
--
INNER JOIN t t2
ON t2.ORD_REF = t.ORD_REF
AND t2.ORDER_TYPE_1 = 'CONF'
AND t2.ORDER_TYPE_2 IN ('ORDER', 'RETURN')
--
LEFT JOIN s
ON t.PRODNUM = s.PRODNUM
AND t.PRODQUAL = s.PRODQUAL
AND t.ORD_DATE >= s.ORD_DATE
--
WHERE t.CAL_YEAR = 2021
AND t.CAL_MONTH = 12
AND t.ORDER_TYPE_1 = 'ORDER'
AND t.ORDER_TYPE_2 = 'ORDER'
;
If you need it to be more efficient, you could use analytic/window functions to pull the summed price from the CONF rows into the ORDER/ORDER row as a new column. This way it will only query table t once instead of twice.
SELECT SUM(t2.order_price_sum)
FROM (select t.*,
sum(case when ORDER_TYPE_1 = 'CONF'
AND ORDER_TYPE_2 IN ('ORDER', 'RETURN')
then t.price
else 0 end) over (partition by ord_ref) as order_price_sum
from t) t2
--
LEFT JOIN s
ON t2.PRODNUM = s.PRODNUM
AND t2.PRODQUAL = s.PRODQUAL
AND t2.ord_date >= s.ORD_DATE
--
WHERE CAL_YEAR = 2021
AND CAL_MONTH = 12
AND ORDER_TYPE_1 = 'ORDER'
AND ORDER_TYPE_2 = 'ORDER'
;

Getting latest price of different products from control table

I have a control table, where Prices with Item number are tracked date wise.
id ItemNo Price Date
---------------------------
1 a001 100 1/1/2003
2 a001 105 1/2/2003
3 a001 110 1/3/2003
4 b100 50 1/1/2003
5 b100 55 1/2/2003
6 b100 60 1/3/2003
7 c501 35 1/1/2003
8 c501 38 1/2/2003
9 c501 42 1/3/2003
10 a001 95 1/1/2004
This is the query I am running.
SELECT pr.*
FROM prices pr
INNER JOIN
(
SELECT ItemNo, max(date) max_date
FROM prices
GROUP BY ItemNo
) p ON pr.ItemNo = p.ItemNo AND
pr.date = p.max_date
order by ItemNo ASC
I am getting below values
id ItemNo Price Date
------------------------------
10 a001 95 2004-01-01
6 b100 60 2003-01-03
9 c501 42 2003-01-03
Question is, is my query right or wrong? though I am getting my desired result.
Your query does what you want, and is a valid approach to solve your problem.
An alternative option would be to use a correlated subquery for filtering:
select p.*
from prices p
where p.date = (select max(p1.date) from prices where p1.itemno = p.itemno)
The upside of this query is that it can take advantage of an index on (itemno, date).
You can also use window functions:
select *
from (
select p.*, rank() over(partition by itemno order by date desc) rn
from prices p
) p
where rn = 1
I would recommend benchmarking the three options against your real data to assess which one performs better.

How to find the record which is not exists with some criteria in SQL Server?

I have two tables.
ItemRelation table having 30k records
ID ChildID1 ChildID2 ChildID3
------------------------------------------
9 null null null
49 43 50 //43 in childid1, don't want this record too
111 112 113 null
65 68 null null
222 221 223 224
79 null null null
5773 5834 5838 null
F_ItemDailySalesParent having millions of records
ItemID StoreId
-----------------
9 1001 //ItemID 9,41,5773 belongs to 1001 StoreID
41 1001
43 1400 //ItemID 43,45,65,5834 belongs to 1400 StoreID
45 1400
65 1400
68 2000 //ItemID 68,79 belongs to 2000 StoreID
79 2000
5773 1001
5834 1400
5838 2000
I want to show the record ID from ItemRelation table where the ItemID from F_ItemDailySalesParent not present in ItemRelation
ItemID StoreID
-----------------
49 1001
111 1001
65 1001
222 1001
79 1001
9 1400
111 1400
222 1400
79 1400
9 2000
49 2000
111 2000
222 2000
5773 2000
I tried this following query. But this will work without StoreID. But no idea for the above result
select ID from HQMatajer.dbo.ItemRelation ir
where not exists(
select ID,StoreID
from [HQWebMatajer].[dbo].[F_ItemDailySalesParent] Fid
where fid.ItemID=ir.ID
or fid.ItemID = ir.ChildID1
or Fid.ItemID=ir.ChildID2
or Fid.ItemID=ir.ChildID3
and time between '2017-01-01 00:00:00.000' and '2017-02-28 00:00:00.000'
group by ItemID,StoreID
)
Update
I have Hqmatajer.dbo.Store that column name of storeCode = F_ItemDailySalesParent.Storeid
Include checking if StoreId matches when using the not exists()
select ID
from HQMatajer.dbo.ItemRelation ir
cross join (select distinct storeCode from Hqmatajer.dbo.Store) s
where not exists(
select 1
from [HQWebMatajer].[dbo].[F_ItemDailySalesParent] Fid
where fid.StoreId = s.StoreCode
and [time] between '2017-01-01 00:00:00.000' and '2017-02-28 00:00:00.000'
and ( fid.ItemID=ir.ID
or fid.ItemID=ir.ChildID1
or Fid.ItemID=ir.ChildID2
or Fid.ItemID=ir.ChildID3
)
)
If I understand correctly, you want to start with a list of all stores and items and then filter out the ones that are present.
select i.id, s.storeId
from (select distinct id from HQMatajer.dbo.ItemRelation ir) i cross join
stores s -- assume this exists
where not exists (select 1
from [HQWebMatajer].[dbo].[F_ItemDailySalesParent] idsp
where idsp.ItemID = i.ID and idsp.storeId = s.storeId
) and
not exists (select 1
from [HQWebMatajer].[dbo].[F_ItemDailySalesParent] idsp
where idsp.ItemID = i.childID1 and idsp.storeId = s.storeId
) and
not exists (select 1
from [HQWebMatajer].[dbo].[F_ItemDailySalesParent] idsp
where idsp.ItemID = i.childID2 and idsp.storeId = s.storeId
) and
not exists (select 1
from [HQWebMatajer].[dbo].[F_ItemDailySalesParent] idsp
where idsp.ItemID = i.childID3 and idsp.storeId = s.storeId
);
I did not include the time condition. It is not in your sample data, so it is unclear where it fits.
First get a unique list of ItemIds and unique list of StoreIDs, then you can see which are missing with a left join and a where cross ref table id is null. I'll do it in generic terms so you get the idea:
select s.StoreId, i.ItemId
from Stores s
cross apply Items i
left join ItemRelation ir
on s.StoreId = ir.StoreId
and i.ItemId = ir.ItemId
where ir.Id is null

Find Duplicates in a table

My table contains multiple lots (LOT_ID) and each lot contains multiple products(PRODUCT_ID) and there are multiple orders (ORDER_ID) under each Product. I would like to know the order ID’s which are repeated for multiple products for a given LOT
S.NO LOT_ID Product_ID Order_ID
1 101 P108 90001
2 101 P109 90001
3 101 P110 80900
4 102 S189 10098
5 102 S234 10087
6 102 S465 10098
7 102 S342 10050
8 103 L109 20090
9 103 L110 20098
10 103 L111 20020
Desired result
S.NO LOT_ID Product_ID Order_ID
1 101 P108 90001
2 101 P109 90001
3 102 S189 10098
4 102 S465 10098
I think you should apply group by on order_id first and you will get the result set. Please check the answer posted, However I haven't run this.
select LOT_ID, Product_ID, Order_ID
from <tableName>
where Order_ID IN (SELECT Order_ID FROM <tableName> where LOT_ID in (101,102)
GROUP BY Order_ID HAVING COUNT(*) > 1);
count repeats and then select the quantity you need
select t.*, count(*) over (partition by t.LOT_ID, t.Product_ID, t.Order_ID) as c
, count(*) over (partition by t.LOT_ID, t.Order_ID) as c2
from t
When count of unique strings is not equal count of unique Lots and Orders - is your case.