duplicates with condition - sql

I would like to get the number of duplicates for article_id for each merchant_id, where the zip_code is not identical. Please see example below:
Table
merchant_id article_id zip_code
1 4555 1000
1 4555 1003
1 4555 1002
1 3029 1000
2 7539 1005
2 7539 1005
2 7539 1002
2 1232 1006
3 5555 1000
3 5555 1001
3 5555 1002
3 5555 1003
Output Table
merchant_id count_duplicate
1 3
2 2
3 4
This is the query that I am currently using but I am struggling to include the zip_code condition:
SELECT merchant_id
,duplicate_count
FROM main_table mt
JOIN(select article_id, count(*) AS duplicate_count
from main_table
group by article_id
having count(article_id) >1) mt_1
ON mt.article_id ON mt_1.article_id = mt.article_id

If I understand correctly, you can use two levels of aggregation:
SELECT merchant_id, SUM(num_zips)
FROM (SELECT merchant_id, article_id, COUNT(DISTINCT zip_code) AS num_zips
FROM main_table
GROUP BY merchant_id, article_id
) ma
WHERE ma.num_zips > 1
GROUP BY merchant_id;

Related

Joins and/or Sub queries or Ranking functions

I have a table as follows:
Order_ID
Ship_num
Item_code
Qty_to_pick
Qty_picked
Pick_date
1111
1
1
3000
0
Null
1111
1
2
2995
1965
2021-05-12
1111
2
1
3000
3000
2021-06-24
1111
2
2
1030
0
Null
1111
3
2
1030
1030
2021-08-23
2222
1
3
270
62
2021-03-18
2222
1
4
432
0
Null
2222
2
3
208
0
Null
2222
2
4
432
200
2021-05-21
2222
3
3
208
208
2021-08-23
2222
3
4
232
200
2021-08-25
From this table,
I only want to show the rows that has the latest ship_num information, not the latest pick_date information (I was directed to a question like this that needed to return the rows with the latest entry time, I am not looking for that) for an order i.e., I want it as follows
Order_ID
Ship_num
Item_code
Qty_to_pick
Qty_picked
Pick_date
1111
3
2
1030
1030
2021-08-23
2222
3
3
208
208
2021-08-23
2222
3
4
232
200
2021-08-25
I tried the following query,
select order_id, max(ship_num), item_code, qty_to_pick, qty_picked, pick_date
from table1
group by order_id, item_code, qty_to_pick, qty_picked, pick_date
Any help would be appreciated.
Thanks in advance.
Using max(ship_num) is a good idea, but you should use the analytic version (with an OVER clause).
select *
from
(
select t.*, max(ship_num) over (partition by order_id) as orders_max_ship_num
from table1 t1
) with_max
where ship_num = orders_max_ship_num
order by order_id, item_code;
You can get this using the DENSE_RANK().
Query
;with cte as (
select rnk = dense_rank()
over (Partition by order_id order by ship_num desc)
, *
from table_name
)
Select *
from cte
Where rnk =1;

Aggregation and joining 2 tables or Sub Queries

I have the following tables.
Order_table
Order_ID
Item_ID
Qty_shipped
1111
11
4
1111
22
6
1111
33
6
1111
44
6
Shipping_det
Order_ID
Ship_num
Ship_cost
1111
1
16.84
1111
2
16.60
1111
3
16.60
I want my output to be as follows,
Order ID
Qty_shipped
Ship_cost
1111
22
50.04
I wrote the following query,
select sum(O.qty_shipped) as Qty_shipped, sum(S.Ship_cost) as Total_cost
from Order_table O
join shipping_det S on O.Order_ID = S.Order_ID
and I got my output as
Qty_shipped
Total_cost
66
200.16
As per my understanding, because I joined the two tables, Qty_shipped got multipled 3 times and Total_cost got multiplied 4 times.
Any help would be appreciated.
Thanks in advance.
You need to aggregate before joining. Or, to union the table together and then aggregate:
select order_id, sum(qty_shipped), sum(ship_cost)
from ((select order_id, qty_shipped, 0 as ship_cost
from order_table
) union all
(select order_id, 0, ship_cost
from shipping_det
)
) os
group by order_id;

Duplicates with condition (SQL)

I would like to get the number of duplicates for article_id for each merchant_id, where the zip_code is identical. Please see example below:
Table
merchant_id article_id zip_code
1 4555 1000
1 4555 1003
1 4555 1000
1 3029 1000
2 7539 1005
2 7539 1005
2 7539 1002
2 1232 1006
3 5555 1000
3 5555 1001
3 5555 1001
3 5555 1001
Output Table
merchant_id count_duplicate zip_code
1 2 1000
2 2 1005
3 3 1001
This is the query that I am currently using but I am struggling to include the zip_code condition:
SELECT merchant_id
,duplicate_count
FROM main_table mt
JOIN(select article_id, count(*) AS duplicate_count
from main_table
group by article_id
having count(article_id) =1) mt_1
ON mt.article_id ON mt_1.article_id = mt.article_id
This seems to return what you want. I'm not sure why article_id is not included in the result set:
select merchant_id, zip_code, count(*)
from main_table
group by merchant_id, article_id, zip_code
having count(*) > 1

Find Duplicates in a table

My table contains multiple lots (LOT_ID) and each lot contains multiple products(PRODUCT_ID) and there are multiple orders (ORDER_ID) under each Product. I would like to know the order ID’s which are repeated for multiple products for a given LOT
S.NO LOT_ID Product_ID Order_ID
1 101 P108 90001
2 101 P109 90001
3 101 P110 80900
4 102 S189 10098
5 102 S234 10087
6 102 S465 10098
7 102 S342 10050
8 103 L109 20090
9 103 L110 20098
10 103 L111 20020
Desired result
S.NO LOT_ID Product_ID Order_ID
1 101 P108 90001
2 101 P109 90001
3 102 S189 10098
4 102 S465 10098
I think you should apply group by on order_id first and you will get the result set. Please check the answer posted, However I haven't run this.
select LOT_ID, Product_ID, Order_ID
from <tableName>
where Order_ID IN (SELECT Order_ID FROM <tableName> where LOT_ID in (101,102)
GROUP BY Order_ID HAVING COUNT(*) > 1);
count repeats and then select the quantity you need
select t.*, count(*) over (partition by t.LOT_ID, t.Product_ID, t.Order_ID) as c
, count(*) over (partition by t.LOT_ID, t.Order_ID) as c2
from t
When count of unique strings is not equal count of unique Lots and Orders - is your case.

SQL Count/sum multiple columns

I want to use count/ sum multiple fields in a single query sample data and desired result is as listed below:
MemID claimNum ItemID PaidAmt
123 1234 4 5
123 2309 4 5
123 1209 4 5
123 1209 8 2.2
123 1210 8 2.2
Desired result
MemID count(claimNum) count(ItemID) sum(PaidAmt)
123 3 3 15
123 2 2 4.4
It looks like you want to group by both MemID and ItemID:
select MemID, count(claimNum), count(ItemID), sum(PaidAmt)
from the_table
group by MemID, ItemID
Use group by ItemID
select MemID, count(claimNum), count(ItemID), sum(PaidAmt)
from my_table
group by MemID, ItemID