Select duplicates - sql

I have a table with 3 columns like the sample below.
All rows have a unique productid but there are duplicates of customerid and productname together. I want to select only 1 record of each duplicate into a new table with all 3 columns. So from the rows below I want row 1 and 3 into the new table.
productid(guid) customerid productname
4362C96D-B413-EA11-A811-000D3A25C7C2 12345678910 credit
C7EC397D-04BF-E611-80EE-005056A027F8 12345678910 credit
F796026C-B413-EA11-A811-000D3A25C942 24681012141 leasing
7490976F-B413-EA11-A811-000D3A25C7C6 24681012141 leasing
I use this SQL to select all the duplicate rows into a new table:
SELECT p.productid, p2.customerid, p2.productname
INTO tempTable
FROM products AS p
JOIN (SELECT customerid, productname
FROM products
GROUP BY customerid, productname
HAVING COUNT(productname)>1) AS p2
ON p.customerid = p2.customerid AND p.productname= p2.productname
ORDER BY p.customerid, p.productname
This SQL works without the productid but won't find the duplicates if I add productid that is unique pr row.
SELECT customerid, productname
FROM testtable
GROUP BY customerid, productname
HAVING COUNT(productname) > 1
ORDER BY customerid
| 12345678910 | credit |
| 24681012141 | leasing |
How can I query this data to select only 1 of each duplicate row?

CREATE TABLE MyTable (productid varchar(255),customerid bigint, productname varchar(50))
INSERT INTO MyTable (productid,customerid,productname) VALUES
('4362C96D-B413-EA11-A811-000D3A25C7C2',12345678910,'credit'),
('C7EC397D-04BF-E611-80EE-005056A027F8',12345678910,'credit'),
('F796026C-B413-EA11-A811-000D3A25C942',24681012141,'leasing'),
('7490976F-B413-EA11-A811-000D3A25C7C6',24681012141,'leasing')
WITH CTE AS (
SELECT
productid,
customerid,
productname,
ROW_NUMBER() OVER (PARTITION BY customerid, productname ORDER BY productid) AS rn
FROM MyTable
)
SELECT customerid,
productname FROM CTE WHERE rn =1
GO
customerid | productname
----------: | :----------
12345678910 | credit
24681012141 | leasing
SELECT customerid, productname
FROM MyTable
GROUP BY customerid, productname
HAVING COUNT(*) > 1
ORDER BY customerid
GO
customerid | productname
----------: | :----------
12345678910 | credit
24681012141 | leasing
db<>fiddle here

You can add a ROW_NUMBER() windowing function to your result set to distinguish between the GUID values.
SELECT
productid,
customerid,
productname
INTO tempTable
SELECT
productid,
customerid,
productname
FROM
(
SELECT
productid,
customerid,
productname,
ROW_NUMBER() OVER (PARTITION BY customerid, productname ORDER BY productid) AS rn
FROM testtable
) AS d
WHERE d.rn = 1

Related

Oracle SQL - Get records with latest date on non-unique data

Following is the example of records in the table -
ITEM_NAME STORAGE_CODE STOCK DATE
ABC 2233 170 27/09/2017
ABC 2233 270 15/09/2017
DEF 2233 120 23/09/2017
DEF 2233 110 11/09/2017
GHI 2233 50 15/09/2017
Expected result:
ITEM_NAME STORAGE_CODE STOCK DATE
ABC 2233 170 27/09/2017
DEF 2233 120 23/09/2017
GHI 2233 50 15/09/2017
I've tried using the below query:
Select ITEM_NAME, STORAGE_CODE, STOCK, MAX(DATE)
FROM ITEM_TABLE
WHERE ITEM_NAME IN ('ABC','DEF','GHI' .........)
GROUP BY ITEM_NAME, STORAGE_CODE, STOCK
This didn't work as the stock value is not unique.
Please note: I'm using ITEM_NAME IN (), because I need the output for some specific items.
You can add a select top and order by to your query if you just want to get the max date, like this:
SELECT TOP 1
ITEM_NAME,
STORAGE_CODE,
STOCK,
MAX(DATE)
FROM ITEM_TABLE
WHERE ITEM_NAME IN('ABC', 'DEF', 'GHI')
GROUP BY ITEM_NAME,
STORAGE_CODE,
STOCK
ORDER BY DATE DESC
EDIT: Using row_num
SELECT T.ITEM_NAME,
T.STORAGE_CODE,
T.STOCK,
T.DATE
FROM
(
SELECT ITEM_NAME,
STORAGE_CODE,
STOCK,
DATE,
ROW_NUMBER() OVER(PARTITION BY ITEM_NAME,
STORAGE_CODE ORDER BY DATE DESC) AS part
FROM ITEM_TABLE
WHERE ITEM_NAME IN('ABC', 'DEF', 'GHI')
) T
WHERE part = 1;
I think a query like this should work:
select
*
from (
select *,
row_number() over (partition by ITEM_NAME, STORAGE_CODE order by DATE desc) as seq
from ITEM_TABLE
where ITEM_NAME in ('ABC','DEF','GHI' .........)
) t
where seq = 1
You can use Oracle row_number() over(partition by) like following
select ITEM_NAME
, STORAGE_CODE
, STOCK
, DATE
from
(
select ITEM_NAME
, STORAGE_CODE
, STOCK
, DATE
, row_number() over(partition by ITEM_NAME order by DATE desc) as rn
from ITEM_TABLE
) s
where rn = 1
One way could be to get the max value from DATE1 column in inner query and corresponding stock value in outer query using join as below.
SELECT t.*
,t1.STOCK
FROM (
SELECT ITEM_NAME
,STORAGE_CODE
,MAX(DATE1) AS DATE1
FROM table1
GROUP BY ITEM_NAME
,STORAGE_CODE
ORDER BY ITEM_NAME
) t
INNER JOIN table1 t1 ON t.ITEM_NAME = t1.ITEM_NAME
AND t.STORAGE_CODE = t1.STORAGE_CODE
AND t.DATE1 = t1.DATE1
Result:
ITEM_NAME STORAGE_CODE DATE1 STOCK
------------------------------------------------------
ABC 2233 27.09.2017 00:00:00 170
DEF 2233 23.09.2017 00:00:00 120
GHI 2233 15.09.2017 00:00:00 50
DEMO

Group function is nested too deeply SQL Error

I have a table which looks like this:
+-----------------+--------------+
| Field | Type |
+-----------------+--------------+
| orderNumber (PK)| int |
| orderDate | date |
| requiredDate | date |
| shippedDate | date |
| status | char(15) |
| comments | char(200) |
| customerNumber | int |
+-----------------+--------------+
I need to return the customerNumber which has maximum number of orders.
I tried the following command:
SELECT customerNumber FROM ORDERS WHERE customerNumber IN (SELECT customerNumber FROM ORDERS HAVING MAX(COUNT(customerNumber)) GROUP BY customerNumber);
I think an error: group function is nested too deeply
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE ORDERS (
orderNumber int PRIMARY KEY,
orderDate date,
requiredDate date,
shippedDate date,
status char(15),
comments char(200),
customerNumber int
);
INSERT INTO ORDERS ( ORDERNUMBER, CUSTOMERNUMBER ) VALUES ( 1, 1 );
INSERT INTO ORDERS ( ORDERNUMBER, CUSTOMERNUMBER ) VALUES ( 2, 1 );
INSERT INTO ORDERS ( ORDERNUMBER, CUSTOMERNUMBER ) VALUES ( 3, 2 );
INSERT INTO ORDERS ( ORDERNUMBER, CUSTOMERNUMBER ) VALUES ( 4, 2 );
INSERT INTO ORDERS ( ORDERNUMBER, CUSTOMERNUMBER ) VALUES ( 5, 3 );
INSERT INTO ORDERS ( ORDERNUMBER, CUSTOMERNUMBER ) VALUES ( 6, 4 );
Query 1 - If you only want to get a single customer:
SELECT CUSTOMERNUMBER
FROM (
SELECT CUSTOMERNUMBER,
COUNT( ORDERNUMBER ) AS num_orders
FROM ORDERS
GROUP BY CUSTOMERNUMBER
ORDER BY num_orders DESC
)
WHERE ROWNUM = 1
Results:
| CUSTOMERNUMBER |
|----------------|
| 1 |
Query 2 - If you want to get all customers with the highest number of orders:
SELECT CUSTOMERNUMBER
FROM (
SELECT CUSTOMERNUMBER,
RANK() OVER ( ORDER BY NUM_ORDERS DESC ) AS RNK
FROM (
SELECT CUSTOMERNUMBER,
COUNT( ORDERNUMBER ) AS num_orders
FROM ORDERS
GROUP BY CUSTOMERNUMBER
ORDER BY num_orders DESC
)
)
WHERE RNK = 1
Results:
| CUSTOMERNUMBER |
|----------------|
| 1 |
| 2 |
One way to do it is using ctes, where you get the count of orders in the first cte, then select the maximum value. Finally join them to get the customer with the maximum orders.
with ordercount as (select customernumber, count(distinct ordernumber) ordercount
from orders
group by customernumber)
,maxorders as (select max(ordercount) maxcount from ordercount)
select o.customernumber
from ordercount o
join maxorders m on m.maxcount = o.ordercount

Get top n occurences based on related table value

I have a table Orders (Id, OrderDate, CreatorId) and a table OrderLines (Id, OrderId, OwnerIdentity, ProductId, Amount)
Scenario is as follows: Someone opens up an Order and other users can then place their product orders on that order. Those users are the OwnerId of OrderLines.
I need to retrieve the top 3 latest orders that a user has placed an order on and display all of his orders placed, to give him an insight in his personal recent orders.
So my end result would be something like
OrderId | ProductId | Amount
----------------------------
1 | 1 | 2
1 | 7 | 1
1 | 2 | 5
4 | 4 | 3
4 | 1 | 2
8 | 4 | 1
8 | 9 | 2
Select o.Id as OrderId, ol.ProductId, ol.Amount from Orders o
inner join OrderLines ol
on o.Id = ol.OrderId where o.Id in
(Select top 3 OrderId from Orders where OwnerId = #OwnerId)
Order By o.OrderDate desc
You can add date time column to OrderLines table to query latest personal orders and then update the code by moving "order by OrderDate desc" section to sub select query.
select * from
(
select OrderId, ProductId, Amount
row_number() over (partition by OrderID order by Orders.OrderDate) as rn
from OrderLines
join Orders
on OrderLines.OrderId = Orders.Id
where OwnerIdentity = x
) lskdfj
where rn <= 3
Try the below query:
SELECT OL.OrderId, OL.ProductID, OL.Amount
FROM OrderLines OL WHERE OL.OrderId IN
(
SELECT TOP 3 O.OrderID FROM orders O LEFT JOIN OrderLines OL2
ON OL2.orderId=O.OrderID
WHERE OL2.OwnerIdentity =...
ORDER BY O.OrderDate DESC
) AND WHERE OL.OwnerIdentity =...
;WITH cte AS (
SELECT ol.OrderId, ol.ProductId, ol.Amount,
ROW_NUMBER()OVER (PARTITION BY ol.OrderId ORDER BY o.OrderDate DESC) rn
FROM OrderLines ol
JOIN Orders o ON ol.OrderId = o.Id
WHERE OwnerIdentity = #OwnerId
)
SELECT OrderId, ProductId, Amount
FROM cte
WHERE rn <= 3

Group By / Having correlated subquery

I am trying to solve this query, I'm really lost
least one order in which the quantity they ordered was greater than
the average quantity of all other orders.
My tables are:
Customer
Cust_ID | CustName | Region | Phone
Orders
Ordernum | Cust_ID | Item_ID | Quantity
Vendor
Vendor_ID | Item_ID | Costs | Region
stock
Item_ID | Description | Price | On_hand
Database schema is:
Customer (Cust_ID, CustName, Region, Phone)
Orders (Ordernum, Cust_ID, Item_ID, Quantity) Foreign key Cust_ID
references Customer Not Null, On Delete Restrict
Foreign key Item_ID references Stock Not Null, On Delete Restrict
Stock (Item_ID, Description, Price, On_hand)
Vendor (Vendor_ID, Item_ID, Cost, Region)
Foreign key Item_ID references Stock Not null, On Delete Restrict
Here is what I have tried so far. What am I doing wrong?
select custname, phone, count(distinct(item_id)), sum(quantity)
from customer c, orders o
Where c.cust_id= o.cust_id and count (o.ordernum) >= 2
group by cust_id
having sum (quantity) >
(select avg(quantity)
from orders o2
where o.item_id != o2.item_id)
order by custname;
I re-wrote my code and this is what I came up with. I'm really lost:
select c1.custname, phone, count(distinct(o.item_id)), sum(quantity) as quantity
from customer c1, orders o
Where c1.cust_id= o.cust_id
group by c1.cust_id, custname, phone, quantity
Having 2>=
(select count(o1.item_id)
From orders o1
where c1.cust_ID = o1.cust_ID)
AND sum(quantity) >
(select AVG(quantity)
from orders o2
Where c1.cust_ID != o2.cust_ID AND o.Item_ID = o2.Item_ID)
order by custname;
Try this query
select a.cust_id, count(distinct Item_ID) as itemOrderd,
sum(b.quantity) as sum
from
customer a
inner join
orders b
on a.cust_id=b.cust_id
group by a.cust_id
having count(distinct ordernum) > 1
and max(b.quantity) > avg(b.quantity)
Fiddle
| CUST_ID | ITEMORDERD | SUM |
|---------|------------|-----|
| 1 | 2 | 61 |
I think it will be:
SELECT MAX(c.CustName) as CustName,
MAX(c.Phone) as Phone,
COUNT(DISTINCT o.Item_ID) as CountOfDistinctItems,
SUM(o.Quantity) as SumOfQuantity
FROM Customer as c
JOIN Orders as o on c.Cust_ID=o.Cust_ID
JOIN (SELECT Ordernum, SUM(Quantity) as Order_q_sum
FROM Orders
GROUP BY Ordernum) as oq
ON o.Ordernum=oq.Ordernum
GROUP BY c.Cust_ID
HAVING COUNT(DISTINCT o.Ordernum)>=2
AND
MAX(oq.Order_q_sum)>
(
SELECT AVG(SumQuantity) FROM
(SELECT SUM(Quantity) SumQuantity FROM Orders GROUP BY OrderNum) as t1
)
HOPE THIS CLOSER TO YOUR NEEDS
SELECT x.custname,
x.phone,
Sum(x.itemcnt),
Sum(x.quantity)
FROM (SELECT c.custname,
c.phone,
1 AS itemCnt,
Sum(o.quantity),
Isnull((SELECT Avg(y.quantity)
FROM orders y
WHERE y.cust_id = o.cust_id), 0.00) AvgQty
FROM customer c
LEFT OUTER JOIN orders o
ON c.cust_id = o.cust_id
GROUP BY c.custname,
c.phone,
o.item_id) x
GROUP BY x.custname,
x.phone
HAVING Sum(x.itemcnt) > 1
AND Sum(x.quantity) > x.avgqty

Multiple counts and group by

My table looks like this:
ID ProductName ProductCode
1 abc 123
2 abc 123
3 abc 456
4 def 789
5 ghi 246
6 jkl 369
7 jkl 369
8 jkl 369
9 jkl 468
10 jkl 468
And I wish to create a summary table that looks like this:
ProductName ProductCode Total
abc 123 2
abc 456 1
jkl 369 3
jkl 468 2
In other words I'm not interested in Products "def" and "ghi" because they only appear once in the original table. For everything else I want to do a group by ProductName and ProductCode and display counts.
I've tried playing about with group by clauses and where in (select...) but I've just ended up going round in circles.
The table has around 50,000 rows and is on SQL Server 2008 R2.
This is it:
SELECT
ProductName,
ProductCode,
COUNT(*) as Total
FROM Table1
WHERE ProductName IN (SELECT ProductName FROM Table1 GROUP BY ProductName HAVING COUNT(*) > 1)
GROUP BY ProductName, ProductCode
http://www.sqlfiddle.com/#!3/c79ad/9
To filter on an aggregate you need to use the HAVING clause:
SELECT
ProductName,
ProductCode,
COUNT(*) as Total
FROM T
GROUP BY ProductName, ProductCode
HAVING COUNT(*) > 1
;WITH cte AS
(
SELECT ID, ProductName, ProductCode,
COUNT(*) OVER (PARTITION BY ProductName, ProductCode) AS Total,
COUNT(*) OVER (PARTITION BY ProductName) AS PCount,
ROW_NUMBER() OVER (PARTITION BY ProductName, ProductCode ORDER BY ID) AS rn
FROM Products
)
SELECT ID, ProductName, ProductCode, Total
FROM cte
WHERE PCount > 1 AND rn = 1
You can't use WHERE on a column that you are aggregating on. You need to use HAVING instead.
SELECT ProductName,
ProductCode,
COUNT(*) AS Total
FROM Products p
GROUP BY p.ProductName,
p.ProductCode
HAVING COUNT(p.ProductName) > 1