Dedupe records without DELETE - sql

I need to bring back only one of the records from a duplicated row in SQL Server
I have data like this
-------------------------------------------
CustomerID, OrderID, ProductID, Title
-------------------------------------------
1,1001,131,orange
1,1002,131,orange
-------------------------------------------
These rows are shown as 2 items that have been ordered by the same person, really they are just two as the quantity chosen in the basket and 2 records.
My question is how can i retrieve only one of these rows?
Thanks

Maybe something like this:
First some test data:
DECLARE #tbl TABLE(CustomerID INT,OrderID INT,ProductID INT,Title VARCHAR(100))
INSERT INTO #tbl
VALUES
(1,1001,131,'orange'),
(1,1002,131,'orange')
Then the query
;WITH CTE AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY tbl.CustomerID,tbl.ProductID,tbl.Title
ORDER BY tbl.OrderID) AS RowNbr,
tbl.CustomerID,
tbl.OrderID,
tbl.ProductID,
tbl.Title
FROM
#tbl AS tbl
)
SELECT
*
FROM
CTE
WHERE
CTE.RowNbr=1

This way you can get, not only one of both rows, but also the quantity ordered
SELECT
CustomerID, ProductID, Title, max(OrderID) as orderID, COUNT(*) as quantity
FROM
TableName
GROUP BY
CustomerID,
ProductID,
Title

Using Max will get you the most recent order
SELECT CustomerID, MAX(OrderId), ProductID, Title
FROM table
GROUP BY CustomerID, ProductID, Title
OR
Using Min will get you the first order
SELECT CustomerID, MIN(OrderId), ProductID, Title
FROM table
GROUP BY CustomerID, ProductID, Title

Provided that it's really what you want you can get the first order of each order with the same customer, product and title using a grouping and the MIN function (MAX would give you the last order):
SELECT CustomerID, MIN(OrderID) AS OrderID, ProductID, Title
FROM MyTable
GROUP BY CustomerID, ProductID, Title
If you want the number of duplicate orders (that would be the ordered quantity judging by your question) you can add a count:
SELECT CustomerID, MIN(OrderID) AS OrderID, ProductID, Title,
COUNT(*) AS Quantity
FROM MyTable
GROUP BY CustomerID, ProductID, Title

Related

SQL-select two values that are the same in different rows but still be able to see a third value

i have three columns orderdate, customerid and productid. i only want to see the rows where orderdate and customerid is repeated. so when the customer orders something on the same day. but i also want to see the product id.
SELECT orderdate, customerid, productid
FROM sales_orders_sheet
GROUP BY orderdate, customerid
HAVING COUNT(*)>1
when I use this sql statement I get;
Field of aggregated query neither grouped nor aggregated: line 1, column 31
You can do:
select *
from (
select t.*, count(1) over(partition by orderdate, customerid) as cnt
from sales_orders_sheet t
) x
where cnt > 1

Mysql query to PostgreSQL query

select storeID, itemID, custID, sum(price)
from Sales F
group by storeID, custID, itemID
with cube(storeID, custID);
I am going to use this query in postgreSQL but it doesn't work in postgreSQL directly
how can I convert this query into postgreSQL query?
Your code will work without the with keyword:
select storeID, itemID, custID, sum(price)
from Sales F
group by itemID, cube(storeID, custID);
I prefer grouping sets for expressing groupings:
select storeID, itemID, custID, sum(price)
from Sales F
group by grouping sets ( (storeID, custID, itemID),
(custID, itemID),
(storeID, itemID),
(itemID)
);
If I understand what you want to do, this should be exactly the same.
You could also use cube and then filter:
select storeID, itemID, custID, sum(price)
from Sales F
group by cube(storeID, custID, itemID)
having itemId is not null
Try this:
select storeID, itemID, custID, sum(price)
from Sales F
group by cube (storeID, itemID, custID)
Check this post for more details on CUBE,GROUPING SETS and ROLLUP.

Having count with case when in a temporary table in snowflake

I'm trying to make a query to return a table based on a having count condition, if the row count is > 2, then it should return me only the max value of a field and make a union with another table. If it's equal to 1, then just pull everything of the table, it would look something a little like this, I don't know the correct syntax for snowflake tho:
WITH TEMP_SHIPMENTS AS (
SELECT
ORDERNUMBER,
POSITIONNUMBER,
ITEMCODE,
ITEMDESCRIPTION,
SHIPMENTNUMBER,
LOAD,
QUANTITY,
SERIALNUMBER,
CUSTOMERNAME,
SHIPTOADDRESS,
MAX(CUSTOMERORDER),
CUSTOMERLINE,
DELIVERYDATE
FROM
T_SHIPMENTS
GROUP BY
ORDERNUMBER,
POSITIONNUMBER,
ITEMCODE,
ITEMDESCRIPTION,
SHIPMENTNUMBER,
LOAD,
QUANTITY,
SERIALNUMBER,
CUSTOMERNAME,
SHIPTOADDRESS,
CUSTOMERORDER,
CUSTOMERLINE,
DELIVERYDATE
)
CASE WHEN HAVING COUNT FROM TEMP_SHIPMENTS.CUSTOMERORDER >2
THEN
SELECT ORDERNUMBER,
POSITIONNUMBER,
ITEMCODE,
ITEMDESCRIPTION,
SHIPMENTNUMBER,
LOAD,
QUANTITY,
SERIALNUMBER,
CUSTOMERNAME,
SHIPTOADDRESS,
CUSTOMERORDER,
CUSTOMERLINE,
MAX(DELIVERYDATE)
FROM TEMP_SHIPMENTS;
Any ideas on how I could achieve it?
SELECT
ORDERNUMBER,
POSITIONNUMBER,
ITEMCODE,
ITEMDESCRIPTION,
SHIPMENTNUMBER,
LOAD,
QUANTITY,
SERIALNUMBER,
CUSTOMERNAME,
SHIPTOADDRESS,
CUSTOMERORDER,
CUSTOMERLINE,
DELIVERYDATE
FROM
T_SHIPMENTS
WHERE SerialNumber = '012501003449' ;
Result table from query
I left the result in here, and as you can see I have the same serialNumber with two records, which is okay but I only need one. And that would be the one that has either the greatest datetime or the greatest customerorder number. I tried by querying with the max value but achieved nothing on either of these fields, I still get two records instead of just one with the maximum value
If you would like to return the row of the MAX(CUSTOMERORDER), I would try:
WITH TEMP_SHIPMENTS AS (
SELECT ROW_NUMBER() over (PARTITION BY SERIALNUMBER ORDER BY CUSTOMERORDER desc) row_num,
ORDERNUMBER,
POSITIONNUMBER,
ITEMCODE,
ITEMDESCRIPTION,
SHIPMENTNUMBER,
LOAD,
QUANTITY,
SERIALNUMBER,
CUSTOMERNAME,
SHIPTOADDRESS,
CUSTOMERORDER,
CUSTOMERLINE,
DELIVERYDATE
FROM
T_SHIPMENTS
)
SELECT *
FROM TEMP_SHIPMENTS
WHERE row_num = 1
It's unclear by your question if this should be the max ordernumber over the entire table or max ordernumber for a group - based on the image you attached it sounds like you want the max ordernumber for each serialnumber, so that is what I partitioned over. However, you can make that partition be itemdescription or customername if you'd like the max ordernumber of an item or customer, respectively.
I'm afraid you can't really do conditional unions in pure SQL in any database, nor use HAVING clause for anything else than grouping results refinement.

Select subset from subset

Data has 1 table with 2 relevant fields:
OrderNumber
ProductID
How do I structure sql to find :-
Select All OrderNumber where ProductID in (A,B)
Now, on this subset, Select all where ProductID in (A,B,C,D,E)
Show CustomerName, OrderNumber, ProductID, ProductPrice
Goal is to find all Orders that contain 2 specific products, then to measure sales of only 3 specific products related to A,B.
I'm not sure what you want, but I will take a stab.
This will show you the details of orders with one of the 5 product id's
SELECT CustomerName, OrderNumber, ProductID, ProductPrice
FROM yourTable
WHERE ProductId IN ('A','B','C','D','E')
This will count the orders for you
SELECT ProductID, COUNT(*) AS Count
FROM yourTable
WHERE ProductId IN ('A','B','C','D','E')
GROUP BY ProductId

How to display products under Category in sql in a table

I have the following table:
where the products are in different categories and i am excepting the output:
like product and its cost need to be displayed under category(For category cost value i want to display total products cost) .I tried with different approaches by using roll up and grouping , but i am not getting excepted output.
Using Rollup you would do it like this.
SELECT COALESCE(product,category,'Total') Category,
SUM(VALUE) cost
FROM products
GROUP BY ROLLUP(category,product)
Here it goes:
Sample Data:
CREATE TABLE #product (ID INT, Category VARCHAR(50), Product VARCHAR(50), Value INT)
INSERT INTO #product
VALUES(1,'Non-veg','Chicken',150),
(2,'Non-veg','Mutton',200),
(3,'Non-veg','Fish',220),
(4,'Non-veg','Prawns',250),
(5,'Veg','Gobi',100),
(6,'Veg','Parota',45),
(7,'Veg','vegbirani',150)
Query using GROUP BY with ROLLUP
SELECT Category, Product,
SUM(Value) AS Value
FROM #product
GROUP BY Category, Product WITH ROLLUP
Results:
you can further manipulate the results:
SELECT COALESCE(product,category,'Total') Category,
SUM(Value) AS Value
FROM #product
GROUP BY Category, Product WITH ROLLUP
Result:
To answer the comment below: "is there any way to display Category first then Products" this seemed to work:
;WITH CTE AS (
SELECT Category, Product,
SUM(Value) AS Value,
ROW_NUMBER() OVER (PARTITION BY Category ORDER BY Product ) AS rn
FROM #product
GROUP BY Category, Product WITH ROLLUP)
SELECT Category = COALESCE(A.product,A.category,'Total') , A.Value
FROM CTE AS A
ORDER BY ISNULL(A.category,'zzzzzz') ,rn
Results:
Maybe something like this... doesn't give your exact output but it's close...
Select category, product, sum(value) as value
From TableName
group by grouping sets ((category),(category, product))