SQL Group By Question - sql

I have a table which tracks views of products.
TrackId ProductId CreatedOn
1 1 01/01/2011
2 4 01/01/2011
3 4 01/01/2011
4 10 01/01/2011
What I want to do is return a dataset which doesn't have two ProductIds next to each other. I.E from the above data set I would want to return:
TrackId ProductId CreatedOn
1 1 01/01/2011
2 4 01/01/2011
4 10 01/01/2011
I can't use distinct as far as I am aware as this is row based?
Help appreciated.

Generate a row number sequence per ProductID, take the first
;WITH cte AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY TrackID) AS rn
FROM
MyProductTable
)
SELECT
TrackId ProductId CreatedOn
FROM
cte
WHERE
rn = 1
Edit:
If you want to use an aggregate, you need a separate subquery first to ensure consistent results. A straight MIN won't work.
This is based on my comment to the question
"not having productid in two adjacent rows. Adjacent is defined by next/previous Trackid"
SELECT
M.*
FROM
myProductTable M
JOIN
( --gets the lowest TrackID for a ProductID
SELECT ProductID, MIN(TrackID) AS MinTrackID
FROM myProductTable
GROUP BY ProductID
) M2 ON M.ProductID= M2.ProductID AND M.TrackID= M2.MinTrackID

select min(TrackId), ProductId, CreatedOn
from YourTable
group by ProductId, CreatedOn;

You can GroupBy on the TrackID and ProductID and do a Min of the CreatedOn if the date is not important.
SELECT TrackID ,ProductID ,MIN(CreatedOn)
FROM [table]
GROUP BY TrackID ,ProductID
If the date is the same you can group by all three
SELECT TrackID ,ProductID ,CreatedOn
FROM [table]
GROUP BY TrackID ,ProductID ,CreatedOn

Related

Getting the lastest entry grouped by ID

I have a table with stock for products. The problem is that every time there is a stock change, the new value is stored, together with the new Quantity. Example:
ProductID | Quantity | LastUpdate
1 123 2019.01.01
2 234 2019.01.01
1 444 2019.01.02
2 222 2019.01.02
I therefore need to get the latest stock update for every Product and return this:
ProductID | Quantity
1 444
2 222
The following SQL works, but is slow.
SELECT ProductID, Quantity
FROM (
SELECT ProductID, Quantity
FROM Stock
WHERE LastUpdate
IN (SELECT MAX(LastUpdate) FROM Stock GROUP BY ProductID)
)
Since the query is slow and supposed to be left joined into another query, I really would like some input on how to do this better.
Is there another way?
Use analytic functions. row_number can be used in this case.
SELECT ProductID, Quantity
FROM (SELECT ProductID, Quantity, row_number() over(partition by ProductID order by LstUpdte desc) as rnum
FROM Stock
) s
WHERE RNUM = 1
Or with first_value.
SELECT DISTINCT ProductID, FIRST_VALUE(Quantity) OVER(partition by ProductID order by LstUpdte desc) as quantuity
FROM Stock
Just another option is using WITH TIES in concert with Row_Number()
Full Disclosure: Vamsi's answer will be a nudge more performant.
Example
Select Top 1 with ties *
From YourTable
Order by Row_Number() over (Partition By ProductID Order by LastUpdate Desc)
Returns
ProductID Quantity LastUpdate
1 444 2019-01-02
2 222 2019-01-02
So you Could use a CTE(Common Table Expression)
Base Data:
SELECT 1 AS ProductID
,123 AS Quantity
,'2019-01-01' as LastUpdate
INTO #table
UNION
SELECT 2 AS ProductID
,234 AS Quantity
,'2019-01-01' as LastUpdate
UNION
SELECT 1 AS ProductID
,444 AS Quantity
,'2019-01-02' as LastUpdate
UNION
SELECT 2 AS ProductID
,222 AS Quantity
,'2019-01-02' as LastUpdate
Here is the code using a Common Table Expression.
WITH CTE (ProductID, Quantity, LastUpdate, Rnk)
AS
(
SELECT ProductID
,Quantity
,LastUpdate
,ROW_NUMBER() OVER(PARTITION BY ProductID ORDER BY LastUpdate DESC) AS Rnk
FROM #table
)
SELECT ProductID, Quantity, LastUpdate
FROM CTE
WHERE rnk = 1
Returns
You could then Join the CTE to whatever table you need.
row_number() function might be the most efficient, but the big slow down in your query is the use of the IN statement when used on a subquery, it's a little bit of a tricky one but a join is faster. This query should get what you want and be much faster.
SELECT
a.ProductID
,a.Quantity
FROM stock as a
INNER JOIN (
SELECT
ProductID
,MAX(LastUpdate) as LastUpdate
FROM stock
GROUP BY ProductID
) b
ON a.ProductID = b.ProductId AND
a.LastUpdate = b.LastUpdate

SQL Server Query for distinct rows

How do I query for distinct customers? Here's the table I have..
CustID DATE PRODUCT
=======================
1 Aug-31 Orange
1 Aug-31 Orange
3 Aug-31 Apple
1 Sept-24 Apple
4 Sept-25 Orange
This is what I want.
# of New Customers DATE
========================================
2 Aug-31
1 Sept-25
Thanks!
This is a bit tricky. You want to count the first date a customer appears and then do the aggregation:
select mindate, count(*) as NumNew
from (select CustId, min(Date) as mindate
from table t
group by CustId
) c
group by mindate
You could use a simple common table expression to find the first time a user id is used;
WITH cte AS (
SELECT date, ROW_NUMBER() OVER (PARTITION BY custid ORDER BY date) rn
FROM customers
)
SELECT COUNT(*)[# of New Customers], date FROM cte
WHERE rn=1
GROUP BY date
ORDER BY date
An SQLfiddle to test with.

display max on one columns with multiple columns in output

How can I display maximum OrderId for a CustomerId with many columns?
I have a table with following columns:
CustomerId, OrderId, Status, OrderType, CustomerType
A customer with Same customer id could have many order ids(1,2,3..) I want to be able to display the max Order id with the rest of the customers in a sql view. how can I achieve this?
Sample Data:
CustomerId OrderId OrderType
145042 1 A
110204 1 C
145042 2 D
162438 1 B
110204 2 B
103603 1 C
115559 1 D
115559 2 A
110204 3 A
I'd use a common table expression and ROW_NUMBER:
;With Ordered as (
select *,
ROW_NUMBER() OVER (PARTITION BY CustomerID
ORDER BY OrderId desc) as rn
from [Unnamed table from the question]
)
select * from Ordered where rn = 1
select * from table_name
where orderid in
(select max(orderid) from table_name group by customerid)
One way to do this is with not exists:
select t.*
from table t
where not exists (select 1
from table t2
where t2.CustomerId = t.CustomerId and
t2.OrderId > t.OrderId
);
This is saying: "get me all rows from t where there is no higher order id for the customer."

SQL: JOIN two tables with distinct rows from one table

This might be a very simple problem but I can't seem to get my head around this since last night.
I have 3 tables
VirtualLicense
VirtualLicenseId ProductName
-----------------------------------
1 Transaction
2 Query
3 Transaction
Product
ProductId Name
---------------------------
1 Transaction
2 Query
License
LicenseId ExpiryDate ProductId
-----------------------------------------
1 14/07/2013 1
2 13/07/2013 1
3 13/07/2013 2
4 14/07/2013 2
The VirtualLicense and License are joined using ProductName and ProductId mapping using the Product table.
I want to get combination of VirtualLicenseId and LicenseId, where I can basically assign the VirtualLicenseId to a LicenseId. Once a licenseid is assigned to a VirtualLicenseId, it should not be available for the following VirtualLicenseIds. Also, I want that the licenseid for which the expirydate is nearer(smaller) should be assigned first.
So, the result for my example data set should be
VirtualLicenseId LicenseId
---------------------------------
1 2
2 3
3 1
I do not want to loop over any of the tables for this.
I hope my problem is clear from my description and data.
You can do something like this:
In first CTE - assign rankings for VirtualLicenses within the Product groups.
In second CTE - assign rankings for Licensce within the Product groups (order by exp. date)
And at the end just join the two subqueries on productID and ranking.
WITH CTE_VL AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ProductId ORDER BY vl.VirtualLicenseId ASC) RN
FROM dbo.VirtualLicense vl
LEFT JOIN dbo.Product p ON vl.ProductName = p.Name
)
,CTE_License AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ProductId ORDER BY ExpiryDate ASC) RN
FROM dbo.License
)
SELECT VirtualLicenseId, LicenseId
FROM CTE_VL vl
LEFT JOIN CTE_License l ON vl.ProductId = l.ProductID AND vl.RN = l.RN
SQLFiddle DEMO

T-Sql find duplicate row values

I want to write a stored procedure.
In that stored procedure, I want to find duplicate row values from a table, and calculate sum operation on these rows to the same table.
Let's say, I have a CustomerSales table;
ID SalesRepresentative Customer Quantity
1 Michael CustA 55
2 Michael CustA 10
and I need to turn table to...
ID SalesRepresentative Customer Quantity
1 Michael CustA 65
2 Michael CustA 0
When I find SalesRepresentative and Customer duplicates at the same time, I want to sum all Quantity values of these rows and assign to the first row of a table, and others will be '0'.
Could you help me.
To aggregate duplicates into one row:
SELECT min(ID) AS ID, SalesRepresentative, Customer
,sum(Quantity) AS Quantity
FROM CustomerSales
GROUP BY SalesRepresentative, Customer
ORDER BY min(ID)
Or, if you actually want those extra rows with 0 as Quantity in the result:
SELECT ID, SalesRepresentative, Customer
,CASE
WHEN (count(*) OVER (PARTITION BY SalesRepresentative,Customer)) = 1
THEN Quantity
WHEN (row_number() OVER (PARTITION BY SalesRepresentative,Customer
ORDER BY ID)) = 1
THEN sum(Quantity) OVER (PARTITION BY SalesRepresentative,Customer)
ELSE 0
END AS Quantity
FROM CustomerSales
ORDER BY ID
This makes heavy use of window functions.
Alternative version without window functions:
SELECT min(ID) AS ID, SalesRepresentative, Customer, sum(Quantity) AS Quantity
FROM CustomerSales
GROUP BY SalesRepresentative, Customer
UNION ALL
SELECT ID, SalesRepresentative, Customer, 0 AS Quantity
FROM CustomerSales c
GROUP BY SalesRepresentative, Customer
LEFT JOIN (
SELECT min(ID) AS ID
FROM CustomerSales
GROUP BY SalesRepresentative, Customer
) x ON (x.ID = c.ID)
WHERE x.ID IS NULL
ORDER BY ID