Multiple Top 1 based on count - sql-server-2012

I have a table with multiple customers and multiple transaction dates.
Cust_ID Trans_Date
------- ----------
C01 2012-02-18
C01 2012-02-27
C01 2012-03-09
C02 2012-02-15
C02 2012-03-09
C03 2012-03-30
C01 2013-01-14
C02 2013-02-21
C03 2013-01-15
C03 2013-03-07
I want to find customers with most transaction in each year and the transactions for that customer.
Below is the result I am expecting.
Year Cust_ID nTrans
---- ------- ------
2012 C01 3
2013 C03 2
Can anybody help with the script? SQL Svr version 2012.
Thanking you in advance,
Thomas

This is the "greatest N per group" problem. It's usually solved with row_number().
;WITH CTE AS (
SELECT YEAR(Trans_Date) Year,
Cust_ID,
COUNT(*) as nTrans,
ROW_NUMBER() OVER (PARTITION BY YEAR(Trans_Date) ORDER BY COUNT(*) DESC) rn
FROM Table
GROUP BY YEAR(Trans_Date),
Cust_ID
)
SELECT Year,
Cust_ID,
nTrans
FROM CTE
WHERE rn = 1
ORDER BY Trans_Year
Strictly speaking, the ROW_NUMBER() here isn't ordered in a deterministic way. As written, if there's a tie in the count, the query just returns one Cust_ID, but there's no guarantee which ID will be returned. It should either be ORDER BY COUNT(*) DESC, Cust_ID to make the results consistent, or you should use RANK() or DENSE_RANK() to allow for ties.

I haven't had the chance to test it but your solution should look something like this:
SELECT YEAR(Trans_Date) AS Year, Cust_ID, COUNT(*) AS nTrans
FROM Transactions
GROUP BY Year, Cust_ID
HAVING MAX(nTrans);
Have a look at Group by functions in SQL.

You can use the max() function on the column to find the largest value in the column. In this case you can apply max(nTrans)
eg:
SELECT MAX(column_name) FROM table_name;

Related

how to avoid sum(sum()) when writing this postgres query with window functions?

Runnable query example at https://www.db-fiddle.com/f/ssrpQyyajYdZkkkAJBaYUp/0
I have a postgres table of sales; each row has a sale_id, product_id, salesperson, and price.
I want to write a query that returns, for each (salesperson, product_id) tuple with at least one sale:
The total of price for all of the sales made by that salesperson for that product (call this product_sales).
The total of price over all of that salesperson's sales (call this total_sales).
My current query is as follows, but I feel silly writing sum(sum(price)). Is there a more standard/idiomatic approach?
select
salesperson,
product_id,
sum(price) as product_sales,
sum(sum(price)) over (partition by salesperson) as total_sales
from sales
group by 1, 2
order by 1, 2
Writing sum(price) instead of sum(sum(price)) yields the following error:
column "sales.price" must appear in the GROUP BY clause or be used in an aggregate function
UPDATES
See this response for a nice approach using a WITH clause. I feel like I ought to be able to do this without a subquery or WITH.
Just stumbled on this response to a different question which proposes both sum(sum(...)) and a subquery approach. Perhaps these are the best options?
You can use a Common Table Expression to simplify the query and do it in two steps.
For example:
with
s as (
select
salesperson,
product_id,
sum(price) as product_sales
from sales
group by salesperson, product_id
)
select
salesperson,
product_id,
product_sales,
sum(product_sales) over (partition by salesperson) as total_sales
from s
order by salesperson, product_id
Result:
salesperson product_id product_sales total_sales
------------ ----------- -------------- -----------
Alice 1 2000 5400
Alice 2 2200 5400
Alice 3 1200 5400
Bobby 1 2000 4300
Bobby 2 1100 4300
Bobby 3 1200 4300
Chuck 1 2000 4300
Chuck 2 1100 4300
Chuck 3 1200 4300
See running example at DB Fiddle.
You can try the below -
select * from
(
select
salesperson,
product_id,
sum(price) over(partition by salesperson,product_id) as product_sales,
sum(price) over(partition by salesperson) as total_sales,
row_number() over(partition by salesperson,product_id order by sale_id) as rn
from sales s
)A where rn=1

SQL group by data with row separate

I would like to group by Customer & Date and generate count columns for 2 separate values (Flag=Y and Flag=N). Input table looks like this:
Customer Date Flag
------- ------- -----
001 201201 Y
001 201202 Y
001 201203 Y
001 201204 N
001 201205 N
001 201206 Y
001 201207 Y
001 201208 Y
001 201209 N
002 201201 N
002 201202 Y
002 201203 Y
002 201205 N
The output should look like this:
Customer MinDate MaxDate Count_Y
------- ------ ------- -------
001 201201 201203 3
001 201206 201208 3
002 201202 201203 2
How can I write the SQL query? Any kind of help is appreciated! Thanks!
You want to find consecutive values of "Y". This is a "gaps-and-islands" problem, and there are two basic approaches:
Determine the first "Y" in each group and use this information to define a group of consecutive "Y" values.
Use the difference of row_number() values for the calculation.
The first depends on SQL Server 2012+ and you haven't specified the version. So, the second looks like this:
select customer, min(date) as mindate, max(date) as maxdate,
count(*) as numYs
from (select t.*,
row_number() over (partition by customer order by date) as seqnum_cd,
row_number() over (partition by customer, flag order by date) as seqnum_cfd
from t
) t
where flag = 'Y'
group by customer, (seqnum_cd - seqnum_cfd), flag;
It is a little tricky to explain how this works. In my experience, thought, if you run the subquery, you will see how the seqnum columns are calculated and "get it" by observing the results.
Note: This assumes that there is at most one record per day. If there are more, you can use dense_rank() instead of row_number() for the same effect.
Try with the below query,which will give you exactly what you want.
DROP TABLE [GroupCustomer]
GO
CREATE TABLE [dbo].[GroupCustomer](
Customer VARCHAR(50),
[Date] [datetime] NULL,
Flag VARCHAR(1)
)
INSERT INTO [dbo].[GroupCustomer] (Customer ,[Date],Flag)
VALUES ('001','201201','Y'),('001','201202','Y'),
('001','201203','Y'),('001','201204','N'),
('001','201205','N'),('001','201206','Y'),
('001','201207','Y'),('001','201208','Y'),
('001','201209','N'),('002','201201','N'),
('002','201202','Y'),('002','201203','Y'),
('002','201205','N')
GO
;WITH cte_cnt
AS
(
SELECT Customer,Format(MIN([Date]),'yyMMdd') AS MinDate
,Format(MAX([Date]),'yyMMdd') AS MaxDate
, COUNT('A') AS Count_Y
FROM (
SELECT Customer,Flag,[Date],
ROW_NUMBER() OVER(Partition by customer ORDER BY [Date]) AS ROW_NUMBER,
DATEDIFF(D, ROW_NUMBER() OVER(Partition by customer ORDER BY [Date])
, [Date]) AS Diff
FROM [GroupCustomer]
WHERE Flag='Y') AS dt
GROUP BY Customer,Flag, Diff )
SELECT *
FROM cte_cnt c
ORDER BY Customer
GO

How to get multiple rows based on max date

I have a table SalePrices in SQL server and data same as below:
SPID ProductID Price Date
001 Pro01 10 2016-03-10
002 Pro01 20 2016-03-11
003 Pro02 10 2016-03-13
004 Pro02 20 2016-03-15
What I want is create a view that show only one ProductID and Price that I have modified at the last time. So what I want is same as the result below:
ProductID Price Date
Pro01 20 2016-03-11
Pro02 20 2016-03-15
There're few different approaches for this, for example, using row_number():
;with cte as (
select
ProductID, Price, Date,
row_number() over(partition by ProductID order by Date desc) as rn
from <Table>
)
select
ProductID, Price, Date
from cte
where
rn = 1
sql fiddle demo
Another version with windowing functions, this one with FIRST_VALUE();
SELECT ProductID, price, date
FROM products
WHERE spid IN (
SELECT FIRST_VALUE(spid) OVER (PARTITION BY ProductID ORDER BY date DESC) spid
FROM products
)
An SQLfiddle to test with.
Note that Roman's version with ROW_NUMBER should work from SQL Server 2005 and newer, while this will only work for SQL Server 2012 and newer.
TRY THIS:
SELECT
ProductID
, Price
, Date FROM tablename AS A
JOIN (SELECT ProductID,MAX(Date) AS DATE FROM tablename
GROUP BY ProductID
) AS B ON A.Date=B.DATE AND A.ProductID=B.ProductID
one more approach...
select productid,price,date
from
table t1
where date=(select max(date) from table t2 where t1.productid=t2.productid)
Your last record will have the highest SPID:
select
ProductId, Price, Date
from
SalePrices sap
where
sap.spid =(
select
max(sap2.spid)
from
SalePrices sap2
where
sap2.productId = sap.productId)
This query will give u desired result:
ProductID Price Date
Pro01 20 2016-03-11
Pro02 20 2016-03-15

How can I generate the latest for an aggregate?

Hey stackoverflow community,
I have a table of Sales, hypothetical shown below.
Customer Revenue State Date
David $100 NY 2016-01-01
David $500 NJ 2016-01-03
Fred $200 CA 2016-01-01
Fred $200 CA 2016-01-02
I'm writing a simple query of revenue generated by customer. The output returns as such:
David $600
Fred $400
What I want to do now is add the row for the latest purchase date.
Desired result:
David $600 2016-01-03
Fred $400 2016-01-02
I would like to keep the SQL code as clean as possible. I also want to avoid doing a JOIN to a new query as this query can start to get complex. Any ideas as to how to do so?
You should sum revenues in your group and get the maximum of dates.
Something like this:
SELECT
Customer, SUM(Revenue) as RevenueSum, MAX([Date]) as [Date]
FROM Sales
GROUP BY Customer
I think it's what you need
select Customer,sum(Revenue), max(Date) from Sales group by Customer
One way to get the SUM of Revenue and also get the information from the record with the MAX Date is to use the ROW_NUMBER() and SUM() windowed functions.
The SUM() OVER() will apply the sum for the Customer to each row and the ROW_NUMBER() OVER() will give each row an order number by Customer and Date DESC.
Put this in a subquery and select only the records with Row_Number of 1 (max date)
SELECT [Customer],
[Revenue],
[State],
[Date]
FROM (SELECT [Customer],
SUM([Revenue]) OVER (PARTITION BY [Customer]) [Revenue],
[State],
[Date],
ROW_NUMBER() OVER (PARTITION BY [Customer] ORDER BY [Date] DESC) Rn
FROM Sales
) t
WHERE t.Rn = 1

SQL How to order each entry by date

I have a list of customer_ids, the date on which some information was changed, and the corresponding changes. I would like to number each change, by order of date, on each customer. So for example; I have something that looks like the following
Cust_id Date information
-----------------------------------------------------
12345 2015-04-03 blue hat
12345 2015-04-05 red scarf
54321 2015-04-12 yellow submarine
and I would like an output which looks something like this;
cust_id change_number Date information
---------------------------------------------------------------
12345 1 2015-04-03 blue hat
12345 2 2015-04-0 red scarf
54321 1 2015-04-12 yellow submarine
This will be quite a big table, so it will need to be somewhat efficient.
There will be at most 1 entry per customer per day.
Any help you can give is appreciated.
If you want to order over a change number like that you need to use an inner select like this:
SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY Cust_id ORDER BY [Date]) As Change_Number
FROM yourTable) t
ORDER BY
Cust_id, Change_Number;
As Indian said, Try this :
select cust_id,
Row_number() over(partition by cust_id order by date) change_number,
Date,
information
from tablename;
Simply use the ORDER BY clause:
SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY Cust_id ORDER BY [Date]) As Change_Number
FROM yourTable) t
ORDER BY
Cust_id, Change_Number;