Query to Show Max Sales for Each Seller - sql

I have a table like this (sample):
Name_Seller Month Value
---------------------------
Seller A Jan 200
Seller B Jan 100
Seller A Fev 300
Seller B Fev 100
Seller C Jan 400
Seller A Mar 200
Seller D Jan 300
SQL query:
SELECT Name_Seller, Month, Value
FROM SALES
WHERE Value = (SELECT MAX(Value) FROM SALES GROUP BY Name_Seller);
And I'd like to print for each seller which was his maximum sale and when it was.
Could you help me fix my query and explain why it does not work?
I tried:
select name_seller, month, max(value)
from sales
group by name_seller, month;
but this query returns:
+---------------+------------+------+
| NAME_SELLER | MAX(VALUE) | MONTH|
+---------------+------------+------+
| SELLER A | 4182.00 | Jan |
| SELLER A | 3261.00 | Fev |
| SELLER A | 4219.00 | Mar |
| SELLER B | 2123.00 | Jan |
| SELLER B | 2111.00 | Fev |
| SELLER B | 3918.00 | Mar |
| SELLER C | 3000.00 | Jan |
| SELLER C | 4000.00 | Fev |
| SELLER C | 1500.00 | Mar |
| SELLER D | 2819.00 | Jan |
| SELLER D | 3881.00 | Fev |
| SELLER D | 2012.00 | Mar |
+---------------+------------+------+
And I would like just THE TOP sale of each salesman and when it was.
So it should return just one sale for each salesman.

With ROW_NUMBER() window function:
SELECT t.Name_Seller, t.Month, t.Value
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name_Seller ORDER BY Value DESC) rn
FROM SALES
) t
WHERE t.rn = 1
Change ROW_NUMBER() with RANK() if you want ties returned.
Or with a correlated subquery in the WHERE clause:
SELECT s.* FROM SALES s
WHERE s.Value = (SELECT MAX(VALUE) FROM SALES WHERE Name_Seller = s.Name_Seller)
Or if your database supports it:
SELECT * FROM SALES
WHERE (Name_Seller, Value) IN (SELECT Name_Seller, MAX(VALUE) FROM SALES GROUP BY Name_Seller)

You query can also bring the results however the "=" operator in where clause needs to change to "IN" because the query below brings more than 1 row so it needs a IN operator in where clause. Also, the data you have in your query returned the correct results but please be wary to use it in general because it will also bring wrong results because of comparison with sales amount( value) as example given by #forpas.
When changed the operator, your query will work.
SELECT Name_Seller, Month, Value FROM SALES
WHERE Value IN (Select MAX(Value) FROM SALES GROUP BY Name_Seller);
You can also use the rank() window function
SELECT Name_Seller, Month, VALUE
FROM (SELECT Name_Seller, Month, VALUE,
RANK() OVER (PARTITION BY Name_Seller ORDER BY VALUE DESC ) as RN
FROM SALES
) A
WHERE A.RN = 1

It will look like this:
SELECT Name_Seller, Month, MAX(Value)
FROM SALES
GROUP BY Name_Seller, Month;

You can use below query,
select name_seller, month, max(value) from sales group by name_seller, month;
If you are expecting month as well then use,
select s2.name_seller, s1.month, max(s2.value) from sales s1
inner join
(select name_seller, max(value) as value from sales
group by name_seller) s2
on (s1.name_seller = s2.name_seller and s1.value = s2.value);

Related

Running Total OVER clause, but Select Distinct instead of sum?

I have the following data set:
| EMAIL | SIGNUP_DATE |
| A#ABC.COM | 1/1/2021 |
| B#ABC.COM | 1/2/2021 |
| C#ABC.COM | 1/3/2021 |
In order to find the running total of email signups as of a certain day, I ran the following sql query:
select
signup_date,
count(email) OVER (order by signup_date ASC) as running_total_signups
I got the following results:
| SIGNUP_DATE | RUNNING_TOTAL_SIGNUPS |
| 1/1/21 | 1 |
| 1/2/21 | 2 |
| 1/3/21 | 3 |
However for my next step, I want to be able to see not just the running total signups, but the actual signup names themselves. Therefore I want to run the same window function (count(email) OVER (order by signup_date ASC)) but instead of a count(email) just a select distinct email. This would hopefully result in the following output:
| SIGNUP_DATE | RUNNING_TOTAL_SIGNUPS |
| 1/1/21 | a#abc.com |
| 1/2/21 | a#abc.com |
| 1/2/21 | b#abc.com |
| 1/3/21 | a#abc.com |
| 1/3/21 | b#abc.com |
| 1/3/21 | c#abc.com |
How would I do this? I'm getting an error on this code:
select
signup_date,
distinct email OVER (order by signup_date ASC) as running_total_signups
One way would be to cross-join the results and filter the joined table having a total <= to the running total:
with counts as (
select *,
Count(*) over (order by SIGNUP_DATE asc) as tot
from t
)
select c1.EMAIL, c1.SIGNUP_DATE
from counts c1
cross join counts c2
where c2.tot <= c1.tot
I want to run the same window function (count(email) OVER (order by
signup_date ASC)) but instead of a count(email) just a select distinct
email
Why do you want COUNT() window function?
It has nothing to do with with your reqirement.
All you need is a simple self join:
SELECT t1.SIGNUP_DATE, t2.EMAIL
FROM tablename t1 INNER JOIN tablename t2
ON t2.SIGNUP_DATE <= t1.SIGNUP_DATE
ORDER BY t1.SIGNUP_DATE, t2.EMAIL;
which will work for your sample data, but just in case there are more than 1 rows for each day in your table you should use:
SELECT t1.SIGNUP_DATE, t2.EMAIL
FROM (SELECT DISTINCT SIGNUP_DATE FROM tablename) t1 INNER JOIN tablename t2
ON t2.SIGNUP_DATE <= t1.SIGNUP_DATE
ORDER BY t1.SIGNUP_DATE, t2.EMAIL;
See the demo.
It's actually slightly simpler than Stu proposed:
select
x2.signup_date,
x1.email
from
signups x1
INNER JOIN signups x2 ON x1.signup_date <= x2.signup_date
order by signup_date
If you join the table to itself but for any date that is less than or equal to, it causes a half cartesian explosion. The lowest dated row matches with only itself. The next one matches with itself and the earlier one, so one of the table aliases has its data repeated.. This continues adding more rows to the explosion as the dates increase:
In this resultset we can see we want the emails from x1, and the dates from x2

Counting current items by month

I'm trying to build a monthly tally of active equipment, grouped by service area from a database log table. I think I'm 90% of the way there; I have a list of months, along with the total number of items that existed, and grouped by region.
However, I also need to know the state of each item as they were on the first of each month, and this is the part I'm stuck on. For instance, Item 1 is in region A in January, but moves to Region B in February. Item 2 is marked as 'inactive' in February, so shouldn't be counted. My existing query will always count item 1 in region A, and item 2 as 'active'.
I can correctly show that Item 3 is deleted in March, and Item 4 doesn't show up until the April count. I realize that I'm getting the first values because my query is specifying the min date, I'm just not sure how I need to change it to get what I want.
I think I'm looking for a way to group by Max(OperationDate) for each Month.
The Table looks like this:
| EQUIPID | EQUIPNAME | EQUIPACTIVE | DISTRICT | REGION | OPERATIONDATE | OPERATION |
|---------|-----------|-------------|----------|--------|----------------------|-----------|
| 1 | Item 1 | 1 | 1 | A | 2015-01-01T00:00:00Z | INS |
| 2 | Item 2 | 1 | 1 | A | 2015-01-01T00:00:00Z | INS |
| 3 | Item 3 | 1 | 1 | A | 2015-01-01T00:00:00Z | INS |
| 2 | Item 2 | 0 | 1 | A | 2015-02-10T00:00:00Z | UPD |
| 1 | Item 1 | 1 | 1 | B | 2015-02-15T00:00:00Z | UPD |
| 3 | (null) | (null) | (null) | (null) | 2015-02-21T00:00:00Z | DEL |
| 1 | Item 1 | 1 | 1 | A | 2015-03-01T00:00:00Z | UPD |
| 4 | Item 4 | 1 | 1 | B | 2015-03-10T00:00:00Z | INS |
There is also a subtable that holds attributes that I care about. It's structure is similar. Unfortunately, due to previous design decisions, there is no correlation to operations between the two tables. Any joins will need to be done using the EquipmentID, and have the overlapping states matched up for each date.
Current query:
--cte to build date list
WITH calendar (dt) AS
(SELECT &fromdate from dual
UNION ALL
SELECT Add_Months(dt,1)
FROM calendar
WHERE dt < &todate)
SELECT dt, a.district, a.region, count(*)
FROM
(SELECT EQUIPID, DISTRICT, REGION, OPERATION, MIN(OPERATIONDATE ) AS FirstOp, deleted.deldate
FROM Equipment_Log
LEFT JOIN
(SELECT EQUIPID,MAX(OPERATIONDATE) as DelDate
FROM Equipment_Log
WHERE OPERATION = 'DEL'
GROUP BY EQUIPID
) Deleted
ON Equipment_Log.EQUIPID = Deleted.EQUIPID
WHERE OPERATION <> 'DEL' --AND additional unimportant filters
GROUP BY EQUIPID,DISTRICT, REGION , OPERATION, deldate
) a
INNER JOIN calendar
ON (calendar.dt >= FirstOp AND calendar.dt < deldate)
OR (calendar.dt >= FirstOp AND deldate is null)
LEFT JOIN
( SELECT EQUIPID, MAX(OPERATIONDATE) as latestop
FROM SpecialEquip_Table_Log
--where SpecialEquip filters
group by EQUIPID
) SpecialEquip
ON a.EQUIPID = SpecialEquip.EQUIPID and calendar.dt >= SpecialEquip.latestop
GROUP BY dt, district, region
ORDER BY dt, district, region
Take only last operation for each id. This is what row_number() and where rn = 1 do.
We have calendar and data. Make partitioned join.
I assumed that you need to fill values for months where entries for id are missing. So nvl(lag() ignore nulls) are needed, because if something appeared in January it still exists in Feb, March and we need district, region values from last not empty row.
Now you have everything to make count. That part where you mentioned SpecialEquip_Table_Log is up to you, because you left-joined this table and not used it later, so what is it for? Join if you need it, you have id.
db<>fiddle
with
calendar(mth) as (
select date '2015-01-01' from dual union all
select add_months(mth, 1) from calendar where mth < date '2015-05-01'),
data as (
select id, dis, reg, dt, op, act
from (
select equipid id, district dis, region reg,
to_char(operationdate, 'yyyy-mm') dt,
row_number()
over (partition by equipid, trunc(operationdate, 'month')
order by operationdate desc) rn,
operation op, nvl(equipactive, 0) act
from t)
where rn = 1 )
select mth, dis, reg, sum(act) cnt
from (
select id, mth,
nvl(dis, lag(dis) ignore nulls over (partition by id order by mth)) dis,
nvl(reg, lag(reg) ignore nulls over (partition by id order by mth)) reg,
nvl(act, lag(act) ignore nulls over (partition by id order by mth)) act
from calendar
left join data partition by (id) on dt = to_char(mth, 'yyyy-mm') )
group by mth, dis, reg
having sum(act) > 0
order by mth, dis, reg
It may seem complicated, so please run subqueries separately at first to see what is going on. And test :) Hope this helps.

Select Top 20 Distinct Rows in Each Category

I have a database table in the following format.
Product | Date | Score
A | 01/01/18 | 99
B | 01/01/18 | 98
C | 01/01/18 | 97
--------------------------
A | 02/01/18 | 99
B | 02/01/18 | 98
C | 02/01/18 | 97
--------------------------
D | 03/01/18 | 99
A | 03/01/18 | 98
B | 03/01/18 | 97
C | 03/01/18 | 96
I want to pick the first from every month such that there are no repeat products. For example, the output of the above table should be
Product | Date | Score
A | 01/01/18 | 99
B | 02/01/18 | 98
D | 03/01/18 | 99
How do I get this result with a single sql query? The actual table is much bigger than this and I want top 20 from every month without repetition.
This is a hard problem -- a type of subgraph problem that isn't really suitable to SQL. There is a brute force approach:
with jan as (
select *
from t
where date = '2018-01-01'
limit 1
),
feb as (
select *
from t
where date = '2018-02-01' and
product not in (select product from jan)
),
mar as (
select *
from t
where date = '2018-03-01' and
product not in (select product from jan) and
product not in (select product from feb)
)
select *
from jan
union all
select *
from feb
union all
select *
from mar;
You can generalize this with additional CTEs. But there is no guarantee that a month will have a product -- even when it could have had one.
It is possible by using row_number.
select * from (
select row_Number() over(partition by Product order by Product ) as rno,* from
Products
) as t where t.rno<=20
I think you want top 20 records every month without repeating products than below solution will be work.
select *
into #temp
from
(values
('A','01/01/18','99')
,('B','01/01/18','98')
,('C','01/01/18','97')
,('A','02/01/18','99')
,('B','02/01/18','98')
,('C','02/01/18','97')
,('D','03/01/18','99')
,('A','03/01/18','98')
,('B','03/01/18','97')
,('C','03/01/18','96')
) AS VTE (Product ,Date, Score )
select * from
(
select * , ROW_NUMBER() over (partition by date,product order by score ) as rn
from #TEMP
)
A where rn < 20

How to return all rows with MAX value meeting a condition of another field in SQL?

I have the following costs table:
+--------+------+-----------+
| Year | ID | Amount |
+--------+------+-----------+
| 1960 | 1 | 100 |
| 1960 | 2 | 200 |
| 1960 | 3 | 200 |
| 1960 | 4 | 150 |
| 1961 | 1 | 300 |
| 1961 | 2 | 200 |
| 1961 | 3 | 100 |
| 1961 | 4 | 300 |
+---------+------+----------+
I want all ID’s having the MAX Amount by Year. For example, for 1960, I want rows with ID's 2 and 3. For 1961, I want rows with ID's 1 and 4.
SELECT Year, ID, Amount FROM costs WHERE Amount = (SELECT MAX(Amount) FROM costs);
The above gets me all MAX values across all Years. But I want a condition that only gets me the max Amount values per year. How do I add an condition to only select records with Year = 1960?
Please try this with below query.This is tested. Its working fine.
By clicking on the below link you can see your expected result in live which you want.
SQL Fiddle Live Demo
SELECT
t1.*
FROM
costs t1
WHERE
t1.amount = (
SELECT
MAX(t2.amount)
FROM
costs t2
WHERE
t2. `year` = t1. `year`
);
Try this....It should work
SELECT
*
FROM
costs
WHERE
(YEAR, amount) IN (
SELECT
YEAR,
max(amount)
FROM
costs
GROUP BY
YEAR
);
One option which should run on all major databases is to use a subquery which finds the max amounts for each year to select the records you want:
SELECT c1.*
FROM costs c1
INNER JOIN
(
SELECT Year, MAX(Amount) AS MaxAmount
FROM costs
GROUP BY Year
) c2
ON c1.Year = c2.Year AND
c1.Amount = c2.MaxAmount
Another way to do this would be to use a correlated subquery:
SELECT c1.*
FROM costs c1
WHERE c1.Amount = (SELECT MAX(c2.Amount) FROM costs c2 WHERE c2.Year = c1.Year)
I expect that joining (the first option) would be the fastest method for larger tables, especially if you have proper indices would could be used.
SELECT Year , ID , Amount
FROM #Table T1
JOIN
(
SELECT MAX(Amount) Amount,Year
FROM #Table
GROUP BY Year
) A ON A.Year = T1.Year AND A.Amount = T1.Amount

Select row that has max total value SQL Server

I have the following scheme (2 tables):
Customer (Id, Name) and
Sale (Id, CustomerId, Date, Sum)
How to select the following data ?
1) Best customer of all time (Customer, which has Max Total value in the Sum column)
For example, I have 2 tables (Customers and Sales respectively):
id CustomerName
---|--------------
1 | First
2 | Second
3 | Third
id CustomerId datetime Sum
---|----------|------------|-----
1 | 1 | 04/06/2013 | 50
2 | 2 | 04/06/2013 | 60
3 | 3 | 04/07/2013 | 30
4 | 1 | 03/07/2013 | 50
5 | 1 | 03/08/2013 | 50
6 | 2 | 03/08/2013 | 30
7 | 3 | 24/09/2013 | 20
Desired result:
CustomerName TotalSum
------------|--------
First | 150
2) Best customer of each month in the current year (the same as previous but for each month in the current year)
Thanks.
Try this for the best customer of all times
SELECT Top 1 WITH TIES c.CustomerName, SUM(s.SUM) AS TotalSum
FROM Customer c JOIN Sales s ON s.CustomerId = c.CustomerId
GROUP BY c.CustomerId, c.CustomerName
ORDER BY SUM(s.SUM) DESC
One option is to use RANK() combined with the SUM aggregate. This will get you the overall values.
select customername, sumtotal
from (
select c.customername,
sum(s.sum) sumtotal,
rank() over (order by sum(s.sum) desc) rnk
from customer c
join sales s on c.id = s.customerid
group by c.id, c.customername
) t
where rnk = 1
SQL Fiddle Demo
Grouping this by month and year should be trivial at that point.