Find the highest number of occurences in a column in SQL - sql

Given this table:
Order
custName description to_char(price)
A desa $14
B desb $14
C desc $21
D desd $65
E dese $21
F desf $78
G desg $14
H desh $21
I am trying to display the whole row where prices have the highest occurances, in this case $14 and $21
I believe there needs to be a subquery. So i started out with this:
select max(count(price))
from orders
group by price
which gives me 3.
after some time i didn't think that was helpful. i believe i needed the value 14 and 21 rather the the count so i can put that in the where clause. but I'm stuck how to display that. any help?
UPDATE: So I got it to query the 14 and 21 from this
select price
from orders
group by price
having (count(price)) in
(select max(count(price))
from orders
group by price)
but i need it to display the custname and description column which i get an error:
select custname, description, price
from orders
group by price
having (count(price)) in
(select max(count(price))
from orders
group by price)
SQL Error: ORA-00979: not a GROUP BY expression
any help on this?

I guess you are pretty close. Since HAVING operates on the GROUPed result set, try
HAVING COUNT(price) IN
or
HAVING COUNT(price) =
replacing your current line.

Since you tagged the question as oracle, you can use windowing functions to get aggregate and detail data within the same query.
SELECT COUNT (price) OVER (PARTITION BY price) count_at_this_price,
o.*
from orders o
order by 1 desc

select employee, count(employee)
from work
group by employee
having count(employee) =
( select max(cnt) from
( select employee, count(employee cnt
from work
group by employee
)
);
Reference

You could try something like
select * from orders where price in (select top 2 price from orders group by price order by price desc)
I'm not sure of limiting results in Oracle, in SQL Server is top, maybe you should use limit.

Related

SQL: Take 1 value per grouping

I have a very simplified table / view like below to illustrate the issue:
The stock column represents the current stock quantity of the style at the retailer. The reason the stock column is included is to avoid joins for reporting. (the table is created for reporting only)
I want to query the table to get what is currently in stock, grouped by stylenumber (across retailers). Like:
select stylenumber,sum(sold) as sold,Max(stock) as stockcount
from MGTest
I Expect to get Stylenumber, Total Sold, Most Recent Stock Total:
A, 6, 15
B, 1, 6
But using ...Max(Stock) I get 10, and with (Sum) I get 25....
I have tried with over(partition.....) also without any luck...
How do I solve this?
I would answer this using window functions:
SELECT Stylenumber, Date, TotalStock
FROM (SELECT M.Stylenumber, M.Date, SUM(M.Stock) as TotalStock,
ROW_NUMBER() OVER (PARTITION BY M.Stylenumber ORDER BY M.Date DESC) as seqnum
FROM MGTest M
GROUP BY M.Stylenumber, M.Date
) m
WHERE seqnum = 1;
The query is a bit tricky since you want a cumulative total of the Sold column, but only the total of the Stock column for the most recent date. I didn't actually try running this, but something like the query below should work. However, because of the shape of your schema this isn't the most performant query in the world since it is scanning your table multiple times to join all of the data together:
SELECT MDate.Stylenumber, MDate.TotalSold, MStock.TotalStock
FROM (SELECT M.Stylenumber, MAX(M.Date) MostRecentDate, SUM(M.Sold) TotalSold
FROM [MGTest] M
GROUP BY M.Stylenumber) MDate
INNER JOIN (SELECT M.Stylenumber, M.Date, SUM(M.Stock) TotalStock
FROM [MGTest] M
GROUP BY M.Stylenumber, M.Date) MStock ON MDate.Stylenumber = MStock.Stylenumber AND MDate.MostRecentDate = MStock.Date
You can do something like this
SELECT B.Stylenumber,SUM(B.Sold),SUM(B.Stock) FROM
(SELECT Stylenumber AS 'Stylenumber',SUM(Sold) AS 'Sold',MAX(Stock) AS 'Stock'
FROM MGTest A
GROUP BY RetailerId,Stylenumber) B
GROUP BY B.Stylenumber
if you don't want to use joins
My solution, like that of Gordon Linoff, will use the window functions. But in my case, everything will turn around the RANK window function.
SELECT stylenumber, sold, SUM(stock) totalstock
FROM (
SELECT
stylenumber,
SUM(sold) OVER(PARTITION BY stylenumber) sold,
RANK() OVER(PARTITION BY stylenumber ORDER BY [Date] DESC) r,
stock
FROM MGTest
) T
WHERE r = 1
GROUP BY stylenumber, sold

Computed Column in Select Query?

I Want To Create A %Share Column Whose Values Are Derived By Dividing The Sale Of Each Customer With The Total Sale. I'm Using The Below Query But Get Error That Column 'Sale' Cannot Be Found. Is there Is A Way Through Which I Can Get Total Of Sale Column i.e. 600 ? Please Help ...
Select IsNull([Customer].[FirstName],'Total') as Customer,
format(Sum([MY_DB].[dbo].[Order].[TotalAmount]),'0.00') [Sale],
FORMAT(sum([MY_DB].[dbo].[Order].[TotalAmount])/ sum([Sale]),'0.00%') as 'Share%'
From Customer
INNER JOIN [MY_DB].[dbo].[Order]
ON [Customer].[Id]=[MY_DB].[dbo].[Order].[CustomerId]
Group By [Customer].[FirstName] with Rollup
Having (Sum([MY_DB].[dbo].[Order].[TotalAmount]) > (Select AVG([MY_DB].[dbo].[Order].[TotalAmount]) From [MY_DB].[dbo].[Order]))
Order By [Customer].[FirstName] Desc;
-
Desired Result:
Customer Sale %Share
Zbyszek 100 16.66 %
Yvonne 200 33.33 %
Yoshi 300 50.00 %
You have to modify the existing query quite a bit. I don't think you need with rollup.
For getting the total sales use sum() over(). Then divide each sale amount by the total to get the percentage. In the same way, using avg() over() you can compute the average and find customers with sales >= avgamount.
Select Customer,[Sale],[Share%]
from (
Select Distinct
c.[FirstName] as Customer,
format(sum(o.[TotalAmount]) over(partition by o.[CustomerId]),'0.00') as [Sale],
format(100.0*sum(o.[TotalAmount]) over(partition by o.[CustomerId]) / sum(o.[TotalAmount]) over(),'0.00%') as 'Share%',
AVG(o.[TotalAmount]) over() as 'AvgAmount'
From Customer c
INNER JOIN [MY_DB].[dbo].[Order] o ON c.[Id]=o.[CustomerId]
) t
Where Sale >= AvgAmount
Order By Customer Desc;
Without computing the average you can just check for customers with share >= 50%.
Select Customer,[Sale],[Share%]
from (
Select Distinct
c.[FirstName] as Customer,
format(sum(o.[TotalAmount]) over(partition by o.[CustomerId]),'0.00') as [Sale],
format(100.0*sum(o.[TotalAmount]) over(partition by o.[CustomerId]) / sum(o.[TotalAmount]) over(),'0.00%') as 'Share%'
From Customer c
INNER JOIN [MY_DB].[dbo].[Order] o ON c.[Id]=o.[CustomerId]
) t
Where [Share%] >= 50
Order By Customer Desc;
You can't define [Sale] column and use it in a function in the same SQL.
Easiest way to achieve this is just to surround your SQL with an outer SQL statement, where you'll also calculate the total sum and make the division.
(My syntax might not be MS-SQL exact, but this is to illustrate the idea)
Select [data].[FirstName],
format([data].[Sale],'0.00'),
format([data].[Sale] / [total].[totalSum],'0.00') FROM
(Select IsNull([Customer].[FirstName],'Total') as Customer,
Sum([MY_DB].[dbo].[Order].[TotalAmount] [Sale],
From Customer
INNER JOIN [MY_DB].[dbo].[Order]
ON [Customer].[Id]=[MY_DB].[dbo].[Order].[CustomerId]
Group By [Customer].[FirstName] with Rollup
Having (Sum([MY_DB].[dbo].[Order].[TotalAmount]) > (Select AVG([MY_DB].[dbo].[Order].[TotalAmount]) From [MY_DB].[dbo].[Order])) ) data,
(Select sum([MY_DB].[dbo].[Order].[TotalAmount] as [totalSum]) from [MY_DB].[dbo].[Order]) total
Order By [data].[FirstName] Desc;

How do I proceed on this query

I want to know if there's a way to display more than one column on an aggregate result but without it affecting the group by.
I need to display the name alongside an aggregate result, but I have no idea what I am missing here.
This is the data I'm working with:
It is the result of the following query:
select * from Salesman, Sale,Buyer
where Salesman.ID = Buyer.Salesman_ID and Buyer.ID = sale.Buyer_ID
I need to find the salesman that sold the most stuff (total price) for a specific year.
This is what I have so far:
select DATEPART(year,sale.sale_date)'year', Salesman.First_Name,sum(sale.price)
from Salesman, Sale,Buyer
where Salesman.ID = Buyer.Salesman_ID and Buyer.ID = sale.Buyer_ID
group by DATEPART(year,sale.sale_date),Salesman.First_Name
This returns me the total sales made by each salesman.
How do I continue from here to get the top salesman of each year?
Maybe the query I am doing is completely wrong and there is a better way?
Any advice would be helpful.
Thanks.
This should work for you:
select *
from(
select DATEPART(year,s.sale_date) as SalesYear -- Avoid reserved words for object names
,sm.First_Name
,sum(s.price) as TotalSales
,row_number() over (partition by DATEPART(year,s.sale_date) -- Rank the data within the same year as this data row.
order by sum(s.price) desc -- Order by the sum total of sales price, with the largest first (Descending). This means that rank 1 is the highest amount.
) as SalesRank -- Orders your salesmen by the total sales within each year, with 1 as the best.
from Buyer b
inner join Sale s
on(b.ID = s.Buyer_ID)
inner join Salesman sm
on(sm.ID = b.Salesman_ID)
group by DATEPART(year,s.sale_date)
,sm.First_Name
) a
where SalesRank = 1 -- This means you only get the top salesman for each year.
First, never use commas in the FROM clause. Always use explicit JOIN syntax.
The answer to your question is to use window functions. If there is a tie and you wand all values, then RANK() or DENSE_RANK(). If you always want exactly one -- even if there are ties -- then ROW_NUMBER().
select ss.*
from (select year(s.sale_date) as yyyy, sm.First_Name, sum(s.price) as total_price,
row_number() over (partition by year(s.sale_date)
order by sum(s.price) desc
) as seqnum
from Salesman sm join
Sale s
on sm.ID = s.Salesman_ID
group by year(s.sale_date), sm.First_Name
) ss
where seqnum = 1;
Note that the Buyers table is unnecessary for this query.

Is it possible to create and use window function in the same query?

I'm using PostgreSQL and I have the following situation:
table of Sales (short version):
itemid quantity
5 10
5 12
6 1
table of stock (short version):
itemid stock
5 30
6 1
I have a complex query that also needs to present in one of it's columns the SUM of each itemid.
So it's going to be:
Select other things,itemid,stock, SUM (quantity) OVER (PARTITION BY itemid) AS total_sales
from .....
sales
stock
This query is OK. however this query will present:
itemid stock total_sales
5 30 22
6 1 1
But I don't need to see itemid=6 because the whole stock was sold. meaning that I need a WHERE condition like:
WHERE total_sales<stock
but I can't do that as the total_sales is created after the WHERE is done.
Is there a way to solve this without surrounding the whole query with another one? I'm trying to avoid it if I can.
You can use a subquery or CTE:
select s.*
from (Select other things,itemid,stock,
SUM(quantity) OVER (PARTITION BY itemid) AS total_sales
from .....
) s
where total_sales < stock;
You cannot use table aliases defined in a SELECT in the SELECT, WHERE, or FROM clauses for that SELECT. However, a subquery or CTE gets around this restriction.
You can also use an inner select in your WHERE statement like this:
SELECT *, SUM (quantity) OVER (PARTITION BY itemid) AS total_sales
FROM t
WHERE quantity <> (SELECT SUM(quantity) FROM t ti WHERE t.itemid = ti.itemid);

Product sales by month - SQL

I just created a small data warehouse with the following details.
Fact Table
Sales
Dimensions
Supplier
Products
Time (Range is one year)
Stores
I want to query which product has the max sales by month, I mean the output to be like
Month - Product Code - Num_Of_Items
JAN xxxx xxxxx
FEB xxxx xxxxx
I tried the following query
with product_sales as(
SELECT dd.month,
fs.p_id,
dp.title,
SUM(number_of_items) Num
FROM fact_sales fs
INNER JOIN dim_products dp
ON fs.p_id = dp.p_id
INNER JOIN dim_date dd
ON dd.date_id = fs.date_id
GROUP BY dd.month,
fs.p_id,
dp.title
)
select distinct month,movie_id,max(num)
from product_sales
group by movie_id,title, month;
Instead of max of 12 rows, I am having 132 records. I need guidance with this. Thanks.
There are a few things about your query that don't make sense, such as:
Where does movie_id come from?
What is from abc? Should it be from product_sales?
That said, if you need the maximum product sales by month and you need to include the product code (or movie ID or whatever), you need an analytical query. Yours would go something like this:
WITH product_sales AS (
SELECT
dd.month,
fs.p_id,
dp.title,
SUM(number_of_items) Num,
RANK() OVER (PARTITION BY dd.month ORDER BY SUM(number_of_items) DESC) NumRank
FROM fact_sales fs
INNER JOIN dim_products dp ON fs.p_id = dp.p_id
INNER JOIN dim_date dd ON dd.date_id = fs.date_id
GROUP BY dd.month, fs.p_id, dp.title
)
SELECT month, p_id, title, num
FROM product_sales
WHERE NumRank = 1
Note that if there's a tie for top sales in any month, this query will show all top sales for the month. In other words, if product codes AAAA and BBBB are tied for top sales in January, the query results will have a January row for both products.
If you want just one row per month even if there's a tie, use ROW_NUMBER instead of RANK(), but note that ROW_NUMBER will arbitrarily pick a winner unless you define a tie-breaker. For example, to have the lowest p_id be the tie-breaker, define the NumRank column like this:
ROW_NUMBER() OVER (
PARTITION BY dd.month
ORDER BY SUM(number_of_items) DESC, p_id
) NumRank
you can user MAX () KEEP (DENSE_RANK FIRST ORDER BY ) to select the movie_id with the max value of num
...
select
month,
MAX(movie_id) KEEP (DENSE_RANK FIRST order by num desc) as movie_id,
MAX(num)
from
abc
group by month
;