I have a situation where I need to find the month in which maximum number of employees hired.
Here is my Employee table:
Although I have a solution for this:
select MM
from (
select *, dense_RANK() OVER(order by cnt desc) as rnk
from (
select month(doj) as MM,count(month(doj)) as CNT
from employee
group by month(doj)
)x
)y
where rnk=1
But I am not satisfied with what i have implemented and want the most feasible solution for it.
I think the simplest way is:
select top 1 year(doj), month(doj), count(*)
from employee
group by year(doj), month(doj)
order by count(*) desc;
Notes:
This interprets "month" as being "year/month". If you really do only want the month, then remove year() from both the select and group by.
This returns one row. If you want multiple rows when there are ties, then use select top (1) with ties.
Related
I have a sample dataframe below that is over 500k rows:
|year|name|text|id|
|2001|foog|ltgn|01|
|2001|goof|ltg4|02|
|2002|tggr|ltg5|03|
|2002|wwwe|ltg6|04|
|2004|frgr|ltg7|05|
|2004|ggtg|ltg8|06|
|2003|hhyy|lt9n|07|
|2003|jjuu|l2gn|08|
|2005|fotg|l3gn|09|
I want to use sql to select the most popular name for each of the year. ie: it returns me a dataframe that has only most popular name per year for all the years that it has in the 500k rows.
I can do this via 2 separate statements:
-- sql query that gives me the names
select count(1), name from table_name group by name, order by count(1) desc limit 1;
-- If i add in a year parameter -> i can get for that particular year
select count(1), name from table_name where year = '2001' group by name, order by count(1) desc limit 1;
However how do I merge the query into 1 sql such that it provides me with the data of just the most popular name for each year?
You can use aggregation and window functions:
select yn.*
from (select yn.*,
row_number() over (partition by year order by cnt desc) as seqnum
from (select year, name, count(*) as cnt
from table_name
group by year, name
) yn
) yn
where seqnum = 1;
The innermost subquery calculates the count for each name in each year. The middle subquery enumerates the names for each year based on the count, with the highest count getting 1. And the outer subquery filters to get only the name (per year) that has the highest count.
In most databases, you can simplify this to:
select yn.*
from (select year, name, count(*) as cnt,
row_number() over (partition by year order by count(*) desc as seqnum
from table_name
group by year, name
) yn
where seqnum = 1;
I have a vague recollection that SparcSQL doesn't allow this syntax.
The whole table
USE Northwind
SELECT MAX(TotalOrder)
FROM vwEmployesAndMostSoldCategories
GROUP MAX(TotalOrder)
What I am only able to output
USE Northwind
SELECT
FullName
, MAX(TotalOrder) AS TheMaxSoldUnits
FROM vwEmployesAndMostSoldCategories
GROUP BY FullName
You could use a TOP query here:
WITH cte AS (
SELECT FullName, CatgegoryName,
SUM(TotalOrder) AS SumTotalOrder,
ROW_NUMBER() OVER (PARTITION BY FullName
ORDER BY SUM(TotalOrder) DESC) rn
FROM vwEmployesAndMostSoldCategories
GROUP BY FullName, CategoryName
)
SELECT FullName, CategoryName, SumTotalOrder AS TotalOrder
FROM cte
WHERE rn = 1;
If a given employee might be tied for having two or more categories with the same order total, and you want to show all ties, then replace ROW_NUMBER, with RANK.
If I follow you correctly, you can do this using with ties:
select top (1) with ties e.*
from vwEmployesAndMostSoldCategories e
order by rank() over(partition by fullname order by totalorders desc)
For each employee, this returns the row with the greatest totalorders (if there are top ties, all tied rows of the same employee are returned). It doesn't look like you need aggregation here.
I have a table with all the cars that have crossed one road during one week. Now I want know what are the 10 most observed cars in that road.
My idea is:
1) Group the cars and count the number of times that they have crossed the road:
select nplate, count('x') from observations group by nplate;
I have to do this because I can have the same car observed multiple times in the same week.
2) Order this group by count from highest to lowest.
3) Take the first 10 of those results.
But I don't know how to do the last two steps.
Thank you.
This works for Oracle 12c and above:
SELECT nplate,
COUNT(*)
FROM observations
GROUP BY nplate
ORDER BY COUNT(*) DESC
FETCH FIRST 10 ROWS ONLY;
You can order by count(*) desc. The Oracle way to limit the result to 10 rows is to use a subquery followed by where rownum < N:
SELECT *
FROM (
SELECT nplate
, count(*)
from observations
group by
nplate
order by
count(*) desc
) sub
WHERE rownum <= 10
Your example uses count('x'), which counts the number of rows where 'x' is not null. That doesn't hurt but it doesn't make sense either.
I'm surprised no one has given the answer using a window (analytic) function ROW_NUMBER():
SELECT nplate, observation_cnt FROM (
SELECT nplate, COUNT(*) AS observation_cnt
, ROW_NUMBER() OVER ( ORDER BY COUNT(*) DESC ) AS rn
FROM observations
GROUP BY nplate
) WHERE rn <= 10
ORDER BY rn;
If there are two (or more) values of nplate with the same number of observations, and all have the 10th most observations, and it's important that you get all, then you'll want to use the window function RANK() instead:
SELECT nplate, observation_cnt FROM (
SELECT nplate, COUNT(*) AS observation_cnt
, RANK() OVER ( ORDER BY COUNT(*) DESC ) AS rn
FROM observations
GROUP BY nplate
) WHERE rn <= 10
ORDER BY rn;
(It's possible you'll want DENSE_RANK() as well!)
In short, window functions give you a flexibility that the ROWNUM and FETCH FIRST solutions do not.
I have a simple assignment but I am stuck, I have a table and need to print out the ID with the maximum total of sales. I have managed to print a sorted list of the IDs based on the sum of each one's sales:
SELECT "COMPANY"."ID", SUM("COMPANY"."PRICE") As PriceSum
FROM "COMPANY"
WHERE "COMPANY"."DATEOFSALE" >= DATE '2016-01-01'
GROUP BY "COMPANY"."ID"
ORDER BY PriceSum DESC;
I just want to show the ID and the total sales of the top selling company.
TIA
This is in Oracle, so I can't be cheap and use LIMIT 1.
You can use a subquery instead:
SELECT c.*
FROM (SELECT "COMPANY"."ID", SUM("COMPANY"."PRICE") As PriceSum
FROM "COMPANY"
WHERE "COMPANY"."DATEOFSALE" >= DATE '2016-01-01'
GROUP BY "COMPANY"."ID"
ORDER BY PriceSum DESC
) c
WHERE rownum = 1;
In Oracle 12c+, you can use FETCH FIRST 1 ROW ONLY without the subquery. This is the ANSI standard equivalent of LIMIT.
EDIT:
If you want all companies with the maximum, use rank() or dense_rank():
SELECT c.*
FROM (SELECT "COMPANY"."ID", SUM("COMPANY"."PRICE") As PriceSum,
RANK() OVER (ORDER BY SUM("COMPANY"."PRICE") DESC) as seqnum
FROM "COMPANY"
WHERE "COMPANY"."DATEOFSALE" >= DATE '2016-01-01'
GROUP BY "COMPANY"."ID"
ORDER BY PriceSum DESC
) c
WHERE seqnum = 1;
You can replace RANK() with ROW_NUMBER() and get the previous result as well.
Suppose we have an accounts table along with the already given values
I want to find the type of account with second highest number of accounts. In this case, result should be 'FD'. In case their is a contention for second highest count I need all those types in the result.
I'm not getting any idea of how to do it. I've found numerous posts for finding second highest values, say salary, in a table. But not for second highest COUNT.
This can be done using cte's. Get the counts for each type as the first step. Then use dense_rank (to get multiple rows with same counts in case of ties) to get the rank of rows by type based on counts. Finally, select the second ranked row.
with counts as (
select type, count(*) cnt
from yourtable
group by type)
, ranks as (
select type, dense_rank() over(order by cnt desc) rnk
from counts)
select type
from ranks
where rnk = 2;
One option is to use row_number() (or dense_rank(), depending on what "second" means when there are ties):
select a.*
from (select a.type, count(*) as cnt,
row_number() over (order by count(*) desc) as seqnum
from accounta a
group by a.type
) a
where seqnum = 2;
In Oracle 12c+, you can use offset/fetch:
select a.type, count(*) as cnt
from accounta a
group by a.type
order by count(*) desc
offset 1
fetch first 1 row only