Using max function on count - sql

I would like to return 1 result, the year (datetime format) with the highest amount of orders and I'm trying to apply MAX function on my COUNT to get the value. Where have I gone wrong?
SELECT TO_CHAR(ODATE, 'YYYY') AS Year
, MAX(COUNT(*))
FROM ORDERS
GROUP BY TO_CHAR(ODATE, 'YYYY')
ORDER BY TO_CHAR(ODATE, 'YYYY');

Not sure if the MAX(COUNT(*)) is valid in this context
Instead do an ORDER on the COUNT(*) and use the ROWNUM
SELECT * FROM
(
SELECT TO_CHAR(ODATE, 'YYYY') AS Year, count(*) AS cnt
FROM ORDERS
GROUP BY TO_CHAR(ODATE, 'YYYY')
ORDER BY cnt DESC
)
WHERE ROWNUM = 1
This will ensure that you keep only the row having the highest count:
The nested query is there because ROWNUM is assigned by Oracle before the ORDER happens
Note that on Oracle 12c and above you can use the instruction FETCH FIRST x ROWS. Well described here.
This allows to do the same without a subquery because the FETCH is applied after the ORDER:
SELECT TO_CHAR(ODATE, 'YYYY') AS Year, count(*) AS cnt
FROM ORDERS
GROUP BY TO_CHAR(ODATE, 'YYYY')
ORDER BY cnt DESC
FETCH FIRST 1 ROWS ONLY;

Related

Determine which year-month has the highest and lowest value [duplicate]

This question already has answers here:
Oracle SELECT TOP 10 records [duplicate]
(6 answers)
Oracle SQL - How to Retrieve highest 5 values of a column [duplicate]
(5 answers)
How do I limit the number of rows returned by an Oracle query after ordering?
(14 answers)
Closed 1 year ago.
Here's my first query to shows the number of customers added per year-month
select count(name) AS CUSTOMER,
extract(year from create_date) as yr,
extract(month from create_date) as mon
from x
group by extract(year from create_date),
extract(month from create_date)
order by yr desc, mon desc;
CUSTOMER
YR
MON
3
2019
07
4
2015
02
100
2014
09
3
2014
04
I tried the query
SELECT MAX(count(*))
FROM x
GROUP BY create_date;
in the results I have;
MAX(COUNT(*))
100
need to see the year and month in the result.
How to do this?
The way I understood the question, you'd use rank analytic function in a subquery (or a CTE) and fetch rows whose count is either minimum or maximum. Something like this:
with temp as
(select to_char(create_date, 'yyyymm') yyyy_mm,
count(*) cnt,
--
rank() over (order by count(*) asc) rnk_min,
rank() over (order by count(*) desc) rnk_max
from x
group by to_char(create_date, 'yyyymm')
)
select yyyy_mm,
cnt
from temp
where rnk_min = 1
or rnk_max = 1;
You can use two levels of aggregation and put the results all in one row using keep (which implements a "first" aggregation function):
select max(num_customers) as max_num_customers,
max(yyyymm) keep (dense_rank first order by num_customers desc) as max_yyyymm,
min(num_customers) as max_num_customers,
max(yyyymm) keep (dense_rank first order by num_customers asc) as in_yyyymm,
from (select to_char(create_date, 'YYYY-MM') as yyyymm,
count(*) AS num_customers
from x
group by to_char(create_date, 'YYYY-MM'
) ym
From Oracle 12, you can use FETCH FIRST ROW ONLY to get the row with the highest number of customers (and, in the case of ties, the latest date):
SELECT count(name) AS CUSTOMER,
extract(year from create_date) as yr,
extract(month from create_date) as mon
FROM x
GROUP BY
extract(year from create_date),
extract(month from create_date)
ORDER BY
customer DESC,
yr DESC,
mon DESC
FETCH FIRST ROW ONLY;
If you want to include ties for the highest number of customers then:
SELECT count(name) AS CUSTOMER,
extract(year from create_date) as yr,
extract(month from create_date) as mon
FROM x
GROUP BY
extract(year from create_date),
extract(month from create_date)
ORDER BY
customer DESC
FETCH FIRST ROW WITH TIES;

SQL order with equal group size

I have a table with columns month, name and transaction_id. I would like to count the number of transactions per month and name. However, for each month I want to have the top N names with the highest transaction counts.
The following query groups by month and name. However the LIMIT is applied to the complete result and not per month:
SELECT
month,
name,
COUNT(*) AS transaction_count
FROM my_table
GROUP BY month, name
ORDER BY month, transaction_count DESC
LIMIT N
Does anyone have an idea how I can get the top N results per month?
Use row_number():
SELECT month, name, transaction_count
FROM (SELECT month, name, COUNT(*) AS transaction_count,
ROW_NUMBER() OVER (PARTITION BY month ORDER BY COUNT(*) DESC) as seqnum
FROM my_table
GROUP BY month, name
) mn
WHERE seqnum <= N
ORDER BY month, transaction_count DESC

Same output in two different lateral joins

I'm working on a bit of PostgreSQL to grab the first 10 and last 10 invoices of every month between certain dates. I am having unexpected output in the lateral joins. Firstly the limit is not working, and each of the array_agg aggregates is returning hundreds of rows instead of limiting to 10. Secondly, the aggregates appear to be the same, even though one is ordered ASC and the other DESC.
How can I retrieve only the first 10 and last 10 invoices of each month group?
SELECT first.invoice_month,
array_agg(first.id) first_ten,
array_agg(last.id) last_ten
FROM public.invoice i
JOIN LATERAL (
SELECT id, to_char(invoice_date, 'Mon-yy') AS invoice_month
FROM public.invoice
WHERE id = i.id
ORDER BY invoice_date, id ASC
LIMIT 10
) first ON i.id = first.id
JOIN LATERAL (
SELECT id, to_char(invoice_date, 'Mon-yy') AS invoice_month
FROM public.invoice
WHERE id = i.id
ORDER BY invoice_date, id DESC
LIMIT 10
) last on i.id = last.id
WHERE i.invoice_date BETWEEN date '2017-10-01' AND date '2018-09-30'
GROUP BY first.invoice_month, last.invoice_month;
This can be done with a recursive query that will generate the interval of months for who we need to find the first and last 10 invoices.
WITH RECURSIVE all_months AS (
SELECT date_trunc('month','2018-01-01'::TIMESTAMP) as c_date, date_trunc('month', '2018-05-11'::TIMESTAMP) as end_date, to_char('2018-01-01'::timestamp, 'YYYY-MM') as current_month
UNION
SELECT c_date + interval '1 month' as c_date,
end_date,
to_char(c_date + INTERVAL '1 month', 'YYYY-MM') as current_month
FROM all_months
WHERE c_date + INTERVAL '1 month' <= end_date
),
invocies_with_month as (
SELECT *, to_char(invoice_date::TIMESTAMP, 'YYYY-MM') invoice_month FROM invoice
)
SELECT current_month, array_agg(first_10.id), 'FIRST 10' as type FROM all_months
JOIN LATERAL (
SELECT * FROM invocies_with_month
WHERE all_months.current_month = invoice_month AND invoice_date >= '2018-01-01' AND invoice_date <= '2018-05-11'
ORDER BY invoice_date ASC limit 10
) first_10 ON TRUE
GROUP BY current_month
UNION
SELECT current_month, array_agg(last_10.id), 'LAST 10' as type FROM all_months
JOIN LATERAL (
SELECT * FROM invocies_with_month
WHERE all_months.current_month = invoice_month AND invoice_date >= '2018-01-01' AND invoice_date <= '2018-05-11'
ORDER BY invoice_date DESC limit 10
) last_10 ON TRUE
GROUP BY current_month;
In the code above, '2018-01-01' and '2018-05-11' represent the dates between we want to find the invoices. Based on those dates, we generate the months (2018-01, 2018-02, 2018-03, 2018-04, 2018-05) that we need to find the invoices for.
We store this data in all_months.
After we get the months, we do a lateral join in order to join the invoices for every month. We need 2 lateral joins in order to get the first and last 10 invoices.
Finally, the result is represented as:
current_month - the month
array_agg - ids of all selected invoices for that month
type - type of the selected invoices ('first 10' or 'last 10').
So in the current implementation, you will have 2 rows for each month (if there is at least 1 invoice for that month). You can easily join that in one row if you need to.
LIMIT is working fine. It's your query that's broken. JOIN is just 100% the wrong tool here; it doesn't even do anything close to what you need. By joining up to 10 rows with up to another 10 rows, you get up to 100 rows back. There's also no reason to self join just to combine filters.
Consider instead window queries. In particular, we have the dense_rank function, which can number every row in the result set according to groups:
SELECT
invoice_month,
time_of_month,
ARRAY_AGG(id) invoice_ids
FROM (
SELECT
id,
invoice_month,
-- Categorize as end or beginning of month
CASE
WHEN month_rank <= 10 THEN 'beginning'
WHEN month_reverse_rank <= 10 THEN 'end'
ELSE 'bug' -- Should never happen. Just a fall back in case of a bug.
END AS time_of_month
FROM (
SELECT
id,
invoice_month,
dense_rank() OVER (PARTITION BY invoice_month ORDER BY invoice_date) month_rank,
dense_rank() OVER (PARTITION BY invoice_month ORDER BY invoice_date DESC) month_rank_reverse
FROM (
SELECT
id,
invoice_date,
to_char(invoice_date, 'Mon-yy') AS invoice_month
FROM public.invoice
WHERE invoice_date BETWEEN date '2017-10-01' AND date '2018-09-30'
) AS fiscal_year_invoices
) ranked_invoices
-- Get first and last 10
WHERE month_rank <= 10 OR month_reverse_rank <= 10
) first_and_last_by_month
GROUP BY
invoice_month,
time_of_month
Don't be intimidated by the length. This query is actually very straightforward; it just needed a few subqueries.
This is what it does logically:
Fetch the rows for the fiscal year in question
Assign a "rank" to the row within its month, both counting from the beginning and from the end
Filter out everything that doesn't rank in the 10 top for its month (counting from either direction)
Adds an indicator as to whether it was at the beginning or end of the month. (Note that if there's less than 20 rows in a month, it will categorize more of them as "beginning".)
Aggregate the IDs together
This is the tool set designed for the job you're trying to do. If really needed, you can adjust this approach slightly to get them into the same row, but you have to aggregate before joining the results together and then join on the month; you can't join and then aggregate.

Find max value for each year

I have a question that is asking:
-List the max sales for each year?
I think I have the starter query but I can't figure out how to get all the years in my answer:
SELECT TO_CHAR(stockdate,'YYYY') AS year, sales
FROM sample_newbooks
WHERE sales = (SELECT MAX(sales) FROM sample_newbooks);
This query gives me the year with the max sales. I need max sales for EACH year. Thanks for your help!
Use group by and max if all you need is year and max sales of the year.
select
to_char(stockdate, 'yyyy') year,
max(sales) sales
from sample_newbooks
group by to_char(stockdate, 'yyyy')
If you need rows with all the columns with max sales for the year, you can use window function row_number:
select
*
from (
select
t.*,
row_number() over (partition by to_char(stockdate, 'yyyy') order by sales desc) rn
from sample_newbooks t
) t where rn = 1;
If you want to get the rows with ties on sales, use rank:
select
*
from (
select
t.*,
rank() over (partition by to_char(stockdate, 'yyyy') order by sales desc) rn
from sample_newbooks t
) t where rn = 1;

Postgresql nested aggregate functions

I want to find the employees who have taken the maximum number of leaves in the current month.
I started with this query:
select MAX(TotalLeaves) as HighestLeaves
FROM (SELECT emp_id, count(adate) as TotalLeaves
from attendance
group by emp_id) AS HIGHEST;
But i am facing problems in displaying the employee id and getting the result only for the current month. Please help me out.
If you just want to show corresponding employee_id in your current query, you can sort results and get top 1 row, and you need to filter data before group to get only current month:
select
emp_id, TotalLeaves
from (
select emp_id, count(adate) as TotalLeaves
from attendance
where adate >= date_trunc('month', current_date)
group by emp_id
) as highest
order by TotalLeaves desc
limit 1;
Actually, you don't need to use subquery at all here:
select emp_id, count(adate) as TotalLeaves
from attendance
where adate >= date_trunc('month', current_date)
group by emp_id
order by TotalLeaves desc
limit 1;
sql fiddle demo
SELECT emp_id, count(adate) as TotalLeaves
from attendance
where adata > date_trunc('month', NOW())
group by emp_id
order by 2 desc limit 1