Postgresql nested aggregate functions - sql

I want to find the employees who have taken the maximum number of leaves in the current month.
I started with this query:
select MAX(TotalLeaves) as HighestLeaves
FROM (SELECT emp_id, count(adate) as TotalLeaves
from attendance
group by emp_id) AS HIGHEST;
But i am facing problems in displaying the employee id and getting the result only for the current month. Please help me out.

If you just want to show corresponding employee_id in your current query, you can sort results and get top 1 row, and you need to filter data before group to get only current month:
select
emp_id, TotalLeaves
from (
select emp_id, count(adate) as TotalLeaves
from attendance
where adate >= date_trunc('month', current_date)
group by emp_id
) as highest
order by TotalLeaves desc
limit 1;
Actually, you don't need to use subquery at all here:
select emp_id, count(adate) as TotalLeaves
from attendance
where adate >= date_trunc('month', current_date)
group by emp_id
order by TotalLeaves desc
limit 1;
sql fiddle demo

SELECT emp_id, count(adate) as TotalLeaves
from attendance
where adata > date_trunc('month', NOW())
group by emp_id
order by 2 desc limit 1

Related

How do I write a query to find highest earning day per quarter?

I need to write SQL query to pull the single, highest-earning day for a certain brand of each quarter of 2018. I have the following but it does not pull a singular day - it pulls the highest earnings for each day.
select distinct quarter, order_event_date, max(gc) as highest_day_gc
from (
select sum(commission) as cm, order_date,
extract(quarter from order__date) as quarter
from order_table
where advertiser_id ='123'
and event_year='2018'
group by 3,2
)
group by 1,2
order by 2 DESC
You can use window functions to find the highest earning day per quarter by using rank().
select rank() over (partition by quarter order by gc desc) as rank, quarter, order_event_date, gc
from (select sum(gross_commission) gc,
order_event_date,
extract(quarter from order_event_date) quarter
from order_aggregation
where advertiser_id = '123'
and event_year = '2018'
group by order_event_date, quarter) a
You could create the query above as view and filter it by using where rank = 1.
You could add the LIMIT clause at the end of the sentence. Also, change the las ORDER BY clause to ORDER BY highest_day_gc. Something like:
SELECT DISTINCT quarter
,order_event_date
,max(gc) as highest_day_gc
FROM (SELECT sum(gross_commission) as gc
,order_event_date
,extract(quarter from order_event_date) as quarter
FROM order_aggregation
WHERE advertiser_id ='123'
AND event_year='2018'
GROUP BY 3,2) as subquery
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 1

Data recurring in previous 90 days

I hope you can suppor me with a piece of code I'm writing. I'm working with the following query:
SELECT case_id, case_date, people_id FROM table_1;
and I've to search in the DB how many times the same people_id is repeted in the DB, (different case_id) considering the case_date -90days timeframe. Any advise on how to address that?
Data sample
Additional info: as results I'm expecting to have the list of people_id with how many cases received in the 90 days from the last case_date.
expected result sample:
The way I understood the question, it would be something like this:
select people_id,
case_id,
count(*)
from table_1
where case_date >= trunc(sysdate) - 90
group by people_id,
case_id
You want to filter WHERE the case_date is greater than or equal to 90 days before the start of today and then GROUP BY the people_id and COUNT the number of DISTINCT (different) case_id:
SELECT people_id,
COUNT( DISTINCT case_id ) AS number_of_cases
FROM table_1
WHERE case_date >= TRUNC( SYSDATE ) - INTERVAL '90' DAY
GROUP BY
people_id;
If you only want to count repeated case_id per person_id then:
SELECT person_id,
COUNT(*) AS number_of_repeated_cases
FROM (
SELECT case_id,
person_id,
FROM table_1
WHERE case_date >= TRUNC( SYSDATE ) - INTERVAL '90' DAY
GROUP BY
people_id,
case_id
HAVING COUNT(*) >= 2
)
GROUP BY
people_id;
I think you want window functions:
select t.*,
count(*) over (partition by people_idorder by case_date
range between interval '90' day preceding and current row
) as person_count_90_day
from t;

Find rows with similar date values

I want to find customers where for example, system by error registered duplicates of an order.
It's pretty easy, if reg_date is EXACTLY the same but I have no idea how to implement it in query to count as duplicate if for example there was up to 1 second difference between transactions.
select * from
(select customer_id, reg_date, count(*) as cnt
from orders
group by 1,2
) x where cnt > 1
Here is example dataset:
https://www.db-fiddle.com/f/m6PhgReSQbVWVZhqe8n4mi/0
CUrrently only customer's 104 orders are counted as duplicates because its reg_date is identical, I want to count also orders 1,2 and 4,5 as there's just 1 second difference
demo:db<>fiddle
SELECT
customer_id,
reg_date
FROM (
SELECT
*,
reg_date - lag(reg_date) OVER (PARTITION BY customer_id ORDER BY reg_date) <= interval '1 second' as is_duplicate
FROM
orders
) s
WHERE is_duplicate
Use the lag() window function. It allows to have a look hat the previous record. With this value you can do a diff and filter the records where the diff time is more than one second.
Try this following script. This will return you day/customer wise duplicates.
SELECT
TO_CHAR(reg_date :: DATE, 'dd/mm/yyyy') reg_date,
customer_id,
count(*) as cnt
FROM orders
GROUP BY
TO_CHAR(reg_date :: DATE, 'dd/mm/yyyy'),
customer_id
HAVING count(*) >1

Using max function on count

I would like to return 1 result, the year (datetime format) with the highest amount of orders and I'm trying to apply MAX function on my COUNT to get the value. Where have I gone wrong?
SELECT TO_CHAR(ODATE, 'YYYY') AS Year
, MAX(COUNT(*))
FROM ORDERS
GROUP BY TO_CHAR(ODATE, 'YYYY')
ORDER BY TO_CHAR(ODATE, 'YYYY');
Not sure if the MAX(COUNT(*)) is valid in this context
Instead do an ORDER on the COUNT(*) and use the ROWNUM
SELECT * FROM
(
SELECT TO_CHAR(ODATE, 'YYYY') AS Year, count(*) AS cnt
FROM ORDERS
GROUP BY TO_CHAR(ODATE, 'YYYY')
ORDER BY cnt DESC
)
WHERE ROWNUM = 1
This will ensure that you keep only the row having the highest count:
The nested query is there because ROWNUM is assigned by Oracle before the ORDER happens
Note that on Oracle 12c and above you can use the instruction FETCH FIRST x ROWS. Well described here.
This allows to do the same without a subquery because the FETCH is applied after the ORDER:
SELECT TO_CHAR(ODATE, 'YYYY') AS Year, count(*) AS cnt
FROM ORDERS
GROUP BY TO_CHAR(ODATE, 'YYYY')
ORDER BY cnt DESC
FETCH FIRST 1 ROWS ONLY;

Get The Employee who has taken Minimum days of leave

Recently in one of the interviews I had given , I was asked question to
write a query to find the employee with min days of leave in previous 3 months department wise.
The table structure was
EMPID | TO_DATE | FROM_DATE | LEAVE _TYPE | DEPT_NO
I was able to write the query
SELECT
min(days) FROM (SELECT id ,(sum((TO_DATE-FROM_DATE)+1) ) days ,dept
FROM emp_leave
WHERE to_date between ADD_MONTHS(sysdate,-3) AND sysdate group by id,dept)
group by dept ;
but when I try to select the emp_id I have to add it in group by statement.
I was stuck there.
I think the query should have been something like
select dept, min(id) keep (dense_rank last order by days)
from ( SELECT id ,
(sum((TO_DATE-FROM_DATE)+1) ) days ,
dept
FROM emp_leave
WHERE to_date between ADD_MONTHS(sysdate,-3)
AND sysdate group by id,dept)
group by dept
;
Well of course, in SQL you have a lot of different way to do this, but when it is about ranking stuff, the first/last function is very useful
Try this
SELECT id
,(sum((TO_DATE - FROM_DATE) + 1)) days
,dept
FROM emp_leave
WHERE to_date BETWEEN ADD_MONTHS(sysdate, - 3)
AND sysdate
GROUP BY id
,dept
ORDER BY days LIMIT 1