ORACLE 12c - "not a single-group group function" - sql

I have a table o employees that contains names, date of employment and some more information.
I want to check which year the most employees were employed.
I write a query which count employment for each year:
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date);
And result of this query are tuples:
YEAR | EMPL_NUMBER
1993 | 3
1997 | 2
and so on...
And now I want to get max of EMPL_NUMBER:
SELECT YEAR, MAX(EMPL_NUMBER)
FROM (SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date));
And then I get an error:
ORA-00937: not a single-group group function
I don't understand why I get an error because subquery returns tuple with 2 columns.

You are using an aggregation function on the select result so If you need all the distinct YEAR you ust group by
SELECT T.YEAR, MAX(T.EMPL_NUMBER)
FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
) T
GROUP BY T.YEAR ;
Otherwise if you need the year of the MAX(EMPL_NUMBER) you could
SELECT T.YEAR, T.EMPL_NUMBER
FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
) T
WHERE (T.EMPL_NUMBER) IN (SELECT MAX(EMPL_NUMBER)
FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
) T1 )

In Oracle 12C, you can do:
SELECT EXTRACT(YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT(YEAR FROM e1.empl_date)
ORDER BY COUNT(e1.id_empl) DESC
FETCH FIRST 1 ROW ONLY;

One way to do this is to use an aggregate query as you were doing already, and then to use aggregate functions to their full extent. For example, using the FIRST/LAST function (and using the SCOTT schema, EMP table for illustration):
select min(extract(year from hiredate)) keep (dense_rank last order by count(empno)) as yr,
max(count(empno)) as emp_count
from emp
group by extract(year from hiredate)
;
YR EMP_COUNT
---- ---------
1981 10
There are two problems with this solution. First, many developers (including many experienced ones) seem unaware of the FIRST/LAST function, or otherwise unwilling to use it. The other, more serious problem is that in this problem it is possible that there are several years with the same, highest number of hires. The problem requirement must be more detailed than in the Original Post. What is the desired output when there are ties for first place?
The query above returns the earliest of all the different years when the max hires were achieved. Change MIN in the SELECT clause to MAX and you will get the most recent year when the highest number of hires happened. However, often we want a query that, in the case of ties, will return all the years tied for most hires. That cannot be done with the FIRST/LAST function.
For that, a compact solution would add an analytic function to your original query, to rank the years by number of hires. Then in an outer query just filter for the rows where rank = 1.
select yr, emp_count
from (
select extract(year from hiredate) as yr, count(empno) as emp_count,
rank() over (order by count(empno) desc) as rnk
from emp
group by extract(year from hiredate)
)
where rnk = 1
;
Or, using the max() analytic function in the SELECT clause of the subquery (instead of a rank-type analytic function):
select yr, emp_count
from (
select extract(year from hiredate) as yr, count(empno) as emp_count,
max(count(empno)) over () as max_count
from emp
group by extract(year from hiredate)
)
where emp_count = max_count
;

I assume you want a single row showing the year most people where hired:
SELECT * FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR,
COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
ORDER BY COUNT(*))
WHERE ROWNUM=1;

Related

Determine which year-month has the highest and lowest value [duplicate]

This question already has answers here:
Oracle SELECT TOP 10 records [duplicate]
(6 answers)
Oracle SQL - How to Retrieve highest 5 values of a column [duplicate]
(5 answers)
How do I limit the number of rows returned by an Oracle query after ordering?
(14 answers)
Closed 1 year ago.
Here's my first query to shows the number of customers added per year-month
select count(name) AS CUSTOMER,
extract(year from create_date) as yr,
extract(month from create_date) as mon
from x
group by extract(year from create_date),
extract(month from create_date)
order by yr desc, mon desc;
CUSTOMER
YR
MON
3
2019
07
4
2015
02
100
2014
09
3
2014
04
I tried the query
SELECT MAX(count(*))
FROM x
GROUP BY create_date;
in the results I have;
MAX(COUNT(*))
100
need to see the year and month in the result.
How to do this?
The way I understood the question, you'd use rank analytic function in a subquery (or a CTE) and fetch rows whose count is either minimum or maximum. Something like this:
with temp as
(select to_char(create_date, 'yyyymm') yyyy_mm,
count(*) cnt,
--
rank() over (order by count(*) asc) rnk_min,
rank() over (order by count(*) desc) rnk_max
from x
group by to_char(create_date, 'yyyymm')
)
select yyyy_mm,
cnt
from temp
where rnk_min = 1
or rnk_max = 1;
You can use two levels of aggregation and put the results all in one row using keep (which implements a "first" aggregation function):
select max(num_customers) as max_num_customers,
max(yyyymm) keep (dense_rank first order by num_customers desc) as max_yyyymm,
min(num_customers) as max_num_customers,
max(yyyymm) keep (dense_rank first order by num_customers asc) as in_yyyymm,
from (select to_char(create_date, 'YYYY-MM') as yyyymm,
count(*) AS num_customers
from x
group by to_char(create_date, 'YYYY-MM'
) ym
From Oracle 12, you can use FETCH FIRST ROW ONLY to get the row with the highest number of customers (and, in the case of ties, the latest date):
SELECT count(name) AS CUSTOMER,
extract(year from create_date) as yr,
extract(month from create_date) as mon
FROM x
GROUP BY
extract(year from create_date),
extract(month from create_date)
ORDER BY
customer DESC,
yr DESC,
mon DESC
FETCH FIRST ROW ONLY;
If you want to include ties for the highest number of customers then:
SELECT count(name) AS CUSTOMER,
extract(year from create_date) as yr,
extract(month from create_date) as mon
FROM x
GROUP BY
extract(year from create_date),
extract(month from create_date)
ORDER BY
customer DESC
FETCH FIRST ROW WITH TIES;

SQL query to find employee with 3 year over year salary raises?

Here is the table. My initial observation would be to query the salary to which employee has an incremental salary per year, but am confused on how to do that. Employee 1 is the only employee that has a three year increase, but not sure how to single them out. Thanks!
You can do this using lead()/lag() and aggregation:
select employee_id
from (select t.*,
lag(salary) over (partition by employee_id order by year) as prev_salary
from t
) t
group by employee_id
having min(salary - prev_salary) > 0 and
count(*) = 3;
This compares salaries in adjacent years and returns employees where the value is always increasing. It assumes that there are no gaps in the years -- as in your sample data.
The advantage of this approach is that you don't need to know the years in advance.
One option uses aggregation:
select employee_id
from mytable t
group by employee_id
having max(salary) filter(where year = 2020) > max(salary) filter(where year = 2019)
and max(salary) filter(where year = 2019) > max(salary) filter(where year = 2018)
This brings employee whose 2020 salary is greather than their 2019 salary, and whose 2019 salary is greater than their 2018 salary - which is how I understood your question.
I guess you don't want to hardcode the years in your query.
The best choice is the use of the window function LAG() to get the salary of the previous 2 years, but you should also check that the 3 years that you check are consecutive:
SELECT DISTINCT employee_id
FROM (
SELECT *,
LAG(year, 1) OVER (PARTITION BY employee_id ORDER BY year) year1,
LAG(salary, 1) OVER (PARTITION BY employee_id ORDER BY year) salary1,
LAG(year, 2) OVER (PARTITION BY employee_id ORDER BY year) year2,
LAG(salary, 2) OVER (PARTITION BY employee_id ORDER BY year) salary2
FROM tablename t
)
WHERE year1 = year - 1 AND year2 = year - 2 AND salary > salary1 AND salary1 > salary2
If you want to check only for the current year and the 2 previous years then add 1 more condition in the WHERE clause:
...AND year = strftime('%Y', CURRENT_DATE)
so you don't need to hardcode the current year,

Use of Group BY in more than 1 column

I have made an table which has 3 columns
Table
In this above table there are 3 columns i.e Trandsaction, EmpID, Date having some data. I need to form a table using some queries so that in the result table I will get how many transactions done in each month by a particular EmpID.
So the result table must be like:
So how to get month wise number of transactions per EmpID?
Get all employees (I assume you have an employee table).
Get all months (you could get them from the transaction table).
Cross join the two in order to get all combinations, because you want to show these in your result.
Outer join the transaction counts per month and employee (an aggregation subquery). SQL Server's date to string conversion is a bit awkward compared to other DBMS. You need to convert to a predefined format and use a substring of that.
Use COALESCE to turn null (for no count for the employee and month) to zero.
The query:
select m.month, e.empid, coalesce(t.cnt, 0) as transaction_count
from employees e
cross join (select distinct convert(varchar(7), t.date, 126) as month from transactions) m
left join
(
select convert(varchar(7), t.date, 126) as month, empid, count(*) as total
from transactions
group by convert(varchar(7), t.date, 126), empid
) t on t.empid = e.empid and t.month = m.month
order by m.month, e.empid;
If you don't want all employees, but only those that have at least one transaction in some month, then replace from employees e with from (select distinct empid from transactions) e.
SELECT Date, EmpID, COUNT(transaction)
FROM your_table
GROUP BY Date, EmpID
To get the date like "YYYY-MM" use
for MySQL (replace the String with the date column):
DATE_FORMAT("2019-11-25 10:49:30.000", "%Y - %m")
for oracle:
to_char(date, 'YYYY-MM')

Combining multiple SELECT queries in one

As the title says the query needs to combine multiple select queries. The question is as follows:
Display the total number of employees, and of that total the number of employees hired in 1995,1996,1997,1998.
My query:
select (select count(*) from employees) as "Total",
(select count(*) from employees where hire_date between 'JAN-1-0095' and 'DEC-1-0095')as "1995",
(select count(*) from employees where hire_date between 'JAN-1-0096' and 'DEC-1-0096') as "1996",
(select count(*) from employees where hire_date between 'JAN-1-0097' and 'DEC-1-0097') as "1997",
(select count(*) from employees where hire_date between 'JAN-1-0098' and 'DEC-1-0098') as "1998"
from employees
but the issue is instead of returning only single record this query is being executed for all the records in the table and hence producing the following output:
You can use conditional counting:
select count(*) as total_count,
count(case when extract(year from hire_date) = 1995 then 1 end) as "1995",
count(case when extract(year from hire_date) = 1996 then 1 end) as "1996",
count(case when extract(year from hire_date) = 1997 then 1 end) as "1997",
count(case when extract(year from hire_date) = 1998 then 1 end) as "1997",
from employees;
this makes use of the fact that aggregate functions ignore NULL values and therefor the count() will only count those rows where the case expressions returns a non-null value.
Your query returns one row for each row in the employees table because you do not apply any grouping. Each select is a scalar sub-select that gets executed for each and every row in the employees table.
You could make it only return a single row if you replace the final from employees with from dual - but you'd still count over all rows within each sub-select.
You should also avoid implicit data type conversion like you did. 'JAN-1-0095' is a string and will implicitly be converted to a date depending on your NLS settings. Your query would not run if executed from my computer (because of different NLS settings).
As you are looking for a complete year, just comparing the year is a bit shorter to write and easier to understand (at least in my eyes).
Another option would be to use proper date literals, e.g. where hire_date between DATE '1995-01-01' and DATE '1995-12-31' or a bit more verbose using Oracle's to_date() function: where hire_date between to_date('1995-01-01', 'yyyy-mm-dd') and to_date('1995-12-31', 'yyyy-mm-dd')
Assuming the years are really what you want, the problem with your query is that you are selecting from employees, so you get a row for each one. You could use:
select (select count(*) from employees) as "Total",
(select count(*) from employees where hire_date between 'JAN-1-0095' and 'DEC-1-0095')as "1995",
(select count(*) from employees where hire_date between 'JAN-1-0096' and 'DEC-1-0096') as "1996",
(select count(*) from employees where hire_date between 'JAN-1-0097' and 'DEC-1-0097') as "1997",
(select count(*) from employees where hire_date between 'JAN-1-0098' and 'DEC-1-0098') as "1998"
from dual;
And I would use date '1998-01-01' for the date constants.
However, I prefer #a_horse_with_no_name's solution.
You should avoid using a lot of subqueries. You should try this:
SQL Server:
SELECT count(*) as Total, hire_date
FROM employees
WHERE year(hire_date) IN ('1995','1996','1997','1998')
GROUP BY hire_date WITH CUBE
In ORACLE
SELECT count(*) as Total, hire_date
FROM employees
WHERE extract(year from hire_date) IN ('1995','1996','1997','1998')
GROUP BY CUBE (hire_date)
In addition to the subtotals generated by the GROUP BY, the CUBE extension will generate subtotals for each hire_date.

SQL Query to fetch number of employees joined over a calender year, broken down per month

I'm trying to find the number of employees joined over a calender year, broken down on a monthly basis. So if 15 employees had joined in January, 30 in February and so on, the output I'd like would be
Month | Employees
------|-----------
Jan | 15
Feb | 30
I've come up with a query to fetch it for a particular month
SELECT * FROM (
SELECT COUNT(EMP_NO), EMP_JN_DT
FROM EMP_REG WHERE
EMP_JN_DT between '01-NOV-09' AND '30-NOV-09'
GROUP BY EMP_JN_DT )
ORDER BY 2
How do I extend this for the full calender year?
SELECT Trunc(EMP_JN_DT,'MM') Emp_Jn_Mth,
Count(*)
FROM EMP_REG
WHERE EMP_JN_DT between date '2009-01-01' AND date '2009-12-31'
GROUP BY Trunc(EMP_JN_DT,'MM')
ORDER BY 1;
If you do not have anyone join in a particular month then you'd get no row returned. To over come this you'd have to outerjoin the above to a list of months in the required year.
SELECT to_date(EMP_JN_DT,'MON') "Month", EMP_NO "Employees"
FROM EMP_REG
WHERE EMP_JN_DT between date '2009-01-01' AND date '2009-12-31'
GROUP by "Month"
ORDER BY 1;
http://www.techonthenet.com/oracle/functions/extract.php
There is a function that returns month. What you need to do is just put it in group by
The number of employees in January can be selected in the following way:
SELECT EXTRACT(MONTH FROM HIREDATE) AS MONTH1, COUNT(*)
FROM employee
WHERE EXTRACT(MONTH FROM HIREDATE)=1
GROUP BY EXTRACT(MONTH FROM HIREDATE)