SQL Grouping By - sql

Using ORACLE SQL.
I have a table 'Employees' with one of the attributes 'hire_date' . My task (book exercise) is to write a SELECT that will show me how many employees were hired in 1995, 1996, 1997 and 1998 .
Something like:
TOTAL 1995 1996 1997 1998
-----------------------------------------
20 4 5 29 2
Individually is easy to count the number of employees for every year, eg:
SELECT
COUNT(*),
FROM
employees e
WHERE
e.hire_date like '%95'
But I am having difficulties when I have to 'aggregate' the data in the needed format .
Any suggestions ?

I'm assuming your hire_date is a varchar2, since you are doing a "like" clause in your example.
Will a simple table with one row per year do?
If so, try this in Oracle:
select case grouping(hire_date)
when 0 then hire_date
else 'TOTAL'
end hire_date,
count(hire_date) as count_hire_date
from employees
group by rollup(hire_date);
That should give something like:
hire_date count_hire_date
1995 10
1996 20
1997 30
TOTAL 60
If you do need to pivot your results into something like you've shown in your question, then you can do the following if you know the distinct set of years prior to running the query. So for example, if you knew that you only had 1995, 1996 and 1997 in your table, then you could pivot the results using this:
SELECT
MAX(CASE WHEN hire_date = 'TOTAL' THEN ilv.count_hire_date END) total,
MAX(CASE WHEN hire_date = '1995' THEN ilv.count_hire_date END) count_1995,
MAX(CASE WHEN hire_date = '1996' THEN ilv.count_hire_date END) count_1996,
MAX(CASE WHEN hire_date = '1997' THEN ilv.count_hire_date END) count_1997
from (
select case grouping(hire_date)
when 0 then hire_date
else 'TOTAL'
end hire_date,
count(hire_date) as count_hire_date
from employees
group by rollup(hire_date)
) ilv;
This has the obvious disadvantage of you needing to add a new clause into the main select statement for each possible year.

The syntax is not intuitive. This leverages cut'n'paste coding:
SQL> select
2 sum(case when to_char(hiredate, 'YYYY') = '1980' then 1 else 0 end) as "1980"
3 , sum(case when to_char(hiredate, 'YYYY') = '1981' then 1 else 0 end) as "1981"
4 , sum(case when to_char(hiredate, 'YYYY') = '1982' then 1 else 0 end) as "1982"
5 , sum(case when to_char(hiredate, 'YYYY') = '1983' then 1 else 0 end) as "1983"
6 , sum(case when to_char(hiredate, 'YYYY') = '1987' then 1 else 0 end) as "1987"
7 , count(*) as total
8 from emp
9 /
1980 1981 1982 1983 1987 TOTAL
---------- ---------- ---------- ---------- ---------- ----------
1 10 1 0 2 20
Elapsed: 00:00:00.00
SQL>

Here's how I'd do it in MySQL, don't know if this applies to Oracle too:
SELECT
YEAR(hire_date), COUNT(*)
FROM
employees
GROUP BY
YEAR(hire_date)

SELECT NVL(hire_date,'Total'), count(hire_date)
FROM Employees GROUP BY rollup(hire_date);
If you need to PIVOT the data see A_M's answer. If you have years for which you have no data, yet still want the year to show up with a zero count you could do something like the following:
SELECT NVL(a.Year,b.Year), NVL2(a.Year,a.Count,0) FROM
(
SELECT NVL(hire_date,'Total') Year, count(hire_date) Count
FROM Employees GROUP BY rollup(hire_date)
) a
FULL JOIN
(
SELECT to_char(2000 + rownum,'FM0000') Year FROM dual CONNECT BY rownum<=9
) b ON a.Year = b.Year;
Here is some test data.
create table Employees (hire_date Varchar2(4));
insert into Employees values ('2005');
insert into Employees values ('2004');
insert into Employees values ('2006');
insert into Employees values ('2009');
insert into Employees values ('2009');
insert into Employees values ('2005');
insert into Employees values ('2004');
insert into Employees values ('2006');
insert into Employees values ('2006');
insert into Employees values ('2006');

Here's how I would do it in MS SQL - it will be similar in Oracle, but I don't want to try to give you Oracle code because I don't usually write it. This is just to get you a basic skeleton.
Select
Year(e.hire_date),
Count(1)
From
employees e
Group By
Year(e.hire_date)

I realize this is 6 years ago, but I also found another unique way of doing this using the DECODE function in Oracle.
select
count(decode(to_char(hire_date, 'yyyy') , '2005', hire_date, null)) hired_in_2005,
count(decode(to_char(hire_date, 'yyyy') , '2006', hire_date, null)) hired_in_2006,
count(decode(to_char(hire_date, 'yyyy') , '2007', hire_date, null)) hired_in_2007,
count(*) total_emp
from employees
where to_char(hire_date,'yyyy') IN ('2005','2006','2007')

Related

How do I Improve T-SQL query performance for retrieving the most recent date?

I have an employee table that contains the columns
employee_id, name, hire_date, termination_date, rehire_date, is_active
in SQL Server. I would like to retrieve the most recent date of hire, termination or rehire for each employee, but only if the employee is active.
The result should include the employee_id, name, and the most recent date. How can I achieve this with a single query?
I am able to do it using the below method:
SELECT
employee_id, name, MAX(date) as most_recent_date
FROM
(SELECT
employee_id, name, hire_date AS date
FROM
employee
UNION
SELECT
employee_id, name, termination_date
FROM
employee
UNION
SELECT
employee_id, name, rehire_date
FROM
employee) AS t
WHERE
employee_id IN (SELECT employee_id
FROM employee
WHERE is_active = 1)
GROUP BY
employee_id, name
This solution seems to work, but I am not sure if it's the most efficient way. I am also worried about the performance when the employee table is large.
Can anyone advise on a better and more efficient way to do this?
you can try this.
SELECT employee_id, name, (SELECT Max(v) FROM (VALUES (hire_date), (termination_date),(rehire_date)) AS value(v)) as most_recent_date
FROM employee
WHERE is_active = 1
Logically only one of those dates would be not null and thus it is a simple group by:
select employee_id, name,
max(coalesce(hire_date, termination_date, rehire_date)) as most_recent_date
from Employee
where is_Active = 1
group by employee_id, name;
But since design seems to be flawed already (why not a single column for those dates, with another for the type), we can't be sure if it is logical you could use:
select employee_id, name max(case
when
coalesce(hire_date,'00010101') > coalesce(termination_date,'00010101') and
coalesce(hire_date,'00010101') > coalesce(rehire_date,'00010101') then hire_date
when
coalesce(termination_date,'00010101') > coalesce(rehire_date,'00010101') then termination_date
else
coalesce(rehire_date,'00010101')
end)
from Employee
where is_Active = 1
group by employee_id;
or one of the variations in other replies.
This is a little cheesy but I think it should work.
SELECT employee_id, name,
CASE
WHEN MAX(hire_date) > MAX(termination_date) AND
MAX(hire_date) > MAX(rehire_date) THEN MAX(hire_date)
WHEN MAX(termination_date) > MAX(hire_date) AND
MAX(termination_date) > MAX(rehire_date) THEN MAX(termination_date)
WHEN MAX(rehire_date) > MAX(hire_date) AND
MAX(rehire_date) > MAX(termination_date) THEN MAX(rehire_date)
END AS date
FROM employee
WHERE (is_active = 1)
GROUP BY employee_id, name
In SQL Database and SQL Server 2022 we can use GREATEST. It will be something like this:
SELECT employee_id, name, GREATEST(hire_date,termination_date,rehire_date )
FROM employee
WHERE is_Active = 1
if you are not abel to use this function, then the old way:
SELECT employee_id, name,
CASE
WHEN hire_date > ISNULL(termination_date, '1900-01-01') AND hire_date > ISNULL(rehire_date , '1900-01-01') THEN hire_date
WHEN termination_date > ISNULL(hire_date, '1900-01-01') AND termination_date > ISNULL(rehire_date , '1900-01-01') THEN termination_date
WHEN rehire_date > ISNULL(hire_date, '1900-01-01') AND rehire_date > ISNULL(termination_date , '1900-01-01') THEN rehire_date
END
FROM employee
WHERE is_Active = 1
and maybe better looking handle for NULLs..
In both cases, it should be better in terms of IO.

SQL get employees who salary increased more than $1000 from 1999 to 2000

Here is a table for salary table:
emp_id, salary, from_date, to_date.
from_date contains the date information about when a new salary starts, to_date contains the date information about when a new salary end.
e.g.
emp_id, salary , from_date, to_date.
100, 1000, 2020-01-01, 2021-01-01
100, 2000, 2021-01-01, 2022-01-01
this person's salary from 2020 to 2021 is 1000. and from 2021 to 2022, the salary became 2000 instead.
This is what I have so far, can someone double-check since something seems off?
Thanks in advance.
SELECT PrevSalaries1.emp_no, (CurrSalaries2.salary - PrevSalaries1.salary) as salarie_add
FROM
(
SELECT emp_no, salary
FROM salaries s1
WHERE YEAR(s1.to_date) = '1999'
GROUP BY s1.emp_no
HAVING s1.salary = min(s1.salary)
) PrevSalaries1
JOIN
(
SELECT emp_no, salary
FROM salaries s2
WHERE s2.to_date >= '2000-01-01' and s2.from_date < '2000-01-01'
GROUP BY s2.emp_no
HAVING s2.salary = max(s2.salary)
) CurrSalaries2
WHERE PrevSalaries1.emp_no = CurrSalaries2.emp_no
AND CurrSalaries2.salary - PrevSalaries1.salary > 1000;
this is link for code:
update1:
I think its important to mention that people salary maybe change (increase or decrease)multiple times in a year. That is why i use min to get lowest salary in 1999. and highest salary in 2000. Hope this helps.
WITH prev AS
(
SELECT emp_no, salary
FROM frbi_exam.salaries s1
WHERE YEAR(s1.to_date) = '2000'
),
curr AS
(
SELECT emp_no, salary
FROM frbi_exam.salaries s1
WHERE YEAR(s1.to_date) = '2001'
)
SELECT curr.emp_no, curr.salary AS current_sal, prev.salary AS prev_sal
FROM curr
LEFT JOIN prev ON curr.emp_no = prev.emp_no
WHERE curr.salary - prev.salary > 1000
curr CTE is for current state of salary, prev is for previous state.
Then you only need to subtract one from another.
Of course this is on assumption that you only care for increase of salary - thus the LEFT JOIN (people who left the company are not in current salary anymore). Otherwise you could use FULL OUTER JOIN and check the difference.
SELECT emp_no, salary
FROM frbi_exam.salaries s1
WHERE YEAR(s1.to_date) = '1999'
GROUP BY s1.emp_no
HAVING s1.salary = min(s1.salary)
This is just not going to work at all because your HAVING clause references a field that isn't in your GROUP BY clause. Your select doesn't make sense. Select what you want. Want the emp_no and the minimum salary, right? So that's what you select. Aggregate functions (min, max, total, etc) compare all the values passed to them (salary, in this case) for each of the group by fields (emp_no) in this case. So what you want is this
SELECT emp_no, min(salary) as MinSalary
FROM salaries s1
WHERE YEAR(s1.to_date) = '1999'
GROUP BY s1.emp_no
I'll let you figure out how to fix the other sub query.
SELECT curr.emp_no, (curr.CurrSalary - prev.PrevSalary) AS Salary_Increment
FROM
( SELECT emp_no, max(salary) as CurrSalary -- Current Max salary
FROM salaries s1
WHERE YEAR(s1.to_date) = '2000'
GROUP BY s1.emp_no
) AS curr
LEFT
( SELECT emp_no, max(salary) as PrevSalary -- Previous year Max salary
FROM salaries s1
WHERE YEAR(s1.to_date) = '1999'
GROUP BY s1.emp_no
) AS prev
ON curr.emp_no = prev.emp_no
WHERE
(curr.CurrSalary - prev.PrevSalary) > 1000 -- Difference is more than 1000

Update table based on another table with date condition

UTILITYREADING TABLE
ROOMNUMBER
ELECTRICITYREADING
DATEOFREADING
N201
279.8
2/15/2022
N201
240.6
1/16/2022
N201
240.6
12/15/2021
N202
299.8
2/15/2022
N202
259.8
1/15/2022
UTILITYINVOICE table should look like this:
ROOMNUMBER
ELECCURRENT
ELECPREVIOUS
INVOICEDATE
N201
279.8
240.6
2/19/2022
N202
299.8
259.8
2/19/2022
I was able to insert the ELECCURRENT value (or the current month February reading) by executing:
INSERT INTO UTILITYINVOICE (ROOMNUMBER, ELECCURRENT, INVOICEDATE)
SELECT
ROOMNUMBER, NVL(ELECTRICITYREADING, 0), SYSDATE
FROM UTILITYREADING
WHERE (extract(MONTH from DATEOFREADING) = extract(MONTH from SYSDATE))
AND (extract(YEAR from DATEOFREADING) = extract(YEAR from SYSDATE));
My problem is fetching the February reading, to fetch the January reading i tried this but with no success:
UPDATE UTILITYINVOICE
SET ELECPREVIOUS = (SELECT NVL(ELECTRICITYREADING, 0)
FROM UTILITYREADING)
WHERE EXISTS (SELECT 1 FROM utilityreading
WHERE (extract(MONTH from UTILITYREADING.DATEOFREADING) = extract(MONTH from SYSDATE)-1)
AND (extract(YEAR from UTILITYREADING.DATEOFREADING) = extract(YEAR from SYSDATE))
AND utilityreading.roomnumber = utilityinvoice.roomnumber);
I'd say that you need
LAG analytic function (to select previous electricity reading)
ROW_NUMBER (to sort rows per each room number by date in descending order)
and then fetch row that ranks as the highest
SQL> insert into utilityinvoice (roomnumber, eleccurrent, elecprevious, invoicedate)
2 with temp as
3 (select roomnumber,
4 electricityreading as eleccurrent,
5 lag(electricityreading) over (partition by roomnumber order by dateofreading) elecprevious,
6 row_number() Over (partition by roomnumber order by dateofreading desc) rn
7 from utilityreading
8 )
9 select roomnumber, eleccurrent, elecprevious, sysdate
10 from temp
11 where rn = 1;
2 rows created.
SQL> select * from utilityinvoice;
ROOM ELECCURRENT ELECPREVIOUS INVOICEDAT
---- ----------- ------------ ----------
N201 279,8 240,6 02/19/2022
N202 299,8 259,8 02/19/2022
SQL>

Group by the Sub-Query which is derived from WITH CLAUSE [duplicate]

This question already has answers here:
ORA-00979 not a group by expression
(10 answers)
Closed 7 years ago.
I'm getting error for my below query, while I'm trying to grouping the sub-query. I built the sub-query using the WITH Clause.
Please correct me
WITH Student AS (SELECT * FROM CLASS WHERE SEX='M')
SELECT NAME, AGE, STATUS, SUM(TOTAL)
(SELECT
NAME,
'15' AS AGE,
CASE WHEN ATTENDANCE > 50 AND ATTENDANCE < 60 THEN 'GOOD'
WHEN ATTENDANCE > 60 'GREAT'
ELSE 'BAD' END AS STATUS
SUM (MARK) AS TOTAL
FROM STUDENT
GROUP BY NAME, ATTENDANCE ) A
GROUP BY NAME, AGE, STATUS
Error: SQL Query not properly ended
I think you are missing a from clause:
WITH Student AS (SELECT * FROM CLASS WHERE SEX='M')
SELECT NAME, AGE, STATUS, SUM(TOTAL)
FROM (SELECT NAME, '15' AS AGE,
(CASE WHEN ATTENDANCE > 50 AND ATTENDANCE < 60 THEN 'GOOD'
WHEN ATTENDANCE > 60 'GREAT'
ELSE 'BAD'
END) AS STATUS,
SUM(MARK) AS TOTAL
FROM STUDENT
GROUP BY NAME, ATTENDANCE
) A
GROUP BY NAME, AGE, STATUS
The query is much longer than it needs to be to achieve the output you want, but this seems to be the problem you have.

Oracle: How can I determine if there are gaps records based on a date field?

I have an application that manages employee time sheets.
My tables look like:
TIMESHEET
TIMESHEET_ID
EMPLOYEE_ID
etc
TIMESHEET_DAY:
TIMESHEETDAY_ID
TIMESHEET_ID
DATE_WORKED
HOURS_WORKED
Each time sheet covers a 14 day period, so there are 14 TIMESHEET_DAY records for each TIMESHEET record. And if someone goes on vacation, they do not need to enter a timesheet if there are no hours worked during that 14 day period.
Now, I need to determine whether or not employees have a 7 day gap in the prior 6 months. This means I have to look for either 7 consecutive TIMESHEET_DAY records with 0 hours, OR a 7 day period with a combination of no records submitted and records submitted with 0 hours worked. I need to know the DATE_WORKED of the last TIMESHEET_DAY record with hours in that case.
The application is asp.net, so I could retrieve all of the TIMESHEET_DAY records and iterate through them, but I think there must be a more efficient way to do this with SQL.
SELECT t1.EMPLOYEE_ID, t1.TIMESHEETDAY_ID, t1.DATE_WORKED
FROM (SELECT ROW_NUMBER() OVER(PARTITION BY EMPLOYEE_ID, TIMESHEETDAY_ID ORDER BY DATE_WORKED) AS RowNumber,
EMPLOYEE_ID, TIMESHEETDAY_ID, DATE_WORKED
FROM (SELECT EMPLOYEE_ID, TIMESHEETDAY_ID, DATE_WORKED
FROM TIMESHEET_DAY d
INNER JOIN TIMESHEET t ON t.TIMESHEET_ID = d.TIMESHEET_ID
GROUP BY EMPLOYEE_ID, TIMESHEETDAY_ID, DATE_WORKED
HAVING SUM(HOURS_WORKED) > 0) t ) t1
INNER JOIN
(SELECT ROW_NUMBER() OVER(PARTITION BY EMPLOYEE_ID, TIMESHEETDAY_ID ORDER BY DATE_WORKED) AS RowNumber,
EMPLOYEE_ID, TIMESHEETDAY_ID, DATE_WORKED
FROM (SELECT EMPLOYEE_ID, TIMESHEETDAY_ID, DATE_WORKED
FROM TIMESHEET_DAY d
INNER JOIN TIMESHEET t ON t.TIMESHEET_ID = d.TIMESHEET_ID
GROUP BY EMPLOYEE_ID, TIMESHEETDAY_ID, DATE_WORKED
HAVING SUM(HOURS_WORKED) > 0) t ) t2 ON t1.RowNumber = t2.RowNumber + 1
WHERE t2.DATE_WORKED - t1.DATE_WORKED >= 7
I would do it at least partly with SQL by counting the days per timesheet and then do the rest of the logic in the program. This identifies time sheets with missing days:
SELECT * FROM
(SELECT s.timesheet_id, SUM(CASE WHEN d.hours_worked > 0 THEN 1 ELSE 0 END) AS days_worked
FROM
TimeSheet s
LEFT JOIN TimeSheet_Day d
ON s.timesheet_id = d.timesheet_id
GROUP BY
s.timesheet_id
HAVING SUM(CASE WHEN d.hours_worked > 0 THEN 1 ELSE 0 END) < 14) X
INNER JOIN TimeSheet
ON TimeSheet.timesheet_id = X.timesheet_id
LEFT JOIN TimeSheet_Day
ON TimeSheet.timesheet_id = TimeSheet_Day.timesheet_id
ORDER BY
TimeSheet.employee_id, TimeSheet.timesheet_id, TimeSheet_Day.date_worked