I'm working on a homework problem, but we haven't learned about the DECODE function, only CASE. This week's unit is about using aggregate functions. Below is my homework question and what my professor wants as result-
"The following two questions are very challenging (you need to use DECODE function to complete them).
Create a query that will display the total number of employees and of that total
the number who were hired in the year 1980, 1981, 1982, and 1987. Give
appropriate column headings. (5 Points)
Total 1980 1981 1982 1987
----- ----- ----- ----- -----
14 1 10 1 2
Here is the function I typed in the server along with result. I only tried the year 1980 so I don't waste time, but I also need 1981, 1982, and 1987-
SELECT COUNT(ename) AS "Total",
COUNT(DECODE(hiredate, '80', '1980'))
FROM emp;
Total COUNT(DECODE(HIREDATE,'80','1980'))
----- -----------------------------------
14 0
Here is the datatype for the 'hiredate' column-
HIREDATE
---------
17-NOV-81
01-MAY-81
09-JUN-81
02-APR-81
28-SEP-81
20-FEB-81
08-SEP-81
03-DEC-81
22-FEB-81
03-DEC-81
17-DEC-80
HIREDATE
---------
09-DEC-82
12-JAN-83
23-JAN-82
14 rows selected.
Thank you for anyone's help!
Assuming that the hiredate column has the data type DATE.
Lets do it without DECODE and use PIVOT instead:
Oracle Setup:
CREATE TABLE emp ( ename, hiredate ) AS
SELECT 'A', DATE '1980-01-01' FROM DUAL UNION ALL
SELECT 'B', DATE '1981-01-01' FROM DUAL UNION ALL
SELECT 'C', DATE '1981-02-01' FROM DUAL UNION ALL
SELECT 'D', DATE '1981-03-01' FROM DUAL UNION ALL
SELECT 'E', DATE '1981-04-01' FROM DUAL UNION ALL
SELECT 'F', DATE '1981-05-01' FROM DUAL UNION ALL
SELECT 'G', DATE '1981-06-01' FROM DUAL UNION ALL
SELECT 'H', DATE '1981-07-01' FROM DUAL UNION ALL
SELECT 'I', DATE '1981-08-01' FROM DUAL UNION ALL
SELECT 'J', DATE '1981-09-01' FROM DUAL UNION ALL
SELECT 'K', DATE '1981-10-01' FROM DUAL UNION ALL
SELECT 'L', DATE '1982-01-01' FROM DUAL UNION ALL
SELECT 'M', DATE '1987-01-01' FROM DUAL UNION ALL
SELECT 'N', DATE '1987-02-01' FROM DUAL;
Query:
SELECT *
FROM (
SELECT EXTRACT( YEAR FROM hiredate ) AS hireyear,
COUNT(*) OVER () AS "Total"
FROM emp
)
PIVOT ( COUNT(*) FOR hireyear IN ( 1980, 1981, 1982, 1987 ) )
Output:
Total | 1980 | 1981 | 1982 | 1987
----: | ---: | ---: | ---: | ---:
14 | 1 | 10 | 1 | 2
Query 2:
You can also do it with CASE:
SELECT COUNT(*) AS "Total",
COUNT(
CASE
WHEN hiredate >= DATE '1980-01-01' AND hiredate < DATE '1981-01-01'
THEN hiredate
END
) AS "1980",
COUNT(
CASE
WHEN hiredate >= DATE '1981-01-01' AND hiredate < DATE '1982-01-01'
THEN hiredate
END
) AS "1981",
COUNT(
CASE
WHEN hiredate >= DATE '1982-01-01' AND hiredate < DATE '1983-01-01'
THEN hiredate
END
) AS "1982",
COUNT(
CASE
WHEN hiredate >= DATE '1987-01-01' AND hiredate < DATE '1988-01-01'
THEN hiredate
END
) AS "1987"
FROM emp
Query 3:
Or with DECODE.
SELECT COUNT(*) AS "Total",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 'anything here' ) ) AS "1980",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1981, 'anything here' ) ) AS "1981",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1982, 'anything here' ) ) AS "1982",
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1987, 'anything here' ) ) AS "1987"
FROM emp
Note: When you are counting, you count that the value passed to COUNT is not-NULL so the DECODE function can return any non-NULL value when it matches and it will get counted; conversely, so long as the value returned to COUNT is NULL, which is the default for DECODE when you don't supply an extra even-numbered argument, then it won't count the row.
So you could use any literal value. Like a string:
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 'anything here' ) ) AS "1980"
or a number
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 1 ) ) AS "1980"
or even the hiredate column (which if it has a year of 1980 then you know it isn't NULL):
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, hiredate ) ) AS "1980"
and if you want to be explicit about the return value from DECODE when it doesn't match then put in an extra NULL argument:
COUNT( DECODE( EXTRACT( YEAR FROM hiredate ), 1980, 42, NULL ) ) AS "1980"
db<>fiddle here
Your teacher wants you to do conditional aggregation:
SELECT
COUNT(ename) AS "Total",
SUM(DECODE(hire_date, 1980, 1, 0)) AS "1980",
SUM(DECODE(hire_date, 1981, 1, 0)) AS "1981",
SUM(DECODE(hire_date, 1982, 1, 0)) AS "1982",
SUM(DECODE(hire_date, 1987, 1, 0)) AS "1987"
FROM emp;
Within each SUM, DECODE() checks the hire_date against the target value, and counts in only records that have the relevant date for each column.
This assumes that hire_date is actually a year, which seems counterintuitive. If it is a DATE, then you would need to extrat the year part, like so:
SELECT
COUNT(ename) AS "Total",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1980, 1, 0)) AS "1980",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1981, 1, 0)) AS "1981",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1982, 1, 0)) AS "1982",
SUM(DECODE(EXTRACT(YEAR from hire_date), 1987, 1, 0)) AS "1987"
FROM emp;
Please note that DECODE() is an Oracle-specific function that is not supported by other RDBMS. A more standard way to write this is to use CASE blocks, with expressions like:
SUM(CASE WHEN EXTRACT(YEAR from hire_date) = 1980 THEN 1 ELSE 0 END) as "1980"
select "1981"+"1982"+"1983" Total, "1981","1982","1983" from (select
count(case to_char(hiredate,'YYYY') when '1981' then 1 end) "1981",
count(case to_char(hiredate,'YYYY') when '1982' then 1 end) "1982",
count(case to_char(hiredate,'YYYY') when '1983' then 1 end) "1983"
from emp);
Related
I am creating a query to find salary details of an employee with date_to as '31-dec-4712' (Latest).
But, If date_to is 31-dec-4712 for two rows for an employee then the one with status 'Approved' should be picked in other cases when only
single rows comes then that should be returned as is.
I have created the below query for the salary details. need help with teh above scenario
select distinct PAPF.EMPLOYEE_NUMBER ,
TO_CHAR (EMP_DOJ (PAPF.PERSON_ID),'DD-MON-YYYY' ) DOJ ,
TO_CHAR(HR_EMPLOYEE_ORIGINAL_DOJ(PAPF.EMPLOYEE_NUMBER,42) ,'DD- MON-YYYY' ) ORIGINAL_DOJ,
PPP.CHANGE_DATE,
PPP.DATE_TO,
PPP.PROPOSED_SALARY_N TOTAL_REMUN,
HR_GENERAL.DECODE_LOOKUP('PER_SAL_PROPOSAL_STATUS',APPROVED) status
from PER_ALL_ASSIGNMENTS_F PAAF,
PER_ALL_PEOPLE_F PAPF,
PER_PAY_PROPOSALS PPP
where 1 = 1
and PAPF.PERSON_ID = PAAF.PERSON_ID
and PAPF.BUSINESS_GROUP_ID = 21
and PAPF.CURRENT_EMPLOYEE_FLAG = 'Y'
and papf.employee_number = '109575'
and :P_DATE1 between PAAF.EFFECTIVE_START_DATE
and PAAF.EFFECTIVE_END_DATE
and :P_DATE1 between PAPF.EFFECTIVE_START_DATE
and PAPF.EFFECTIVE_END_DATE
and :P_DATE1 between PPP.CHANGE_DATE(+)
and NVL(PPP.DATE_TO, HR_GENERAL.END_OF_TIME)
and PPP.ASSIGNMENT_ID(+) = PAAF.ASSIGNMENT_ID
order by TO_NUMBER(PAPF.EMPLOYEE_NUMBER);
Emp_num DOJ ORIGINAL_DOJ CHANGE_DATE DATE_TO TOTAL_REMUN STATUS
109575 01-DEC-2016 24-JUL-2014 01-MAY-19 31-DEC-12 250000 Proposed
109575 01-DEC-2016 24-JUL-2014 01-APR-19 31-DEC-12 100000 Approved
You can use conditional ordering for each employee separately, like here:
-- sample rows
with salaries (emp_id, name, salary, date_to, status) as (
select 1001, 'Orange', 1400, date '4712-12-31', 'Rejected' from dual union all
select 1001, 'Orange', 1200, date '4712-12-31', 'Approved' from dual union all
select 1002, 'Red', 2500, date '4712-12-31', 'Approved' from dual union all
select 1003, 'Blue', 2700, date '4712-12-31', 'Proposed' from dual union all
select 1004, 'Green', 2200, date '2012-07-31', 'Approved' from dual union all
select 1005, 'White', 1200, date '4712-12-31', 'Approved' from dual union all
select 1005, 'White', 1300, date '4712-12-31', 'Rejected' from dual )
-- end of sample data
select emp_id, name, salary, date_to, status
from (
select s.*,
row_number() over (partition by emp_id
order by case status when 'Approved' then 1 end) rn
from salaries s
where date_to = date '4712-12-31')
where rn = 1
Result:
EMP_ID NAME SALARY DATE_TO STATUS
---------- ------ ---------- ----------- --------
1001 Orange 1200 4712-12-31 Approved
1002 Red 2500 4712-12-31 Approved
1003 Blue 2700 4712-12-31 Proposed
1005 White 1200 4712-12-31 Approved
If the STATUS takes only two values, "Approved" and "Proposed", you can order by STATUS and fetch the first row. If you have (or in the future you'll have) more statuses and you want to define a priority add a column in the select with a "CASE" that assigns to each status the corresponding priority. Then you order by this column and you fetch the first row....
I have a table with emplid and end_date columns. I want from all emplids the max end_dates. If at least one end_date is null, I want to have the null value as max. So in this example:
emplid end_date
1 05/04/2019
1 05/10/2019
1 null
2 05/04/2019
2 05/10/2019
I want as result:
emplid end_date
1 null
2 05/10/2019
I tried something like
select emplid,
CASE
WHEN MAX(NVL(end_Date,'01/01/3000'))='01/01/3000' THEN null
ELSE end_date
END as end_dt
from people
group by emplid
then I get a group-by error.
Maybe it is very easy, but I don't figure out how to get properly what I want.
with s(id, dt) as (
select 1, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 1, to_date('05/10/2019', 'dd/mm/yyyy') from dual union all
select 1, null from dual union all
select 2, to_date('05/04/2019', 'dd/mm/yyyy') from dual union all
select 2, to_date('05/10/2019', 'dd/mm/yyyy') from dual)
select id, decode(count(dt), count(*), max(dt)) max_dt
from s
group by id;
ID MAX_DT
---------- -----------------------------
1
2 2019-10-05 00:00:00
I would simply do:
select emplid,
(case when count(*) = count(end_date)
then max(end_date)
end) as max_end_date
from t
group by emplid;
There is no reason to introduce a "magic" maximum value (even if it is correct).
The first expression in the case is simply asking "do the number of non-NULL end-date values match the number of rows".
Try this
SELECT
EMPLID,
CASE WHEN END_DATE='01/01/3000' THEN NULL ELSE END_DATE END AS END_DT
FROM
(
SELECT EMPLID, MAX(END_DATE) AS END_DATE FROM
(
SELECT EMPLID, NVL(END_DATE,'01/01/3000') AS END_DATE FROM PEOPLE
)
GROUP BY EMPLID
);
Case does not go with group by , you have to get the max value using group by first then evaluate the null values. Try below.
select empid, CASE WHEN NVL(eDate,'01-DEC-3000')='01-DEC-3000' THEN null ELSE edate end end_dt from (
select empid, MAX(NVL(eDate,'01-DEC-3000')) eDate
from
(select 1 empid, sysdate-100 edate from dual union all
select 1 empid, sysdate-10 edate from dual union all
select 1 empid, null edate from dual union all
select 2 empid, sysdate-105 edate from dual union all
select 2 empid, sysdate-1 edate from dual ) datad
group by empid);
I have a problem with my employee_det table, where I am categorizing year wise active employee status.
for example1: an employee joined in 01-01-2017 and released from company in 02-02-2018 then he/she fall under 2017 bucket.
example2: If an employee joined in 01-02-2018 and released in 01-15-2019 then he will be under 2018 bucket.
if an employee joined in 01-01-2017 and he is still continuing in company then he must fall under 2019.
I have written the following query and which is giving me accurate results, but next year I need to add one more entry in WHERE condition, instead of that is there is any generalized way to solve this.
select emp_id, ename, year(effective_start_date) as year_bucket
from employee_det
where worker_status = 'Active'
and manager_name like '%srinivas%'
and (
( date(effective_start_date) <= '2017-12-31'
and date(effective_end_date)>='2017-12-31' )
or
( date(effective_start_date) <= '2018-12-31'
and date(effective_end_date)>='2018-12-31' )
or
( date(effective_start_date) <= current_date()
and date(effective_end_date)>=current_date()
)
You seem to want the start year for employees who have ended and the current year for active employees. So:
select emp_id, ename,
(case when effective_end_date > current_date
then year(current_date)
else year(effective_start_date)
end) as year_bucket
from employee_det
where worker_status = 'Active' and
manager_name like '%srinivas%';
Below is for BigQuery Standard SQL
#standardSQL
SELECT emp_id, ename,
EXTRACT(YEAR FROM IF(effective_end_date >= CURRENT_DATE, CURRENT_DATE, effective_start_date)) year_bucket
FROM `project.dataset.employee_det`
WHERE worker_status = 'Active'
AND manager_name LIKE '%srinivas%'
You can test, play with above using dummy data as in example below
#standardSQL
WITH `project.dataset.employee_det` AS (
SELECT 1 emp_id, 'employee1' ename, DATE '2017-01-01' effective_start_date, DATE '2018-02-02' effective_end_date, 'Active' worker_status, 'srinivas' manager_name UNION ALL
SELECT 2, 'employee2', '2018-01-02', '2019-01-15', 'Active', 'srinivas' UNION ALL
SELECT 3, 'employee3', '2017-01-01', '2019-04-15', 'Active', 'srinivas'
)
SELECT emp_id, ename,
EXTRACT(YEAR FROM IF(effective_end_date >= CURRENT_DATE, CURRENT_DATE, effective_start_date)) year_bucket
FROM `project.dataset.employee_det`
WHERE worker_status = 'Active'
AND manager_name LIKE '%srinivas%'
with result
Row emp_id ename year_bucket
1 1 employee1 2017
2 2 employee2 2018
3 3 employee3 2019
Update - excluding employees whose start and end YEAR is the same
You can just use one "generic" clause as below
WHERE EXTRACT(YEAR FROM effective_start_date) != EXTRACT(YEAR FROM effective_end_date)
so, the whole query now will be as in below example
#standardSQL
WITH `project.dataset.employee_det` AS (
SELECT 1 emp_id, 'employee1' ename, DATE '2017-01-01' effective_start_date, DATE '2018-02-02' effective_end_date, 'Active' worker_status, 'srinivas' manager_name UNION ALL
SELECT 2, 'employee2', '2018-01-02', '2019-01-15', 'Active', 'srinivas' UNION ALL
SELECT 3, 'employee3', '2017-01-01', '2019-04-15', 'Active', 'srinivas' UNION ALL
SELECT 4, 'employee4', '2017-01-01', '2017-04-15', 'Active', 'srinivas'
)
SELECT emp_id, ename,
EXTRACT(YEAR FROM IF(effective_end_date >= CURRENT_DATE, CURRENT_DATE, effective_start_date)) year_bucket
FROM `project.dataset.employee_det`
WHERE worker_status = 'Active'
AND manager_name LIKE '%srinivas%'
AND EXTRACT(YEAR FROM effective_start_date) != EXTRACT(YEAR FROM effective_end_date)
with result
Row emp_id ename year_bucket
1 1 employee1 2017
2 2 employee2 2018
3 3 employee3 2019
as you can see - employee4 is not included in any bucket
I have a table "Managers" that contains data as follows
i am expecting the output like as below
or another output format i am expecting is
The conditions are
manager 1001 is joined in 2018 and end date is 9999, so he is active in 2018, 2019 and 2020
manager 1004 is joined in 2018 and he left the company in the same year, so he is active only in 2018
please help me on how to achieve this
Build a list of years and JOIN with it:
SELECT manager_id, yearnum, 'Active' AS status
FROM UNNEST(GENERATE_ARRAY(2018, 2020)) AS yearnum
JOIN managers ON yearnum BETWEEN EXTRACT(year FROM eff_start_date)
AND EXTRACT(year FROM eff_end_date)
Below is for BigQuery Standard SQL
#standardSQL
WITH years AS (
SELECT EXTRACT(YEAR FROM year) year
FROM ( SELECT
(SELECT MIN(eff_start_date) FROM `project.dataset.managers`) AS min_date,
(SELECT MAX(eff_end_date) FROM `project.dataset.managers` WHERE eff_end_date != '9999-12-31') max_date
), UNNEST(GENERATE_DATE_ARRAY(DATE_TRUNC(min_date, YEAR), DATE_TRUNC(max_date, YEAR), INTERVAL 1 YEAR)) year
), managers_list AS (
SELECT manager_id, status, EXTRACT(YEAR FROM eff_start_date) start_year, EXTRACT(YEAR FROM eff_end_date) end_year
FROM `project.dataset.managers`
)
SELECT manager_id, year, status
FROM years y, managers_list m
WHERE year BETWEEN start_year AND end_year
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.managers` AS (
SELECT 1001 manager_id, 'Active' status, DATE '2018-02-10' eff_start_date, DATE '9999-12-31' eff_end_date UNION ALL
SELECT 1002, 'Active', '2018-02-14', '2020-12-31' UNION ALL
SELECT 1003, 'Active', '2018-02-16', '2019-02-15' UNION ALL
SELECT 1004, 'Active', '2018-02-16', '2018-12-31'
), years AS (
SELECT EXTRACT(YEAR FROM year) year
FROM ( SELECT
(SELECT MIN(eff_start_date) FROM `project.dataset.managers`) AS min_date,
(SELECT MAX(eff_end_date) FROM `project.dataset.managers` WHERE eff_end_date != '9999-12-31') max_date
), UNNEST(GENERATE_DATE_ARRAY(DATE_TRUNC(min_date, YEAR), DATE_TRUNC(max_date, YEAR), INTERVAL 1 YEAR)) year
), managers_list AS (
SELECT manager_id, status, EXTRACT(YEAR FROM eff_start_date) start_year, EXTRACT(YEAR FROM eff_end_date) end_year
FROM `project.dataset.managers`
)
SELECT manager_id, year, status
FROM years y, managers_list m
WHERE year BETWEEN start_year AND end_year
-- ORDER BY manager_id, year
with result
Row manager_id year status
1 1001 2018 Active
2 1001 2019 Active
3 1001 2020 Active
4 1002 2018 Active
5 1002 2019 Active
6 1002 2020 Active
7 1003 2018 Active
8 1003 2019 Active
9 1004 2018 Active
I have below data
empid date amount
1 12-FEB-2017 10
1 12-FEB-2017 10
1 13-FEB-2017 10
1 14-FEB-2017 10
I need a query to return the total amount for a given id and date i.e, below result set
empid date amount
1 12-FEB-2017 20
1 13-FEB-2017 10
1 14-FEB-2017 10
but the think is, from the UI i will be getting the date as input.. if they pass the date return the result for that date .. if they dont pass the date return the result for most recent date.
below is the query that I wrote .. but it is working partially..
SELECT sum(amount),empid,date
FROM employee emp,
where
((date= :ddd) OR aum_valutn_dt = (select max(date) from emp))
AND emp.id = '1'
group by (empid,date)
Please help..
I think you could do something like this
but it is pretty bad you should try to do it some other way
it is doing extra work to get the most recent date
select amt, empid, date
from
(
select amt, empid, date, rank() over (order by date desc) date_rank
from
(SELECT sum(amount) amt,empid,date
FROM employee emp
where emp.id = '1'
and (date = :ddd or :ddd is null)
group by empid, date)
)
where date = :ddd or (:ddd is null and date_rank=1)
Here's another option; scans TEST table twice so ... mind the performance.
SQL> with test (empid, datum, amount) as
2 (select 1, date '2017-02-12', 10 from dual union all
3 select 1, date '2017-02-12', 10 from dual union all
4 select 1, date '2017-02-13', 10 from dual union all
5 select 1, date '2017-02-14', 10 from dual
6 )
7 select t.empid, t.datum, sum(t.amount) sum_amount
8 from test t
9 where t.datum = (select max(t1.datum)
10 from test t1
11 where t1.empid = t.empid
12 and (t1.datum = to_date('&&par_datum', 'dd.mm.yyyy')
13 or '&&par_datum' is null)
14 )
15 group by t.empid, t.datum;
Enter value for par_datum: 13.02.2017
EMPID DATUM SUM_AMOUNT
---------- ---------- ----------
1 13.02.2017 10
SQL> undefine par_datum
SQL> /
Enter value for par_datum:
EMPID DATUM SUM_AMOUNT
---------- ---------- ----------
1 14.02.2017 10
SQL>
SELECT sum(amount),empid,date
FROM employee emp,
where date =nvl((:ddd ,(select max(date) from emp))
AND emp.id = '1'
group by (empid,date)
My solution is following:
with t (empid, datum, amount) as
(select 1, date '2017-02-12', 10 from dual union all
select 1, date '2017-02-12', 10 from dual union all
select 1, date '2017-02-13', 10 from dual union all
select 1, date '2017-02-14', 10 from dual
)
select empid, datum, s
from (select empid, datum, sum(amount) s, max(datum) over (partition by empid) md
from t
group by empid, datum)
where datum = nvl(to_date(:p, 'yyyy-mm-dd'), md);
Calculate maximal date in the subquery and then, in outer subquery, compare the date with nvl(to_date(:p, 'yyyy-mm-dd'), md). If the paremeter is null, then the date field is compared with maximal date.