Oracle SQL - how to select and combine data from multiple rows - sql

The scenario is that for a particular date range, I need to return all rows where a field value has changed (FROM and TO date). The changed field values will then be shown on screen in a different colour.
I can return all employee records where a change has occurred for the specific time period. However, if an employee's record has changed multiple times within the selected time period, I need to return a single employee record with a combined value for each of the different X_FLAG columns. 1 indicates that a change has occurred, 0 indicates no change.
Table DDL is:
CREATE TABLE "EMPLOYEE_DATA"
( "EMPLOYEE_ID" NUMBER(20,0),
"EMPLOYEE_NAME" VARCHAR2(100 BYTE),
"EMPLOYEE_NAME_FLAG" NUMBER(1,0),
"EMPLOYEE_ROLE" VARCHAR2(100 BYTE),
"EMPLOYEE_ROLE_FLAG" NUMBER(1,0),
"EMPLOYEE_SALARY" VARCHAR2(100 BYTE),
"EMPLOYEE_SALARY_FLAG" NUMBER(1,0),
"DATE_VALID_FROM" DATE,
"DATE_VALID_TO" DATE,
"HAS_RECORD_CHANGED" NUMBER(1,0),
"CURRENT_ROW_IND" NUMBER(1,0)
);
Mock data is:
Insert into EMPLOYEE_DATA (EMPLOYEE_ID,EMPLOYEE_NAME,EMPLOYEE_NAME_FLAG,EMPLOYEE_ROLE,EMPLOYEE_ROLE_FLAG,EMPLOYEE_SALARY,EMPLOYEE_SALARY_FLAG, DATE_VALID_FROM,DATE_VALID_TO,HAS_RECORD_CHANGED,CURRENT_ROW_IND) values (1,'John Smith',0,'Associate',0,'1',0,to_date('01-FEB-17','DD-MON-RR'),to_date('28-FEB-17','DD-MON-RR'),0,0);
Insert into EMPLOYEE_DATA (EMPLOYEE_ID,EMPLOYEE_NAME,EMPLOYEE_NAME_FLAG,EMPLOYEE_ROLE,EMPLOYEE_ROLE_FLAG,EMPLOYEE_SALARY,EMPLOYEE_SALARY_FLAG, DATE_VALID_FROM,DATE_VALID_TO,HAS_RECORD_CHANGED,CURRENT_ROW_IND) values (2,'Katy Brown',0,'Team Leader',0,'7',0, to_date('01-FEB-17','DD-MON-RR'),to_date('28-FEB-17','DD-MON-RR'),0,0);
Insert into EMPLOYEE_DATA (EMPLOYEE_ID,EMPLOYEE_NAME,EMPLOYEE_NAME_FLAG,EMPLOYEE_ROLE,EMPLOYEE_ROLE_FLAG,EMPLOYEE_SALARY,EMPLOYEE_SALARY_FLAG, DATE_VALID_FROM,DATE_VALID_TO,HAS_RECORD_CHANGED,CURRENT_ROW_IND) values (2,'Katy Brown',0,'Team Leader',0,'7',0, to_date('01-APR-17','DD-MON-RR'),to_date('31-DEC-99','DD-MON-RR'),1,1);
Insert into EMPLOYEE_DATA (EMPLOYEE_ID,EMPLOYEE_NAME,EMPLOYEE_NAME_FLAG,EMPLOYEE_ROLE,EMPLOYEE_ROLE_FLAG,EMPLOYEE_SALARY,EMPLOYEE_SALARY_FLAG, DATE_VALID_FROM,DATE_VALID_TO,HAS_RECORD_CHANGED,CURRENT_ROW_IND) values (3,'Ian Jones',1,'Delivery Manager',1,'3',1, to_date('01-MAR-17','DD-MON-RR'),to_date('31-DEC-99','DD-MON-RR'),1,1);
Insert into EMPLOYEE_DATA (EMPLOYEE_ID,EMPLOYEE_NAME,EMPLOYEE_NAME_FLAG,EMPLOYEE_ROLE,EMPLOYEE_ROLE_FLAG,EMPLOYEE_SALARY,EMPLOYEE_SALARY_FLAG, DATE_VALID_FROM,DATE_VALID_TO,HAS_RECORD_CHANGED,CURRENT_ROW_IND) values (1,'John Smith',0,'Analyst',1,'1',0, to_date('01-MAR-17','DD-MON-RR'),to_date('31-MAR-17','DD-MON-RR'),1,0);
Insert into EMPLOYEE_DATA (EMPLOYEE_ID,EMPLOYEE_NAME,EMPLOYEE_NAME_FLAG,EMPLOYEE_ROLE,EMPLOYEE_ROLE_FLAG,EMPLOYEE_SALARY,EMPLOYEE_SALARY_FLAG, DATE_VALID_FROM,DATE_VALID_TO,HAS_RECORD_CHANGED,CURRENT_ROW_IND) values (1,'John Smith',0,'Analyst',0,'2',1, to_date('01-APR-17','DD-MON-RR'),to_date('31-DEC-99','DD-MON-RR'),1,1);
My query is:
SELECT *
FROM EMPLOYEE_DATA
WHERE DATE_VALID_FROM <= TO_DATE('01/04/2017', 'dd/mm/yyyy')
AND DATE_VALID_TO >= TO_DATE('01/01/2017', 'dd/mm/yyyy')
AND HAS_RECORD_CHANGED = '1'
ORDER BY EMPLOYEE_ID ASC, DATE_VALID_FROM ASC;
In my final query, I will add following line to bring back the current record AND CURRENT_ROW_IND = '1'. I left this out to show how I need to combine data for "John Smith" record to merge EMPLOYEE_ROLE_FLAG and EMPLOYEE_SALARY_FLAG for the previous and current record (for John Smith)
EDIT - Added original and target results. If possible, I would need to aggregate to get the max of the X_FLAG columns for each unique employee.
Original
EMPLOYEE_ID EMPLOYEE_NAME EMPLOYEE_NAME_FLAG EMPLOYEE_ROLE EMPLOYEE_ROLE_FLAG EMPLOYEE_SALARY EMPLOYEE_SALARY_FLAG DATE_VALID_FROM DATE_VALID_TO HAS_RECORD_CHANGED CURRENT_ROW_IND
1 John Smith 0 Associate 0 1 0 01-Feb-17 28-Feb-17 0 0
2 Katy Brown 0 Team Leader 0 7 0 01-Feb-17 28-Feb-17 0 0
2 Katy Brown 0 Team Leader 0 7 0 01-Apr-17 31-Dec-99 1 1
3 Ian Jones 1 Delivery Manager 1 3 1 01-Mar-17 31-Dec-99 1 1
1 John Smith 0 Analyst 1 0 0 01-Mar-17 31-Mar-17 1 0
1 John Smith 0 Analyst 0 1 1 01-Apr-17 31-Dec-99 1 1
Target
EMPLOYEE_ID EMPLOYEE_NAME EMPLOYEE_NAME_FLAG EMPLOYEE_ROLE EMPLOYEE_ROLE_FLAG EMPLOYEE_SALARY EMPLOYEE_SALARY_FLAG DATE_VALID_FROM DATE_VALID_TO HAS_RECORD_CHANGED CURRENT_ROW_IND
1 John Smith 0 Analyst 1 1 1 01-Apr-17 31-Dec-99 1 1
2 Katy Brown 0 Team Leader 0 7 0 01-Apr-17 31-Dec-99 1 1
3 Ian Jones 1 Delivery Manager 1 3 1 01-Mar-17 31-Dec-99 1 1

Consider derived tables where unit level query joins with aggregate query that calculates max flags:
SELECT emp.*
FROM
(SELECT *
FROM EMPLOYEE_DATA
WHERE DATE_VALID_FROM BETWEEN TO_DATE('01/01/2017', 'dd/mm/yyyy')
AND TO_DATE('01/04/2017', 'dd/mm/yyyy')
AND HAS_RECORD_CHANGED = 1
) AS emp
INNER JOIN
(SELECT EMPLOYEE_ID, MAX(EMPLOYEE_NAME_FLAG) AS MAX_NAME_FLAG,
MAX(EMPLOYEE_ROLE_FLAG) AS MAX_ROLE_FLAG,
MAX(EMPLOYEE_SALARY_FLAG) AS MAX_SALARY_FLAG
FROM EMPLOYEE_DATA
WHERE DATE_VALID_FROM BETWEEN TO_DATE('01/01/2017', 'dd/mm/yyyy')
AND TO_DATE('01/04/2017', 'dd/mm/yyyy')
AND HAS_RECORD_CHANGED = 1
GROUP BY EMPLOYEE_ID
) AS agg
ON emp.EMPLOYEE_ID = agg.EMPLOYEE_ID
AND emp.EMPLOYEE_NAME_FLAG = agg.MAX_NAME_FLAG
AND emp.EMPLOYEE_ROLE_FLAG = agg.MAX_ROLE_FLAG
AND emp.EMPLOYEE_SALARY_FLAG = agg.MAX_SALARY_FLAG
ORDER BY emp.EMPLOYEE_ID ASC, emp.DATE_VALID_FROM ASC

If possible, I would need to aggregate to get the max of the X_FLAG columns for each unique employee.
For the above requirement GROUP BY will work, right?
Please find below query to achieve you Target,
SELECT EMPLOYEE_ID
, EMPLOYEE_NAME
, MIN(EMPLOYEE_NAME_FLAG) EMPLOYEE_NAME_FLAG
, EMPLOYEE_ROLE
, MAX(EMPLOYEE_ROLE_FLAG) EMPLOYEE_ROLE_FLAG
, MIN(EMPLOYEE_SALARY) EMPLOYEE_SALARY
, MAX(EMPLOYEE_SALARY_FLAG) EMPLOYEE_SALARY_FLAG
, MAX(DATE_VALID_FROM) DATE_VALID_FROM
, MAX(DATE_VALID_TO) DATE_VALID_TO
, HAS_RECORD_CHANGED
, MAX(CURRENT_ROW_IND) CURRENT_ROW_IND
FROM EMPLOYEE_DATA
WHERE HAS_RECORD_CHANGED = 1
AND DATE_VALID_FROM BETWEEN TO_DATE('01/01/2017', 'dd/mm/yyyy')
AND TO_DATE('01/04/2017', 'dd/mm/yyyy')
GROUP BY EMPLOYEE_ID
, EMPLOYEE_NAME
, EMPLOYEE_ROLE
, HAS_RECORD_CHANGED
ORDER BY EMPLOYEE_ID ASC
, DATE_VALID_FROM ASC;

Related

display the number of excluded days from holiday leave

I have a leaves_table which contains id, holiday_start, holiday_end. I have another leaves_holiday table which contains the public holiday name and it's date. now i want to in the leaves_table to add a new column and exclude the days where it is a public holiday
lets say for example
leaves_table
id. holiday_start. holiday_end
1. 09-Jul-2022. 13-Jul-2022
public holiday table
holiday_name. holiday_date
christmas 10-Jul-2022
the query should return no of days excluded as 1
id. holiday_start. holiday_end. excluded days
1 09-Jul-2022. 13-Jul-2022. 1
how do i do this?
here is the create table and insert
create table XX_LEAVES_EXCLUDES
(
exclude_id number not null primary key,
holiday_start date not null,
holiday_end date not null
);
create sequence seq_exclude_id MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 2;
create or replace trigger trg_exclude_id
before insert
on XX_LEAVES_EXCLUDES
for each row
begin
:new.exclude_id:=seq_exclude_id.nextval;
end;
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('23-Jul-2022','20-Aug-2022');
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('01-Jul-2022','02-Aug-2022');
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('13-Jul-2022','29-Aug-2022');
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('12-Jul-2022','01-Aug-2022');
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('01-Jul-2022','29-Aug-2022');
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('08-Jul-2022','08-Aug-2022');
INSERT INTO XX_LEAVES_EXCLUDES (HOLIDAY_START, HOLIDAY_END) VALUES ('03-Jul-2022','20-Aug-2022');
2nd table (public holiday calendar table)
CREATE TABLE "XX_LEAVES_PUBLIC_HOLIDAYS"
( "PUBLIC_HOLIDAY_UAE_YEAR_2022" VARCHAR2(50) NOT NULL,
"HOLIDAY_DATE" DATE NOT NULL ENABLE
)
INSERT INTO XX_LEAVES_PUBLIC_HOLIDAYS (PUBLIC_HOLIDAY_UAE_YEAR_2022, HOLIDAY_DATE) VALUES (National Day,'10-Jul-2022');
compare leave date rage with hodiday and get count as excluded_days
select l.id, l.holiday_start, l.holiday_end,
(select Count(1) from leaves_holiday lh
where l.holiday_start<= lh.holiday_date and l.holiday_end >= lh.holiday_date) as excluded_days
from leaves_table l
One option is to create calendar of all holiday dates (leaves_calendar CTE in my example) and then join it to public_holiday so that you'd know which dates to exclude.
Sample data:
SQL> with
2 leaves_table (id, holiday_start, holiday_end) as
3 (select 1, date '2022-07-09', date '2022-07-13' from dual union all
4 select 2, date '2022-05-25', date '2022-05-30' from dual
5 ),
6 public_holiday (holiday_name, holiday_date) as
7 (select 'Christmas' , date '2022-07-10' from dual union all
8 select 'My holiday', date '2022-07-12' from dual),
9 --
Query begins here; first create a calendar ...
10 leaves_calendar as
11 (select l.id, l.holiday_start + column_value - 1 as datum
12 from leaves_table l cross join
13 table(cast(multiset(select level from dual
14 connect by level <= l.holiday_end - l.holiday_start + 1
15 ) as sys.odcinumberlist))
16 )
... then return the result: start and end date, number of excluded dates and holiday names (you didn't ask for that, but ... not a problem)
17 select c.id,
18 min(c.datum) as holiday_start,
19 max(c.datum) as holiday_end,
20 sum(case when p.holiday_date = c.datum then 1 else 0 end) as excluded_days,
21 listagg(p.holiday_name, ', ') within group (order by p.holiday_date) as excluded
22 from leaves_calendar c left join public_holiday p on p.holiday_date = c.datum
23 group by c.id;
ID HOLIDAY_START HOLIDAY_END EXCLUDED_DAYS EXCLUDED
---------- --------------- --------------- ------------- ------------------------------
1 09.07.2022 13.07.2022 2 Christmas, My holiday
2 25.05.2022 30.05.2022 0
SQL>
With sample data you provided:
SQL> select * from xx_leaves_excludes;
EXCLUDE_ID HOLIDAY_START HOLIDAY_END
---------- --------------- ---------------
1 23.07.2022 20.08.2022
2 01.07.2022 02.08.2022
3 13.07.2022 29.08.2022
4 12.07.2022 01.08.2022
5 01.07.2022 29.08.2022
6 08.07.2022 08.08.2022
7 03.07.2022 20.08.2022
7 rows selected.
SQL> select * from public_holiday;
HOLIDAY_NAME HOLIDAY_DATE
--------------- ---------------
Christmas 10.07.2022
My holiday 12.07.2022
Query looks like this:
SQL> with
2 leaves_calendar as
3 (select l.exclude_id, l.holiday_start + column_value - 1 as datum
4 from xx_leaves_excludesl cross join
5 table(cast(multiset(select level from dual
6 connect by level <= l.holiday_end - l.holiday_start + 1
7 ) as sys.odcinumberlist))
8 )
9 select c.exclude_id,
10 min(c.datum) as holiday_start,
11 max(c.datum) as holiday_end,
12 sum(case when p.holiday_date = c.datum then 1 else 0 end) as excluded_days,
13 listagg(p.holiday_name, ', ') within group (order by p.holiday_date) as excluded
14 from leaves_calendar c left join public_holiday p on p.holiday_date = c.datum
15 group by c.exclude_id;
EXCLUDE_ID HOLIDAY_START HOLIDAY_END EXCLUDED_DAYS EXCLUDED
---------- --------------- --------------- ------------- ----------------------------------------
1 23.07.2022 20.08.2022 0
2 01.07.2022 02.08.2022 2 Christmas, My holiday
3 13.07.2022 29.08.2022 0
4 12.07.2022 01.08.2022 1 My holiday
5 01.07.2022 29.08.2022 2 Christmas, My holiday
6 08.07.2022 08.08.2022 2 Christmas, My holiday
7 03.07.2022 20.08.2022 2 Christmas, My holiday
7 rows selected.
SQL>

Creating a new RANK based on delta of previous row

I've been working on an issue for a few days now, and I can't seem to find the right fix. Does anybody have an idea?
Case
We want to create a new a new sequence number whenever an employee has resigned for more than 1 day. We have the delta of the current employment record and the previous, so we can check the sequence. We want to calculate the min(Start) and max(End) of each employment record which isn't separated more than 1 day apart.
Data
Employee
Contract
Unit
Start
End
Delta
John Doe
1
Unit A
2014-01-01
2017-12-31
NULL
John Doe
2
Unit A
2018-02-01
2018-12-31
31
John Doe
3
Unit B
2019-01-01
2020-05-31
1
John Doe
4
Unit A
2020-06-01
NULL
1
With the query it should give back:
Employee
Contract
Unit
Start
End
Delta
Sequence
John Doe
1
Unit A
2014-01-01
2017-12-31
NULL
1
John Doe
2
Unit A
2018-02-01
2018-12-31
31
2
John Doe
3
Unit B
2019-01-01
2020-05-31
1
2
John Doe
4
Unit A
2020-06-01
NULL
1
2
That is because sequence 1 end at 31-12-2017, and a new one starts in February of 2018, so there has been more than 1 day of separation between the records. The following all have a sequence of 2 because it is continuing.
Query
I've tried a few things already with lag() and lead(), but I keep working myself into a corner with the data sample that I have. When I run it on the full set it won't work.
SELECT
Employee,
Start,
End,
DeltaPrevious,
Delta,
DeltaNext,
case
when DeltaPrevious IS NULL AND Delta = 1 then 1
when DeltaPrevious = 1 AND Delta > 1 then min(Contract) OVER (PARTITION BY Employee ORDER BY Contract ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
when DeltaPrevious > 1 AND Delta = 1 then min(Contract) OVER (PARTITION BY Employee ORDER BY Contract ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
end as Sequence
FROM
Contracts
ORDER BY
Employee, Start ASC
Hope that someone has a great idea.
Thanks,
Basically, you want to use lag() to get the previous date and then do a cumulative sum. This looks like:
select c.*,
sum(case when prev_end >= dateadd(day, -1, start) then 0 else 1
end) over (partition by employee order by start) as ranking
from (select c.*,
lag(end) over (partition by employee order by start) as prev_end
from contracts c
) c;
You mention that you might want to recalculate the new start and end. You would just use the above as a subquery/CTE and aggregate on employee and ranking.
If I understood correctly from the definition of Sequence in your second table, you are more interested in the DeltaNext than in the Delta(Previous). Here an attempt, including the code to create a sample input date with two more employees:
CREATE TABLE #input_table (Employee VARCHAR(255), [Contract] INT, Unit VARCHAR(6), [Start] DATE, [End] DATE)
INSERT INTO #input_table
VALUES
('John Doe', 1, 'Unit A', '2014-01-01', '2017-12-31'),
('John Doe', 2, 'Unit A', '2018-02-01', '2018-12-31'),
('John Doe', 3, 'Unit B', '2019-01-01', '2020-05-31'),
('John Doe', 4, 'Unit A', '2020-06-01', NULL),
('Alice', 1, 'Unit A', '2020-01-01', NULL),
('Bob', 1, 'Unit C', '2020-01-01', '2020-02-20')
First we create the deltas:
SELECT *
, DeltaPrev = DATEDIFF(DAY, LAG([End], 1, NULL) OVER(PARTITION BY Employee
ORDER BY [Start]), [Start]) -- Not relevant (?)
, DeltaNext = DATEDIFF(DAY, [End], LEAD([Start], 1, NULL) OVER(PARTITION BY Employee ORDER BY [Start]))
INTO #cte_delta -- I'll create a CTE at the end
FROM #input_table
Then we define Sequence:
SELECT *
, [Sequence] = CASE WHEN DeltaNext > 1 THEN 1 ELSE 2 END
INTO #cte_sequence
FROM #cte_delta
We then group the same Sequences by assigning a unique ROW_NUMBER for each employee with consecutive/ same Sequences:
SELECT *
, GRP = ROW_NUMBER() OVER(PARTITION BY Employee ORDER BY [Start]) - ROW_NUMBER() OVER(PARTITION BY Employee, [Sequence] ORDER BY [Start])
INTO #cte_grp
FROM #cte_sequence
Finally we calculate the min and max of the contract duration:
SELECT *
, MIN([Start]) OVER(PARTITION BY Employee, GRP) AS ContractStart
, CASE WHEN COUNT(*) OVER(PARTITION BY Employee, GRP) = COUNT([End])
OVER(PARTITION BY Employee, GRP) THEN MAX([End]) OVER(PARTITION BY Employee, GRP) ELSE NULL END AS ContractEnd
FROM cte_grp
The COUNT(*) and COUNT([End]) comparison is necessary or else the ContractEnd would be the max non-NULL value, i.e. 2018-02-01.
The whole code with CTEs here:
WITH cte_delta AS (
SELECT *
, DeltaPrev = DATEDIFF(DAY, LAG([End], 1, NULL) OVER(PARTITION BY Employee ORDER BY [Start]), [Start]) -- Not relevant (?)
, DeltaNext = DATEDIFF(DAY, [End], LEAD([Start], 1, NULL) OVER(PARTITION BY Employee ORDER BY [Start]))
FROM #input_table
)
, cte_sequence AS (
SELECT *
, [Sequence] = CASE WHEN DeltaNext > 1 THEN 1 ELSE 2 END
FROM cte_delta
)
, cte_grp AS (
SELECT *
, GRP = ROW_NUMBER() OVER(PARTITION BY Employee ORDER BY [Start]) - ROW_NUMBER() OVER(PARTITION BY Employee, [Sequence] ORDER BY [Start])
FROM cte_sequence
)
SELECT *
, MIN([Start]) OVER(PARTITION BY Employee, GRP) AS ContractStart
, CASE WHEN COUNT(*) OVER(PARTITION BY Employee, GRP) = COUNT([End]) OVER(PARTITION BY Employee, GRP) THEN MAX([End]) OVER(PARTITION BY Employee, GRP) ELSE NULL END AS ContractEnd
FROM cte_grp
Here the output:
Employee
Contract
Unit
Start
End
DeltaPrev
DeltaNext
Sequence
GRP
ContractStart
ContractEnd
Alice
1
Unit A
2020-01-01
NULL
NULL
NULL
2
0
2020-01-01
NULL
Bob
1
Unit C
2020-01-01
2020-02-20
NULL
NULL
2
0
2020-01-01
2020-02-20
John Doe
1
Unit A
2014-01-01
2017-12-31
NULL
32
1
0
2014-01-01
2017-12-31
John Doe
2
Unit A
2018-02-01
2018-12-31
32
1
2
1
2018-02-01
NULL
John Doe
3
Unit B
2019-01-01
2020-05-31
1
1
2
1
2018-02-01
NULL
John Doe
4
Unit A
2020-06-01
NULL
1
NULL
2
1
2018-02-01
NULL
Feel free to select DISTINCT records according to your needs.

query to retrieve changes in multiple columns in sql

I have the below table with the assignments details such as location, org, job, grade etc.
I want to build a query such that changes in the location, org are fetched for all system_person_type = 'EMP' only.
per_assignments
Person_id locat_id org_id job_id grade_id system_person_type START_DT END_DT
1 Toronto XYZ 1 GR1 EMP 01-JAN-2019 20-JAN-2019
1 US XYZ 1 GR1 EMP 21-JAN-2019 31-DEC-4712
2 Chicago ABC 2 GR1 EX-EMP 01-jul-2017 30-Nov-2017
2 Toronto XYZ 3 GR2 EMP 01-JAN-2019 03-JUL-2019
2 India GFH 3 GR2 EMP 04-JUL-2019 08-SEP-2019
2 India GFH 4 GR2 EMP 09-SEP-2019 31-DEC-4712
so in the above example the output should be :
person_id old_locat_id new_locat_id old_org_id new_org_id old_start_dt new_start_dat
1 Toronto US - - 01-jan-2019 21-jan-2019
2 Toronto India XYZ GFH 01-JAN-2019 04-JUL-2019
I created the below query But from the below query I am getting old_start_dt> new_start_dt and I am not getting all the changes required,
only 1 column change is retrieving. How can the below query be changed to accomadtae the above requirement ?
SELECT DISTINCT paam_change_loc.person_id ,
to_char(paam_change1.start_date,'YYYY-MM-DD') AS old_effective_start_dt ,
to_char(paam_change_loc.start_date,'YYYY-MM-DD') AS new_effective_start_dt ,
paam_change1.location_id AS old_loc_value ,
paam_change_loc.location_id AS new_loc_value
FROM per_assignments paam_change_loc,
per_assignments paam_change1
WHERE paam_change_loc.person_id =paam_change1.person_id
AND (
paam_change_loc.location_id IS NOT NULL
AND paam_change_loc.location_id <> paam_change1.location_id )
AND paam_change_loc.system_person_type = 'EMP'
AND paam_change1.system_person_type = 'EMP'
AND to_char(to_date(paam_change_loc.start_date),'DD-MM-YYYY') BETWEEN ('05-08-2019') AND '05-12-2019'
AND (
to_char(to_date(paam_change_loc.start_date)-1,'DD-MM-YYYY') BETWEEN ('05-08-2019') AND '05-12-2019' )
'05-08-2019' and '05-12-2019' is the transfer dates which will be passed to the query and the dates are to be compared in between these two dates
This query gives expected result:
select person_id, prev_start_dt, start_dt,
case loc_new when loc_old then ' - ' else loc_old end loc_old,
case loc_new when loc_old then ' - ' else loc_new end loc_new,
case org_new when org_old then ' - ' else org_old end org_old,
case org_new when org_old then ' - ' else org_new end org_new
from (
select person_id, locat_id loc_new, org_id org_new, start_dt,
lag(locat_id) over (partition by person_id order by start_dt) loc_old,
lag(org_id) over (partition by person_id order by start_dt) org_old,
lag(start_dt) over (partition by person_id order by start_dt) prev_start_dt,
case start_dt when 1 + lag(end_dt) over (partition by person_id order by start_dt)
then 1 end flag
from per_assignments)
where flag = 1 and (loc_new <> loc_old or org_new <> org_old)
dbfiddle
In the inner query apply filters for system_person_type and dates as needed. At first I used lag() three times and also to mark continuous rows, in column flag. Then only flagged rows are shown where location or organization changed.
I am not sure about data structure of your db. Considering the sample data as table, you can achieve the expected outptut using analytical function:
Select person_id,
Locat_id as old_locat_id,
New_locat_id,
org_id as old_org_id,
New_org_id,
Start_date as old_start_date,
New_start_date
From
(Select t.*,
Lead(org_id) over (partition by person_id order by start_date) as new_org_id,
Lead(start_date) over (partition by person_id order by start_date) as new_start_date,
Lead(locat_id) over (partition by person_id order by start_date) as new_locat_id,
From your_table t where system_person_type = 'EMP')
Where locat_id <> new_locat_id or org_id <> new_org_id;
Cheers!!

Counting employees from one job level to another

I have a snapshot of a dataset as follows:
effective_date hire_date name job_level direct_report
01.01.2018 01.01.2018 xyz 5 null
01.02.2018 01.01.2018 xyz 5 null
01.03.2018 01.01.2018 xyz 5 null
01.04.2018 01.01.2018 xyz 6 null
01.05.2018 01.01.2018 xyz 6 null
01.01.2018 01.02.2018 abc 5 null
01.02.2018 01.02.2018 abc 5 null
01.03.2018 01.02.2018 abc 5 null
01.04.2018 01.02.2018 abc 5 null
01.05.2018 01.02.2018 abc 5 null
Effective date is an overview of info for each employee on a daily
basis.
Hire date is the date when an employee was hired
Job level is the level at which employee stands on that particular day
I want to find out as to how many employees moved/promoted from level 5 to level 6 during this overall time?
Here is one method that uses two levels of aggregation. You can get the employees that were promoted by comparing the minimum date for "5" to the maximum date of "6":
select name
from t
where job_level in (5, 6)
group by name
having min(case where job_level = 5 then effective_date end) < max(case where job_level = 6 then effective_date end);
To count them:
select count(*)
from (select name
from t
where job_level in (5, 6)
group by name
having min(case where job_level = 5 then effective_date end) < max(case where job_level = 6 then effective_date end)
) x;
Alternatively, you can use lag():
select count(distinct name)
from (select t.*, lag(job_level) over (partition by name order by effective_date) as prev_job_level
from t
) t
where prev_job_level = 5 and job_level = 6;
The two are subtly different, but within the range of the ambiguity of the question. For instance, the first would count 5 --> 4 --> 6, the second would not.
you can try this.
select count(distinct name) from employees e1
WHERE effective_date between '01.01.2018' and '01.05.2018'
And job_level = 5
and EXISTS (select * from employees e2 where e1.name = e2.name
and e2.effective_date > e1.effective_date
and e2.job_level = 6
)

SQL server full join query

I have a full join sql query and i am retrieving the data from the same table.the problem is i am getting the null value where i am expecting the column name.
Example:
I am having a table where there are two columns typeOfPost,dob.
DOB TypeOfPost
--------- --------------
20/11/1998 Manager
1/1/2000 Sales
13/6/1999 Manager
20/1/1987 Manager
1/11/1985 Sales
Now when I am writing a join query like
select DATENAME(month,dob) as Red,count(TypeOfPost)
from tablename
where TypeOfPost='Manager'
group by DATENAME(month,dob) as A
full join
select DATENAME(month,dob) as Green,count(TypeOfPost)
from tablename
where TypeOfPost='Sales'
group by DATENAME(month,dob) as B on B.Green = A.Red
Output-- Expected Output--
--------------------- ---------------------
Month Man Sal Month Man Sal
-------- ----- ------ -------- ----- ------
January 1 1 January 1 1
NULL 1 NULL June 1 NULL
November 1 1 November 1 1
Now here the problem rise, I want 'June' in the column Month instead of NULL value.
So is there any way to get that??
Help me out.
Thanks.
One option is to
use a CASE statement in a subselect
Determine for given record if it is a manager or sales
substitute with 1 or 0 accordingly
SELECT and GROUP from this subselect the final results.
SQL Statement
SELECT Month
, SUM(Man) AS Man
, SUM(Sal) AS Sal
FROM (
SELECT DATENAME(MONTH, DOB) AS Month
, CASE WHEN TypeOfPost = 'Manager' THEN 1 ELSE 0 END AS Man
, CASE WHEN TypeOfPost = 'Sales' THEN 1 ELSE 0 END AS Sal
FROM tableName
) g
GROUP BY
Month
or
SELECT Month
, SUM(Man)
, SUM(Sal)
FROM (
SELECT DATENAME(MONTH, DOB) AS Month
, COUNT(TypeOfPost) AS Man
, 0 AS Sal
FROM tableName
WHERE TypeOfPost = 'Manager'
GROUP BY
DATENAME(MONTH, DOB)
UNION ALL
SELECT DATENAME(MONTH, DOB) AS Month
, 0 AS Man
, COUNT(TypeOfPost) AS Sal
FROM tableName
WHERE TypeOfPost = 'Sales'
GROUP BY
DATENAME(MONTH, DOB)
) g
GROUP BY
Month