Locating Historical records using Oracle PL/SQL - sql

I have the following Oracle employee table which tracks employee movement between various departments.
EMPID DEPARTMENT RECORD_DATE
123456 Technology 2019-01-01
123456 Technology 2019-02-25
123456 Finance 2019-03-01
123456 Finance 2019-09-28
123456 HR 2020-03-01
987654 HR 2019-04-01
987654 Finance 2019-09-01
987654 HR 2020-01-31
I need to write an Oracle PL/SQL script that will allow the user to define a department name, and a historical date, resulting in having the query display all employees that were assigned to that department at specific point in time.
Example: If I wanted to know all the employees that worked in Finance on 2019-10-01, the query would return:
EMPID DEPARTMENT DEPARTMENT_START_DATE
123456 Finance 2019-03-01
987654 Finance 2019-09-01
(Note, a Department "leave" date would be nice, but optional)
Any ideas?

You can use window functions for this:
select empid, department, record_date
from (
select
t.*,
lead(record_date) over(partition by empid order by record_date) lead_record_date
from mytable t
) t
where
department = :department_id
and :target_date >= record_date
and (:target_date < lead_record_date or lead_record_date is null)
:department_id and :target_date represent the parameters to the query.
In the subquery, lead() retrieves the "next" record_date of the same empid. The outer query uses that information as a filter parameter to locate the relevant records.

With your given example it seems ,you need employee list with historical date for each department(if worked).This can be achieved with simple subquery
select empid,record_date from(
select empid,department,record_Date,ROW_NUMBER() over (partition by empid,department order by empid) as id
from employee)a
where id=1
order by 3

Related

Select earliest date and count rows in table with duplicate IDs

I have a table called table1:
id created_date
1001 2020-06-01
1001 2020-01-01
1001 2020-07-01
1002 2020-02-01
1002 2020-04-01
1003 2020-09-01
I'm trying to write a query that provides me a list of distinct IDs with the earliest created_date they have, along with the count of rows each id has:
id created_date count
1001 2020-01-01 3
1002 2020-02-01 2
1003 2020-09-01 1
I managed to write a window function to grab the earliest date, but I'm having trouble figuring out where to fit the count statement in one:
SELECT
id,
created_date
FROM ( SELECT
id,
created_date,
row_number() OVER(PARTITION BY id ORDER BY created_date) as row_num
FROM table1)
) AS a
WHERE row_num = 1
You would use aggregation:
select id, min(create_date), count(*)
from table1
group by id;
I find it amusing that you want to use window functions -- which are considered more advanced -- when lowly aggregation suffices.

Not able to get exact latest records with two columns having same value - in SQL Server

I am trying to get distinct records for a specific department from the table employee.
I have tried with this code in SQL Server, and I'm getting this error:
Error: employeeId is invalid in the select list because it is not contained in either aggregate function or the GROUP BY clause.
My code:
SELECT
name, department, MAX(jointime) LatestDate, employeeId
FROM
employee
WHERE
department = 'Mechanical'
GROUP BY
name
Records in DB:
name department joinTime EmployeeId
-----------------------------------------------------------
Erik Mechanical 2019-07-06 11:59:59 456
Tom Mechanical 2019-07-06 11:59:59 789
Erik Computer 2019-07-05 11:59:59 222
Erik Computer 2019-07-04 11:59:59 111
Erik Mechanical 2019-07-01 11:59:59 123
I want to achieve the result when a query for 'Mechanical' is executed. The latest record should be fetched from DB for a particular department.
name department joinTime EmployeeId
-----------------------------------------------------------
Erik Mechanical 2019-07-06 11:59:59 456
Tom Mechanical 2019-07-06 11:59:59 789
Assuming the key is [Name] and not [EmployeeId]
One option is the WITH TIES clause, and thus no need for aggregation
Example
Select Top 1 with ties *
From employee
Where department='Mechanical'
Order By Row_Number() over (Partition By [Name] order by joinTime Desc)
Returns
name department joinTime EmployeeId
Erik Mechanical 2019-07-06 11:59:59.000 456
Tom Mechanical 2019-07-06 11:59:59.000 789
You can use EXISTS:
SELECT e.*
FROM employee e
WHERE e.department='Mechanical'
AND NOT EXISTS (
SELECT 1 FROM employee
WHERE department = e.department
AND name = e.name AND joinTime > e.joinTime
)
See the demo.
Results:
> name | department | joinTime | EmployeeId
> :--- | :--------- | :------------------ | ---------:
> Erik | Mechanical | 2019-07-06 11:59:59 | 456
> Tom | Mechanical | 2019-07-06 11:59:59 | 789
You can use ROW_NUMBER to mark the latest row for each employee, or CROSS APPLY to run a correlated subquery for each employee.
with q as
(
SELECT name, department, jointime, employeeId,
row_number() over (partition by name, order by joinTime desc) rn
FROM employee where department='Mechanical'
)
select name, department, jointime, employeeId
from q
where rn = 1
or
with emp as
(
select distinct name from employee
)
select e.*
from q
cross apply
(
select top 1 *
from employee e2
where e2.name = q.name
order by joinDate desc
) e
Just add department,employeeId to the GROUP BY
SELECT name , department, MAX(jointime) LatestDate , employeeId
FROM employee where department='Mechanical'
GROUP BY name, department, employeeId
You need to use AGGREGATE Functions for fields which are used in SELECT statement:
SELECT name,
MIN(department)
, MAX(jointime) LatestDate,
, MIN(employeeId)
FROM employee where department='Mechanical'
GROUP BY name
SQL server finds all records with names Tom or Erik, but SQL Server does not know what one value from multiple rows should be chosen for the fields such as department or employeeId. By using aggregrate functions, you are advising SQL Server to get the MIN, MAX, SUM, COUNT values of that columns.
OR use those columns to the GROUP BY clause to get all unique rows:
SELECT name
, department
, jointime
, employeeId
FROM employee where department='Mechanical'
GROUP BY name
, department
, jointime
, employeeId

SQL: How do you make a second grouping based off non-sequential dates?

I'm using SQL Server 2008 R2. I have table called Name with the following structure and sample data:
ID Date Name DOD
10001 200911 Kevin H 06/17/2000
10001 200912 Kevin 06/20/2000
10001 201001 Kevin 06/20/2000
10001 201012 K 06/20/2000
10001 201101 K 06/20/2000
10001 201406 Kevin 06/20/2000
Notice that the Employee 10001 has been changing names 3 times and DODs once over time. What I am trying to do is to group by ID, Name, and DOD such that the data is consistent between dates. I also need to grab the min and max date for these groups and ensure that the dates are in sequential order. If the name or DOD changes and then changes back what it was previously, a new group would need to be created.
So, the output will look like this:
EmployeeID MinDate MaxDate Name DOD
10001 200911 200911 Kevin H 06/17/2000
10001 200912 201001 Kevin 06/20/2000
10001 201012 201101 K 06/20/2000
10001 201406 201406 Kevin 06/20/2000
The Name table is quite large so there will be instances where Data is consistent for 20 months, then inconsistent for 1 month, and then back to being consistent for 20 months.
Thank you in advance and please let me know if you need additional information.
You can use the difference of row numbers approach:
select employeeid, min(date) as mindate, max(date) as maxdate,
name, dod
from (select t.*,
row_number() over (partition by employeeid order by date) as seqnum_d,
row_number() over (partition by employeeid, name, dod order by date) as seqnum_dnd
from t
) t
group by employeeid, (seqnum_d - seqnum_dnd);
To see how this works, run the subquery. If you stare at the results for a while, you'll get why the difference of row numbers works in this case.

How to create a oracle view of the max sum of a sum of values of a column based on the values of another

I need to create a view in Oracle 11g that would take these tables:
employees
FirstName | LastName | EmployeeID
-----------------------------------
joe | shmo | 1
bob | moll | 2
salesData
Employee ID | commission on sale
----------------------------------
1 | $20
1 | $30
2 | $50
2 | $60
and then sum up the total commission each employee earned and return the employee who earned the most commission.
So using the sample data the view will contain the employee id :: 2 or bob moll.
This should get you what you need
Create someviewname as view
Select EmployeeID, sum (commision)
from employees
left outer join salesData on salesData.EmployeeID = employees.EmployeeID
Group by EmployeeID, commision
order by commission desc
SELECT employeeID
FROM
(SELECT employeeID,
SUM(commission)
FROM sales
GROUP BY employeeID
ORDER BY SUM(commission)
)
WHERE rownum = 1
Not sure why you want a view of that, but hopefully, you can figure that out.
In Oracle 12g+, you can use fetch:
select employeeid
from sales
group by employeeid
order by sum(commission) desc
fetch first 1 row only;
In earlier versions, one method is to use rownum:
select s.*
from (select employeeid
from sales
group by employeeid
order by sum(commission) desc
) s
where rownum = 1;

Obtain count for duplicates in two tables oracle sql

I have a database and I will primarily focus on two table. Table 1 has emp_id as primary key
Table 1 store access info for each employee. I am tasked to count how many time an employee goes into a room..
Table 1
emp_id time_in time_out, other columns etc
1111 3:00 3.30
2222 1:00 1:10
3333 2:00 2:45
4444 7:00 5:00
table 2
sequence_no, emp_id, time, access type, other columns etc
Table 2 has multiple entries of enties
sequence_no, emp_id, time, access type
10000 1111 3:00 granted
10221 1111 3:23 granted
19911 2222 x
12122 1111 x
23232 3333
I have written sQl query that display joins the two tables,
but at the moment I am trying to add a column that either sums total entries (due to the sequence number, my query is returning multiple rows)
select e.emp_id,a.sequence_no,count(sequence_no) from employee, access a where e.emp_id = a.emp_id
group by e.emp_id having count t(1) > 1
output should look like
emp_id, sequence number, time in/out , total_counts
1111 10000 3:30 5
1111 12122 3:30 5
2222 19911 2:20 19
within the results, I need the sequence number which will cause duplicate emp_id, but the total for each ID should be the same accross;
you don't need to group anything:
select
e.emp_id,
a.sequence_no,
count(sequence_no) over (partition by e.emp_id) as total_counts
from employee, access a
where e.emp_id = a.emp_id
If you want to filter those emps with less than two entries:
select *
from
(
select
e.emp_id,
a.sequence_no,
count(sequence_no) over (partition by e.emp_id) as total_counts
from employee, access a
where e.emp_id = a.emp_id
)
where total_counts >= 2;
If you want to group by emp, in Oracle(I don't know if the syntax is ok in sqlserver) you can use keep:
select
e.emp_id,
max(a.sequence_no) keep (dense_rank first order by time desc), --last sequence
count(sequence_no)
from employee, access a
where e.emp_id = a.emp_id
group by e.emp_id
having count(*) > 1;