Select distinct records based on max(date) or NULL date - sql

I am trying to get a list of employees based on their employee status or their most recent termination date. If the employee is active, the termination date will be NULL. There are also employees that have worked in multiple companies within our organization, I only want the record from the most recent company, whether active or terminated. An employee may also have different Employee numbers in the different companies, so the selection will have to be based on the SSN (Fica) number.
Here is an original data set:
company employee Fica First_name emp_status Term_date
5 7026 Jason T1 2013-09-16 00:00:00.000
500 7026 Jason T1 2010-11-30 00:00:00.000
7 7026 Jason T1 2009-07-31 00:00:00.000
2 90908 Jason A1 NULL
505 293866 William T1 2008-05-23 00:00:00.000
7 7243 Ashley T1 2010-07-11 00:00:00.000
2 90478 Michael T1 2013-01-11 00:00:00.000
500 90478 Michael T1 2011-09-26 00:00:00.000
500 311002 Andreas A1 NULL
3 365463 Matthew A1 NULL
500 248766 Chris T1 2007-04-23 00:00:00.000
500 90692 Kaitlyn T1 2012-03-13 00:00:00.000
2 90692 Kaitlyn A5 NULL
500 90236 Jeff T1 2011-09-26 00:00:00.000
2 90236 Jeff A1 NULL
2 90433 Nathan T1 2012-03-26 00:00:00.000
500 90433 Nathan T1 2011-09-26 00:00:00.000
Here are the results I am trying to get:
company employee Fica First_name emp_status Term_date
2 90908 Jason A1 NULL
505 293866 William T1 2008-05-23 00:00:00.000
7 7243 Ashley T1 2010-07-11 00:00:00.000
2 90478 Michael T1 2013-01-11 00:00:00.000
500 311002 Andreas A1 NULL
3 365463 Matthew A1 NULL
500 248766 Chris T1 2007-04-23 00:00:00.000
2 90692 Kaitlyn A5 NULL
2 90236 Jeff A1 NULL
2 90433 Nathan T1 2012-03-26 00:00:00.000
Thanks for any help you are able to give. I need to run this on a SQL2005 server which will be connecting to an Oracle server via ODBC.

If the dates were all populated, you could do this with a "standard" not exists query. The NULLs introduce a problem, but that problem can be solved using coalesce():
select t.*
from table t
where not exists (select 1
from table t2
where t2.employee = t.employee and
coalesce(t2.term_date, '9999-01-01') > coalesce(t.term_date, '9999-01-01)
);
NOTE: If you need for this to work on Oracle, then you need a different format for the date constant.
EDIT:
Another way to solve this uses row_number():
select t.*
from (select t.*,
row_number() over (partition by employee
order by (case when term_date is null then 0 else 1 end),
term_date desc
) as seqnum
from table t
) t
where seqnum = 1;
The rule for choosing the "last" row are embedded in the order by clause. Put the NULL value first, followed by the term_date in descending order.

Related

T-SQL get values for specific group

I have a table EmployeeContract similar like this:
ContractId
EmployeeId
ValidFrom
ValidTo
Salary
12
5
2018-02-01
2019-06-31
x
25
8
2015-01-01
2099-12-31
x
50
5
2019-07-01
2021-05-31
x
52
6
2011-08-01
2021-12-31
x
72
8
2010-08-01
2014-12-31
x
52
6
2011-08-01
2021-12-31
x
Table includes history contracts in company for each employee. I need to get date when employees started work and last date of contract. Sometime records has duplicates.
For example, based on data from above:
EmployeeId
ValidFrom
ValidTo
5
2018-02-01
2021-05-31
8
2010-08-01
2099-12-31
6
2011-08-01
2021-12-31
Base on this article: https://www.techcoil.com/blog/sql-statement-for-selecting-the-latest-record-in-each-group/
I prepared query like this:
select minv.*, maxv.maxvalidto from
(select distinct con.[EmployeeId], mvt.maxvalidto
from [EmployeeContract] con
join (select [EmployeeId], max(validto) as maxvalidto
FROM [EmployeeContract]
group by [EmployeeId]) mvt
on con.[EmployeeId] = mvt.[EmployeeId] and mvt.maxvalidto = con.validto) maxv
join
(select distinct con.[EmployeeId], mvf.minvalidfrom
from [EmployeeContract] con
join (select [EmployeeId], min(validfrom) as minvalidfrom
FROM [EmployeeContract]
group by [EmployeeId]) mvf
on con.[EmployeeId] = mvf.[EmployeeId] and mvf.minvalidfrom = con.validfrom) minv
on minv.[EmployeeId] = maxv.[EmployeeId]
order by 1
But I'm not satisfied, i think it's not easy to read, and probably optimize is poor. How can I do it better?
I think you want group by:
select employeeid, min(validfrom), max(validto)
from employeecontract
group by employeeid

Join records only on first match

im trying to join two tables. I only want the first matching row to be joined the others have to be null.
One of the tables contains daily records per User and the second table contains the goal for each user and day.
The joined result table should only join the firs ocurrence of User and Day and set the others to null. The Goal in the joined table can be interpreted as DailyGoal.
Example:
Table1 Table2
Id Day User Value Id Day User Goal
================================ ============================
01 01/01/2020 Bob 100 01 01/01/2020 Bob 300
02 01/01/2020 Bob 150 02 02/01/2020 Carl 170
03 01/01/2020 Bob 50
04 02/01/2020 Carl 200
05 02/01/2020 Carl 30
ResultTable
Day User Value Goal
============================================
01/01/2020 Bob 100 300
01/01/2020 Bob 150 (null)
01/01/2020 Bob 50 (null)
02/01/2020 Carl 200 170
02/01/2020 Carl 30 (null)
I tryed doing top1, distinct, subqueries but I cant find way to do it. Is this possible?
One option uses window functions:
select t1.*, t2.goal
from (
select t1.*,
row_number() over(partition by day, user order by id) as rn
from table1 t1
) t1
left join table2 t2 on t2.day = t1.day and t2.user = t1.user and t1.rn = 1
A case expression is even simpler:
select t1.*,
case when row_number() over(partition by day, user order by id) = 1
then t2.goal
end as goal
from table1 t1

SQL query to check if the next row value is same or different

I am joining two tables based on a common column date. However, the column I am trying to get from one the table (cmg) in this case, should get next row value only if it is different from its previous row's value
Table A
Date comp.no
-----------------------
2019-03-08 5
2019-02-26 5
2019-01-17 5
2019-01-10 5
2018-12-27 5
Table B
Date cmg
-----------------
2019-07-17 NULL
2019-04-20 NULL
2019-02-26 RHB
2019-01-19 NULL
2019-01-17 RHB
2019-01-10 RMB
2018-12-28 NULL
2018-12-27 RHB
2018-12-12 RUB
2018-11-28 RUB
2018-10-20 NULL
2018-07-21 NULL
2018-04-21 NULL
2018-01-20 NULL
2017-10-21 NULL
2017-07-29 NULL
2017-05-07 NULL
2017-02-13 NULL
2016-11-22 NULL
2016-08-29 NULL
2016-06-07 NULL
2016-04-06 RUB
2016-03-21 RUB
2016-03-07 RUB
You can use lag function to compare with previous value. And for the first row you'll need an isnull() check since the first row won't have a previous value.
;with cte as(
select case
when isnull(lag(t2.cmg)over (order by t2.cmg desc),'') <>t2.cmg then 1 else 0 end as isresult
,t2.date,t2.cmg
from TableA t1
inner join TableB t2
on t1.date=t2.date
)
select date,cmg from cte where isresult=1
Use lag():
select date, cmg
from (select b.date, b.cmg, lag(b.cmg) over (order by b.date) as prev_cmg
from a join
b
on a.date = b.date
) b
where prev_cmg is null or prev_cmg <> cmg
order by date;

Sql query to assign value to a column having null value from other row based on different scenarios

I have the below real production data scenario and I am trying to get the desired output. I have to populate all the NULL values for the Worker from other rows (next or previous based on data).
Sample Input
PK Id Status Worker Created Date
--- --- ----------- ----------- -------------
1 101 Waiting NULL 1/1/2019 8:00
2 101 Assigned Jon Doe 1/1/2019 8:10
3 101 Initiated Jon Doe 1/1/2019 8:15
4 102 Waiting NULL 1/1/2019 8:00
5 102 Waiting NULL 1/1/2019 8:12
6 102 Assigned Jane Doe 1/1/2019 8:15
7 103 Waiting NULL 1/1/2019 8:00
9 103 Initiated Jon Doe 1/1/2019 8:15
11 103 Waiting NULL 1/1/2019 8:17
12 103 Assigned Jane Doe 1/1/2019 8:20
13 103 Assigned NULL 1/1/2019 8:22
14 103 Initiated NULL 1/1/2019 8:25
Desired Output
PK Id Status Worker Created Date
--- --- ----------- ----------- -------------
1 101 Waiting Jon Doe 1/1/2019 8:00
2 101 Assigned Jon Doe 1/1/2019 8:10
3 101 Initiated Jon Doe 1/1/2019 8:15
4 102 Waiting Jane Doe 1/1/2019 8:00
5 102 Waiting Jane Doe 1/1/2019 8:12
6 102 Assigned Jane Doe 1/1/2019 8:15
7 103 Waiting Jon Doe 1/1/2019 8:00
9 103 Initiated Jon Doe 1/1/2019 8:15
11 103 Waiting Jane Doe 1/1/2019 8:17
12 103 Assigned Jane Doe 1/1/2019 8:20
13 103 Assigned Jane Doe 1/1/2019 8:22
14 103 Initiated Jane Doe 1/1/2019 8:25
SQL:
select tl.*, RANK() OVER (ORDER BY tl.[Id],tl.[Created Date]) rnk
into #temp
from table tl
select tl.*,
case when tl.[Worker] is null t2.[Worker] else tl.[Worker] end as [Worker Updated]
from #temp tl
left join #temp t2 on tl.[Id]=t2.[Id] and tl.rnk=t2.rnk-1
I am only able to get the correct result for scenario Id 101 in the Input Data Sample. I am not sure how to handle scenario 102 (two consecutive rows having NULL on Worker column) and 103 (Last 2 rows having NULL on Worker).
Can someone please help me on this?
I think what you need is ISNULL() and MAX() OVER() so your query would have something like this :
SELECT
t1.PK
, t1.Id
, t1.Status
, ISNULL(t1.Worker, MAX(t1.Worker) OVER(PARTITION BY Id) ) Worker
, t1.CreatedDate
FROM #temp tl
ISNULL() will check the value, if is it null will replace it with the secondary value. it's the same the case that you have in your query.
MAX(t1.Worker) OVER(PARTITION BY Id)
Since the aggregation functions eliminate nulls, we take this advantage and use it with OVER() clause to partition the rows by Id and get the value that we need using one of the aggregation functions.
Possibly the simplest way is outer apply:
select t.id, t.status, t2.worker, t.date
from t outer apply
(select top (1) t2.*
from t2
where t2.worker is not null and t2.id >= t.id
order by t2.id asc
) t2;
What you really want is the IGNORE NULLS option on LEAD(). However, SQL Server does not support that.
If you want to fill in the most recent values with the preceding value, then follow the same logic with another apply:
select t.id, t.status,
coalesce(tnext.worker, tprev.worker) as worker, t.date
from t outer apply
(select top (1) t2.*
from t2
where t2.worker is not null and t2.id >= t.id
order by t2.id asc
) tnext outer apply
(select top (1) t2.*
from t2
where t2.worker is not null and t2.id <= t.id
order by t2.id desc
) tprev;

insert duplicate rows in temp table

i'm a new to sql & forum - need help on how to insert duplicate rows in temp table. Would like to create a view as result
View - Employee:
Name Empid Status Manager Dept StartDate EndDate
AAA 111 Active A111 Cashier 2015-01-01 2015-05-01
AAA 111 Active A222 Sales 2015-05-01 NULL
I don't know how to write a function, but do have a DATE table.
Date Table: (365 days) goes up to 2018
Date Fiscal_Wk Fiscal_Mon Fiscal_Yr
2015-01-01 1 1 2015
Result inquiry
How do i duplicate rows for each record from Employee base on each of the start date for entire calendar year.
Result:
Name Empid Status Manager Dept Date FW FM FY
AAA 111 Active A111 Cashier 2015-01-01 1 1 2015
AAA 111 Active A111 Cashier 2015-01-02 1 1 2015
******so on!!!!!!
AAA 111 Active A222 Sales 2015-05-01 18 5 2015
AAA 111 Active A222 Sales 2015-05-02 18 5 2015
******so on!!!!!!
Thanks in advance,
Quinn
Select * from Employee cross join Calendar.
This will essentially join every record in calendar to every record in Employee.
so, if there are 2 records in Employee and 10 in calendar, you'll end up with 20 total, 10 for each.
What you are looking for is a join operation. However, the condition for the join is not equality, because you want all rows that match between two values.
The basic idea is:
select e.*, c.date
from employee e join
calendar c
on e.startdate >= c.date and
(e.enddate is null or c.date <= e.enddate);
modified query - this yields result of previous & most recent records
select e.*, c.date, c.FW, c.FM, c.FY
from employee e
join calendar c
on e.startdate <= c.date and
ISNULL(e.enddate,GETDATE()) > c.date