Effective Date Employee Entered Pay Group - sql

I need to find the effective date when the employee entered the pay group. It either occurred at the hire date, the rehire date, or Transfer date, whichever is latest. I think what I want to do is create a temp table of most recent effective dates where C1.ACTION=('XFR') AND C1.PAYGROUP=A.PAYGROUP, when the associate is not in that table, give me most recent hire date.
A is Top of Stack Employee Dta
B is Top of Stack Personal Data
C is entire employee record
Most Recent Hire Date is
CASE WHEN A.HIRE_DT<=A.REHIRE_DT THEN A.REHIRE_DT
ELSE A.HIRE_DT END MOST_REC_HIREDT
FYI I know this query is really messed up, that's why I'm asking for help.
SELECT DISTINCT
A.EMPLID
A.FIRST_NAME||' '||A.LAST_NAME WORKERNAME,
CASE
WHEN(Select Max(C1.EFFDT) FROM JOB C1
WHERE (C.EMPLID=C1.EMPLID
AND C1.ACTION=('TAF')
AND C1.PAYGROUP=A.PAYGROUP
AND C1.EFFDT>=(CASE WHEN A.HIRE_DT<A.REHIRE_DT THEN =A.REHIRE_DT
ELSE A.HIRE_DT END MR_HIRE_DT)))
WHEN A.EMPLID NOT IN JOB C1
THEN (CASE WHEN A.HIRE_DT<=A.REHIRE_DT
THEN A.REHIRE_DT
ELSE A.HIRE_DT END MR_HIRE_DT2)
ELSE 'Null' END EFFDT,
A.PAYGROUP
FROM EMPLOYEES A, PERSONAL_DATA B, JOB C
WHERE
A.EMPLID=B.EMPLID
AND
B.EMPLID=C.EMPLID
AND
A.PAYGROUP=C.PAYGROUP
AND
C.EMPL_STATUS in ('A','L','P','S')

It really is important to use ANSI join syntax as it aids (a lot) in working through the logic of how the tables relate. Here we only have 2 tables but in the example query there are 4 table aliases in use (A, B, C and C1). Additionally it helps to use table aliases that relate to the table's name such as E for Employee, J for Job.
What you are seeking is "the latest" date from table JOB, and an extremely useful function row_number() can be used for this. It is used in conjunction with an over() clause which contains a partition by (which is a little similar to group by) and an order by. When ordered by date descending then the row number is 1 for the most recent date (per employee due to the partition used). So, if we filter the subquery below by is_latest = 1 we get one row per employee with the latest effective date. Note this also removes the need to use select distinct now.
SELECT
E.EMPLID
, (E.FIRST_NAME || ' ' || E.LAST_NAME) WORKERNAME
, J.EFFDT PAYGROUP_EFFDT
, E.PAYGROUP
FROM EMPLOYEES E
INNER JOIN (
SELECT
JOB.*
, ROW_NUMBER() OVER (PARTITION BY EMPLID
ORDER BY EFFDT DESC) AS is_latest
FROM JOB
WHERE EMPL_STATUS IN ('A','L','P','S')
) J ON E.EMPLID=J.EMPLID AND J.is_latest = 1

I may be over-simplifying the task here, as I don't fully understand how we get to the dates in question. But well, what I am doing is:
get the greater of the two hire_dt and rehire_dt from the employee record
get the job dates for the employee
from these intermediate results get the first date per employee
The query:
select emplid, max(dt)
from
(
select emplid, greatest(nvl(hire_dt,rehire_dt),nvl(rehire_dt,hire_dt)) as dt from employees
union all
select emplid, effdt as dt from job where action = 'TAF' and empl_status in ('A','L','P','S')
)
group by emplid
order by emplid;

Related

Use SQL to exclude rows with no change in specific column from query

How can I exclude the yellow highlighted rows? The requirement of the query is to return only rows where the job title changed, but the raw data includes extraneous rows with different effective dates which we want to be able to automatically exclude with SQL.
This is because the source system has a record effective date column that is common to several columns. We cannot change this architecture and so need the ability to exclude records from the output.
Edit to include error image from suggested answer:
select
a.*
FROM
jobtitles a
LEFT JOIN jobtitles b
ON a.id = b.id AND
a.effdate < b.effdate
WHERE b.id IS NULL
something like that, that would get the latest job title anyway
you might be able to use that "pseudo table" to further query
on considering your question further, how about
select
MIN(effdate) as effdate, jobtitle
FROM jobtitles
group by employeeid, jobtitle
(I'm making the assumption they don't change job titles back and forth, if so you're basically screwed, so be aware of that)
If JobTitle of an employee does not reverted to previous job titles, use the following query:
SELECT EmployeeID,
Name,
JobTitle,
MAX(Name) AS Name,
MIN(EffectiveDate) AS EffectiveDate
FROM jobtitles
GROUP BY EmployeeID, JobTitle
ORDER BY EmployeeID ASC, EffectiveDate DESC
If JobTitle of employees can be reverted/change to title that they have already obtained in the past, use the following query:
Edit: Update query according to table schema provided in question
SELECT ASSOCIATE_ID,
JOB_TITLE_DESCRIPTION,
POSITION_EFFECTIVE_DATE
FROM (
SELECT
ASSOCIATE_ID,
JOB_TITLE_DESCRIPTION,
JOB_TITLE_CODE,
POSITION_EFFECTIVE_DATE,
LEAD(JOB_TITLE_CODE,1, '0') OVER (ORDER BY ASSOCIATE_ID ASC, POSITION_EFFECTIVE_DATE DESC) AS PREV_TITLE_ID
FROM EMP_JOB_HISTORY
) AS tmp
WHERE tmp.PREV_TITLE_ID <> tmp.JOB_TITLE_CODE

SQL How to pull in all records that don't contain

This is a bit of a trick question to explain, but I'll try my best.
The essence of the question is that I have a employee salary table and the columns are like so,: Employee ID, Month of Salary, Salary (Currency).
I want to run a select that will show me all of the employees that don't have a record for X month.
I have attached an image to assist in the visualising of this, and here is an example of what UI would want from this data:
Let's say from this small example that I want to see all of the employees that weren't paid on the 1st October 2021. From looking I know that employee 3 was the only one paid and 1 and 2 were not paid. How would I be able to query this on a much larger range of data without knowing which month it could be that they weren't paid?
You need to join your EmployeeSalary table against a list of expected EmployeeID/MonthOfSalary values, and determine the gaps - the instances where there is no matching record in the EmployeeSalary table. A LEFT OUTER JOIN can be used here, whenever there's no matching record / missing record in your EmployeeSalary table, the LEFT OUTER JOIN will give you NULL.
The following query shows how to perform the LEFT OUTER JOIN, however note that I've joined your table on itself to get the list of EmployeeID and MonthOfSalary values. You would be better to join these from other tables, i.e. I assume you have an Employee table with all the IDs in it, which would be more efficient (and more accurate) to use, than building the ID list from the EmployeeSalary table (like I've done).
SELECT EmployeeList.EmployeeID, MonthList.MonthOfSalary
FROM (SELECT DISTINCT MonthOfSalary FROM EmployeeSalary) MonthList
JOIN (SELECT DISTINCT EmployeeID FROM EmployeeSalary) EmployeeList
LEFT OUTER JOIN EmployeeSalary
ON MonthList.MonthOfSalary = EmployeeSalary.MonthOfSalary
AND EmployeeList.EmployeeID = EmployeeSalary.EmployeeID
WHERE EmployeeSalary.EmployeeID IS NULL
You need first to get the latest value, then to calculate the difference and make a filter on it. The filter can be done thanks to having clause.
I propose you the following starting point, that you might need to adapt, at least to cast some formats according to your column types.
with latest_pay as (
-- Filter to get, for each employee, the latest paid month
select Employee_ID, Month, Salary, max(month) as latest_pay_month
from your_table
group by Employee_ID
)
-- Look for employees not paid since more than 'your_treshold' months
select Employee_ID, latest_pay_month, Salary, datediff(latest_pay_month, getdate(), Month) as latest_paid_month_delay
from latest_pay
having datediff(latest_pay_month, getdate(), Month) > your_threshold
Btw, I know it's an example, but avoid using column names such as Month, which would lead to confusions and errors with SQL keywords
This is ideally where you would use a calendar table - having one available is handy for tasks such as this where you need to find missing dates.
You can build one on the fly, I have done so in this example however you would normally have a permanant table to use.
In order to determin which rows are missing you need to generate a list of expected rows, an outer join to your actual data will then reveal the missing rows.
So here we have a CTE that generates a list of dates (based on a date range you can set), followed by another to give a list of all the EmployeeId values.
You expect each employeeId to have a row for each month, so we do a cross join to generate the list of expected results, we then outer join with the actual data and filter to the null rows, these are the employees who have no been paid for that month.
See example DB<>Fiddle
declare #from date='20210101', #to date='20211001';
with dates as (
select DateAdd(month,n,#from) dt from (
select top(100) Row_Number() over(order by (select null))-1 n from master.dbo.spt_values
)v
), e as (select distinct employeeId from t)
select dt, e.EmployeeId
from dates d cross join e
left join t on DatePart(month,d.dt)=DatePart(month,t.PaidDate) and t.EmployeeId=e.EmployeeId
where d.dt<=#to
and t.EmployeeId is null

SELECT rows containing latest values

How do I SELECT the column1 registries that have the column2 with the latest date and is not null?
For example, I need to return just the line five (employee3).
How about this?
SELECT Employee, MAX(Resignation) Resignation
FROM table
WHERE Resignation IS NOT NULL
GROUP BY Employee
Or, if your table has more columns than you've shown,
SELECT a.*
FROM table a
JOIN (
SELECT Employee, MAX(Resignation) Resignation
FROM table
WHERE Resignation IS NOT NULL
GROUP BY Employee
) b ON a.Employee = b.Employee AND a.Resigation = b.Resignation
This is the "find detail rows with extreme values" query pattern.
Updated with a re-interpretation of the question:
I think you mean:
Return the most recent resignation date for all employees who are currently Resigned. Currently resigned is defined as "having all the same employee records with a resignation date populated for that employee". A single employee record with a NULL resignation date means the employee is still employed; regardless of how many times they have resigned!
This can be accomplished with an exists using a correlated subquery.
and a max along with a group by
First we get a list of all the employees who are not resigned.
Then we compare our full set to the set of employees who are not resigned and only keep employees are not in the list of employees not resigned, next we group by employee and get the max resignation.
SELECT Employee, Max(Resignation) Resignation
FROM Table A
WHERE NOT EXISTS (SELECT 1
FROM Table B
WHERE A.Employee = B.Employee
and B.Resignation is null) --Cooelation occurs here Notice how A
GROUP BY Employee
Cooelation occurs on the line WHERE A.Employee = B.Employee as A.EMPLOYEE refers to a table one level removed TABLE A from the table B.
Pretty sure there would be a way to do this with an apply join as well; but I'm not as familiar with that syntax yet.

sql server - join 2 tables based on earliest date in 2nd table

I'm looking for query advise to gather data on the following.
Table 1 'Case' - contains columns: Id, Customer, Product, Reported Date
Table 2 'Activity - contains columns: Case Id, Date Created, Created By
There can be many activities linked to the same case. What I'd like to do is write a query to return the following.
Case.Id, Case.Customer, Case.Product, Case.ReportedDate,
Activity.DateCreated, Activity.CreatedBy,
datediff(hour, Case.ReportedDate, Activity.DateCreated)
BUT ONLY for the activity with the earliest date. Basically showing the time difference between when the case was first created and the first activity was created.
I'd really appreciate any advice on how to accomplish this join. I tried a few things but it ended returning multiple rows per case. Thanks very much!
Try this...
SELECT C.ID
,C.Customer
,C.Product
,C.ReportedDate
,DATEDIFF(HOUR, C.ReportedDate, A.DateCreated) AS [TimePassed]
,A.CreatedBy
FROM [Case] C INNER JOIN
(SELECT *,
ROW_NUMBER() OVER (PARTITION BY CaseId ORDER BY DateCreated ASC) AS rn
FROM [Activity]) A
ON C.ID = A.CaseId
WHERE A.rn = 1

How to get the MAX result even if the MAX is different

I have a code where I am pulling language knowledge and each employee has a plan year, they do not all complete them each year and in order to get their most recent one I use the MAX for plan year. Now one of the criteria is whether or not they are willing to move over seas, the issue arises that it will bring up their most recent YES and most recent NO and I just need their most recent plan year whether it be yes or no, I am having difficulty troubleshooting this. The code is as follows:
SELECT Employee_ID, Accept_International_Assignment, MAX(Plan_Year) AS Expr1
FROM dbo.v_sc08_CD_Employee_Availabilities
GROUP BY Employee_ID, Accept_International_Assignment
I suspect this will be more efficient than the accepted answer, at scale...
;WITH x AS
(
SELECT Employee_ID, Accept_International_Assignment, Plan_Year,
rn = ROW_NUMBER() OVER (PARTITION BY Employee_ID ORDER BY Plan_Year DESC)
FROM dbo.v_sc08_CD_Employee_Availabilities -- who comes up with these names?
)
SELECT Employee_ID, Accept_International_Assignment, Plan_Year
FROM x WHERE rn = 1;
SELECT a.Employee_ID, a.Accept_International_Assignment, a.Plan_Year
FROM dbo.v_sc08_CD_Employee_Availabilities a
INNER JOIN (SELECT Employee_ID, MAX(Plan_Year) maxPlanYear
from dbo.v_sc08_CD_Employee_Availabilities
GROUP BY Employee_ID) m
ON a.Plan_Year = m.maxPlanYear AND a.Employee_ID = m.Employee_ID
I'm not sure if you want only the most recent decision and year as Raphael posted or if you want the yes's and the no's but always with the Max plan year for that employee.
Here is a query for the yes's and no's but Max plan year is always the max for the employee.
select main.Employee_ID, Accept_International_Assignment, Expr1
from (
SELECT Employee_ID, Accept_International_Assignment
FROM #v_sc08_CD_Employee_Availabilities
GROUP BY Employee_ID, Accept_International_Assignment
) main
inner join
(
select Employee_ID, MAX(Plan_Year) as Expr1
from #v_sc08_CD_Employee_Availabilities
group by Employee_ID) empPlanYear
on main.Employee_ID = empPlanYear.Employee_ID
You need a subquery on the max() and join against that.
SQL Group by & Max