SQL Server one-to-many relational IF query - sql

I have an employees table that has many employee_records. The employee_records table has a column named event_type and it is an enum that can be either hire-date, promotion, termination, title-change, or rehire. I am attempting to calculate an employees total time employed and make some calculations based on how long they have been employed.
How can I add a column that gives me the total days that they have been employed?
Essentially, I need to see if they have a record with the event_type = termination and if they do, then I need to see if they have a rehire date, and if they do, then I need to use their rehire date as the first day of their employment and calculate their time of employment that way.
As for a result, I simply need a days_employed column that reflects the actual amount of days they have been employed.
Here is what I have so far.
SELECT
employees.id,
first_name,
last_name,
email,
event_type,
CASE
WHEN DATEDIFF(DAY, employee_records.created_at, SYSDATETIME()) < 365 * 5
THEN 1
WHEN DATEDIFF(DAY, employee_records.created_at, SYSDATETIME()) < 365 * 10
THEN 2
ELSE 3
END AS benfits_type,
DATEDIFF(DAY, employee_records.created_at, SYSDATETIME()) AS days_employed,
employee_records.created_at AS hire_date
FROM
employees
JOIN
employee_records ON employees.id = employee_records.employee_id
ORDER BY
employees.id ASC;

Here is an example of how you could do this. I'll post the query first and then walk through my explanation. If I understood you correctly, you were not looking for total days the employee was hired, but rather the total days of the employee's most recent employment at the company (Max hire date to max termination date or today).
;WITH hired
AS (SELECT employee_records.employee_id id,
Max(employee_records.created_at) created_at
FROM employee_records
WHERE event_type = 'hire-date'
GROUP BY employee_records.employee_id),
latest
AS (SELECT employees.id
id,
Isnull(Cast(Max(employee_records.created_at) AS DATE), Getdate()
)
created_at
FROM employees
LEFT JOIN employee_records
ON employees.id = employee_records.employee_id
AND employee_records.event_type = 'termination'
GROUP BY employees.id)
SELECT *,
Datediff(day, hired, latestday) DaysEmployeed
FROM (SELECT hired.id,
hired.created_at AS Hired,
CASE
WHEN hired.created_at > latest.created_at THEN Cast(
Getdate() AS DATE)
ELSE latest.created_at
END AS LatestDay
FROM hired
INNER JOIN latest
ON hired.id = latest.id) JoinedCTEs
First of all I know you mentioned event_type is an integer, but for easy explanation I used a varchar.
Two CTEs to start.
First there is "hired" which will get you the latest hire date. So if an employee has multiple hire dates, it grabs the latest date.
Second there is "latest" which is the latest date an employee has a termination date, but also uses today's date as a placeholder date if an employee has never been terminated.
The final query joins the two CTEs and does a datediff by day to determine how many days an employee has been at the company. If the termination date is earlier than the hire date (An employee who was hired, terminated, rehired and is still with the company), it will take today's date as the latest date to count.

Related

Data value on a given date

This time I have a table on a PostgreSQL database that contains the employee name, the date that he started working and the date that he leaves the company, in the cases of the employee still remains in the company, this field has null value.
Knowing this, I would like to know how many people was working on a predetermined date, ex:
I would like to know how many people works on the company in January 2021.
I don't know where to start, in some attempts I got the number of hires and layoffs per month, but I need to show this accumulated value per month, in another column.
I hope I made myself understood, I'll leave the last SQL I got here.
select reference, sum(hires) from
(
select
date_trunc('month', date_hires) as reference,
count(*) as hires
from
ponto_mais_relatorio_colaboradores
group by
date_hires
union all
select
date_trunc('month', date_layoff) as reference,
count(*)*-1 as layoffs
from
ponto_mais_relatorio_colaboradores
group by
date_layoff
) as reference
join calendar_aux on calendar_aux.ano_mes = reference
group by reference
order by reference
Break the requirement down. The question: how many are employed on any given date? That would include all hired before that date and do not have a layoff date plus all hired before with a layoff date later then the date your interested period. I.e you are interested in Jan so you still want to count an employee with a layoff date in Feb. With that in place convert into SQL. The preceding is available from select comparing dates. other issue is that Jan is not a date, it is a range of dates, so you need each date. You can use generate series to create each day in Jan. Then Join the generated dates with and selection from your table. Resulting query:
with jan_dates( jdate ) as
( select generate_series( date '2021-01-01'
, date '2021-01-31'
, interval '1' day
)::date
)
select jdate "Date", count(*) "Employees"
from jan_dates j
join employees e
on ( e.date_hires <= j.jdate
and ( e.date_layoff is null
or e.date_layoff > j.jdate
)
)
group by j.jdate
order by j.jdate;
Note: Not tested.

Count numbers of days worked for Employees in SQL Server

I have a database containing transactional data. I am trying to count the number of days all employees worked between '2016-03-01' and '2017-03-01'. What I am using to determine the days worked per employee is by doing a count(distinct day(datecompleted)) where datecompleted resembles a day that an employee completed an activity as the driver.
The issue is there can be 50-100 transactions a day each having a datecompleted time stamp, however I just want to check if the employee worked that day by grabbing a distinct day(datecompleted) and then counting how many between the given time frame...
I used this query:
select
count(distinct day(dateCompleted)),
repName
from
DATABASE_view_Final
where
datecompleted between '2016-03-01' and '2017-03-01'
group by
repName
For some reason I keep getting the result : 31....
which doesn't make any sense...someone please help
Your counting distinct days (1-31) NOT dates. Perhaps you could try:
select count(distinct cast(dateCompleted as date)),
repName
from DATABASE_view_Final
where datecompleted between '2016-03-01' and '2017-03-01'
group by repName

Average Group size per month Over previous ten years

I need to find the average size (average number of employees) of all the groups (employers) that we do business with per month for the last ten years.
So I have no problem getting the average group size for each month. For the Current month I can use the following:
Select count(*)
from Employees EE
join Employers ER on EE.employerid = ER.employerid
group by ER.EmployerName
This will give me a list of how many employees are in each group. I can then copy and paste the column into excel get the average for the current month.
For the previous month, I want exclude any employees that were added after that month. I have a query for this too:
Select count(*)
from Employees EE
join Employers ER on EE.employerid = ER.employerid
where EE.dateadded <= DATEADD(month, -1,GETDATE())
group by ER.EmployerName
That will exclude all employees that were added this month. I can continue to this all the way back ten years, but I know there is a better way to do this. I have no problem running this query 120 times, copying and pasting the results into excel to compute the average. However, I'd rather learn a more efficient way to do this.
Another Question, I can't do the following, anyone know a way around it:
Select avg(count(*))
Thanks in advance guys!!
Edit: Employees that have been terminated can be found like this. NULL are employees that are currently employed.
Select count(*)
from Employees EE
join Employers ER on EE.employerid = ER.employerid
join Gen_Info gne on gne.id = EE.newuserid
where EE.dateadded <= DATEADD(month, -1,GETDATE())
and (gne.TerminationDate is NULL OR gen.TerminationDate < DATEADD(day, -14,GETDATE())
group by ER.EmployerName
Are you after a query that shows the count by year and month they were added? if so this seems pretty straight forward.
this is using mySQL date functions Year & month.
Select AVG(cnt) FROM (
Select count(*) cnt, Year(dateAdded), Month(dateAdded)
from System_Users su
join system_Employers se on se.employerid = su.employerid
group by Year(dateAdded), Month(dateAdded)) B
The inner query counts and breaks out the counts by year and month We then wrap that in a query to show the avg.
--2nd attempt but I'm Brain FriDay'd out.
This uses a Common table Expression (CTE) to generate a set of data for the count by Year, Month of the employees, and then averages out by month.
if this isn't what your after, sample data w/ expected results would help better frame the question and I can making assumptions about what you need/want.
With CTE AS (
Select Year(dateAdded) YR , Month(DateAdded) MO, count(*) over (partition by Year(dateAdded), Month(dateAdded) order by DateAdded Asc) as RunningTotal
from System_Users su
join system_Employers se on se.employerid = su.employerid
Order by YR ASC, MO ASC)
Select avg(RunningTotal), mo from cte;

Calculating day difference between 2 date columns where dates are "messy"

Query:
SELECT EmployeeId,
HireDate,
TerminationDate
FROM dbo.Employment
WHERE EmployeeId = 318312
ORDER BY HireDate,
TerminationDate;
Result:
I need to get the number of days this person worked. The problem is that the termination date is "messy" ... meaning, I might not get a termination date for every hire date.
So basically I need to put the dates in "order" ... and then figure out how many days the person had of employment.
In this scenario, it goes as follows:
Person is hired on 2012-12-19, has no termination date and then was re-hired on 2012-12-27.
Person terminates on 2014-03-01 and then is re-hired on 2014-06-05.
Person has no termination date after 2014-06-05 so it is assumed he was re-hired on 2014-06-06 rather than 2014-06-05.
How do I go about creating a query that captures the number of EMPLOYMENT days (excluding gaps), in this scenario?
I would be grouping this by EmployeeID as I'm running this for multiple employees.
This problem is really kicking my butt and I need some help.
This is rather complex but uses LAG to get the previous row, put that in a CTE and then pick out the data with a CASE:
;WITH dataCTE AS
(SELECT EmployeeID,
LAG(HireDate, 1) OVER (ORDER BY HireDate) PreviousHireDate,
LAG(TerminationDate, 1) OVER (ORDER BY HireDate) PreviousTerminationDate,
HireDate, TerminationDate
FROM Employment)
SELECT EmployeeID,
CASE WHEN PreviousTerminationDate IS NULL THEN PreviousHireDate ELSE HireDate END AS HireDate,
TerminationDate,
DATEDIFF(DAY, CASE WHEN PreviousTerminationDate IS NULL THEN PreviousHireDate ELSE HireDate END, TerminationDate) AS NumberOfDays
FROM dataCTE
WHERE TerminationDate IS NOT NULL
Example fiddle here: http://sqlfiddle.com/#!6/1f839e/22

employee database with varing salary over time

I have the following tables:
PROJECTS - project_id, name
EMPLOYEES - employee_id, name
SALARY - employee_id, date, per_hour
HOURS - log_id, project_id, employee_id, date, num_hours
I need to query how much a project is costing. Problem is that Salary can vary. For example, a person can get a raise.
The SALARY table logs the per_hour charge for an employee. With every change in cost being recorded with its date.
How can I query this information to make sure that the the log from the HOURS table is always matched to the right entry from the SALARY table. Right match being.. depending on the date of the hours log, get the row from the salary table with the highest date before the log's date.
ie.. if the work was performed on Feb 14th. Get the row for this employee from the Salary table with the highest date.. but still before the 14th.
Thank you,
What you need is an end date on SALARY. When a new record is inserted into SALARY for an employee, the previous record with the highest date (or better yet, a current flag set to 'Y' as recommended by cletus) should have its end date column set to the same date as the start date for the new record.
This should work with your current schema but be aware that it may be slow.
SELECT
SUM(h.num_hours * s.per_hour) AS cost
FROM PROJECTS p
INNER JOIN HOURS h
ON p.project_id = h.project_id
INNER JOIN (
SELECT
s1.employee_id,
s1.date AS start_date,
MIN(s2.date) AS end_date
FROM SALARY s1
INNER JOIN SALARY s2
ON s1.employee_id = s2.employee_id
AND s1.date < s2.date
GROUP BY
s1.employee_id,
s1.date) s
ON h.employee_id = s.employee_id
AND h.date >= s.start_date
AND h.date < s.end_date
In the 'Hours' table actually log the value of the salary that you use (don't link it based on ID). This will give you more flexibility in the future.
I have found the easiest way to handle queries spanning dates like this is to store a StartDate and an EndDate, where the EndDate is NULL for the current salary. I use a trigger to make sure there is only ever one NULL value for EndDate, and that there are no overlapping date ranges, or gaps between the ranges. StartDate is made not nullable, since that is never a valid value.
Then your join is pretty simple:
select h.num_hours, s.per_hour
from hours h
inner join salary s on h.employee_id = s.employee_id
and h.date >= s.StartDate and (h.date <= s.EndDate or s.EndDate is null)