employee database with varing salary over time - sql

I have the following tables:
PROJECTS - project_id, name
EMPLOYEES - employee_id, name
SALARY - employee_id, date, per_hour
HOURS - log_id, project_id, employee_id, date, num_hours
I need to query how much a project is costing. Problem is that Salary can vary. For example, a person can get a raise.
The SALARY table logs the per_hour charge for an employee. With every change in cost being recorded with its date.
How can I query this information to make sure that the the log from the HOURS table is always matched to the right entry from the SALARY table. Right match being.. depending on the date of the hours log, get the row from the salary table with the highest date before the log's date.
ie.. if the work was performed on Feb 14th. Get the row for this employee from the Salary table with the highest date.. but still before the 14th.
Thank you,

What you need is an end date on SALARY. When a new record is inserted into SALARY for an employee, the previous record with the highest date (or better yet, a current flag set to 'Y' as recommended by cletus) should have its end date column set to the same date as the start date for the new record.
This should work with your current schema but be aware that it may be slow.
SELECT
SUM(h.num_hours * s.per_hour) AS cost
FROM PROJECTS p
INNER JOIN HOURS h
ON p.project_id = h.project_id
INNER JOIN (
SELECT
s1.employee_id,
s1.date AS start_date,
MIN(s2.date) AS end_date
FROM SALARY s1
INNER JOIN SALARY s2
ON s1.employee_id = s2.employee_id
AND s1.date < s2.date
GROUP BY
s1.employee_id,
s1.date) s
ON h.employee_id = s.employee_id
AND h.date >= s.start_date
AND h.date < s.end_date

In the 'Hours' table actually log the value of the salary that you use (don't link it based on ID). This will give you more flexibility in the future.

I have found the easiest way to handle queries spanning dates like this is to store a StartDate and an EndDate, where the EndDate is NULL for the current salary. I use a trigger to make sure there is only ever one NULL value for EndDate, and that there are no overlapping date ranges, or gaps between the ranges. StartDate is made not nullable, since that is never a valid value.
Then your join is pretty simple:
select h.num_hours, s.per_hour
from hours h
inner join salary s on h.employee_id = s.employee_id
and h.date >= s.StartDate and (h.date <= s.EndDate or s.EndDate is null)

Related

Data value on a given date

This time I have a table on a PostgreSQL database that contains the employee name, the date that he started working and the date that he leaves the company, in the cases of the employee still remains in the company, this field has null value.
Knowing this, I would like to know how many people was working on a predetermined date, ex:
I would like to know how many people works on the company in January 2021.
I don't know where to start, in some attempts I got the number of hires and layoffs per month, but I need to show this accumulated value per month, in another column.
I hope I made myself understood, I'll leave the last SQL I got here.
select reference, sum(hires) from
(
select
date_trunc('month', date_hires) as reference,
count(*) as hires
from
ponto_mais_relatorio_colaboradores
group by
date_hires
union all
select
date_trunc('month', date_layoff) as reference,
count(*)*-1 as layoffs
from
ponto_mais_relatorio_colaboradores
group by
date_layoff
) as reference
join calendar_aux on calendar_aux.ano_mes = reference
group by reference
order by reference
Break the requirement down. The question: how many are employed on any given date? That would include all hired before that date and do not have a layoff date plus all hired before with a layoff date later then the date your interested period. I.e you are interested in Jan so you still want to count an employee with a layoff date in Feb. With that in place convert into SQL. The preceding is available from select comparing dates. other issue is that Jan is not a date, it is a range of dates, so you need each date. You can use generate series to create each day in Jan. Then Join the generated dates with and selection from your table. Resulting query:
with jan_dates( jdate ) as
( select generate_series( date '2021-01-01'
, date '2021-01-31'
, interval '1' day
)::date
)
select jdate "Date", count(*) "Employees"
from jan_dates j
join employees e
on ( e.date_hires <= j.jdate
and ( e.date_layoff is null
or e.date_layoff > j.jdate
)
)
group by j.jdate
order by j.jdate;
Note: Not tested.

Retrieving active employees by month in Postgres

In my employee database, I have a column hire_date which has the hiring date of an employee, and deactivate_date which has the date on which the employee was dismissed (or null if employee is still active).
To find the number of active employees at the beginning of any month, I can run the following query. For example, to see my active employees on 1st January 2019 -
SELECT count(*)
FROM employees
WHERE hire_date <= '2019-01-01' AND
(deactivate_date IS NULL OR deactivate_date > '2019-01-01')
Now, what I would like to know is the number of active employees on the 1st of every month of 2018. I can obviously run this query 12 times, but would like to know if there is a more efficient solution possible. It seems like the CROSSTAB and generate_series functions of pg will be useful, but I haven't been able to form the proper query.
Use generate_series():
SELECT gs.dte, count(e.hire_date)
FROM generate_series('2018-01-01'::date, '2018-12-01'::date, interval '1 month') gs(dte) LEFT JOIN
employees e
ON e.hire_date <= gs.dte AND
(e.deactivate_date IS NULL OR e.deactivate_date > gs.dte)
GROUP BY gs.dte
ORDER BY gs.dte;

Is there a way to select rows in database per day?

I have to create a sql request which returns the employees with 0 absence and with 0 hours worked per day between an interval choosen by an user. I got my view v_personnel (rows about the employees), exp_mat_abs (rows about absence of my employee) and exp_mat_mo (rows about the worked hours of the employees). I am stuck. I don't find the way how i can get the employees with 0 absence and with 0 hours worked per day between an interval choosen by an user.
Can someone help me with it ?
So you have a record on the employees table for each employee, but there will be no Record in hours worked or hours absent if the non working employee was not absent. Aside from the slight contradiction in how someone can not work but also not be absent :) (I assume you mean they're on some kind of no-work-no-pay deal) what you want is
employees
left join hoursworked on ...
left join absent on ...
Employees that are zero absent/hours will have NULL in every column from each of the hours/absence tables
If you just want these employees you can
WHERE hoursworkedtable.primarykeycolumnname IS NULL
If you want to turn the null into zero you can:
SELECT COALESCE(somenullcolumn, 0) as somenullcolumn
The only other thing I wanted to point out is that hours worked and hours absent aren't necessarily related concepts - you might want to group them up into eg a time period before you join and join on the employee and the day also, otherwise you'll end up with a Cartesian product - basically find some way of reducing the rows coming out of the worked and absence tables so there is only one row from each per employee (per time period) before you join:
employees
left join (select ... from hoursworked group by...) on ...
left join (select ... from absent group by...) on ...
If you don't, then multiple rows from either of them, per employee, will cause data from the other table to duplicate
Edit; so you need some extra days per employee:
employee e
cross join
generate_series('2019-01-10'::timestamp, '2019-01-21'::timestamp, '1 day'::interval) d
left outer join (select emp, thedate, ... from hoursworked group by emp, thedate) w
on w.empid = e.empid and w.thedate = d.d
left outer join (select emp, thedate, ... from absent group by emp, thedate) a
on a.empid = e.empid and a.thedate = d.d
we ask generate_series to make us a list of dates between the 10th and the 21st, and join the employees to it. This gives us a per-employee-per-day thing. Now we can left join our hours and absences onto this and get hours worked per day (or null if they didnt work) and absences per day (or null if they weren't absent)

SQL Server one-to-many relational IF query

I have an employees table that has many employee_records. The employee_records table has a column named event_type and it is an enum that can be either hire-date, promotion, termination, title-change, or rehire. I am attempting to calculate an employees total time employed and make some calculations based on how long they have been employed.
How can I add a column that gives me the total days that they have been employed?
Essentially, I need to see if they have a record with the event_type = termination and if they do, then I need to see if they have a rehire date, and if they do, then I need to use their rehire date as the first day of their employment and calculate their time of employment that way.
As for a result, I simply need a days_employed column that reflects the actual amount of days they have been employed.
Here is what I have so far.
SELECT
employees.id,
first_name,
last_name,
email,
event_type,
CASE
WHEN DATEDIFF(DAY, employee_records.created_at, SYSDATETIME()) < 365 * 5
THEN 1
WHEN DATEDIFF(DAY, employee_records.created_at, SYSDATETIME()) < 365 * 10
THEN 2
ELSE 3
END AS benfits_type,
DATEDIFF(DAY, employee_records.created_at, SYSDATETIME()) AS days_employed,
employee_records.created_at AS hire_date
FROM
employees
JOIN
employee_records ON employees.id = employee_records.employee_id
ORDER BY
employees.id ASC;
Here is an example of how you could do this. I'll post the query first and then walk through my explanation. If I understood you correctly, you were not looking for total days the employee was hired, but rather the total days of the employee's most recent employment at the company (Max hire date to max termination date or today).
;WITH hired
AS (SELECT employee_records.employee_id id,
Max(employee_records.created_at) created_at
FROM employee_records
WHERE event_type = 'hire-date'
GROUP BY employee_records.employee_id),
latest
AS (SELECT employees.id
id,
Isnull(Cast(Max(employee_records.created_at) AS DATE), Getdate()
)
created_at
FROM employees
LEFT JOIN employee_records
ON employees.id = employee_records.employee_id
AND employee_records.event_type = 'termination'
GROUP BY employees.id)
SELECT *,
Datediff(day, hired, latestday) DaysEmployeed
FROM (SELECT hired.id,
hired.created_at AS Hired,
CASE
WHEN hired.created_at > latest.created_at THEN Cast(
Getdate() AS DATE)
ELSE latest.created_at
END AS LatestDay
FROM hired
INNER JOIN latest
ON hired.id = latest.id) JoinedCTEs
First of all I know you mentioned event_type is an integer, but for easy explanation I used a varchar.
Two CTEs to start.
First there is "hired" which will get you the latest hire date. So if an employee has multiple hire dates, it grabs the latest date.
Second there is "latest" which is the latest date an employee has a termination date, but also uses today's date as a placeholder date if an employee has never been terminated.
The final query joins the two CTEs and does a datediff by day to determine how many days an employee has been at the company. If the termination date is earlier than the hire date (An employee who was hired, terminated, rehired and is still with the company), it will take today's date as the latest date to count.

How to return a dataset for next 30 days group by each day

i have a query like this returns today salary of the employee
select salary
from employee_salary
where sysdate between start_date and end_date.
i want to generate a query which returns salary(salary display based on each day - Salary comes to the column from number of hours working) of employees group by each day for next 30 days .The real question is how to increment
sysdate+1 between start_date and end_date
sysdate+2 between start_date and end_date
on same query. so it generate sum of salary each day work.
So dataset will be like
date name salary
sysdate+1 emp1 100
sysdate+1 emp2 90
sysdate+2 emp1 30
...................
sysdate+30 emp1 130
please note sysdate+x actually returns a date. How can i modify my query to return data like this for next sysdate+30 days .
You need to generate a list of the next thirty days. Join the calendar to the employee_salary table, filtering by the date range. Sum the salaries for each day using the aggregate syntax.
So:
with cal as (
select sysdate+level as dt
from dual
connect by level <= 30
)
select cal.dt as "date"
, sum(es.salary) as sal_daily_total
from cal
left outer join employee_salary es
on es.start_date <= cal.dt
and es.end_date >= cal.dt
group by cal.dt
The obvious snag is that the total will include salary calculated for weekends and public holidays, if the employee _salary date range spans them. If that is a problem please edit your question to clarify your requirments.