How to sum the hours using two Date Fields and group them by the user id in SQL - sql

I feel like the task is straight forward but I am having hard time getting it to do what I want.
Here is a table in my database:
ID |Empl_Acc_ID |CheckIn |CheckOut |WeekDay
----------------------------------------------------------------------------
1 | 1 | 2017-09-24 08:03:02.143 | 2017-09-24 12:00:00.180 | Sun
2 | 1 | 2017-09-24 13:02:23.457 | 2017-09-24 17:01:02.640 | Sun
3 | 2 | 2017-09-24 08:05:23.457 | 2017-09-24 13:01:02.640 | Mon
4 | 2 | 2017-09-24 14:05:23.457 | 2017-09-24 17:00:02.640 | Mon
5 | 3 | 2017-09-24 07:05:23.457 | 2017-09-24 11:30:02.640 | Tue
6 | 3 | 2017-09-24 12:31:23.457 | 2017-09-24 16:01:02.640 | Tue
and so on....
I want to group Empl_Acc_ID by the same date and sum up the total hours each employee worked that day. Each employee could have either one or more records per day depending on how many breaks he/she took that day.
For example if Empl_Acc_ID (2) worked 3 different days with one break, the table will contain 6 records for that person but in my query I want to see 3 records with the total hours they worked each day.
Here is how I constructed the query:
select distinct w.Empl_Acc_ID, ws.fullWorkDayHours
from Work_Schedule as w
INNER JOIN (
SELECT Empl_Acc_ID, fullWorkDayHours = Sum(DATEDIFF(hour, w.CheckIn, w.CheckOut))
from Work_Schedule w
GROUP BY Empl_Acc_ID
) ws on w.Empl_Acc_ID = ws.Empl_Acc_ID
This query does not quite get me what I need. It only returns the sum of hours per employee for all the days they worked. Also, this query only has 2 columns but I want to see more columns. when I tried adding more columns, the records no longer are distinct by Empl_Acc_ID.
What is wrong with the query?
Thank you

You do not need self-join this table in that case, just group by casting the datetime field to date.
create table Work_Schedule (
ID TINYINT,
Empl_Acc_ID TINYINT,
CheckIn DATETIME,
CheckOut DATETIME,
WeekDay CHAR(3)
);
INSERT INTO Work_Schedule VALUES (1, 1,'2017-09-24 08:03:02.143','2017-09-24 12:00:00.180','Sun');
INSERT INTO Work_Schedule VALUES (2, 1,'2017-09-24 13:02:23.457','2017-09-24 17:01:02.640','Sun');
INSERT INTO Work_Schedule VALUES (3, 2,'2017-09-24 08:05:23.457','2017-09-24 13:01:02.640','Mon');
INSERT INTO Work_Schedule VALUES (4, 2,'2017-09-24 14:05:23.457','2017-09-24 17:00:02.640','Mon');
INSERT INTO Work_Schedule VALUES (5, 3,'2017-09-24 07:05:23.457','2017-09-24 11:30:02.640','Tue');
INSERT INTO Work_Schedule VALUES (6, 3,'2017-09-24 12:31:23.457','2017-09-24 16:01:02.640','Tue');
SELECT w.Empl_Acc_ID,
CAST(CheckIn AS DATE) [date],
SUM(DATEDIFF(hour, w.CheckIn, w.CheckOut)) fullWorkDayHours
FROM Work_Schedule w
GROUP BY w.Empl_Acc_ID, CAST(CheckIn AS DATE)
DROP TABLE Work_Schedule;
Empl_Acc_ID date fullWorkDayHours
1 2017-09-24 8
2 2017-09-24 8
3 2017-09-24 8

Try this. You just have to group by date and employee account.
select Employee.Empl_Acc_ID, FirstName, LastName, Username,
convert(varchar(10), checkin, 101) as checkin, convert(varchar(10),
checkout, 101) as checkout, sum(datediff(hour, checkin, checkout)) as hours
from Employee
inner join Employee_Account on Employee.Empl_Acc_ID =
Employee_Account.Empl_Acc_ID
inner join Work_Schedule on Employee_Account.Empl_Acc_ID =
Work_Schedule.Empl_Acc_ID
group by convert(varchar(10), checkin, 101), convert(varchar(10), checkout,
101), Employee.Empl_Acc_ID, FirstName, LastName, Username
order by Employee.Empl_Acc_ID

You do not group by date, that's the issue:
SELECT DISTINCT w.Empl_Acc_ID, ws.fullWorkDayHours, ws.CheckInDate
FROM Work_Schedule as w
INNER JOIN (
SELECT Empl_Acc_ID, CAST(w.CheckIn AS DATE) AS [CheckInDate], fullWorkDayHours = Sum(DATEDIFF(hour,
w.CheckIn, w.CheckOut))
from Work_Schedule w
GROUP BY Empl_Acc_ID, CAST(w.CheckIn AS DATE)
) ws on w.Empl_Acc_ID = ws.Empl_Acc_ID

No need of doing self join, it works fine without it:
Select distinct Empl_Acc_ID, Sum(DATEDIFF(hour,CheckIN,CheckOut)) As
FullDayWorkHours from EMP2
where DATEPART(day,CheckIn)=DATEPART(day,CheckOut)
Group By Empl_Acc_ID

Related

Find missing rows between three related tables

I have three tables:
Person
person_id
-------------
10001
10002
10003
10004
Dates
date_type date
-------------- -----------------------
PUBLIC_HOLIDAY 2020-04-10 00:00:00.000
PUBLIC_HOLIDAY 2020-04-13 00:00:00.000
Absence
person_id date absence_type
--------- ----------------------- ------------
10001 2020-04-10 00:00:00.000 HOLIDAY
10001 2020-04-13 00:00:00.000 HOLIDAY
10002 2020-04-10 00:00:00.000 HOLIDAY
10003 2020-04-13 00:00:00.000 HOLIDAY
I need to find all of the person_id's in the Person table and the date's from the Dates table who have not booked any absence matching the following criteria:
Dates.date_type = 'PUBLIC_HOLIDAY'
Absence.absence_type = 'HOLIDAY'
Basically, I need to find the people and the dates which are public holidays they have not booked an absence for as holiday.
You can try this below logic-
DEMO HERE
SELECT Person.person_id,Dates.dat,ISNULL(Absence.dat, 'Not Bokked')
FROM Dates
CROSS JOIN Person
LEFT JOIN Absence ON Person.person_id = Absence.person_id AND Dates.dat = Absence.dat
WHERE Dates.date_type = 'PUBLIC_HOLIDAY'
If you wants only information with not booked, just simply add below line to the script-
AND Absence.dat IS NULL
I think that you want a cross join to generate all combinations of persons and dates, and then not exists to filter on those that do not exist in the absence table:
select p.*, d.*
from person p
cross join dates d
where
d.date_type = 'PUBLIC_HOLIDAY'
and not exists (
select 1
from absence a
where a.person_id = p.person_id and a.date = d.date and a.absence_type = 'HOLIDAY'
)
Try:
select distinct person_id from absence a
where absence_type = 'HOLIDAY'
and not exists (select 1 from dates
where date = a.date
and date_type = 'PUBLIC_HOLIDAY')
union all
select person_id from person p
where not exists ( select 1 from absence
where p.person_id = person_id)
If you want to have them with dates, use below query:
select person_id, date from absence a
where absence_type = 'HOLIDAY'
and not exists (select 1 from dates
where date = a.date
and date_type = 'PUBLIC_HOLIDAY')
union all
-- in person table we don;t have any dates
select person_id, null from person p
where not exists ( select 1 from absence
where p.person_id = person_id)
you can use below query. I have tested this in SQL Server 2014.
CREATE TABLE #Person(person_id INT)
INSERT INTO #person
values
(10001),
(10002),
(10003),
(10004);
CREATE TABLE #Dates (date_Type VARCHAR(50), [datevalue] datetime)
INSERT INTO #Dates
VALUES
('PUBLIC_HOLIDAY','2020-04-10 00:00:00.000'),
('PUBLIC_HOLIDAY','2020-04-13 00:00:00.000');
CREATE TABLE #Absence (person_id int, datevalue datetime, absence_type VARCHAR(50))
INSERT INTO #Absence
VALUES
(10001,'2020-04-10 00:00:00.000','HOLIDAY'),
(10001,'2020-04-13 00:00:00.000','HOLIDAY'),
(10002,'2020-04-10 00:00:00.000','HOLIDAY'),
(10003,'2020-04-13 00:00:00.000','HOLIDAY');
SELECT p.person_id, od.datevalue
FROM #Person AS p
CROSS JOIN (SELECT * FROM #Dates WHERE date_type ='PUBLIC_HOLIDAY') AS od
WHERE NOT EXISTS
(
SELECT 1 FROM
#Absence AS a
INNER JOIN #Dates AS d
ON a.datevalue = d.datevalue
WHERE a.absence_type = 'Holiday' AND d.date_type = 'PUBLIC_HOLIDAY'
AND a.person_id = p.person_id and d.datevalue = od.datevalue)
Below is the resultset:
+-----------+-------------------------+
| person_id | datevalue |
+-----------+-------------------------+
| 10003 | 2020-04-10 00:00:00.000 |
| 10004 | 2020-04-10 00:00:00.000 |
| 10002 | 2020-04-13 00:00:00.000 |
| 10004 | 2020-04-13 00:00:00.000 |
+-----------+-------------------------+

Avoid cartesian product using sum

I want to sum up the stake from tickets table, grouping it by customer_id and date_trunc('day') from bonus table.
The problem is that rows are being multiplied and I don't know how to solve it.
https://www.db-fiddle.com/f/yWCvFamMAY9uGtoZupiAQ/4
CREATE TABLE tickets (
ticket_id integer,
customer_id integer,
stake integer,
reg_date date
);
CREATE TABLE bonus (
bonus_id integer,
customer_id integer,
reg_date date
);
insert into tickets
values
(1,100, 12,'2019-01-10 11:00'),
(2,100, 10,'2019-01-10 12:00'),
(3,100, 30,'2019-01-10 13:00'),
(4,100, 10,'2019-01-11 14:00'),
(5,100, 15,'2019-01-11 15:00'),
(6,102, 25,'2019-01-10 10:00'),
(7,102, 25,'2019-01-10 11:10'),
(8,102, 13,'2019-01-11 12:40'),
(9,102, 9,'2019-01-12 15:00'),
(10,102, 7,'2019-01-13 18:00'),
(13,103, 15,'2019-01-12 19:00'),
(14,103, 11,'2019-01-12 22:00'),
(15,103, 11,'2019-01-14 02:00'),
(16,103, 11,'2019-01-14 10:00')
;
insert into bonus
values
(200,100,'2019-01-10 05:00'),
(201,100,'2019-01-10 06:00'),
(202,100,'2019-01-10 15:00'),
(203,100,'2019-01-10 15:50'),
(204,100,'2019-01-10 16:10'),
(205,100,'2019-01-10 16:15'),
(206,100,'2019-01-10 16:22'),
(207,100,'2019-01-11 10:10'),
(208,100,'2019-01-11 16:10'),
(209,102,'2019-01-10 10:00'),
(210,102,'2019-01-10 11:00'),
(211,102,'2019-01-10 12:00'),
(212,102,'2019-01-10 13:00'),
(213,103,'2019-01-11 11:00'),
(214,103,'2019-01-11 18:00'),
(215,103,'2019-01-12 15:00'),
(216,103,'2019-01-12 16:00'),
(217,103,'2019-01-14 02:00')
select
customer_id,
date_trunc('day', b.reg_date),
sum(t.stake)
from tickets t
join bonus b using (customer_id)
where date_trunc('day', b.reg_date) = date_trunc('day', t.reg_date)
group by 1,2
order by 1
Output for customer 102 should be:
102,2019-01-10, 50
OK, I think you want to get the summary data of column stake in tickets table and the records's customer_id, reg_date pairs have appeared in the second table bonus, and all business has nothing to do with the bonus_id, am I right? The customer_id, reg_date pairs in bonus is duplicated, so you need a distinct on it, and then join the sum data from tickets.The complete SQL and result as below:
with stake_sum as (
select
customer_id,
reg_date,
sum(stake)
from
tickets
group by
customer_id,
reg_date
)
,bonus_date_distinct as (
select
distinct customer_id,
reg_date
from
bonus
)
select
a.*
from
stake_sum a
join
bonus_date_distinct b on a.customer_id = b.customer_id and a.reg_date = b.reg_date order by customer_id, reg_date;
customer_id | reg_date | sum
-------------+------------+-----
100 | 2019-01-10 | 52
100 | 2019-01-11 | 25
102 | 2019-01-10 | 50
103 | 2019-01-12 | 26
103 | 2019-01-14 | 22
(5 rows)

Selecting the most recent date

I have data structured like this:
ID | Enrolment_Date | Appointment1_Date | Appointment2_Date | .... | Appointment150_Date |
112 01/01/2015 01/02/2015 01/03/2018 01/08/2018
113 01/06/2018 01/07/2018 NULL NULL
114 01/04/2018 01/05/2018 01/06/2018 NULL
I need a new variable which counts the number of months between the enrolment_date and the most recent appointment. The challenge is is that all individuals have a different number of appointments.
Update: I agree with the comments that this is poor table design and it needs to be reformatted. Could proposed solutions please include suggested code on how to transform the table?
Since the OP is currently stuck with this bad design, I will point out a temporary solution. As others have suggested, you really must change the structure here. For now, this will suffice:
SELECT '['+ NAME + '],' FROM sys.columns WHERE OBJECT_ID = OBJECT_ID ('TableA') -- find all columns, last one probably max appointment date
SELECT ID,
Enrolment_Date,
CASE WHEN Appointment150_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment150_Date)
WHEN Appointment149_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment149_Date)
WHEN Appointment148_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment148_Date)
WHEN Appointment147_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment147_Date)
WHEN Appointment146_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment146_Date)
WHEN Appointment145_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment145_Date)
WHEN Appointment144_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment144_Date) -- and so on
END AS NumberOfMonths
FROM TableA
This is a very ugly temporary solution and should be considered as such.
You will need to restructure your data, the given structure is poor database design. Create two separate tables - one called users and one called appointments. The users table contains the user id, enrollment date and any other specific user information. Each row in the appointments table contains the user's unique id and a specific appointment date. Structuring your tables like this will make it easier to write a query to get days/months since last appointment.
For example:
Users Table:
ID, Enrollment_Date
1, 2018-01-01
2, 2018-03-02
3, 2018-05-02
Appointments Table:
ID, Appointment_Date
1, 2018-01-02
1, 2018-02-02
1, 2018-02-10
2, 2018-05-01
You would then be able to write a query to join the two tables together and calculate the difference between the enrollment date and min value of the appointment date.
It is better if you can create two tables.
Enrolment Table (dbo.Enrolments)
ID | EnrolmentDate
1 | 2018-08-30
2 | 2018-08-31
Appointments Table (dbo.Appointments)
ID | EnrolmentID | AppointmentDate
1 | 1 | 2018-09-02
2 | 1 | 2018-09-03
3 | 2 | 2018-09-01
4 | 2 | 2018-09-03
Then you can try something like this.
If you want the count of months from Enrolment Date to the final appointment date then use below query.
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MAX(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A
And, If you want the count of months from Enrolment Date to the nearest appointment date then use below query.
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MIN(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A
Try this on sqlfiddle
You have a lousy data structure, as others have noted. You really one a table with one row per appointment. After all, what happens after the 150th appointment?
select t.id, t.Enrolment_Date,
datediff(month, t.Enrolment_Date, m.max_Appointment_Date) as months_diff
from t cross apply
(select max(Appointment_Date) as max_Appointment_Date
from (values (Appointment1_Date),
(Appointment2_Date),
. . .
(Appointment150_Date)
) v(Appointment_Date)
) m;

How to apply randomly selected values to distinct dates in SQL Server

I have a table showing available dates for some staff with two fields - staffid and date with information that looks :`
staffid date
1 2016-01-01
1 2016-01-02
1 2016-01-03
2 2016-01-03
3 2016-01-01
3 2016-01-03
I need to generate a list of DISTINCT available dates from this table, where the staff selected to each date is selected randomly. I know how to select rows based on one distinct field, (see for example the answer here, but this will always select the rows based on a given order in the table (so for example staff 1 for January 1, while I need selection to be random so sometimes 1 will be selected as the distinct row and sometimes staff 3 will be selected.
The result needs to be ordered by date.
Try this:
-- test data
create table your_table (staffid int, [date] date);
insert into your_table values
(1, '2016-01-01'),
(1, '2016-01-02'),
(1, '2016-01-03'),
(2, '2016-01-03'),
(3, '2016-01-01'),
(3, '2016-01-03');
-- query
select *
from (
select distinct [date] [distinct_date] from your_table
) as d
outer apply (
select top 1 staffid
from your_table
where d.[distinct_date] = [date]
order by newid()
) as x
-- result 1
distinct_date staffid
-----------------------
2016-01-01 3
2016-01-02 1
2016-01-03 1
-- result 2
distinct_date staffid
-----------------------
2016-01-01 1
2016-01-02 1
2016-01-03 2
hope it helps :)

Finding a sql query to get the latest associated date for each grouping

I have a sql table of payroll data that has wage rates and effective dates associated with those wage rates, as well as hours worked on various dates. It looks somewhat like this:
EMPID DateWorked Hours WageRate EffectiveDate
1 1/1/2010 10 7.00 6/1/2009
1 1/1/2010 10 7.25 6/10/2009
1 1/1/2010 10 8.00 2/1/2010
1 1/10/2010 ...
2 1/1/2010 ...
...
And so on. Basically, the data has been combined in such a way that for every day worked, all of the employee's wage history is joined together, and I want to grab the wage rate associated with the LATEST effective date that is not later than the date worked. So in the example above, the rate of 7.25 that become effective on 6/10/2009 is what I want.
What kind of query can I put together for this? I can use MAX(EffectiveDate) alongwith a criteria based on being before the work date, but that only gives me the latest date itself, I want the associated wage. I am using Sql Server for this.
Alternatively, I have the original tables that were used to create this data. One of them contains the dates worked, and the hours as well as EMPID, the other contains the list of wage rates and effective dates. Is there a way to join these instead that would correctly apply the right wage rate for each work day?
I was thinking that I'd want to group by EMPID and then DateWorked, and do something from there. I want to get a result that gives me the wage rate that actually is the latest effective rate for each date worked
select p.*
from (
select EMPID, DateWorked, Max(EffectiveDate) as MaxEffectiveDate
from Payroll
where EffectiveDate <= DateWorked
group by EMPID, DateWorked
) pm
inner join Payroll p on pm.EMPID = p.EMPID and pm.DateWorked = p.DateWorked and pm.MaxEffectiveDate = p.EffectiveDate
Output:
EMPID DateWorked Hours WageRate EffectiveDate
----------- ----------------------- ----------- --------------------------------------- -----------------------
1 2010-01-01 00:00:00.000 10 7.25 2009-06-10 00:00:00.000
try this:
DECLARE #YourTable table (EMPID int, DateWorked datetime, Hours int
,WageRate numeric(6,2), EffectiveDate datetime)
INSERT INTO #YourTable VALUES (1,'1/1/2010' ,10, 7.00, '6/1/2009')
INSERT INTO #YourTable VALUES (1,'1/1/2010' ,10, 7.25, '6/10/2009')
INSERT INTO #YourTable VALUES (1,'1/1/2010' ,10, 8.00, '2/1/2010')
INSERT INTO #YourTable VALUES (1,'1/10/2010',10, 20.00,'12/1/2010')
INSERT INTO #YourTable VALUES (2,'1/1/2010' ,8 , 12.00, '2/1/2009')
SELECT
e.EMPID,e.WageRate,e.EffectiveDate
FROM #YourTable e
INNER JOIN (SELECT
EMPID,MAX(EffectiveDate) AS EffectiveDate
FROM #YourTable
WHERE EffectiveDate<GETDATE()+1
GROUP BY EMPID
) dt ON e.EMPID=dt.EMPID AND e.EffectiveDate=dt.EffectiveDate
ORDER BY e.EMPID
OUTPUT
EMPID WageRate EffectiveDate
----------- --------------------------------------- -----------------------
1 8.00 2010-02-01 00:00:00.000
2 12.00 2009-02-01 00:00:00.000
(2 row(s) affected)
Something like this ought to work:
SELECT T.* FROM T
INNER JOIN (
SELECT EMPID, MAX(EFFECTIVEDATE) EFFECTIVEDATE
FROM T
WHERE DATEWORKED <= EFFECTIVEDATE
GROUP BY EMPID) t2
ON T2.EMPID = T.EMPID
AND T2.EFFECTIVEDATE = T.EFFECTIVEDATE
SELECT TOP 1 EMPID, WageRate
FROM wages
WHERE ......
ORDER BY EffectiveDate DESC