Multiple rows of dates between using custom calendar - sql

So banging my head against the wall and can't see the wood for the trees...
I've got two tables;
1. ID field, start date and end date columns.
2. Date and Workday columns.
I just need to be able to count the days between the two for each row using this dates on the second calendar. Googl'ing had found plenty of examples without the dates table and plenty of examples where its just based on 1 start and end date.
Table_1 - Contains an entry for every id
id start_date end_date
123 01/01/2013 03/01/2013
456 02/01/2013 08/01/2013
789 06/01/2013 07/01/2013
Table_2 - Contains an entry for everyday
e_day workday
01/01/2013 1
02/01/2013 0
03/01/2013 1
04/01/2013 1
05/01/2013 0
06/01/2013 1
07/01/2013 0
08/01/2013 0
Results
id start_date end_date days_between
123 01/01/2013 03/01/2013 2
456 02/01/2013 08/01/2013 3
789 06/01/2013 07/01/2013 1
I can find out the value for 1 id;
SELECT COUNT(workday) FROM table_2
WHERE workday = 1 AND cal_day >= '01/01/2013'
AND cal_day <= '03/01/2013';
Just not sure how to put this logic in to table_1.
IE (Clearly not correct)
SELECT
table_1.id,
table_1.start_date,
table_1.end_date,
(COUNT(table_2.workday) FROM table_2 WHERE table_2.workday = 1
AND table_2.e_day >= table_1.start_date
AND table_2.e_day <= table_2.end_date) AS days_between
FROM table_1
Code to generate bodged example tables;
CREATE TABLE #table_1(id INT, start_date SMALLDATETIME, end_date SMALLDATETIME);
CREATE TABLE #table_2(e_day SMALLDATETIME, workday BIT);
INSERT #table_1 VALUES (123,'01/01/2013','03/01/2013')
INSERT #table_1 VALUES (456,'02/01/2013','08/01/2013')
INSERT #table_1 VALUES (789,'06/01/2013','07/01/2013')
INSERT #table_2 VALUES ('01/01/2013',1)
INSERT #table_2 VALUES ('02/01/2013',0)
INSERT #table_2 VALUES ('03/01/2013',1)
INSERT #table_2 VALUES ('04/01/2013',1)
INSERT #table_2 VALUES ('05/01/2013',0)
INSERT #table_2 VALUES ('06/01/2013',1)
INSERT #table_2 VALUES ('07/01/2013',0)
INSERT #table_2 VALUES ('08/01/2013',0)
SELECT * FROM #table_1
SELECT * FROM #table_2
Code to remove tables;
DROP TABLE #table_1 DROP TABLE #table_2;
Thanks all for you help in advance :)

Try this:
select a.id,a.start_date,a.end_date,sum(cast(workday as tinyint)) as NumWorkDays,
count(*) as Total_days
from idTable a
join workdaytable b on b.eday between a.start_date and a.end_Date
group by a.id,a.start_date,a.end_date
To visualize what is happening
select a.id,a.start_date,a.end_date
where id=123
id start_date end_date
123 1/1/2013 3/1/2013
returns one row for id=123
Now, when we do the join, we add e_day and the workday flag columns AND we add one row for each e_day in the second table
id start_date end_date e_day work_day
123 1/1/2013 3/1/2013 1/1/2013 0
123 1/1/2013 3/1/2013 1/2/2013 1
123 1/1/2013 3/1/2013 1/3/2013 1
etc.
Now we had a big "table" with 5 columns and one row for each day in the second table that falls between 1/1/2013 and 3/1/2013. The Sum operation simply adds all of the work_day flag from the "table" we created by the join. If you run the query without the JOIN (and remove the sum and count), you can see the "table" that gets created...
Hope this helps a bit...

Related

Calculate the difference in minutes between 'First-row' and 'Second-row' with 2 different columns in SQL

I have the following table in which I have to calculate the difference in minutes between 'First-row' Emp_Out and 'Second-row' Emp_IN for each room and for each day starting from 12 am.
Table:
Date EMP_ID Room Emp_IN Emp_OUT Difference(In Min)
----- ------ ------ ------ ------- ------------------
9/1/22 001 Room 1 04:30 05:00 270 (First diff is calulated from 12am - Emp_IN)
9/1/22 002 Room 1 05:25 05:42 7
9/1/22 003 Room 1 05:48 06:13 6
9/1/22 001 Room 2 05:00 05:17 300 (First diff is calulated from 12am - Emp_IN)
9/1/22 002 Room 2 05:36 05:48 19
9/1/22 003 Room 2 05:51 06:05 3
Can LAG be used for it or I'm missing a logic which can be helped?
Use LAG to get the next row value, partition the rows using Room
The first entry of the day for the room will get Null and replace that with '00:00' which means 12 AM.
SELECT *,
DATEDIFF(MINUTE,ISNULL(LAG(Emp_out) OVER(PARTITION BY Date, Room ORDER BY Emp_In),'00:00'),Emp_In) [Difference in Min]
FROM Your_table
My sample scripts and the result
create table #temp(empdate date, empId int, room varchar(100), InTime time, outTime time)
insert into #temp Values ('2022-09-02','101','Room 1','04:30','05:00')
insert into #temp Values ('2022-09-02','102','Room 1','05:25','05:42')
insert into #temp Values ('2022-09-02','103','Room 1','05:48','07:00')
insert into #temp Values ('2022-09-02','101','Room 2','05:00','05:17')
insert into #temp Values ('2022-09-02','102','Room 2','05:36','05:48')
insert into #temp Values ('2022-09-02','103','Room 2','05:51','06:00')
SELECT *,
DATEDIFF(MINUTE,ISNULL(LAG(outTime) OVER(PARTITION BY Room ORDER BY InTime),'00:00'),Intime) [Difference in Min]
FROM #temp
DROP TABLE #temp
Output:
empdate empId room InTime outTime Difference in Min
---------- ----------- ---------- ---------------- ---------------- -----------------
2022-09-02 101 Room 1 04:30:00.0000000 05:00:00.0000000 270
2022-09-02 102 Room 1 05:25:00.0000000 05:42:00.0000000 25
2022-09-02 103 Room 1 05:48:00.0000000 07:00:00.0000000 6
2022-09-02 101 Room 2 05:00:00.0000000 05:17:00.0000000 300
2022-09-02 102 Room 2 05:36:00.0000000 05:48:00.0000000 19
2022-09-02 103 Room 2 05:51:00.0000000 06:00:00.0000000 3
Create the table
CREATE TABLE DEMOLAG (Date Datetime , EMP_ID INT , Room Varchar(50), EMP_IN TIME(0) , EMP_OUT TIME(0))
Insert the Records
INSERT INTO DEMOLAG VALUES ('9/1/22',001,'ROOM 1', '04:30','05:00')
INSERT INTO DEMOLAG VALUES ('9/1/22',002,'ROOM 1', '05:25','05:42')
INSERT INTO DEMOLAG VALUES ('9/1/22',003,'ROOM 1', '05:48','06:13')
INSERT INTO DEMOLAG VALUES ('9/1/22',001,'ROOM 2', '05:00','05:17')
INSERT INTO DEMOLAG VALUES ('9/1/22',002,'ROOM 2', '05:36','05:48')
INSERT INTO DEMOLAG VALUES ('9/1/22',003,'ROOM 2', '05:51','06:05')
USE LEG Function to achieve
SELECT *,
ISNULL((LAG(EMP_OUT,1) OVER(Partition by Room ORDER BY EMP_OUT,Room,Emp_ID ASC)),'00:00') AS Prvempout,
DATEDIFF(Minute,ISNULL((LAG(EMP_OUT,1) OVER(Partition by Room ORDER BY EMP_OUT,Room,Emp_ID ASC)),'00:00'),EMP_IN) AS Timediff
FROM DEMOLAG
For more detail explanation you can refer to Use LAG Function in SQL

Selecting the most recent date

I have data structured like this:
ID | Enrolment_Date | Appointment1_Date | Appointment2_Date | .... | Appointment150_Date |
112 01/01/2015 01/02/2015 01/03/2018 01/08/2018
113 01/06/2018 01/07/2018 NULL NULL
114 01/04/2018 01/05/2018 01/06/2018 NULL
I need a new variable which counts the number of months between the enrolment_date and the most recent appointment. The challenge is is that all individuals have a different number of appointments.
Update: I agree with the comments that this is poor table design and it needs to be reformatted. Could proposed solutions please include suggested code on how to transform the table?
Since the OP is currently stuck with this bad design, I will point out a temporary solution. As others have suggested, you really must change the structure here. For now, this will suffice:
SELECT '['+ NAME + '],' FROM sys.columns WHERE OBJECT_ID = OBJECT_ID ('TableA') -- find all columns, last one probably max appointment date
SELECT ID,
Enrolment_Date,
CASE WHEN Appointment150_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment150_Date)
WHEN Appointment149_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment149_Date)
WHEN Appointment148_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment148_Date)
WHEN Appointment147_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment147_Date)
WHEN Appointment146_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment146_Date)
WHEN Appointment145_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment145_Date)
WHEN Appointment144_Date IS NOT NULL THEN DATEDIFF (MONTH, Enrolment_Date, Appointment144_Date) -- and so on
END AS NumberOfMonths
FROM TableA
This is a very ugly temporary solution and should be considered as such.
You will need to restructure your data, the given structure is poor database design. Create two separate tables - one called users and one called appointments. The users table contains the user id, enrollment date and any other specific user information. Each row in the appointments table contains the user's unique id and a specific appointment date. Structuring your tables like this will make it easier to write a query to get days/months since last appointment.
For example:
Users Table:
ID, Enrollment_Date
1, 2018-01-01
2, 2018-03-02
3, 2018-05-02
Appointments Table:
ID, Appointment_Date
1, 2018-01-02
1, 2018-02-02
1, 2018-02-10
2, 2018-05-01
You would then be able to write a query to join the two tables together and calculate the difference between the enrollment date and min value of the appointment date.
It is better if you can create two tables.
Enrolment Table (dbo.Enrolments)
ID | EnrolmentDate
1 | 2018-08-30
2 | 2018-08-31
Appointments Table (dbo.Appointments)
ID | EnrolmentID | AppointmentDate
1 | 1 | 2018-09-02
2 | 1 | 2018-09-03
3 | 2 | 2018-09-01
4 | 2 | 2018-09-03
Then you can try something like this.
If you want the count of months from Enrolment Date to the final appointment date then use below query.
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MAX(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A
And, If you want the count of months from Enrolment Date to the nearest appointment date then use below query.
SELECT E.ID, E.EnrolmentDate, A.NoOfMonths
FROM dbo.Enrolments E
OUTER APPLY
(
SELECT DATEDIFF(mm, E.EnrolmentDate, MIN(A.AppointmentDate)) AS NoOfMonths
FROM dbo.Appointments A
WHERE A.EnrolmentId = E.ID
) A
Try this on sqlfiddle
You have a lousy data structure, as others have noted. You really one a table with one row per appointment. After all, what happens after the 150th appointment?
select t.id, t.Enrolment_Date,
datediff(month, t.Enrolment_Date, m.max_Appointment_Date) as months_diff
from t cross apply
(select max(Appointment_Date) as max_Appointment_Date
from (values (Appointment1_Date),
(Appointment2_Date),
. . .
(Appointment150_Date)
) v(Appointment_Date)
) m;

How to apply randomly selected values to distinct dates in SQL Server

I have a table showing available dates for some staff with two fields - staffid and date with information that looks :`
staffid date
1 2016-01-01
1 2016-01-02
1 2016-01-03
2 2016-01-03
3 2016-01-01
3 2016-01-03
I need to generate a list of DISTINCT available dates from this table, where the staff selected to each date is selected randomly. I know how to select rows based on one distinct field, (see for example the answer here, but this will always select the rows based on a given order in the table (so for example staff 1 for January 1, while I need selection to be random so sometimes 1 will be selected as the distinct row and sometimes staff 3 will be selected.
The result needs to be ordered by date.
Try this:
-- test data
create table your_table (staffid int, [date] date);
insert into your_table values
(1, '2016-01-01'),
(1, '2016-01-02'),
(1, '2016-01-03'),
(2, '2016-01-03'),
(3, '2016-01-01'),
(3, '2016-01-03');
-- query
select *
from (
select distinct [date] [distinct_date] from your_table
) as d
outer apply (
select top 1 staffid
from your_table
where d.[distinct_date] = [date]
order by newid()
) as x
-- result 1
distinct_date staffid
-----------------------
2016-01-01 3
2016-01-02 1
2016-01-03 1
-- result 2
distinct_date staffid
-----------------------
2016-01-01 1
2016-01-02 1
2016-01-03 2
hope it helps :)

Add a counter by date, user per day to a query

I have a table of data which stores scans into a building, and this contains well over a million rows of data. I am attempting to add a temporary status column within this query, which counts the scans on a daily basis. For the purpose of this question lets use this as the main data table:
CREATE TABLE DataTable (DataTableID INT IDENTITY(1,1) NOT NULL,
User VARCHAR(50),
EventTime DATETIME)
from this I have narrowed it down to show only the scans for today:
SELECT * FROM DataTable
WHERE CONVERT(DATE,EventTime) = CONVERT(DATE, SYSDATETIME())
It is at this point in which I want to add a status column to this query above. The Status column:
WHEN ODD - will mean that the person is in the building
WHEN EVEN - will mean that the person is not in the building
(This is simply an integer field which starts on 1, and will increment by 1 per scan on that day, PER USER). How would I go about doing this?
I do want to make this a view after so its worth mentioning in case this affects the query syntax
Also its worth mentioning that I cant add a status column to the main table as this would prevent the door access program working, otherwise I would add something in here to control that.
EXAMPLE DATA:
DataTableID User EventTime Status
1 Joe 30/08/2016 09:00:00 1
2 Alan 30/08/2016 08:45:00 1
3 John 30/08/2016 09:02:00 1
4 Steven 30/08/2016 07:30:00 1
5 Joe 30/08/2016 11:00:00 2
6 Mike 30/08/2016 17:30:00 1
7 Joe 30/08/2016 12:00:00 3
You want a simple windowing function for this. Take a look at the query below and let me know if you have any questions. This is ordered by EventTime rather than DataTableID for the windowing, it's then ordered by DataTableID in the final query. This is going to make sure you don't have any issues if your data isn't in the correct order in the table.
Temp table for testing;
CREATE TABLE #DataTable
(DataTableID INT IDENTITY(1,1) NOT NULL,
[User] VARCHAR(50),
EventTime DATETIME)
Fill it with sample data;
INSERT INTO #DataTable
VALUES
('Joe', '2016-08-30 09:00:00')
,('Alan', '2016-08-30 08:45:00')
,('John', '2016-08-30 09:02:00')
,('Steven', '2016-08-30 07:30:00')
,('Joe', '2016-08-30 11:00:00')
,('Mike', '2016-08-30 17:30:00')
,('Joe', '2016-08-30 12:00:00')
Query
SELECT
DataTableID
,[User]
,EventTime
,ROW_NUMBER() OVER(PARTITION BY [User] ORDER BY EventTime) Status
FROM #DataTable
WHERE CONVERT(DATE,EventTime) = CONVERT(DATE, SYSDATETIME())
ORDER BY DataTableID
Output
DataTableID User EventTime Status
1 Joe 2016-08-30 09:00:00.000 1
2 Alan 2016-08-30 08:45:00.000 1
3 John 2016-08-30 09:02:00.000 1
4 Steven 2016-08-30 07:30:00.000 1
5 Joe 2016-08-30 11:00:00.000 2
6 Mike 2016-08-30 17:30:00.000 1
7 Joe 2016-08-30 12:00:00.000 3
Something like:
select *, row_number() over(partition by user, cast(eventtime as date) order by eventtime) as status
from datatable
should do the trick.
However, I'd suggest to create a calculated column as cast(eventtime as date), and compound index on this and user column and the original eventtime column as well for performance reasons.

Finding a sql query to get the latest associated date for each grouping

I have a sql table of payroll data that has wage rates and effective dates associated with those wage rates, as well as hours worked on various dates. It looks somewhat like this:
EMPID DateWorked Hours WageRate EffectiveDate
1 1/1/2010 10 7.00 6/1/2009
1 1/1/2010 10 7.25 6/10/2009
1 1/1/2010 10 8.00 2/1/2010
1 1/10/2010 ...
2 1/1/2010 ...
...
And so on. Basically, the data has been combined in such a way that for every day worked, all of the employee's wage history is joined together, and I want to grab the wage rate associated with the LATEST effective date that is not later than the date worked. So in the example above, the rate of 7.25 that become effective on 6/10/2009 is what I want.
What kind of query can I put together for this? I can use MAX(EffectiveDate) alongwith a criteria based on being before the work date, but that only gives me the latest date itself, I want the associated wage. I am using Sql Server for this.
Alternatively, I have the original tables that were used to create this data. One of them contains the dates worked, and the hours as well as EMPID, the other contains the list of wage rates and effective dates. Is there a way to join these instead that would correctly apply the right wage rate for each work day?
I was thinking that I'd want to group by EMPID and then DateWorked, and do something from there. I want to get a result that gives me the wage rate that actually is the latest effective rate for each date worked
select p.*
from (
select EMPID, DateWorked, Max(EffectiveDate) as MaxEffectiveDate
from Payroll
where EffectiveDate <= DateWorked
group by EMPID, DateWorked
) pm
inner join Payroll p on pm.EMPID = p.EMPID and pm.DateWorked = p.DateWorked and pm.MaxEffectiveDate = p.EffectiveDate
Output:
EMPID DateWorked Hours WageRate EffectiveDate
----------- ----------------------- ----------- --------------------------------------- -----------------------
1 2010-01-01 00:00:00.000 10 7.25 2009-06-10 00:00:00.000
try this:
DECLARE #YourTable table (EMPID int, DateWorked datetime, Hours int
,WageRate numeric(6,2), EffectiveDate datetime)
INSERT INTO #YourTable VALUES (1,'1/1/2010' ,10, 7.00, '6/1/2009')
INSERT INTO #YourTable VALUES (1,'1/1/2010' ,10, 7.25, '6/10/2009')
INSERT INTO #YourTable VALUES (1,'1/1/2010' ,10, 8.00, '2/1/2010')
INSERT INTO #YourTable VALUES (1,'1/10/2010',10, 20.00,'12/1/2010')
INSERT INTO #YourTable VALUES (2,'1/1/2010' ,8 , 12.00, '2/1/2009')
SELECT
e.EMPID,e.WageRate,e.EffectiveDate
FROM #YourTable e
INNER JOIN (SELECT
EMPID,MAX(EffectiveDate) AS EffectiveDate
FROM #YourTable
WHERE EffectiveDate<GETDATE()+1
GROUP BY EMPID
) dt ON e.EMPID=dt.EMPID AND e.EffectiveDate=dt.EffectiveDate
ORDER BY e.EMPID
OUTPUT
EMPID WageRate EffectiveDate
----------- --------------------------------------- -----------------------
1 8.00 2010-02-01 00:00:00.000
2 12.00 2009-02-01 00:00:00.000
(2 row(s) affected)
Something like this ought to work:
SELECT T.* FROM T
INNER JOIN (
SELECT EMPID, MAX(EFFECTIVEDATE) EFFECTIVEDATE
FROM T
WHERE DATEWORKED <= EFFECTIVEDATE
GROUP BY EMPID) t2
ON T2.EMPID = T.EMPID
AND T2.EFFECTIVEDATE = T.EFFECTIVEDATE
SELECT TOP 1 EMPID, WageRate
FROM wages
WHERE ......
ORDER BY EffectiveDate DESC