SQL procedure to show how many hours has worker worked - sql

+-----------+-------------------------------+-------+
| Worker ID | Time(MM/DD/YYYY Hour:Min:Sec) | InOut |
+-----------+-------------------------------+-------+
| 1 | 12/04/2017 10:00:00 | In |
| 2 | 12/04/2017 10:00:00 | In |
| 2 | 12/04/2017 18:40:02 | Out |
| 3 | 12/04/2017 10:00:00 | In |
| 1 | 12/04/2017 12:01:00 | Out |
| 3 | 12/04/2017 19:40:05 | Out |
+-----------+-------------------------------+-------+
Hi! I have problem with my project and I thought some of you would help me. I have table like that. It is simple table that indicates worker getting in and out of company. I need to do procedure which would take ID and number of day as In parameters and it would show how many hours and minutes that worker has worked that day. Thanks for help.

Yeah, I had to do a number of queries like this at my old job. Here's the approach I used, and it worked out pretty well:
For each "Out" record, get the MAX(TIME) on "In" records with a time earlier than the OUT record
Does that make sense? You're basically joining the table against itself, looking for the record that represents the "clock in" time for any particular "clock out" time.
So here's the backbone:
select
*
, (
SELECT MAX(tim) from #tempTable subQ
where subQ.id = main.id
and subQ.tim <= main.tim
and subQ.InOut = 'In'
) as correspondingInTime
from #tempTable main
where InOut = 'Out'
... from here, you can get the data you need. Either by manipulating the query above, or using it as a subquery itself (which is my favored way of doing it) - something like:
select id as workerID, sum(DATEDIFF(s, correspondingInTime, tim)) as totalSecondsWorked
from
(
select
*
, (
SELECT MAX(tim) from #tempTable subQ
where subQ.id = main.id
and subQ.tim <= main.tim
and subQ.InOut = 'In'
) correspondingInTime
from #tempTable main
where InOut = 'Out'
) mainQuery
group by id
EDIT: Remove the 'as' before correspondingInTime, because oracle doesn't allow 'as' in table aliasing.

Maybe something similar to
select sum( time1 - prev_time1 ) from (
select InOut, time1,
prev(time1) over (partition by worker_id order by time1) prev_time1,
prev(InOut) over (partition by worker_id order by time1) prev_inOut
from MyTABLE
where TimeColumn between trunc(:date1) and trunc( :date1 + 1 )
and workerId = :workerId
) t1
where InOut = 'Out' and prev_InOut = 'In'
would go.
:workerId and :date1 are variables to constrain to one date and one worker as required.

I'm fairly certain Oracle allows you to use CROSS APPLY these days.
SELECT [Worker ID], yt.Time - ca.Time
FROM YourTable yt
CROSS APPLY (SELECT MAX(Time) AS Time
FROM YourTable
WHERE [Worker ID] = yt.[Worker ID] AND Time < yt.Time AND InOut = 'In') ca
WHERE yt.InOut = 'Out'

Related

SQL: Calculate number of days since last success

Following table represents results of given test.
Every result for the same test is either pass ( error_id=0) or fail ( error_id <> 0)
I need help to write a query, that returns the number of runs since last good run ( error_id= 0) and the date.
| Date | test_id | error_id |
-----------------------------------
| 2019-12-20 | 123 | 23
| 2019-12-19 | 123 | 23
| 2019-12-17 | 123 | 22
| 2019-12-18 | 123 | 0
| 2019-12-16 | 123 | 11
| 2019-12-15 | 123 | 11
| 2019-12-13 | 123 | 11
| 2019-12-12 | 123 | 0
So the result for this example should be:
| 2019-12-18 | 123 | 4
as the test 123 was PASS on 2019-12-18 and this happened 4 runs ago.
I have a query to determine whether given run is error or not, but I have trouble applying appropriate window function to it to get the wanted result
select test_id, Date, error_id, (CASE WHEN error_id 0 THEN 1 ELSE 0 END) as is_error
from testresults
You can generate a row number, in reverse order from the sorting of the query itself:
SELECT test_date, test_id, error_code,
(row_number() OVER (ORDER BY test_date asc) - 1) as runs_since_last_pass
FROM tests
WHERE test_date >= (SELECT MAX(test_date) FROM tests WHERE error_code=0)
ORDER BY test_date DESC
LIMIT 1;
Note that this will run into issues if test_date is not unique. Better use a timestamp (precise to the millisecond) instead of a date.
Here's a DBFiddle: https://www.db-fiddle.com/f/8gSHVcXMztuRiFcL8zLeEx/0
If there's more than one test_id, you'll want to add a PARTITION BY clause to the row number function, and the subquery would become a bit more complex. It may be more efficient to come up with a way to do this by a JOIN instead of a subquery, but it would be more cognitively complex.
I think you just want aggregation and some filtering:
select id, count(*),
max(date) over (filter where error_id = 0) as last_success_date
from t
where date > (select max(t2.date) from t t2 where t2.error_id = 0);
group by id;
You have to use the Maximum date of the good runs for every test_id in your query. You can try this query:
select tr2.Date_error, tr.test_id, count(tr.error_id) from
testresults tr inner join (select max(Date_error), test_id
from testresult where error_id=0 group by test_id) tr2 on
tr.test_id=tr2.test_id and tr.date_error >=tr2.date_error
group by test_id
This should do the trick:
select count(*) from table t,
(select max(date) date from table where error_id = 0) good
where t.date >= good.date
Basically you are counting the rows that have a date >= the date of the last success.
Please note: If you need the number of days, it is a complete different query:
select now()::date - max(test_date) last_valid from tests
where error_code = 0;

SQL Server Query In and Out

This is from DTR Device that i saved in Ms sql database
ID | Employee_ID | Date | InOutMode
-------+-------------+---------------------+-----------
70821 | 104 | 2019-10-11 19:00:00 | 0
70850 | 104 | 2019-10-12 07:01:00 | 1
if i'm going to separate the IN and OUT it suppose to be like this:
ID | Employee_ID | IN | OUT
-------+-------------+---------------------+-----------
70821 | 104 | 2019-10-11 19:00:00 | 2019-10-12 07:01:00
What happens is, i don't know if my queries were wrong. the TIME-OUT is not 2019-10-12 but 2019-10-11 same as the TIME-IN it looks like this:
ID | Employee_ID | IN | OUT
-------+-------------+---------------------+-----------
70821 | 104 | 2019-10-11 19:00:00 | 2019-10-11 07:01:00
Try this,
DECLARE #Temp_Table Table
(
Empoyee_id int,
[Date] datetime,
[InOutMode] bit
)
INSERT INTO #Temp_Table
(
Empoyee_id,[Date],[InOutMode]
)
SELECT 104,'20191011 09:30',1
UNION ALL
SELECT 104,'20191011 19:30',0
UNION ALL
SELECT 104,'20191012 09:30',1
UNION ALL
SELECT 104,'20191012 12:30',0
UNION ALL
SELECT 104,'20191012 19:00',0
UNION ALL
SELECT 104,'20191013 09:30',1
UNION ALL
SELECT 104,'20191013 07:30',0
UNION ALL
SELECT 104,'20191014 09:30',1
SELECT Empoyee_id,[Date],[In],IIF([In]>[Out],null,[Out]) as [Out]
FROM
(
SELECT Empoyee_id,CAST([Date] AS DATE) AS [Date],
MIN(IIF(InOutMode=1,[Date],NULL)) AS [In] ,
MAX(IIF(InOutMode=0,[Date],NULL)) AS [Out]
FROM #Temp_Table
GROUP BY Empoyee_id,CAST([Date] AS DATE)
)A
Try this:
;
WITH Ins as (
Select *
FROM HR_DTR_Device
WHERE InOutMode = 0
),
Outs as (
Select *
FROM HR_DTR_Device
WHERE InOutMode = 1
)
SELECT Ins.ID,
Ins.Employee_ID,
Ins.Date as [In],
(
SELECT Min(Outs.Date)
FROM Outs
WHERE Ins.Employee_ID = Outs.Employee_ID
AND Outs.Date > Ins.Date
) as [Out]
FROM Ins
WHERE Ins.Employee_ID = '104'
What this does:
Separates the Ins and the Outs, as if they were separate data sources. Using Common Table Expressions allows you, in effect, to pre-define subqueries and give them names.
For each record in the Ins, looks for the smallest date from the Outs that is still larger than the In date. (This assumes that your records are complete, and that you can't ever have two Ins in a row because someone forgot to clock out.)
Doesn't make any assumptions about when the Out date happens, just that it's later than the In date (by definition). That way, you don't have to worry about whether the employee left later the same day or early the next day (if you have employees working different shifts.)
Will also show any entries where the employee clocked in but has not yet clocked out.
I think your big error was here:
(SELECT MAX(Date) FROM HR_DTR_Device XX
WHERE InOutMode = 1
AND XX.Employee_ID = AA.Employee_ID
AND CAST(XX.Date AS DATE) = CAST(AA.Date AS DATE)) AS 'Out'
You are returning the largest date for that employee that is on the same calendar date (and is an Out). But, if the employee works until the next morning, the date will have changed!
You could fix this by changing your test to this:
CAST(DATEADD(d, -1, XX.Date) AS DATE) = CAST(AA.Date AS DATE)
... but then it will ONLY work for employees who worked overnight, whereas my solution simply finds the next time the employee clocked out after they clocked in, regardless of whether it's the same day, the next day, or the next week!
If you like this solution, please mark it as your accepted solution. Thank you.

Find difference between two consecutive rows from a result in SQL server 2008

I want to fetch the difference in "Data" column between two consecutive rows. For example, need Row2-Row1 ( 1902.4-1899.66) , Row 3-Row 2 and so on. The difference should be stored in a new column.
+----+-------+-----------+-------------------------+----+
| Name | Data |meter| Time |
+----+-------+-----------+-------------------------+----+
| Boiler-1 | 1899.66 | 1 | 5/16/2019 12:00:00 AM |
| Boiler-1 | 1902.4 | 1 | 5/16/2019 12:15:00 AM |
| Boiler-1 | 1908.1 | 1 | 5/16/2019 12:15:00 AM |
| Boiler-1 | 1911.7 | 6 | 5/16/2019 12:15:00 AM |
| Boiler-1 | 1926.4 | 6 | 5/16/2019 12:15:00 AM |
|
+----+-------+-----------+------------------------- +
Thing is the table structure that I have shown in the question, is actually obtained from two different tables. I mean, the above table is a result of a Select query to get data from two different tables. Goes like "select name, data, unitId, Timestamp from table t1 join table t2....." So is there anyway for me to calculate the difference in "data" column value between consecutive rows, without storing this above shown result into a table?
I use SQL 2008, so Lead/Lag functionality cannot be used.
The equivalent in SQL Server 2008 uses apply -- and it can be expensive:
with t as (
<your query here>
)
select t.*,
(t.data - tprev.data) as diff
from t outer apply
(select top (1) tprev.*
from t tprev
where tprev.name = t.name and
tprev.boiler = t.boiler and
tprev.time < t.time
order by tprev.time desc
) tprev;
This assumes that you want the previous row when the name and boiler are the same. You can adjust the correlation clause if you have different groupings in mind.
Not claiming that this is best, this is just another option in SQL SERVER < 2012. As from SQL Server 2012 its easy to do the same using LEAD and LAG default option added. Any way, for small and medium data set, you can consider this below script as well :)
Note: This is just an Idea for you.
WITH CTE(Name,Data)
AS
(
SELECT 'Boiler-1' ,1899.66 UNION ALL
SELECT 'Boiler-1',1902.4 UNION ALL
SELECT 'Boiler-1',1908.1 UNION ALL
SELECT 'Boiler-1',1911.7 UNION ALL
SELECT 'Boiler-1',1926.4
--Replace above select statement with your query
)
SELECT A.Name,A.Data,A.Data-ISNULL(B.Data,0) AS [Diff]
FROM
(
--Adding ROW_NUMBER Over (SELECT NULL) will keep the natural order
--of your data and will just add the row number.
SELECT *,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) RN FROM CTE
)A
LEFT JOIN
(
SELECT *,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) RN FROM CTE
) B
--Here the JOINING will take place on curent and next row for using ( = B.RN-1)
ON A.RN = B.RN-1

SQL query to turn change log into intervals

So let me describe the problem:
-I have a task table with an assignee column, a created column and a resolved column
(both created and resolved are timestamps)
+---------+----------+------------+------------+
| task_id | assignee | created | resolved |
+---------+----------+------------+------------+
| tsk1 | him | 2000-01-01 | 2018-01-03 |
+---------+----------+------------+------------+
-I have a change log table with a task_id, a from column, a to column and a date column that records each time the assignee is changed
+---------+----------+------------+------------+
| task_id | from | to | date |
+---------+----------+------------+------------+
| tsk1 | me | you | 2017-04-06 |
+---------+----------+------------+------------+
| tsk1 | you | him | 2017-04-08 |
+---------+----------+------------+------------+
I want to select a table that shows a list of all the assignees that worked on a task within an interval
+---------+----------+------------+------------+
| task_id | assignee | from | to |
+---------+----------+------------+------------+
| tsk1 | me | 2000-01-01 | 2017-04-06 |
+---------+----------+------------+------------+
| tsk1 | you | 2017-04-06 | 2017-04-08 |
+---------+----------+------------+------------+
| tsk1 | him | 2017-04-08 | 2018-01-03 |
+---------+----------+------------+------------+
I'm having trouble with the first(/last) row, where the from(/to) should be set as created(/resolved), I don't know how to make a column with data from two different tables...
I've tried making them in their own select and then merging all rows with union, but I don't think this is a very good solution...
Hmmm . . . This is tricker than it seems. The idea is to use lead() to get the next date, but you need to "augment" the data with information from the tasks table:
select task_id, to, date as fromdate,
coalesce(lead(date) over (partition by task_id order by date),
max(resolved) over (partition by task_id)
) as todate
from ((select task_id, to, date, null::timestamp
from log l
) union all
(select distint on (t.task_id) t.task_id, l.from, t.created, t.resolved
from task t join
log l
on t.task_id = l.task_id
order by t.task_id, l.date
)
) t;
demo:db<>fiddle
SELECT
l.task_id,
assignee_from as assignee,
COALESCE(
lag(assign_date) OVER (ORDER BY assign_date),
created
) as date_from,
assign_date as date_to
FROM
log l
JOIN
task t
ON l.task_id = t.task_id
UNION ALL
SELECT * FROM (
SELECT DISTINCT ON (l.task_id)
l.task_id, assignee_to, assign_date, resolved
FROM
log l
JOIN
task t
ON l.task_id = t.task_id
ORDER BY l.task_id, assign_date DESC
) s
ORDER BY task_id, date_from
UNION consists of two parts: The part from the log and finally the last row from the task table.
The first part uses LAG() window function to get the previous date to the current row. Because "me" has no previous row, that would result in a NULL value. So this is catched by getting the created date from the task table.
The second part is to get the last row: Here I am getting the last row of the log by DISTINCT and ORDER BY assign_date DESC. So I know the last assignee_to. The rest is similar to the first part: Getting the resolved value from the task table.
Thanks to the answer from S-Man and Gordon Linoff, I was able to come up with this solution:
SELECT t.task_id,
t.item_from AS assignee,
COALESCE(lag(t.changelog_created) OVER (
PARTITION BY t.task_id ORDER BY t.changelog_created),
max(t.creationdate) OVER (PARTITION BY t.task_id)) AS fromdate,
t.changelog_created as todate
FROM ( SELECT ch.task_id,
ch.item_from,
ch.changelog_created,
NULL::timestamp without time zone AS creationdate
FROM changelog_generic_expanded_view ch
WHERE ch.field::text = 'assignee'::text
UNION ALL
( SELECT DISTINCT ON (t_1.id_task) t_1.id_task,
t_1.assigneekey,
t_1.resolutiondate,
t_1.creationdate
FROM task_jira t_1
ORDER BY t_1.id_task)) t;
Note: this is the final version so the names are a bit different, but the idea stays the same.
This is basically the same code as Gordon Linoff, but I go through changelog in the opposite direction.
I use the 2nd part of UNION ALL to generate the last assignee instead of the first (this is to handle the case where there is no changelog at all, the last assignee is generated without involving changelogs)

How to subtract two consecutive rows in MS SQL Server?

A table looks like:
id | location | datetime
------| ---------| --------
CD123 | loc001 | 2010-10-21 13:30:15
ZY123 | loc001 | 2010-10-21 13:40:15
YU333 | loc001 | 2010-10-21 13:41:00
AB456 | loc002 | 2011-1-21 14:30:30
FG121 | loc002 | 2011-1-21 14:31:00
BN010 | loc002 | 2011-1-21 14:32:00
Assume the table has been sorted by ascending datetime. I am trying to find the elapse (in seconds) between two consecutive rows within a location.
The result table is supposed to be:
| location | elapse
| loc001 | 600
| loc001 | 45
| loc002 | 30
| loc002 | 60
Since the id is randomly generated, it is difficult to write something like a.id = b.id + 1 in a query. And only rows within the same location is consecutively subtracted, not across different locations.
How should I write a query in MS SQL Server to accomplish it?
In SQL Server 2012 and later you can use LEAD or LAG
SELECT
location,
SUM(DATEDIFF(SECOND, DateTime,
Lead(DateTime, 1) OVER(PARTITION BY location ORDER BY DateTime))) Elepase
FROM
tableName
GROUP BY
location
with Result as
(Select *, ROW_NUMBER() Over (order by location,datetime) RowID from table_name )
Select R1.location,DATEDIFF(SECOND,R2.datetime,R1.datetime) from Result R1 Inner join Result R2 on (R1.RowID=R2.RowID+1 and R1.location=r2.location)
You have two options:
Add a new Row number column and then self join this on the ID e.g. [NEW ID] = [NEW ID] - 1. You can then do the subtraction i.e. Table1.[New ID] - Table2.[New ID]
Use the LAG function which is a shortcut for the above method. As long as you are using SQL2012+
You can try this way:
select s.location,
s.datetime,
datediff(ss, s.datetime, s.prev_datetime)
from (
select location,
datetime,
lead(datetime) over (partition by location order by datetime ) prev_datetime
from Table1
) s
where s.prev_datetime is not null
order by s.location,
s.datetime desc
Create cte and use lead to get datetime and next_datetime at same row.
Then calculate with datediff using this cte
WITH cte
AS
(
SELECT location
, datetime
, lead(datetime,1) OVER (patition BY location ORDER BY datetime asc) next_datetime
from tbl)
SELECT location
, datediff(ss,next_datetime,datetime) Elepase
FROM cte