Merge two consecutive rows into a column - sql

Assuming that, I have a table something like this :
EmployeeCode
EntryType
TIME
ABC200413
IN
8:48AM
ABC200413
OUT
4:09PM
ABC200413
IN
4:45PM
ABC200413
OUT
6:09PM
ABC200413
IN
7:45PM
ABC200413
OUT
10:09PM
Now I want to convert my data something like this :
EmployeeCode
IN_TIME
OUT_TIME
ABC200413
8:48AM
4:09PM
ABC200413
4:45PM
6:09PM
ABC200413
7:45PM
10:09PM
Is there any way I can achieve this using SQL server query?
Thanks in advance.

Provided mytable contains only valid pairs of in / out events
select EmployeeCode,
max(case EntryType when 'IN' then TIME end ) IN_TIME,
max(case EntryType when 'OUT' then TIME end ) OUT_TIME
from (
select EmployeeCode, EntryType, TIME,
row_number() over(partition by EmployeeCode, EntryType order by TIME) rn
from mytable
)t
group by EmployeeCode, rn
order by EmployeeCode, rn
Otherwise a kind of clean-up is required first.

One solution that may work is using inner joins .. as you may have multiple In & Out records.
Select A.EmployeeCode,
Min(TIME) IN_TIME,
(Select Max(A2.TIME) From Attendance A2 Where A2.EmployeeCode = A.EmployeeCode And A2.EntryType = 'OUT') OUT_TIME
From Attendance A
Where A.EntryType = 'IN'
Group By A.EmployeeCode
So, The main query get the max Out time for each employee & the inner query get the min In time. That solution supposes that at least each employee has one IN record

Assuming there are always pairs of IN/OUT, you can use LEAD to get the next value
select
EmployeeCode,
TIME as IN_TIME,
nextTime AS OUT_TIME
from (
select *,
lead(case EntryType when 'OUT' then TIME end) over (partition by EmployeeCode order by TIME) nextTime
from mytable
) t
where EntryType = 'IN';

Related

Create partitions based on column values in sql

I am very new to sql and query writing and after alot of trying, I am asking for help.
As shown in the picture, I want to create partition of data based on is_late = 1 and show its count (that is 2) but at the same time want to capture the value of last_status where is_late = 0 to be displayed in the single row.
The task is to calculate how many time the rider was late and time taken by him from first occurrence of estimated time to the last_status.
Desired output:
You can use following query
SELECT
rider_id,
task_created_time,
expected_time_to_arrive,
is_late,
last_status,
task_count,
CONVERT(VARCHAR(5), DATEADD(MINUTE, DATEDIFF(MINUTE, expected_time_to_arrive, last_status), 0), 114) AS time_delayed
FROM
(SELECT
rider_id,
task_created_time,
expected_time_to_arrive,
is_late,
SUM(CASE WHEN is_late = 1 THEN 1 ELSE 0 END) OVER(PARTITION BY rider_id ORDER BY rider_id) AS task_count,
ROW_NUMBER() OVER(PARTITION BY rider_id ORDER BY rider_id) AS num,
MAX(last_status) OVER(PARTITION BY rider_id ORDER BY rider_id) AS last_status
FROM myTestTable) t
WHERE num = 1
db<>fiddle

Oracle - SQL query to bin the data from one table to another based on a column value?

What SQL query to use to generate Table2 from Table1(tables shown below)?
Table2 needs to bin the timestamps from Table1 into start and end times according to the Table1.Status field. As a result Table2.Status should toggle between 0 and 1 in every row.
Table2.StartTime = 1st value of Table1.TimeStamp whenever Table1.Status changes from 1 to 0 or viceversa.
Table2.EndTime = (1st value of Table1.TimeStamp of Table1.Status next change) - 1millisecond
The first record in Table1 can start from any Status - 1 or 0.
Table1
TimeStamp Status
11-NOV-11 12.45.45.445000000 AM 1
11-NOV-11 12.46.45.445000000 AM 1
11-NOV-11 12.47.45.445000000 AM 1
11-NOV-11 12.48.45.445000000 AM 0
11-NOV-11 12.49.45.445000000 AM 0
11-NOV-11 12.50.45.445000000 AM 1
11-NOV-11 12.51.45.445000000 AM 0
11-NOV-11 12.52.45.445000000 AM 0
11-NOV-11 12.53.45.445000000 AM 1
...
Table2
StartTime EndTime Status
11-NOV-11 12.45.45.445000000 AM 11-NOV-11 12.48.45.444000000 AM 1
11-NOV-11 12.48.45.445000000 AM 11-NOV-11 12.50.45.444000000 AM 0
11-NOV-11 12.50.45.445000000 AM 11-NOV-11 12.51.45.444000000 AM 1
11-NOV-11 12.51.45.445000000 AM 11-NOV-11 12.53.45.444000000 AM 0
...
Any help would be appreciated.
select min(t2.TimeStamp) Start_time
, lead(min(t2.TimeStamp), 1)over(order by min(t2.TimeStamp)) - INTERVAL '0.001' SECOND End_time
, t2.status
from (
select t1.TimeStamp TimeStamp, STATUS, group_tag, first_value(group_tag ignore nulls)over(order by t1.TimeStamp ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) group_tag_gap_filling
from (
select t.TimeStamp, Status, case when lead(status, 1, 2)over(order by t.TimeStamp) != status then row_number()over(order by t.TimeStamp)end group_tag
from Table1 t
)t1
)t2
group by t2.group_tag_gap_filling, t2.status
;
First, in inline view "t1", I create temporary groups
Then Secondly, I fill in the gap for all temporary groups
Then I extract what is need using min aggregation function, and lead analytic function, and I extract 0.001 second from end_time
This is a gaps and islands problem, with a slight twist. The twist being that you don't want to report the most recent timestamp record for each island, but rather you want to report the record immediately following after that most recent record. So, we can use the difference in row numbers method, and also compute the lead of the timestamp column.
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (ORDER BY ts) rn1,
ROW_NUMBER() OVER (PARTITION BY Status ORDER BY ts) rn2,
LEAD(ts) OVER (ORDER BY ts) AS ts_lead
FROM Table1 t
)
SELECT MIN(ts) AS StartTime, MAX(ts_lead) AS EndTime
FROM cte
GROUP BY Status, rn1 - rn2
HAVING MAX(ts_lead) IS NOT NULL
ORDER BY MIN(ts);
Demo
You can achieve it using the analytical functions and GROUP BY as follows:
SELECT MIN(TIMESTAMP) AS STARTTIME, MAX(TIMESTAMP) AS ENDTIME, STATUS
FROM
(SELECT T.TIMESTAMP, T.STATUS, SUM(CHANGE_POINT) OVER (ORDER BY TIMESTAMP) AS SM
FROM
(SELECT T.TIMESTAMP, T.STATUS,
CASE WHEN LAG(STATUS) OVER (ORDER BY TIMESTAMP) <> STATUS THEN 1 END AS CHANGE_POINT
FROM YOUR_TABLE T
) T
)
GROUP BY SM, STATUS
You can also use the MATCH_RECOGNIZE clause as follows:
SELECT * FROM YOUR_TABLE
MATCH_RECOGNIZE
(
ORDER BY TIMESTAMP
MEASURES
FIRST( TIMESTAMP ) AS STARTDATE,
LAST( TIMESTAMP ) AS ENDDATE,
FIRST( STATUS ) AS STATUS
ONE ROW PER MATCH
PATTERN( same_field+ )
DEFINE same_field AS FIRST(STATUS) = STATUS
);
There are many ways to solve gaps-and-islands problems. In this case, the simplest method is two window functions and filtering:
select status, ts as start_time, lead(ts) over (order by ts) as end_time
from (select t1.*, lag(status) over (order by ts) as prev_status
from table1 t1
) t1
where prev_status is null or prev_status <> status
order by start_time;
The logic is simple:
Keep each row where the status changes, filtering out all other rows.
Use lead() to get the ending timestamp.
Here is a db<>fiddle.

SQL query get first punch and lastpunch for every employee

I have the following data in SQL Server:
What I need is that for every day by employee (employeeId) I get in the follwing data:
AccessCode column means I = PunchIn and O = PunchOut and we have to filter by lunchtype = 'N'
So basically the result should return only one row per day and all the punch ins and punch outs in the middle of the first entrance and last exist shouldn't be considered.
Any clue?
You can do conditional aggregation :
select employeeid, In, Out,
dateadd(second, datediff(second, in, out), 0) as Hours
from(select employeeid,
min(case when AccessCode = 'I' then timestamp end) as In,
max(case when AccessCode = 'O' then timestamp end) as Out
from table t
where lunchtype = 'N'
group by employeeid, convert(date, times)
) t;
You can try this
with cte as
(select
*,
cast(times as date) as myda
from myTable
)
select
employeeid,
mn as punch_in,
mx as punch_out,
datediff(minute, mn, mx)/60.0 as hours
from
(select
employeeid,
min(times) over (partition by myda) as mn,
max(times) over (partition by myda) as mx
from cte
) t
group by
employeeid, mn, mx
Try this:
select employeeId,
min(case when accessCode = 'I' then timestamp end) punchIn,
max(case when accessCode = 'O' then timestamp end) punchOut
from myTable
where lunchtype = 'N'
group by employeeId

SUM of time spent in a State

Please consider the table below for call center agent states.
What I need is to calculate the sum of time Bryan spent in "Break" for the whole day.
This is what I'm trying to execute but it returns some inaccurate values:
select sum (CASE
WHEN State = 'Not Working' and Reason = 'Break'
THEN Datediff(SECOND, [Time_Stamp], CURRENT_TIMESTAMP)
else '' END) as Break_Overall
from MyTable
where Agent = 'Bryan'
Use lead():
select agent,
sum(datediff(second, timestamp, next_timestamp)
from (select t.*,
lead(timestamp) over (partition by agent order by time_stamp) as next_timestamp
from mytable t
) t
where state = 'Not Working' and reason = 'Break'
group by agent;
If the agent can currently be on break, you might want a default value:
select agent,
sum(datediff(second, timestamp, next_timestamp)
from (select t.*,
lead(timestamp, 1, current_timestamp) over (partition by agent
order by time_stamp) as next_timestamp
from mytable t
) t
where state = 'Not Working' and reason = 'Break'
group by agent;
I'm a little uncomfortable with this logic, because current_timestamp has a date component, but your times don't.
EDIT:
In SQL Server 2008, you can do:
select agent,
sum(datediff(second, timestamp, coalesce(next_timestamp, current_timestamp))
from (select t.*, t2.timestamp as next_timestamp
from mytable t outer apply
(select top 1 t2.*
from mytable t2
where t2.agent = t.agent and t2.time_stamp > t.time_stamp
order by t.time_stamp
) t2
) t
where state = 'Not Working' and reason = 'Break'
group by agent;
As it is, you're getting the difference between the record's Time_Stamp and CURRENT_TIMESTAMP. That's probably not correct - you probably want to get the difference between the record's Time_Stamp and the next Time_Stamp for the same "Agent".
(Note that "Agent" will also present problems if you have multiple Agents with the same name; you probably want to store Agents in a different table and use a unique identifier as a foreign key.)
So, for Bryan, you'd get
the sum of both the "total time" for the 8:30:21 record AND the 11:34:58 record, which is right - except that you're calculating "total time" incorrectly, so instead you'd get the sum of the time since 8:30:21 and 11:34:58.

SQL Question: Getting Datediff in days elapsed for each record in a group

Given this table:
How can I get the datediff in days between each status_date for each group of ID_Number? In other words I need to find the number of elapsed days for each status that the ID_Number has been given.
Some things to know:
All ID_Number will have a received_date which should be the earliest date for each ID_Number (but app doesn't enforce)
For each ID_Number there will be a status with a corresponding status_date which is the date that the ID_Number was given that particular status.
The status column doesn't always necessarily go in the same order every time (app doesn't enforce)
All ID_Number will have a closed_date which should be the latest date (but app doesn't enforce)
Sample output:
So for ID_Number 2001, the first date (received_date) is 2009-05-02 and the next date you encounter has a status of 'open' and is 2009-05-02 so elapsed days is 0. Moving on to the next date encountered is 2009-05-10 with a status of 'invest' and the elapsed days is 8 counting from the prior date. The next date encountered is 2009-07-11 and the elapsed days is 62 counting from the previous date.
Edited to add:
Is it possible to have the elapsed days end up as a column on this table/view?
I also forgot to add that this is SQL Server 2000.
What I understand is that you need the difference between the first status_date and the next status_date for the same id and so on up to the closed_date.
This will only work in SQL 2005 and up.
;with test as (
select
key,
id_number,
status,
received_date,
status_date,
closed_date,
row_number() over (partition by id order by status_date, key ) as rownum
from #test
)
select
t1.key,
t1.id_number,
t1.status,
t1.status_date,
t1.received_date,
t1.closed_date,
datediff(d, case when t1.rownum = 1
then t1.received_date
else
case when t2.status_date is null
then t1.closed_date
else t2.status_date
end
end,
t1.status_date
) as days
from test t1
left outer join test t2
on t1.id = t2.id
and t2.rownum = t1.rownum - 1
This solution will work with SQL 2000 but I am not sure how good will perform:
select *,
datediff(d,
case when prev_date is null
then closed_date
else prev_date
end,
status_date )
from (
select *,
isnull( ( select top 1 t2.status_date
from #test t2
where t1.id_number = t2.id_number
and t2.status_date < t1.status_date
order by t2.status_date desc
),received_date) as prev_date
from #test t1
) a
order by id_number, status_date
Note: Replace the #Test table with the name of your table.
Some sample output would really help, but this is a guess at what you mean, assuming you want that information for each ID_Number/Status combination:
select ID_Number, Status, EndDate - StartDate as DaysElapsed
from (
select ID_Number, Status, min(coalesce(received_date, status_date)) as StartDate, max(coalesce(closed_date, status_date)) as EndDate
from Table1
group by ID_Number, Status
) a
The tricky bit is determining the previous status and putting it on the same row as the current status. It would be simplified a little if there were a correlation between Key and StatusDate (i.e. that Key(x) > Key(y) always implies StatusDate(x) >= StatusDate(y)). Unfortunately, that doesn't seem to be the case.
PS: I am assuming Key is a unique identifier on your table; you haven't said anything to indicate otherwise.
SELECT Key,
ID_Number,
(
SELECT TOP 1 Key
FROM StatusUpdates prev
WHERE (prev.ID_Number = cur.ID_Number)
AND ( (prev.StatusDate < cur.StatusDate)
OR ( prev.StatusDate = cur.StatusDate
AND prev.Key < cur.Key
)
)
ORDER BY StatusDate, Key /*Consider index on (ID_Number, StatusDate, Key)*/
) PrevKey
FROM StatusUpdates cur
Once you have this as a basis, it's easy to extrapolate to any other info you need from the current or previous StatusUpdate. E.g.
SELECT c.*,
p.Status AS PrevStatus,
p.StatusDate AS PrevStatusDate,
DATEDIFF(d, c.StatusDate, p.StatusDate) AS DaysElapsed
FROM (
SELECT Key,
ID_Number,
Status,
SattusDate,
(
SELECT TOP 1 Key
FROM StatusUpdates prev
WHERE (prev.ID_Number = cur.ID_Number)
AND ( (prev.StatusDate < cur.StatusDate)
OR ( prev.StatusDate = cur.StatusDate
AND prev.Key < cur.Key
)
)
ORDER BY StatusDate, Key
) PrevKey
FROM StatusUpdates cur
) c
JOIN StatusUpdates p ON
p.Key = c.PrevKey