Given a table with a consecutive run of data: a number that always increases while a task is in progress and resets back to zero when the next task starts, how do you select the maximum of each run of data?
Each consecutive run can have any number of rows, and the runs of data are marked by a a "start" and "end" row, eg the data might look like
user_id, action, qty, datetime
1, start, 0, 2017-01-01 00:00:01
1, record, 0, 2017-01-01 00:00:01
1, record, 4, 2017-01-01 00:00:02
1, record, 5, 2017-01-01 00:00:03
1, record, 6, 2017-01-01 00:00:04
1, end, 0, 2017-01-01 00:00:04
1, start, 0, 2017-01-01 00:00:05
1, record, 0, 2017-01-01 00:00:05
1, record, 2, 2017-01-01 00:00:06
1, record, 3, 2017-01-01 00:00:07
1, end, 0, 2017-01-01 00:00:07
2, start, 0, 2017-01-01 00:00:08
2, record, 0, 2017-01-01 00:00:08
2, record, 3, 2017-01-01 00:00:09
2, record, 8, 2017-01-01 00:00:10
2, end, 0, 2017-01-01 00:00:10
And the results would be the maximum value of each run:
user_id, action, qty, datetime
1, record, 6, 2017-01-01 00:00:04
1, record, 3, 2017-01-01 00:00:07
2, record, 8, 2017-01-01 00:00:10
Using any postgres sql syntax (9.3)? Its some kind of grouping then selecting max from each group, but I don't see how to do the grouping part.
If theres no overlapping for a single user and the next run always starts at a later time, then you can use LAG() window function.
with the_table(user_id, action, qty, datetime) as (
select 1,'start', 0, '2017-01-01 00:00:01'::timestamp union all
select 1,'record', 0, '2017-01-01 00:00:01'::timestamp union all
select 1,'record', 4, '2017-01-01 00:00:02'::timestamp union all
select 1,'record', 5, '2017-01-01 00:00:03'::timestamp union all
select 1,'record', 6, '2017-01-01 00:00:04'::timestamp union all
select 1,'end', 0, '2017-01-01 00:00:04'::timestamp union all
select 1,'start', 0, '2017-01-01 00:00:05'::timestamp union all
select 1,'record', 0, '2017-01-01 00:00:05'::timestamp union all
select 1,'record', 2, '2017-01-01 00:00:06'::timestamp union all
select 1,'record', 3, '2017-01-01 00:00:07'::timestamp union all
select 1,'end', 0, '2017-01-01 00:00:07'::timestamp union all
select 2,'start', 0, '2017-01-01 00:00:08'::timestamp union all
select 2,'record', 0, '2017-01-01 00:00:08'::timestamp union all
select 2,'record', 3, '2017-01-01 00:00:09'::timestamp union all
select 2,'record', 8, '2017-01-01 00:00:10'::timestamp union all
select 2,'end', 0, '2017-01-01 00:00:10'::timestamp
)
select n_user_id, n_action, n_qty, n_datetime from (
select action,
lag(user_id) over(partition by user_id order by datetime, case when action = 'start' then 0 when action = 'record' then 1 else 2 end, qty) as n_user_id,
lag(action) over(partition by user_id order by datetime, case when action = 'start' then 0 when action = 'record' then 1 else 2 end, qty) as n_action,
lag(qty) over(partition by user_id order by datetime, case when action = 'start' then 0 when action = 'record' then 1 else 2 end, qty) as n_qty,
lag(datetime) over(partition by user_id order by datetime, case when action = 'start' then 0 when action = 'record' then 1 else 2 end, qty) as n_datetime
from the_table
)t
where action = 'end'
Because some action = record rows have same datetime as start and end rows, I use CASE in ORDER BY, to be clear that start is first, then is record and then end.
Quick and dirty, assuming runs do not overlap
with bounds as (select starts.rn, starts.datetime as s, ends.datetime as e from
(select datetime,ROW_NUMBER() OVER () as rn from runs where action = 'start' order by datetime) as starts
join
(select datetime,ROW_NUMBER() OVER () as rn from runs where action = 'end' order by datetime) as ends
on starts.rn = ends.rn)
,with_run as (SELECT *, (select rn from bounds where s <= r.datetime and e >= r.datetime) as run
from runs as r)
,max_qty as (
SELECT run,max(qty) as qty
from with_run
GROUP BY run)
SELECT s.user_id,s.action,s.qty,s.datetime from with_run as s join max_qty as f on s.run = f.run AND s.qty = f.qty;
-- TEST DATA --
create table runs (user_id int, action text, qty int, datetime TIMESTAMP);
insert INTO runs VALUES
(1, 'start', 0, '2017-01-01 00:00:01')
,(1, 'record', 0, '2017-01-01 00:00:01')
,(1, 'record', 4, '2017-01-01 00:00:02')
,(1, 'record', 5, '2017-01-01 00:00:03')
,(1, 'record', 6, '2017-01-01 00:00:04')
,(1, 'end', 0, '2017-01-01 00:00:04')
,(1, 'start', 0, '2017-01-01 00:00:05')
,(1, 'record', 0, '2017-01-01 00:00:05')
,(1, 'record', 2, '2017-01-01 00:00:06')
,(1, 'record', 3, '2017-01-01 00:00:07')
,(1, 'end', 0, '2017-01-01 00:00:07')
,(2, 'start', 0, '2017-01-01 00:00:08')
,(2, 'record', 0, '2017-01-01 00:00:08')
,(2, 'record', 3, '2017-01-01 00:00:09')
,(2, 'record', 8, '2017-01-01 00:00:10')
,(2, 'end', 0, '2017-01-01 00:00:10');
UPDATE
#Oto Shavadze answer can be shortened
with lookup as (select action,lag(t.*) over(order by datetime, case when action = 'start' then 0 when action = 'record' then 1 else 2 end) as r from runs t)
select (r::runs).user_id
,(r::runs).action
,(r::runs).qty
,(r::runs).datetime
from lookup where action = 'end';
I think OP unclear about what considers maximum, last record before end or highest qty in run.
Related
My Table Structure is like below:
Carrier Terminal timestamp1
1 1 21-Mar-17
2 101 21-Mar-17
3 2 21-Mar-17
4 202 21-Mar-17
5 3 21-Mar-17
6 303 21-Mar-17
where carrier
flight 1,2 = Delta
flight 3,4 = Air France
flight 5,6 = Lufthanse
and
Terminal 1,101 = T1
terminal 2,202 = T2
terminal 3,303 = T3
I am trying output like below:
count(Delta), count(Air France), count(Lufthansa), terminal as column output
2, 0, 0, T1
0, 2, 0, T2
0, 0, 2, T3
I have started like this
select count(Delta), count(Air France), count(Lufthansa), terminal
from table_name
where timestamp between '01-Mar-18 07.00.00.000000 AM' and '30-Mar-18 07.59.59.999999 AM'
I am trying to write a query to have a count of different carriers flown through a particular day for each terminal
Any Advise will be highly appreciated
I'm making a whole lot of assumptions for this to work... I've extracted all the rules you've mentioned in your text and I've assumed that those structures are are already in place. Otherwise, let us know :)
with flights(carrier, terminal, departure) as(
select 1, 1, timestamp '2017-03-01 01:00:00' from dual union all
select 2, 101, timestamp '2017-03-01 02:00:00' from dual union all
select 3, 2, timestamp '2017-03-01 03:00:00' from dual union all
select 4, 202, timestamp '2017-03-01 04:00:00' from dual union all
select 5, 3, timestamp '2017-03-01 05:00:00' from dual union all
select 6, 303, timestamp '2017-03-01 06:00:00' from dual
)
,carriers(carrier, carrier_name) as(
select 1, 'Delta' from dual union all
select 2, 'Delta' from dual union all
select 3, 'Air France' from dual union all
select 4, 'Air France' from dual union all
select 5, 'Lufthanse' from dual union all
select 6, 'Lufthanse' from dual
)
,terminals(terminal, terminal_name) as(
select 1, 'T1' from dual union all
select 101, 'T1' from dual union all
select 2, 'T2' from dual union all
select 202, 'T2' from dual union all
select 3, 'T3' from dual union all
select 303, 'T3' from dual
)
select terminal_name
,count(case when carrier_name = 'Delta' then 1 end) as "Delta"
,count(case when carrier_name = 'Air France' then 1 end) as "Air France"
,count(case when carrier_name = 'Lufthanse' then 1 end) as "Lufthanse"
from flights f
join carriers c using(carrier)
join terminals t using(terminal)
where departure >= timestamp '2017-03-01 00:00:00'
and departure < timestamp '2017-04-01 00:00:00'
group by terminal_name
order by terminal_name;
with
t ( flight, gate, ts ) as (
select 1, 1, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 2, 101, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 3, 2, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 4, 202, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 5, 3, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual union all
select 6, 303, to_timestamp('21-Mar-17', 'dd-Mon-rr') from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins below this line. Use your actual table and column names.
select count (case when flight in (1, 2) then 1 end) as delta
, count (case when flight in (3, 4) then 1 end) as air_france
, count (case when flight in (5, 6) then 1 end) as lufthansa
, case when gate in (1, 101) then 'T1'
when gate in (2, 202) then 'T2'
when gate in (3, 303) then 'T3' end as terminal
from t
where ts between '21-Mar-17 02.00.00.000000 AM' and '21-Mar-17 10.00.00.000000 AM'
group by case when gate in (1, 101) then 'T1'
when gate in (2, 202) then 'T2'
when gate in (3, 303) then 'T3' end
order by terminal
;
DELTA AIR_FRANCE LUFTHANSA TERMINAL
---------- ---------- ---------- --------
2 0 0 T1
0 2 0 T2
0 0 2 T3
I am working with a blood pressure database in SQL Server which contains patient_id, timestamp (per minute) and systolicBloodPressure.
My goals are to find:
the number of episodes in which a patient is under a certain blood pressure threshold
An episode consists of the timestmap where the patient drops below a certain threshold until the timestamp where the patient comes above the threshold.
the mean blood pressure per episode per patient
the duration of the episode per episode per patient
What I have tried so far:
I am able to identify episodes by just making a new column which sets to 1 if threshold is reached.
select *
, CASE
when sys < threshold THEN '1'
from BPDATA
However , I am not able to 'identify' different episodes within the patient; episode1 episode 2 with their relative timestamps.
Could someone help me with this? Or is there someone with a better different solution?
EDIT: Sample data with example threshold 100
ID Timestamp SysBP below Threshold
----------------------------------------------------
1 9:38 110 Null
1 9:39 105 Null
1 9:40 96 1
1 9:41 92 1
1 9:42 102 Null
2 12:23 95 1
2 12:24 98 1
2 12:25 102 Null
2 12:26 104 Null
2 12:27 94 1
2 12:28 88 1
2 12:29 104 Null
Thanks for the sample data.
This should work:
declare #t table (ID int, Timestamp time, SysBP int, belowThreshold bit)
insert #t
values
(1, '9:38', 110, null),
(1, '9:39', 105, null),
(1, '9:40', 96, 1),
(1, '9:41', 92, 1),
(1, '9:42', 102, null),
(2, '12:23', 95, 1),
(2, '12:24', 98, 1),
(2, '12:25', 102, null),
(2, '12:26', 104, null),
(2, '12:27', 94, 1),
(2, '12:28', 88, 1),
(2, '12:29', 104, null)
declare #treshold int = 100
;with y as (
select *, case when lag(belowThreshold, 1, 0) over(partition by id order by timestamp) = belowThreshold then 0 else 1 end epg
from #t
),
z as (
select *, sum(epg) over(partition by id order by timestamp) episode
from y
where sysbp < #treshold
)
select id, episode, count(episode) over(partition by id) number_of_episodes_per_id, avg(sysbp) avg_sysbp, datediff(minute, min(timestamp), max(timestamp))+1 episode_duration
from z
group by id, episode
This answer relies on LEAD() and LAG() functions so only works on 2012 or later:
Setup:
CREATE TABLE #bloodpressure
(
Patient_id int,
[TimeStamp] SmallDateTime,
SystolicBloodPressure INT
)
INSERT INTO #bloodpressure
VALUES
(1, '2017-01-01 09:01', 60),
(1, '2017-01-01 09:02', 55),
(1, '2017-01-01 09:03', 60),
(1, '2017-01-01 09:04', 70),
(1, '2017-01-01 09:05', 72),
(1, '2017-01-01 09:06', 75),
(1, '2017-01-01 09:07', 60),
(1, '2017-01-01 09:08', 50),
(1, '2017-01-01 09:09', 52),
(1, '2017-01-01 09:10', 53),
(1, '2017-01-01 09:11', 65),
(1, '2017-01-01 09:12', 71),
(1, '2017-01-01 09:13', 73),
(1, '2017-01-01 09:14', 74),
(2, '2017-01-01 09:01', 70),
(2, '2017-01-01 09:02', 75),
(2, '2017-01-01 09:03', 80),
(2, '2017-01-01 09:04', 70),
(2, '2017-01-01 09:05', 72),
(2, '2017-01-01 09:06', 75),
(2, '2017-01-01 09:07', 60),
(2, '2017-01-01 09:08', 50),
(2, '2017-01-01 09:09', 52),
(2, '2017-01-01 09:10', 53),
(2, '2017-01-01 09:11', 65),
(2, '2017-01-01 09:12', 71),
(2, '2017-01-01 09:13', 73),
(2, '2017-01-01 09:14', 74),
(3, '2017-01-01 09:12', 71),
(3, '2017-01-01 09:13', 60),
(3, '2017-01-01 09:14', 74)
Now using Lead And Lag to find the previous rows values, to find whether this is the beginning or end of a sequence of low blood pressures, in combination with a common table expression. Using a UNION of start and end events ensures that an event which covers just one minute is recorded as both a start and an end event.
;WITH CTE
AS
(
SELECT *,
LAG(SystolicBloodPressure,1)
OVER (PaRTITION BY Patient_Id ORDER BY TimeStamp) As PrevValue,
Lead(SystolicBloodPressure,1)
OVER (PaRTITION BY Patient_Id ORDER BY TimeStamp) As NextValue
FROM #bloodpressure
),
CTE2
AS
(
-- Get Start Events (EventType 1)
SELECT 1 As [EventType], Patient_id, TimeStamp,
ROW_NUMBER() OVER (ORDER BY Patient_id, TimeStamp) AS RN
FROM CTE
WHERE (PrevValue IS NULL AND SystolicBloodPressure < 70) OR
(PrevValue >= 70 AND SystolicBloodPressure < 70)
UNION
-- Get End Events (EventType 2)
SELECT 2 As [EventType], Patient_id, TimeStamp,
ROW_NUMBER() OVER (ORDER BY Patient_id, TimeStamp) AS RN
FROM CTE
WHERE (NextValue IS NULL AND SystolicBloodPressure < 70 ) OR
(NextValue >= 70 AND SystolicBloodPressure < 70)
)
SELECT C1.Patient_id, C1.TimeStamp As EventStart, C2.TimeStamp As EventEnd
FROM CTE2 C1
INNER JOIN CTE2 C2
ON C1.Patient_id = C2.Patient_id AND C1.RN = C2.RN
WHERE C1.EventType = 1 AND C2.EventType = 2
ORDER BY C1.Patient_id, C1.TimeStamp
Simplified structure.
I need the two dates between a record that has an action type of 4 and an action type of 1.
The record could be in that state multiple times and I would need separate rows for their times
For example for IncidentId = 1
Row 1 - StartTime = 2017-01-01 14:00 (id:3) - End Time = 2017-01-01 20:00 (id: 5)
Row 2 - StartTime = 2017-01-01 21:00 (id:6) - End Time = 2017-01-02 11:00 (id: 9)
CREATE TABLE #returntable
(
[incidentid] INT,
[starttime] DATETIME,
[endtime] DATETIME
)
CREATE TABLE #testtableofdoom
(
[incidentlogid] INT,
[incidentid] INT,
[timestamp] DATETIME,
[actiontypeid] INT
)
INSERT INTO #testtableofdoom
( incidentlogid, incidentid, timestamp, actiontypeid )
VALUES ( 1, 1, '2017-01-01 09:00', 1 )
, ( 2, 1, '2017-01-01 11:00', 1 )
, ( 3, 1, '2017-01-01 14:00', 4 )
, ( 4, 1, '2017-01-01 16:00', 4 )
, ( 5, 1, '2017-01-01 20:00', 1 )
, ( 6, 1, '2017-01-01 21:00', 4 )
, ( 7, 1, '2017-01-02 09:00', 4 )
, ( 8, 2, '2017-01-02 10:00', 1 )
, ( 9, 1, '2017-01-02 11:00', 1 )
, ( 10, 1, '2017-01-02 14:00', 1 )
, ( 11, 2, '2017-01-02 15:00', 4 )
, ( 12, 1, '2017-01-02 16:00', 1 )
, ( 13, 1, '2017-01-02 17:00', 1 )
, ( 14, 1, '2017-01-02 18:00', 1 )
, ( 15, 2, '2017-01-02 15:00', 1 );
DROP TABLE #testtableofdoom
DROP TABLE #returntable
I used table variables instead of temp tables, and shorter column names than you, but this works:
declare #tt TABLE (
logId INT, iId INT,
dt DATETIME, atId INT
INSERT #tt (logId, iId,
dt, atId) values
(1, 1, '2017-01-01 09:00', 1),
(2, 1, '2017-01-01 11:00', 1),
(3, 1, '2017-01-01 14:00', 4),
(4, 1, '2017-01-01 16:00', 4),
(5, 1, '2017-01-01 20:00', 1),
(6, 1, '2017-01-01 21:00', 4),
(7, 1, '2017-01-02 09:00', 4),
(8, 2, '2017-01-02 10:00', 1),
(9, 1, '2017-01-02 11:00', 1),
(10, 1, '2017-01-02 14:00', 1),
(11, 2, '2017-01-02 15:00', 4),
(12, 1, '2017-01-02 16:00', 1),
(13, 1, '2017-01-02 17:00', 1),
(14, 1, '2017-01-02 18:00', 1),
(15, 2, '2017-01-02 15:00', 1)
Select s.logId startLogid, e.logId endLogId,
s.iID, s.dt startTime, e.dt endTime
from #tt s join #tt e
on e.logId =
(Select min(logId) from #tt
where iId = s.iID
and atId = 1
and logId > s.logId)
where s.aTid = 4
and ((Select atId from #tt
Where logId =
(Select Max(logId) from #tt
where logId < s.LogId
and iId = s.iId)) = 1
or Not Exists
(Select * from #tt
Where logId < s.LogId
and iId = s.iID))
This produces the following:
startLogid endLogId iID startTime endTime
----------- ----------- ---- ---------------- ----------------
3 5 1 2017-01-01 14:00 2017-01-01 20:00
6 9 1 2017-01-01 21:00 2017-01-02 11:00
11 15 2 2017-01-02 15:00 2017-01-02 15:00
it uses a self-join. s represents the first (start) record with actionType 4, and e represents end record with action type 1. Since logId increments, the end record must have higher logId than the start record, and it must be the lowest logId higher than the start records that has same iId and an atId = 1.
Select s.iID, s.dt startTime, e.dt endTime
from #tt s join #tt e
on e.logId =
(Select min(logId) from #tt -- lowest log greater than start logId
where iId = s.iID -- same iId
and atId = 1 -- with atId = 1
and logId > s.logId) -- greater than start logId
finally, the start record must be restricted to those "4" records which either have no other same incident records before it or have a "1" record immediately prior to it.
where s.aTid = 4
and ((Select atId from #tt -- atId of immed prior = 1
Where logId =
(Select Max(logId) from #tt
where logId < s.LogId
and iId = s.iId)) = 1
or Not Exists -- or there is no prior record
(Select * from #tt
Where logId < s.LogId
and iId = s.iID))
something like this?
select
d.[timestamp] as StartDate,
(select top 1 [timestamp]
from #testTableOfDoom d2
where d2.incidentid = 1 and d2.[timestamp] > d.[timestamp] and actiontypeid = 1
order by d2.[timestamp] asc
) as EndDate
from
(select
p.[timestamp],
LAG(p.actiontypeid) OVER (ORDER BY incidentlogid asc) PrevValue,
p.actiontypeid
from #testTableOfDoom p
where p.incidentid = 1) d
where d.actiontypeid = 4
and d.PrevValue <> 4
I can use DATEDIFF to find the difference between one set of dates like this
DATEDIFF(MINUTE, #startdate, #enddate)
but how would I find the total time span between multiple sets of dates? I don't know how many sets (stops and starts) I will have.
The data is on multiple rows with start and stops.
ID TimeStamp StartOrStop TimeCode
----------------------------------------------------------------
1 2017-01-01 07:00:00 Start 1
2 2017-01-01 08:15:00 Stop 2
3 2017-01-01 10:00:00 Start 1
4 2017-01-01 11:00:00 Stop 2
5 2017-01-01 10:30:00 Start 1
6 2017-01-01 12:00:00 Stop 2
This code would work assuming that your table only store data from one person, and they should be of the order Start/Stop/Start/Stop
WITH StartTime AS (
SELECT
TimeStamp
, ROW_NUMBER() PARTITION BY (ORDER BY TimeStamp) RowNum
FROM
<<table>>
WHERE
TimeCode = 1
), StopTime AS (
SELECT
TimeStamp
, ROW_NUMBER() PARTITION BY (ORDER BY TimeStamp) RowNum
FROM
<<table>>
WHERE
TimeCode = 2
)
SELECT
SUM (DATEDIFF( MINUTE, StartTime.TimeStamp, StopTime.TimeStamp )) As TotalTime
FROM
StartTime
JOIN StopTime ON StartTime.RowNum = StopTime.RowNum
This will work if your starts and stops are reliable. Your sample has two starts in order - 10:00 and 10:30 starts. I assume in production you will have an employee id to group on, so I added this to the sample data in place of the identity column.
Also in production, the CTE sets will be reduced by using a parameter on date. If there are overnight shifts, you would want your stops CTE to use dateadd(day, 1, #startDate) as your upper bound when retrieving end date.
Set up sample:
declare #temp table (
EmpId int,
TimeStamp datetime,
StartOrStop varchar(55),
TimeCode int
);
insert into #temp
values
(1, '2017-01-01 07:00:00', 'Start', 1),
(1, '2017-01-01 08:15:00', 'Stop', 2),
(1, '2017-01-01 10:00:00', 'Start', 1),
(1, '2017-01-01 11:00:00', 'Stop', 2),
(2, '2017-01-01 10:30:00', 'Start', 1),
(2, '2017-01-01 12:00:00', 'Stop', 2)
Query:
;with starts as (
select t.EmpId,
t.TimeStamp as StartTime,
row_number() over (partition by t.EmpId order by t.TimeStamp asc) as rn
from #temp t
where Timecode = 1 --Start time code?
),
stops as (
select t.EmpId,
t.TimeStamp as EndTime,
row_number() over (partition by t.EmpId order by t.TimeStamp asc) as rn
from #temp t
where Timecode = 2 --Stop time code?
)
select cast(min(sub.StartTime) as date) as WorkDay,
sub.EmpId as Employee,
min(sub.StartTime) as ClockIn,
min(sub.EndTime) as ClockOut,
sum(sub.MinutesWorked) as MinutesWorked
from
(
select strt.EmpId,
strt.StartTime,
stp.EndTime,
datediff(minute, strt.StartTime, stp.EndTime) as MinutesWorked
from starts strt
inner join stops stp
on strt.EmpId = stp.EmpId
and strt.rn = stp.rn
)sub
group by sub.EmpId
This works assuming your table has an incremental ID and interleaving start/stop records
--Data sample as provided
declare #temp table (
Id int,
TimeStamp datetime,
StartOrStop varchar(55),
TimeCode int
);
insert into #temp
values
(1, '2017-01-01 07:00:00', 'Start', 1),
(2, '2017-01-01 08:15:00', 'Stop', 2),
(3, '2017-01-01 10:00:00', 'Start', 1),
(4, '2017-01-01 11:00:00', 'Stop', 2),
(5, '2017-01-01 10:30:00', 'Start', 1),
(6, '2017-01-01 12:00:00', 'Stop', 2)
--let's see every pair start/stop and discard stop/start
select start.timestamp start, stop.timestamp stop,
datediff(mi,start.timestamp,stop.timestamp) minutes
from #temp start inner join #temp stop
on start.id+1= stop.id and start.timecode=1
--Sum all for required result
select sum(datediff(mi,start.timestamp,stop.timestamp) ) totalMinutes
from #temp start inner join #temp stop
on start.id+1= stop.id and start.timecode=1
Results
+-------------------------+-------------------------+---------+
| start | stop | minutes |
+-------------------------+-------------------------+---------+
| 2017-01-01 07:00:00.000 | 2017-01-01 08:15:00.000 | 75 |
| 2017-01-01 10:00:00.000 | 2017-01-01 11:00:00.000 | 60 |
| 2017-01-01 10:30:00.000 | 2017-01-01 12:00:00.000 | 90 |
+-------------------------+-------------------------+---------+
+--------------+
| totalMinutes |
+--------------+
| 225 |
+--------------+
Maybe the tricky part is the join clause. We need to join #table with itself by deferring 1 ID. Here is where on start.id+1= stop.id did its work.
In the other hand, for excluding stop/start couple we use start.timecode=1. In case we don't have a column with this information, something like stop.id%2=0 works just fine.
I am trying to group by ticker, date and display their value in different columns for different tickers.
But I don't know the exact ticker name which are in my table.
original table(tickers may have other symbols besides A, B, C):
date | ticker | value
-------------------------
1 | A | 5
1 | B | 3
1 | C | 2
2 | A | 5
2 | B | 3
2 | C | 2
3 | A | 5
3 | B | 3
3 | C | 2
.......
Write the SQL query to get the result dataframe:
date | A | B | C
--------------------------------
1 | 5 | 3 | 2
2 | 5 | 3 | 2
3 | 5 | 3 | 2
As sgeddes states, this is accomplished with a pivot. You can create a pivot dynamically when you don't know all the values. I gave an example of doing just that here
https://dba.stackexchange.com/questions/98776/dynamic-select-and-place-result-in-variable-columns/98809#98809
create Table Questions
(
id int identity,
question_id int,
question_name varchar(255)
)
go
with CTEquestion
as
(
select 1 QID
union all
Select QID+1
from CTEquestion
where QID < 11
)
insert questions
select QID, 'Question'+cast(QID as varchar(50))
from CTEquestion
insert QuestionAnswers
values ('2015-04-23', 'a1', 1, 'Canswer1')
, ('2015-04-23', 'a1', 2, 'Ianswer2')
, ('2015-04-23', 'a1', 3, 'Canswer3')
, ('2015-04-23', 'a1', 4, 'Canswer4')
, ('2015-04-23', 'a1', 5, 'Ianswer5')
, ('2015-04-23', 'a1', 6, 'Ianswer6')
, ('2015-04-23', 'a1', 7, 'Canswer7')
, ('2015-04-23', 'a1', 8, 'Canswer8')
, ('2015-04-23', 'a1', 9, 'Canswer9')
, ('2015-04-23', 'a1', 10,'Canswer10')
insert QuestionAnswers
values (CONVERT(DATE, GETDATE()), 'b2', 1, 'Canswer1')
, (CONVERT(DATE, GETDATE()), 'b2', 2, 'Canswer2')
, (CONVERT(DATE, GETDATE()), 'b2', 3, 'Canswer3')
, (CONVERT(DATE, GETDATE()), 'b2', 4, 'Canswer4')
, (CONVERT(DATE, GETDATE()), 'b2', 5, 'Canswer5')
, (CONVERT(DATE, GETDATE()), 'b2', 6, 'Ianswer6')
, (CONVERT(DATE, GETDATE()), 'b2', 7, 'Canswer7')
, (CONVERT(DATE, GETDATE()), 'b2', 8, 'Canswer8')
, (CONVERT(DATE, GETDATE()), 'b2', 9, 'Canswer9')
, (CONVERT(DATE, GETDATE()), 'b2', 10, 'Ianswer10')
insert QuestionAnswers
values (CONVERT(DATE, GETDATE()), 'c3', 1, 'Ianswer1')
, (CONVERT(DATE, GETDATE()), 'c3', 2, 'Ianswer2')
, (CONVERT(DATE, GETDATE()), 'c3', 3, 'Canswer3')
, (CONVERT(DATE, GETDATE()), 'c3', 4, 'Ianswer4')
, (CONVERT(DATE, GETDATE()), 'c3', 5, 'Canswer5')
, (CONVERT(DATE, GETDATE()), 'c3', 6, 'Ianswer6')
, (CONVERT(DATE, GETDATE()), 'c3', 7, 'Canswer7')
, (CONVERT(DATE, GETDATE()), 'c3', 8, 'Canswer8')
, (CONVERT(DATE, GETDATE()), 'c3', 9, 'Canswer9')
, (CONVERT(DATE, GETDATE()), 'c3', 10, 'Ianswer10')
insert QuestionAnswers
values (CONVERT(DATE, GETDATE()), 'a1', 1, 'Canswer1')
, (CONVERT(DATE, GETDATE()), 'a1', 2, 'Ianswer2')
, (CONVERT(DATE, GETDATE()), 'a1', 3, 'Canswer3')
, (CONVERT(DATE, GETDATE()), 'a1', 4, 'Canswer4')
, (CONVERT(DATE, GETDATE()), 'a1', 5, 'Canswer5')
, (CONVERT(DATE, GETDATE()), 'a1', 6, 'Canswer6')
, (CONVERT(DATE, GETDATE()), 'a1', 7, 'Canswer7')
, (CONVERT(DATE, GETDATE()), 'a1', 8, 'Canswer8')
, (CONVERT(DATE, GETDATE()), 'a1', 9, 'Canswer9')
, (CONVERT(DATE, GETDATE()), 'a1', 10, 'Ianswer10')
-->End test data creation
--straight join
select qa.user_id, qa.question_set, q.question_id, qa.answer
from Questions q
join QuestionAnswers qa on qa.question_id=q.question_id
order by qa.user_id
--dynamic pivot
DECLARE
#questionList varchar(max)
, #maxQID int
, #qid int
select #questionList='',#maxQID = MAX(question_id), #qid= MIN(question_id)
FROM Questions
while #qid <= #maxQID
begin
set #questionList=#questionList+'['+cast(#qid as varchar(10))+']'
select #qid=min(question_id)
from Questions
where question_id > #qid
if #qid<=#maxQID
set #questionList=#questionList+', '
end
DECLARE #SQL NVARCHAR(MAX)
SET #SQL = N'
select user_id, '+#questionList+'
from
(select q.question_id, qa.question_set, qa.user_id, qa.answer
from Questions q
join QuestionAnswers qa on qa.question_id=q.question_id) x
PIVOT
(
max(answer)
FOR question_id in ('+#questionList+')
) pvt
order by user_id'
exec sp_executesql #SQL
If you know the ticker values you can do a GROUP BY. Use CASE to do conditional SUM:
select date,
SUM(case when ticker = 'A' then value end) as A,
SUM(case when ticker = 'B' then value end) as B,
SUM(case when ticker = 'C' then value end) as C,
...
from tablename
group by date
Core ANSI SQL-99.
If the different ticker values are unknown, and may change between executions, you need product specific functionality. (The general behavior is that a SELECT always return the same number of columns - independent of table data!)