Date difference between rows - sql

I have the following SQL query:
SELECT t.trans_id, t.business_process_id, tsp.status, tsp.timestamp
FROM tran_stat_p tsp, tran t
WHERE t.trans_id = tsp.trans_id
AND tsp.timestamp BETWEEN '1-jan-2008' AND SYSDATE
AND t.business_process_id = 'ABC01'
It outputs data like this:
trans_ID business_process_id status timestamp
14444400 ABC01 F 6/5/2008 12:37:36 PM
14444400 ABC01 W 6/6/2008 1:37:36 PM
14444400 ABC01 S 6/7/2008 2:37:36 PM
14444400 ABC01 P 6/8/2008 3:37:36 PM
14444401 ABC01 F 6/5/2008 12:37:36 PM
14444401 ABC01 W 6/6/2008 1:37:36 PM
14444401 ABC01 S 6/7/2008 2:37:36 PM
14444401 ABC01 P 6/8/2008 3:37:36 PM
In addition to the above, I'd like to add a column which calculates the time difference (in days) between statuses W&F, S&W, P&S for every unique trans_id.
The idea is to figure out how long transactions are sitting in the various statuses before they are finally processed to status "P". The life cycle of a transaction is in the following order -> F -> W -> S -> P. Where F is the first status, and P is the final status.
Can anyone help? Thanks in advance.

The actual query would use LAG, which will give you a value from a prior row.
Your status codes won't sort as F -> W -> S -> P, which is why the query below has the big CASE statement for the LAG function's ORDER BY - it translates the status codes into a value that follows your transaction life cycle.
SELECT
t.trans_id,
t.business_process_id,
tsp.status,
tsp.timestamp,
tsp.timestamp - LAG(timestamp) OVER (
PARTITION BY tsp.trans_id
ORDER BY
CASE tsp.Status
WHEN 'F' THEN 1
WHEN 'W' THEN 2
WHEN 'S' THEN 3
WHEN 'P' THEN 4
END) AS DaysBetween
FROM tran t
INNER JOIN tran_stat_p tsp ON t.trans_id = tsp.trans_id
WHERE tsp.timestamp BETWEEN DATE '2008-01-01' AND SYSDATE
AND t.business_process_id = 'ABC01';
A couple more notes:
The query is untested. If you have trouble please post some sample data and I'll test it.
I used DATE '2008-01-08' to define Jnauary 1, 2008 because that's how Oracle (and ANSI) likes a date constant to look. When you use 1-jan-2008 you're relying on Oracle's default date format, and that's a session value which can be changed. If it's changed your query will stop working.

You can use LEAD to retrieve the next timestamp value and calculated the time left in every status (F, W and S) and TRUNC to calculated days between as an integer :
SELECT t."trans_ID", t."business_process_id", tsp."status", tsp."timestamp",
LEAD("timestamp", 1) OVER (
PARTITION BY tsp."trans_ID"
ORDER BY "timestamp") AS "next_timestamp",
trunc(LEAD("timestamp", 1) OVER (
PARTITION BY tsp."trans_ID"
ORDER BY "timestamp")) - trunc(tsp."timestamp") as "Days"
FROM tran t
INNER JOIN tran_stat_p tsp ON t."trans_ID" = tsp."trans_ID"
AND tsp."timestamp" BETWEEN '01-jan-2008 12:00:00 AM' AND SYSDATE
WHERE t."business_process_id" = 'ABC01'
See SQLFIDDLE : http://www.sqlfiddle.com/#!4/04633/49/0

Look into oracle window analytics.
http://www.orafaq.com/node/55
You'll want to do a diff of your current row date and the lag of that date.
Hope that makes sense.

Related

Group Timestamps into intervals of 5 minutes, take value that's closest to timestamp and always give out a value

I'm new to SQL coding and would heavily appreciate help for a problem I'm facing. I have the following SQL script, that gives me the following output (see picture 1):
WITH speicher as(
select a.node as NODE_ID, d.name_0 as NODE_NAME, d.parent as PARENT_ID, c.time_stamp as ZEITSTEMPEL, c.value_num as WERT, b.DESCRIPTION_0 as Beschreibung, TO_CHAR(c.time_stamp, 'HH24:MI:SS') as Uhrzeit
from p_value_relations a, l_nodes d, p_values b, p_value_archive c
where a.node in (select sub_node from l_node_relations r where r.node in (
50028,
49989,
49848
))
and a.node = d.id
and (b."DESCRIPTION_0" like 'Name1' OR b."DESCRIPTION_0" like 'Name2')
and c.time_stamp between SYSDATE-30 AND SYSDATE-1
and a.value = b.id and b.id = c.value)
SELECT WERT as Value, NODE_NAME, ZEITSTEMPEL as Timestamp, Uhrzeit as Time, Beschreibung as Category
FROM speicher
I would like to create time intervals of 5 minutes to output the value. It should always choose the value closest above one on the defined time interval time stamps. If there is no value inside a set 5 minute intervall it should still give out the last value it finds, since the value has not changed in that case. To see what I mean please see the following picture. Any help wold be greatly appreciated. This data is from an oracle database.
Result until now [
Result I would like
Since I do not understand your data, and can't test with it, I present something I could test with. My data has a table which tracks when folks login to a system.
This is not intended as a complete answer, but as something to potentially point you in the right direction;
with time_range
as
(
select rownum, sysdate - (1/288)*rownum time_stamp
from dual
connect By Rownum <= 288*30
)
select time_stamp, min(LOGIN_TIME)
from time_range
left outer join WEB_SECURITY_LOGGED_IN on LOGIN_TIME >= time_stamp
group by time_stamp
order by 1;
Good luck...
Edit:
The with part of the query builds a time_stamp column which has one row for every 5 minutes for the last 30 days. The query portion joins to my login log table which I get the login which is the smallest date/time greater than the time_stamp.

Max Date & Min Date in sql Server

Anyone please help me
How to use SQL to find yesterday max time and today min time. I need yesterday's max time Tine_in and today's min time yesterday Time_out.
Like this:
I have a table like this:
|--ID--|--Time-------|--Date------|
|--1---|08:03:00 PM|04/01/2014|
|--1---|08:07:00 AM|04/02/2014|
|--1---|08:11:00 PM|04/02/2014|
|--1---|08:02:00 AM|04/03/2014|
|--1---|08:10:00 PM|04/03/2014|
|--1---|08:05:00 AM|04/04/2014|
|--1---|08:10:00 PM|04/04/2014|
|--1---|08:06:00 AM|04/05/2014|
|--1---|08:15:00 PM|04/05/2014|
|--1---|08:01:00 AM|04/06/2014|
|--1---|08:08:00 PM|04/06/2014|
I need these results:
|--ID--|--Date------|--Time_in----|--Time_out--|
|--1---|04/01/2014|08:03:00 PM|08:07:00 AM|
|--1---|04/02/2014|08:11:00 PM|08:02:00 AM|
|--1---|04/03/2014|08:10:00 PM|08:05:00 AM|
|--1---|04/04/2014|08:10:00 PM|08:06:00 AM|
|--1---|04/05/2014|08:10:00 PM|08:01:00 AM|
|--1---|04/06/2014|08:08:00 PM|00:00:00 ----|
Like 04/02/2014 MAX time 04/02/2014 Time_in and 04/03/2014 MIN Time 04/02/2014 Time_Out.
Thanks..
Not sure how you are querying for this (stored proc, view, etc), but here is a generic advice:
For the purpose of your query, combine the Date and Time columns into one column (of type datetime or smalldatetime). Now find the min and max of that. You should be set!
I may have misunderstood the task, but you can do a self join as in:
with t(t) as (
values '2014-04-01-08.03.00'
,'2014-04-02-08.07.00'
,'2014-04-02-08.11.00'
,'2014-04-03-08.02.00'
)
select max(t1.t), min(t2.t)
from t t1
join t t2
on date(t1.t) = date(t2.t) - 1 day
group by date(t1.t)
FWIW you will get more answers if you provide sample data as either insert statements or cte (like above).
You can use group by with min and max
select
[Date],
Min([Time]) As Time_In,
Max([Time]) As Time_Out
from Table1
group by
[Date]
Also note that Time and Date are reserved words in SQL server. You should consider renaming those columns.
For each date the latest time and the next day's earliest time:
WITH cte AS
( SELECT
d,
MIN(t) AS mint,
MAX(t) AS maxt
FROM tab
GROUP BY d
)
SELECT
t1.d,
t1.maxt,
COALESCE(t2.mint, '00:00:00')
FROM cte AS t1
LEFT JOIN cte AS t2
ON t1.d = t2.d-1

Count unique results in one column which have a given value in another column

I'm currently working with a database of product tests. Each tested device has a unique identifier, a datetime stamp, and a batch of tests with a result for each. The final test is named 'Finished', with a pass or fail reflecting the batch overall.
I'm trying to write a query in SQL Server 2008 which will give me a count of the number of devices which have passed overall first time, grouped by date.
Limiting my results to November onwards, I can list how many times each device has been tested with:
SELECT UID, COUNT (UID) AS Attempts
FROM [dbase].[dbo].[tbl_results]
where TestName = 'Finished'
and Stamp > '2013-11-01'
GROUP BY UID
ORDER BY Attempts
However, this doesn't give me a count, a grouping by date, or any idea whether that batch was a pass or a fail. On my first attempt I included "where Pass = 'P'", but then realised that just excluded all the failures resulting in false information.
I THINK I need to find a way to make this query show only products where Attempts = 1 (with a subquery?), then join that output with the results table and counting Pass = 'P', but I can't figure out the syntax of the join command.
I'm not looking for code - I'd much rather someone could just give me some advice. I've only been using SQL for about two days... Please could someone give me a hand?
Thanks!
Tom
Sample Data:
UID Pass TestName Stamp
97292 P Finished 02/12/2013 07:43
97567 F Finished 03/12/2013 13:21
97567 P Finished 03/12/2013 13:25
97568 P Finished 03/12/2013 12:42
97569 P Finished 03/12/2013 12:28
97570 P Finished 03/12/2013 11:56
97571 F Finished 03/12/2013 11:40
97571 P Finished 03/12/2013 11:44
97572 F Finished 03/12/2013 11:23
This data is already ordered by UID - it shows that 97292 passed first time (single result, value P), 97567 passed second time (two results, P and F), and 97572 was only tested once but failed.
Ideal Output:
Date Passed First Time
02/12/2013 45
03/12/2013 37
04/12/2013 62
Your subquery for products with only one attempt could look like:
Select UID
from tbl_results
Where TestName='Finished'
Group by UID
Having Count(*)=1
the subquery can be joined with tbl_results by using an alias and joining via UID, filtered for Testname, Pass and DateStamp.
For the grouping you will have to extract the datepart from Stamp:
On SQL-Server 2008 + you can use CONVERT (STAMP, GETDATE())
on prior versions you will have to use e.g. CAST(CONVERT(VARCHAR(10), STAMP, 112) as Date)
So the complete SQL could look like:
Select CAST(CONVERT(VARCHAR(10), STAMP, 112) as Date) as [Date]
,COUNT(*) as [Passed First Time]
from tbl_results
join
(
Select UID
from tbl_results
Where TestName='Finished'
Group by UID
Having Count(*)=1
) x on x.UID=tbl_results.UID
Where TestName='Finished' and Pass='P' and Stamp > '2013-11-01'
Group by CAST(CONVERT(VARCHAR(10), STAMP, 112) as Date)
ORDER BY CAST(CONVERT(VARCHAR(10), STAMP, 112) as Date)
Finally sorted it! Now works like a charm. Thanks to bummi and Kiril for your advice.
In case anyone needs this in future - once I got the hang of things I added a few extra features as well.
SELECT u.UID
,TestRig
,CONVERT (Date, [Stamp]) AS Date
,Description
FROM (
SELECT UID
,COUNT (UID) AS Attempts
FROM [dbase].[dbo].[tbl_results]
where TestRig IN ('RigOne', 'RigTwo')
and TestName = 'Finished'
and Stamp > '2013-12-01'
GROUP BY UID) AS u
JOIN (
SELECT UID
,Pass
,TestName
,Stamp
,TestRig
,Description
FROM [dbase].[dbo].[tbl_results]
) AS r
ON (u.UID = r.UID)
where u.Attempts = 1
and r.Pass = 'P'
and TestRig IN ('RigOne', 'RigTwo')
and TestName = 'Finished'
and Stamp > '2013-12-01'
and Pass = 'P'

issue with complex query involving returning 0 for no data in query result

I have a table defined in Oracle11g schema like this
Txn_summ_dec
=================
id
currentdate
resource_id
user_id
trans_id
eventdescptn
each resource has different event descriptions.
I give a date range (of maximum 1 month or less) and resource_id and I want to get distinct count of all users for the given resource id, group by currentdate, eventdescptn
So I have the following query
SELECT COUNT(DISTINCT(txn_summ_dec.user_id)) as dusers, currentDate, eventdescptn
FROM Txn_summ_dec
WHERE resource_id = 1
AND currentdate BETWEEN TO_DATE('2011-12-01', 'YYYY-MM-DD')
AND TO_DATE('2011-12-31', 'YYYY-MM-DD')
GROUP BY currentdate, eventdescptn
and it gives me rightly the result below
dusers currentdate eventdescptn
182 12/01/2011 00:00:00 Save
33 12/04/2011 00:00:00 Save
98 12/01/2011 00:00:00 Read
22 12/30/2011 00:00:00 Write
I want result in the following format: From the query
with the given date range is suppose 5th to 5th of a month (or less) I want results for all dates in the range for all eventdescptn of a resource. If there is no result for a particular date in the range, for a particular event descptn then it should still have that record in the resultset with a 'dusers' value = 0
so if a resource has 3 different eventdescptns (Save, Read, Write) and the date range is 5th to 30th of a month then there should be a total of 26X3 = 78 records in the resultset..
How do I write a query for that?
Also I will need to convert it to hibernate later.. but Sql to start with is fine
Thanks in advance
Check the accepted answer here:
generate days from date range
If I understand you correctly you don't necessarily have an event for every date in the range in your log. That answer gives you a way to materialize a list of dates in the range. If you can modify it to include 1 of each of those dates per event, you would just have to join back to the results you have already aggregated here and set the null dUsers to zero.
I haven't tried this, but I wonder if you could you do:
WITH the_query AS (
... your query here ...
)
SELECT dusers, currentdate, eventdescptn
FROM the_query
WHERE 0 != ( SELECT COUNT(*) FROM the_query )
UNION
SELECT 0, NULL, NULL
FROM the_query
WHERE 0 = ( SELECT COUNT(*) FROM the_query )

Window moving average in sql server

I am trying to create a function that computes a windowed moving average in SQLServer 2008. I am quite new to SQL so I am having a fair bit of difficulty. The data that I am trying to perform the moving average on needs to be grouped by day (it is all timestamped data) and then a variable moving average window needs to be applied to it.
I already have a function that groups the data by day (and #id) which is shown at the bottom. I have a few questions:
Would it be better to call the grouping function inside the moving average function or should I do it all at once?
Is it possible to get the moving average for the dates input into the function, but go back n days to begin the moving average so that the first n days of the returned data will not have 0 for their average? (ie. if they want a 7 day moving average from 01-08-2011 to 02-08-2011 that I start the moving average calculation on 01-01-2011 so that the first day they defined has a value?)
I am in the process of looking into how to do the moving average, and know that a moving window seems to be the best option (currentSum = prevSum + todayCount - nthDayAgoCount) / nDays but I am still working on figuring out the SQL implementation of this.
I have a grouping function that looks like this (some variables removed for visibility purposes):
SELECT
'ALL' as GeogType,
CAST(v.AdmissionOn as date) as dtAdmission,
CASE WHEN #id IS NULL THEN 99 ELSE v.ID END,
COUNT(*) as nVisits
FROM dbo.Table1 v INNER JOIN dbo.Table2 t ON v.FSLDU = t.FSLDU5
WHERE v.AdmissionOn >= '01-01-2010' AND v.AdmissionOn < DATEADD(day,1,'02-01-2010')
AND v.ID = Coalesce(#id,ID)
GROUP BY
CAST(v.AdmissionOn as date),
CASE WHEN #id IS NULL THEN 99 ELSE v.ID END
ORDER BY 2,3,4
Which returns a table like so:
ALL 2010-01-01 1 103
ALL 2010-01-02 1 114
ALL 2010-01-03 1 86
ALL 2010-01-04 1 88
ALL 2010-01-05 1 84
ALL 2010-01-06 1 87
ALL 2010-01-07 1 82
EDIT: To answer the first question I asked:
I ended up creating a function which declared a temporary table and inserted the results from the count function into it, then used the example from user662852 to compute the moving average.
Take the hardcoded date range out of your query. Write the output (like your sample at the end) to a temp table (I called it #visits below).
Try this self join to the temp table:
Select list.dtadmission
, AVG(data.nvisits) as Avg
, SUM(data.nvisits) as sum
, COUNT(data.nvisits) as RollingDayCount
, MIN(data.dtadmission) as Verifymindate
, MAX(data.dtadmission) as Verifymaxdate
from #visits as list
inner join #visits as data
on list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission) group by list.dtadmission
EDIT: I didn't have enough room in Comments to say this in response to your question:
My join is "kinda cartesian" because it uses a between in the join constraint. Each record in list is going up against every other record, and then I want the ones where the date I report is between a lower bound of (-7) days and today. Every data date is available to list date, this is the key to your question. I could have written the join condition as
list.dtadmission between DATEADD(DD,-6,data.dtadmission) and data.dtadmission
But what really happened was I tested it as
list.dtadmission between DATEADD(DD,6,data.dtadmission) and data.dtadmission
Which returns no records because the syntax is "Between LOW and HIGH". I facepalmed on 0 records and swapped the arguments, that's all.
Try the following, see what I mean: This is the cartesian join for just one listdate:
SELECT
list.[dtAdmission] as listdate
,data.[dtAdmission] as datadate
,data.nVisits as datadata
,DATEADD(dd,6,list.dtadmission) as listplus6
,DATEADD(dd,6,data.dtAdmission ) as datapplus6
from [sandbox].[dbo].[admAvg] as list inner join [sandbox].[dbo].[admAvg] as data
on
1=1
where list.dtAdmission = '5-Jan-2011'
Compare this to the actual join condition
SELECT
list.[dtAdmission] as listdate
,data.[dtAdmission] as datadate
,data.nVisits as datadata
,DATEADD(dd,6,list.dtadmission) as listplus6
,DATEADD(dd,6,data.dtAdmission ) as datapplus6
from [sandbox].[dbo].[admAvg] as list inner join [sandbox].[dbo].[admAvg] as data
on
list.dtadmission between data.dtadmission and DATEADD(DD,6,data.dtadmission)
where list.dtAdmission = '5-Jan-2011'
See how list date is between datadate and dataplus6 in all the records?