How can I select the difference between two rows? - sql

Here's an example of what I'm looking for:
I have data that comes in as a lifetime total in gallons. I want to be able to display the data as a running total over the time period I am selecting for rather than as a lifetime total. For example:
timestamp lifetimeTotal runningTotal
1:30 3000 0
1:31 3001 1
1:32 3005 5
1:33 3010 10
I'm not sure how to go about doing this. I was looking at examples like this one using over but it's not quite what I'm looking for: I don't want to add the rows together every time, rather I want to add the difference between the two rows. Right now I am simply selecting the lifetime totals and displaying that.
Any ideas? I will add code if necessary but there's not much to show besides my select statement; I am having trouble thinking of a way to do this conceptually.

This should give difference between the current lifetimeTotal and the min lifetimeTotal
SELECT timestamp,
lifetimeTotal,
lifetimeTotal - MIN(lifetimeTotal) OVER () as runningTotal
FROM Table

This can be easily done using window functions:
SELECT [timestamp], lifetimeTotal,
COALESCE(SUM(diff) OVER (ORDER BY [timestamp]), 0) AS runningTotal
FROM (
SELECT [timestamp],
lifetimeTotal,
lifetimeTotal - LAG(lifetimeTotal) OVER (ORDER BY [timestamp]) AS diff
FROM mytable ) t
The above query uses LAG to calculate the difference between current and previous row. SUM OVER is then used in an outer query to calculate the running total of the difference.
Demo here

Related

Average time between two consecutive events in SQL

I have a table as shown below.
time
Event
2021-03-19T17:15:05
A
2021-03-19T17:15:11
B
2021-03-19T17:15:11
C
2021-03-19T17:15:12
A
2021-03-19T17:15:14
C
I want to find the average time between event A and the event following it.
How do I find it using an SQL query?
here desired output is: 4 seconds.
I really appreciate any help you can provide.
The basic idea is lead() to get the time from the next row. Then you need to calculate the difference. So for all rows:
select t.*,
(to_unix_timestamp(lead(time) over (order by time) -
to_unix_timestamp(time)
) as diff_seconds
from t;
Use a subquery and filtering for just A and the average:
select avg(diff_seconds)
from (select t.*,
(to_unix_timestamp(lead(time) over (order by time) -
to_unix_timestamp(time)
) as diff_seconds
from t
) t
where event = 'A';

sql count all items that day until start of database isn't working because of time

I am trying to count each item in a database table, that is deployments. I have counted the total number of items 3879, by doing this:
use Bamboo_Version6
go
SELECT Count(STARTED_DATE)
FROM [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]
But I have been struggling to get the number of items each day until the start. I have tried using some of the other similar answers to this like:
select STARTED_Date, count(deploymentID)
from [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]
WHERE STARTED_Date>=dateadd(day,datediff(day,0,STARTED_Date)- 7,0)
GROUP BY STARTED_Date
But this will return every id, and a 1 beside it because the dates have times which are making it unique, so I tried doing this: CONVERT(varchar(12),STARTED_DATE,110) to try and fix the problem but it still happens. How can I count this without, getting all the id's or every id as 1 each time?
Remove the time component:
select cast(STARTED_Date as date) as dte, count(deploymentID)
from [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]
group by cast(STARTED_Date as date)
order by dte;
I'm not sure what the WHERE clause is supposed to be doing, so I just removed it. If it is useful, add it back in.
I have another efficient way of doing this, may be try this with an over clause
SELECT cast(STARTED_DATE AS DATE) AS Deployment_date,
COUNT(deploymentID) OVER ( PARTITION BY cast(STARTED_DATE AS DATE) ORDER BY STARTED_DATE) AS NumberofDeployments
FROM [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]

postgres select aggregate timespans

I have a table with the following structure:
timstamp-start, timestamp-stop
1,5
6,10
25,30
31,35
...
i am only interested in continuous timespans e.g. the break between a timestamp-end and the following timestamp-start is less than 3.
How could I get the aggregated covered timespans as a result:
timestamp-start,timestamp-stop
1,10
25,35
The reason I am considering this is because a user may request a timespan that would need to return several thousand rows. However, most records are continous and using above method could potentially reduce many thousand of rows down to just a dozen. Or is the added computation not worth the savings in bandwith and latency?
You can group the time stamps in three steps:
Add a flag to determine where a new period starts (that is, a gap greater than 3).
Cumulatively sum the flag to assign groupings.
Re-aggregate with the new groupings.
The code looks like:
select min(ts_start) as ts_start, max(ts_end) as ts_end
from (select t.*,
sum(flag) over (order by ts_start) as grouping
from (select t.*,
(coalesce(ts_start - lag(ts_end) over (order by ts_start),0) > 3)::int as flag
from t
) t
) t
group by grouping;

SQL to return From/To dates with gaps

I have a temp table that has data like this:
I need to come up with t-SQL that will show the dates in/out for the lot like this:
Since the lot went empty on 6/12/15, I need to show 2 separate rows to allow for the gap in the date range when the lot had no qty. I've tried using MIN and MAX but I can't seem to figure out how to allow for the time gap. Any help would be greatly appreciated. I'm using SQL Server 2012.
Thanks.
You want to divide the groups when the balanced has switched from zero. So, you can define the groups by doing a cumulative count of the 0 running balances. The value is actually more accurate if you do this in reverse order.
This provides a grouping, which you can use for aggregation:
select lot, min(trandate), max(trandate)
from (select t.*,
sum(case when runbal = 0 then 1 else 0 end) over
(partition by lot order by trandate desc) as grp
from t
) t
group by grp
order by min(trandate);

SQLite query to get the closest datetime

I am trying to write an SQLite statement to get the closest datetime from an user input (from a WPF datepicker). I have a table IRquote(rateId, quoteDateAndTime, quoteValue).
For example, if the user enter 10/01/2000 and the database have only fixing stored for 08/01/2000, 07/01/2000 and 14/01/2000, it would return 08/01/2000, being the closest date from 10/01/2000.
Of course, I'd like it to work not only with dates but also with time.
I tried with this query, but it returns the row with the furthest date, and not the closest one:
SELECT quoteValue FROM IRquote
WHERE rateId = '" + pRefIndexTicker + "'
ORDER BY abs(datetime(quoteDateAndTime) - datetime('" + DateTimeSQLite(pFixingDate) + "')) ASC
LIMIT 1;
Note that I have a function DateTimeSQLite to transform user input to the right format.
I don't get why this does not work.
How could I do it? Thanks for your help
To get the closest date, you will need to use the strftime('%s', datetime) SQLite function.
With this example/demo, you will get the most closest date to your given date.
Note that the date 2015-06-25 10:00:00 is the input datetime that the user selected.
select t.ID, t.Price, t.PriceDate,
abs(strftime('%s','2015-06-25 10:00:00') - strftime('%s', t.PriceDate)) as 'ClosestDate'
from Test t
order by abs(strftime('%s','2015-06-25 10:00:00') - strftime('%s', PriceDate))
limit 1;
SQL explanation:
We use the strftime('%s') - strftime('%s') to calculate the difference, in seconds, between the two dates (Note: it has to be '%s', not '%S'). Since this can be either positive or negative, we also need to use the abs function to make it all positive to ensure that our order by and subsequent limit 1 sections work correct.
If the table is big, and there is an index on the datetime column, this will use the index to get the 2 closest rows (above and below the supplied value) and will be more efficient:
select *
from
( select *
from
( select t.ID, t.Price, t.PriceDate
from Test t
where t.PriceDate <= datetime('2015-06-23 10:00:00')
order by t.PriceDate desc
limit 1
) d
union all
select * from
( select t.ID, t.Price, t.PriceDate
from Test t
where t.PriceDate > datetime('2015-06-23 10:00:00')
order by t.PriceDate asc
limit 1
) a
) x
order by abs(julianday('2015-06-23 10:00:00') - julianday(PriceDate))
limit 1 ;
Tested in SQLfiddle.
Another useful solution is using BETWEEN operator, if you can determine upper and lower bounds for your time/date query. I encountered this solution just recently here in this link. This is what i've used for my application on a time column named t (changing code for date column and date function is not difficult):
select *
from myTable
where t BETWEEN '09:35:00' and '09:45:00'
order by ABS(strftime('%s',t) - strftime('%s','09:40:00')) asc
limit 1
Also, i must correct my comment on above post. I tried a simple examination of speed of these 3 approaches proposed by #BerndLinde, #ypercubeᵀᴹ and me . I have around 500 tables with 150 rows in each and medium hardware in my PC. The result is:
Solution 1 (using strftime) takes around 12 seconds.
Adding index of column t to solution 1 improves speed by around 30% and takes around 8 seconds. I didn't face any improvement for using index of time(t).
Solution 2 also has around 30% of speed improvement over Solution 1 and takes around 8 seconds
Finally, Solution 3 has around 50% improvement and takes around 5.5 seconds. Adding index of column t gives a little more improvement and takes around 4.8 seconds. Index of time(t) has no effect in this solution.
Note: I'm a simple programmer and this is a simple test in .NET code. A real performance test must consider more professional aspects, which i'm not aware of them. There was also some computations in my code, after querying and reading from database. Also, as #ypercubeᵀᴹ states, this result my not work for large amount of data.