Find the difference of rows within a column - sql

I want to create a new table where the difference of weight will be displayed as
weight diff. for eg first day the difference is 0, so second day for the same id the weight should be like +.. for gain and - .. for loss

You seem to want lag():
select t.*,
(weight -
lag(weight, 1, weight) over (partition by id order by date)
) as weight_diff
from t;
Your image is really hard to read so I just used the names given in the description.

Related

How to compare the value of one row with the upper row in one column of an ordered table?

I have a table in PostgreSQL that contains the GPS points from cell phones. It has an integer column that stores epoch (the number of seconds from 1960). I want to order the table based on time (epoch column), then, break the trips to sub trips when there is no GPS record for more than 2 minutes.
I did it with GeoPandas. However, it is too slow. I want to do it inside the PostgreSQL. How can I compare each row of the ordered table with the previous row (to see if the epoch has a difference of 2 minutes or more)?
In fact, I do not know how to compare each row with the upper row.
You can use lag():
select t.*
from (select t.*,
lag(timestamp_epoch) over (partition by trip order by timestamp_epoch) as last_timestamp_epoch
from t
) t
where last_timestamp_epoch < timestamp_epoch - 120
I want to order the table based on time (epoch column), then, break the trips to sub trips when there is no GPS record for more than 2 minutes.
After comparing to the previous (or next) row, with the window function lag() (or lead()), form groups based on the gaps to get sub trip numbers:
SELECT *, count(*) FILTER (WHERE step) OVER (PARTITION BY trip ORDER BY timestamp_epoch) AS sub_trip
FROM (
SELECT *
, (timestamp_epoch - lag(timestamp_epoch) OVER (PARTITION BY trip ORDER BY timestamp_epoch)) > 120 AS step
FROM tbl
) sub;
Further reading:
Select longest continuous sequence

Group by and calculation from value on the next row

I'm quite new to sql server. I can't seem to figure this out. I have a table that looks like this.
I need to be able to calculate the percentage change in the number for each name, for each year, in the column p. So the end result should look like this.
You can easily calculate the % difference using lag()
select name, date, number,
Cast(((number * 1.0) - Lag(number,1) over(partition by name order by date))/ Lag(number,1) over(partition by name order by date) * 100 as int)
from table

Z-Score in SQL based on last 1 year

I have daily data structured in the below format. Please note this is just a subset of the data and I had to make some modifications to be able to share it.
The first column is the [DataValue] for which I need to find the Z-score by IndexValue, [Qualifier], [QualifierCode] and [QualifierType]. I also have the [Date] column in there.
I essentially need to find the Z-score value for each data point by IndexValue, [Qualifier], [QualifierCode] and [QualifierType]. The main point of focus here is that I have data for the last 3 years but in order to calculate Z-score, I only want to take the average and standard deviation for the last one year.
Z-Score = [DataValue] - (Avg in last 1 year) / (Std Dev in last 1 year)
I am struggling with how to get average for the last one year. Would anybody be able to help me with this?
SELECT [IndexValue]
,[Qualifier]
,[QualifierCode]
,[QualifierType],[Date]
,[Month]
,[Year]
,[Z-Score] = ([DataValue] - ROUND(AVG([DataValue]),3))/ ROUND(STDEV([DataValue]),3)
FROM [TABLEA]
GROUP BY [IndexValue]
,[Qualifier]
,[QualifierCode]
,[QualifierType]
,[Date]
,[Month]
,[Year]
order by [IndexValue]
,[Qualifier]
,[QualifierCode]
,[QualifierType]
,[Date] desc
: https://i.stack.imgur.com/pqhJD.png
You need window functions for this:
SELECT a.*,
( (DataValue - AVG(DataValue) OVER ()) /
STDEV(DataValue) OVER ()
) as z_score
FROM [TABLEA] a;
Note: if data_value is an integer, you will need to convert it to a number with digits:
SELECT a.*,
( (DataValue - AVG(DataValue * 1.0) OVER ()) /
STDEV(DataValue) OVER ()
) as z_score
FROM [TABLEA] a;
Rounding for the calculation seems to be way off base, unless your intention is to produce a z-like score that isn't really a z-score.

selecting percentage of time on/off for groupings by id

I'm looking to summarize some information into a kind of report, and the crux of it is similar to the following problem. I'm looking for the approach in any sql-like language.
consider a schema containing the following:
id - int, on - bool, time - datetime
This table is basically a log that specifies when a thing of id changes state between 'on' and 'off'.
What I want is a table with the percentage of time 'on' for each id seen. So a result might look like this
id, percent 'on'
1, 50
2, 45
3, 67
I would expect the overall time to be
now - (time first seen in the log)
Programatically, I understand how to do this. For each id, I just want to add up all of the segments of time for which the item was 'on' and express this as a percentage of the total time. I'm not quite seeing how to do this in sql however
You can use lead() and some date/time arithmetic (which varies by database).
In pseudo-code this looks like:
select id,
sum(csae when status = on then coalesce(next_datetime, current_datetime) - datetime) end) / (current_datetime - min(datetime))
from (select t.*,
lead(datetime) over (partition by id order by datetime) as next_datetime
from t
) t
group by id;
Date/time functions vary by database, so this is just to give an idea of what to do.

Every 10th row based on timestamp

I have a table with signal name, value and timestamp. these signals where recorded at sampling rate of 1sample/sec. Now i want to plot a graph on values of months, and it is becoming very heavy for the system to perform it within seconds. So my question is " Is there any way to view 1 value/minute in other words i want to see every 60th row.?"
You can use the row_number() function to enumerate the rows, and then use modulo arithmetic to get the rows:
select signalname, value, timestamp
from (select t.*,
row_number() over (order by timestamp) as seqnum
from table t
) t
where seqnum % 60 = 0;
If your data really is regular, you can also extract the seconds value and check when that is 0:
select signalname, value, timestamp
from table t
where datepart(second, timestamp) = 0
This assumes that timestamp is stored in an appropriate date/time format.
Instead of sampling, you could use the one minute average for your plot:
select name
, min(timestamp)
, avg(value)
from Yourtable
group by
name
, datediff(minute, '2013-01-01', timestamp)
If you are charting months, even the hourly average might be detailed enough.