Calculate delta(difference of current and previous row) in sql - sql

I have a table like :
trans is the name of the table for example
Id | Trans_Date | Account_Id | Amount | Delta
------------------------------------------------------
1 | 2011-02-20 00:00:00 | 2 | 1200 | NULL
------------------------------------------------------
2 | 2011-03-21 00:00:00 | 2 | 2000 | NULL
------------------------------------------------------
3 | 2011-04-22 00:00:00 | 2 | 4500 | NULL
------------------------------------------------------
4 | 2011-02-20 00:00:00 | 4 | 1000 | NULL
------------------------------------------------------
5 | 2011-03-21 00:00:00 | 4 | 2400 | NULL
------------------------------------------------------
6 | 2011-04-22 00:00:00 | 4 | 3000 | NULL
------------------------------------------------------
And I have to update Delta column. which value is the difference of current row of the same account and preceeding row of the same account assuming there is one transaction per month.
Here is a dummy sql which can generate the delta value
select tt1.id, tt1.amount , tt1.AccountId,(tt1.amount-tt2.amount) as delta
from trans tt1 left outer JOIN trans tt2
on tt1.accountid = tt2.accountid
where month(tt1.date1)-month(tt2.date1)=1 ;
The result of this query is
id | amount | AccountId | delta |
-------------------------------------
2 | 2000 | 2 | 800 |
-------------------------------------
3 | 4500 | 2 | 2500 |
-------------------------------------
5 | 2400 | 4 | 1400 |
-------------------------------------
6 | 3000 | 4 | 600 |
-------------------------------------
But the delta of the row which has not any preceeding row should be its amount such as
1 | 1200 | 2 | 1200 |
-----------------------------------------
4 | 1000 | 4 | 1000 |
-----------------------------------------
these are missing by the way.
Please help me in resolving this query.

Here's your original query modified accordingly:
select
tt1.id,
tt1.amount,
tt1.AccountId,
(tt1.amount-ifnull(tt2.amount, 0)) as delta
from trans tt1
left outer JOIN trans tt2 on tt1.accountid = tt2.accountid
and month(tt1.date1)-month(tt2.date1)=1;
The month comparison is moved from where to on, which makes a difference for left join, and tt2.amount is replaced with ifnull(tt2.amount, 0).
The UPDATE version of the script:
update tt1
set delta = (tt1.amount-ifnull(tt2.amount, 0))
from trans tt1
left outer JOIN trans tt2 on tt1.accountid = tt2.accountid
and month(tt1.date1)-month(tt2.date1)=1;
The correct MySQL syntax for the above update should actually be:
update trans tt1
left outer JOIN trans tt2
on tt1.accountid = tt2.accountid
and month(tt1.date1)-month(tt2.date1)=1
set tt1.delta = (tt1.amount-ifnull(tt2.amount, 0));
(Thanks #pinkb.)

You can use an inner query, but it's not necessarily the most efficient query.
UPDATE trans
SET Delta = Amount -
(SELECT Amount FROM trans t1
WHERE t1.Trans_Date < trans.Trans_Date
ORDER BY t1.Trans_Date DESC LIMIT 1)

Can you "union all" your query with a query that simply selects the first item for each account with the initial balance set as the delta, and the ID of that record as the id for the delta record? The result would be ordered by ID. Dirty but is it applicable?

Related

SQL insert/update to a table from another table based on some condition

I have a table like below named table1:
+-----------+-------+
| productid | stock |
+-----------+-------+
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
+-----------+-------+
I need to insert / update to the above table from another table named table2:
+-----+-------+
| PId | stock |
+-----+-------+
| 1 | 20 |
| 2 | 40 |
| 4 | 10 |
+-----+-------+
I would like to execute an SQL query with below condition:
if PId from table2 exist in Productid in table1 then need to update the value of stock.
if PId from table2 doesnt exist in Productid in table1 then need to insert the value of stock as a new row in table1 from table2.
So after executing query output in table1 would be like below:
+-----------+-------+
| productid | stock |
+-----------+-------+
| 1 | 20 |
| 2 | 40 |
| 3 | 30 |
| 4 | 10 |
+-----------+-------+
Help me to get the query as I am new to SQL. Thanks in advance!
You want a MERGE statement. Most database management systems support it nowadays:
MERGE INTO table1
USING table2
ON table1.productid=table2.pid
WHEN MATCHED THEN UPDATE SET
stock = table2.stock
WHEN NOT MATCHED THEN INSERT VALUES (
table2.pid
, table2.stock
)
;
SELECT * FROM table1 ORDER BY 1;
-- out OUTPUT
-- out --------
-- out 3
-- out (1 row)
-- out
-- out Time: First fetch (1 row): 27.090 ms. All rows formatted: 27.155 ms
-- out productid | stock
-- out -----------+-------
-- out 1 | 20
-- out 2 | 40
-- out 3 | 30
-- out 4 | 10

SQL interpolating missing values for a specific date range - with some conditions

There are some similar questions on the site, but I believe mine warrants a new post because there are specific conditions that need to be incorporated.
I have a table with monthly intervals, structured like this:
+----+--------+--------------+--------------+
| ID | amount | interval_beg | interval_end |
+----+--------+--------------+--------------+
| 1 | 10 | 12/17/2017 | 1/17/2018 |
| 1 | 10 | 1/18/2018 | 2/18/2018 |
| 1 | 10 | 2/19/2018 | 3/19/2018 |
| 1 | 10 | 3/20/2018 | 4/20/2018 |
| 1 | 10 | 4/21/2018 | 5/21/2018 |
+----+--------+--------------+--------------+
I've found that sometimes there is a month of data missing around the end/beginning of the year where I know it should exist, like this:
+----+--------+--------------+--------------+
| ID | amount | interval_beg | interval_end |
+----+--------+--------------+--------------+
| 2 | 10 | 10/14/2018 | 11/14/2018 |
| 2 | 10 | 11/15/2018 | 12/15/2018 |
| 2 | 10 | 1/17/2019 | 2/17/2019 |
| 2 | 10 | 2/18/2019 | 3/18/2019 |
| 2 | 10 | 3/19/2019 | 4/19/2019 |
+----+--------+--------------+--------------+
What I need is a statement that will:
Identify where this year-end period is missing (but not find missing
months that aren't at the beginning/end of the year).
Create this interval by using the length of an existing interval for
that ID (maybe using the mean interval length for the ID to do it?). I could create the interval from the "gap" between the previous and next interval, except that won't work if I'm missing an interval at the beginning or end of the ID's record (i.e. if the record starts at say 1/16/2015, I need the amount for 12/15/2014-1/15/2015
Interpolate an 'amount' for this interval using the mean daily
'amount' from the closest existing interval.
The end result for the sample above should look like:
+----+--------+--------------+--------------+
| ID | amount | interval_beg | interval_end |
+----+--------+--------------+--------------+
| 2 | 10 | 10/14/2018 | 11/14/2018 |
| 2 | 10 | 11/15/2018 | 12/15/2018 |
| 2 | 10 | 12/16/2018 | 1/16/2018 |
| 2 | 10 | 1/17/2019 | 2/17/2019 |
| 2 | 10 | 2/18/2019 | 3/18/2019 |
+----+--------+--------------+--------------+
A 'nice to have' would be a flag indicating that this value is interpolated.
Is there a way to do this efficiently in SQL? I have written a solution in SAS, but have a need to move it to SQL, and my SAS solution is very inefficient (optimization isn't a goal, so any statement that does what I need is fantastic).
EDIT: I've made an SQLFiddle with my example table here:
http://sqlfiddle.com/#!18/8b16d
You can use a sequence of CTEs to build up the data for the missing periods. In this query, the first CTE (EOYS) generates all the end-of-year dates (YYYY-12-31) relevant to the table; the second (INTERVALS) the average interval length for each ID and the third (MISSING) attempts to find start (from t2) and end (from t3) dates of adjoining intervals for any missing (indicated by t1.ID IS NULL) end-of-year interval. The output of this CTE is then used in an INSERT ... SELECT query to add missing interval records to the table, generating missing dates by adding/subtracting the interval length to the end/start date of the adjacent interval as necessary.
First though we add the interp column to indicate if a row was interpolated:
ALTER TABLE Table1 ADD interp TINYINT NOT NULL DEFAULT 0;
This sets interp to 0 for all existing rows. Then we can do the INSERT, setting interp for all those rows to 1:
WITH EOYS AS (
SELECT DISTINCT DATEFROMPARTS(DATEPART(YEAR, interval_beg), 12, 31) AS eoy
FROM Table1
),
INTERVALS AS (
SELECT ID, AVG(DATEDIFF(DAY, interval_beg, interval_end)) AS interval_len
FROM Table1
GROUP BY ID
),
MISSING AS (
SELECT e.eoy,
ids.ID,
i.interval_len,
COALESCE(t2.amount, t3.amount) AS amount,
DATEADD(DAY, 1, t2.interval_end) AS interval_beg,
DATEADD(DAY, -1, t3.interval_beg) AS interval_end
FROM EOYS e
CROSS JOIN (SELECT DISTINCT ID FROM Table1) ids
JOIN INTERVALS i ON i.ID = ids.ID
LEFT JOIN Table1 t1 ON ids.ID = t1.ID
AND e.eoy BETWEEN t1.interval_beg AND t1.interval_end
LEFT JOIN Table1 t2 ON ids.ID = t2.ID
AND DATEADD(MONTH, -1, e.eoy) BETWEEN t2.interval_beg AND t2.interval_end
LEFT JOIN Table1 t3 ON ids.ID = t3.ID
AND DATEADD(MONTH, 1, e.eoy) BETWEEN t3.interval_beg AND t3.interval_end
WHERE t1.ID IS NULL
)
INSERT INTO Table1 (ID, amount, interval_beg, interval_end, interp)
SELECT ID,
amount,
COALESCE(interval_beg, DATEADD(DAY, -interval_len, interval_end)) AS interval_beg,
COALESCE(interval_end, DATEADD(DAY, interval_len, interval_beg)) AS interval_end,
1 AS interp
FROM MISSING
This adds the following rows to the table:
ID amount interval_beg interval_end interp
2 10 2017-12-05 2018-01-04 1
2 10 2018-12-16 2019-01-16 1
2 10 2019-12-28 2020-01-27 1
Demo on SQLFiddle

How can I return rows that meet criteria for occurring in one day, but over a date range?

I have a query (shown below) that returns all rows for UserID that have :
a JOIN,
a subsequent CANCEL, and then
a subsequent JOIN
But: I need to return UserIDs that meet this criteria of having a JOIN,CANCEL, then JOIN in sequence ON THE SAME DAY, but for a date range: for example BETWEEN 2016-11-01 and 2016-11-30. So in the example table below, UserIDs 12345, 9876, and 33445 would be returned.
I'm not sure how this is achieved - would be involve some sort of grouping on the timestamp date? Would a stored procedure that iterates over conditional tests for UserID and ActionType be a viable solution?
+--------+--------+----------------------+------------+------------------+
| rownum | UserID | Timestamp | ActionType | Return in query? |
+--------+--------+----------------------+------------+------------------+
| 1 | 12345 | 2016-11-01 08:25:39 | JOIN | yes |
| 2 | 12345 | 2016-11-01 08:27:00 | NULL | yes |
| 3 | 12345 | 2016-11-01 08:28:20 | DOWNGRADE | yes |
| 4 | 12345 | 2016-11-01 08:31:34 | NULL | yes |
| 5 | 12345 | 2016-11-01 08:32:44 | CANCEL | yes |
| 6 | 12345 | 2016-11-01 08:45:51 | NULL | yes |
| 7 | 12345 | 2016-11-01 08:50:57 | JOIN | yes |
| 1 | 9876 | 2016-11-01 16:05:42 | JOIN | yes |
| 2 | 9876 | 2016-11-01 16:07:33 | CANCEL | yes |
| 3 | 9876 | 2016-11-01 16:09:09 | JOIN | yes |
| 1 | 56565 | 2016-11-01 18:15:16 | JOIN | no |
| 2 | 56565 | 2016-11-01 19:22:25 | CANCEL | no |
| 3 | 56565 | 2016-11-01 20:05:05 | CANCEL | no |
| 1 | 34343 | 2016-11-01 05:32:56 | JOIN | no |
NEXT DAY
| 1 | 7878 | 2016-11-02 10:05:04 | JOIN | no |
| 2 | 7878 | 2016-11-02 10:06:06 | JOIN | no |
| 1 | 33445 | 2016-11-02 02:33:34 | JOIN | yes |
| 2 | 33445 | 2016-11-02 02:33:34 | NULL | yes |
| 3 | 33445 | 2016-11-02 02:37:56 | CANCEL | yes |
| 4 | 33445 | 2016-11-02 02:38:01 | JOIN | yes |
+--------+--------+----------------------+------------+------------------+
Here is a link to the question which led me to the query that pulls data for exactly one day (not a range): How can I return rows that meet a specific sequence of events?
Here is the query:
SELECT *
FROM T
WHERE USERID IN (
select distinct userid
from t first_join
inner join t cancel
on first_join.tmstmp < cancel.tmstp
and first_join.userid = cancel.userid
inner join t.second_join
on second_join.tmstmp > cancel.tmstp
and second_join.userid = cancel.userid
where first_join.actiontype = 'JOIN'
and cancel.actiontype = 'CANCEL'
and second_join.actiontype = 'JOIN'
)
Clarification of comments/questions:
vkp:
QUESTION: can join,cancel,join be on different days with other values in between? ANSWER: No, I need to find the join>cancel>join that occur in one day only. If there is a join on 11/1 and a cancel on 11/2, that UserID does not need to be returned.
QUESTION: if a particular date satisfies cancel,join,cancel in the date range, will that be enough for a user to be included in the results? ANSWER: No, I am specifically looking at rows that meet the ActionType sequence in one day, not over a range of days.
THANK YOU!
To get all the users and the days when they have the specified sequence of events happen, use
select distinct userid,tmstamp::date
from t
where ActionType = 'CANCEL' and tmstamp::date between date '2016-11-01' and date '2016-11-30' and
exists (select 1
from t t2
where t2.userId = t.userId and
t2.actiontype = 'JOIN' and
t2.tmstamp < t.tmstamp and
t2.tmstamp::date = t.tmstamp::date
) and
exists (select 1
from t t3
where t3.userId = t.userId and
t3.actiontype = 'JOIN' and
t3.tmstamp > t.tmstamp and
t3.tmstamp::date = t.tmstamp::date
)
To get all the rows for such users on those days, wrap the previous query as a subquery against the original table.
select * from t where (userid,tmstamp::date) in (
select distinct userid,tmstamp::date
from t
where ActionType = 'CANCEL'
and tmstamp::date between date '2016-11-01' and date '2016-11-30' and
exists (select 1
from t t2
where t2.userId = t.userId and
t2.actiontype = 'JOIN' and
t2.tmstamp < t.tmstamp and
t2.tmstamp::date = t.tmstamp::date
) and
exists (select 1
from t t3
where t3.userId = t.userId and
t3.actiontype = 'JOIN' and
t3.tmstamp > t.tmstamp and
t3.tmstamp::date = t.tmstamp::date
)
)
Sample Demo
Note that this is a minor tweak to #Gordon's query (to check for these sequence of events on a particular day) in your previous question which i felt was the best.
Edit: An alternate approach with window functions
select * from t where (userid,tmstamp::date) in (
select distinct userid,tmstamp::date from (
select t.*
,min(case when actiontype = 'JOIN' then 1 else 2 end) over(partition by t.userid,t.tmstamp::date order by t.tmstamp rows between unbounded preceding and 1 preceding) min_before
,min(case when actiontype = 'JOIN' then 1 else 2 end) over(partition by t.userid,t.tmstamp::date order by t.tmstamp rows between 1 following and unbounded following) min_after
from (select userid,tmstamp from t where actiontype='CANCEL') tc
join t on t.userid=tc.userid and t.tmstamp::date=tc.tmstamp::date
) x
where min_before=1 and min_after=1
)
1) Using a case expression we designate all actiontype JOIN rows as 1 and 2 for all other actiontypes.
2) We join it with the actiontype CANCEL rows.
3) Then we check for the minimum value before CANCEL and minimum value after CANCEL for each date and userid combination. Per the case expression defined, it should be 1.
4) Get all such dates and userid's and fetch the corresponding rows.

sql join two tables with 0 values

I'm having some trouble joining the contents of two tables. Here's the current situation and the table should be shown as result
Table a
id | income
----------
1 | 100
2 | 200
3 | 300
4 | 400
5 | 500
Table b
id | outcome
----------
1 | 10
2 | 20
6 | 60
7 | 70
3 | 30
Result table
id | income | outcome | balance
--------------------------------
1 | 100 | 10 | 100-10=90
2 | 200 | 20 | 200-20=180
3 | 300 | 30 | 300-30=270
4 | 400 | 0 | 400-0=400
5 | 500 | 0 | 500-0=500
6 | 0 | 60 | 0-60=-60
7 | 0 | 70 | 0-70=-70
every id should be disctinct in the result table
income and outcome should be shown in the result table. if the id is not in one of the table income or outcome should be 0
calculation of balance column: income-outcome
Please provide code, without using the statement
where id not in (select id from ...)
because not in statement is not supported by the system I am using.
Thank you very much for your help
You can do what you want using a full outer join:
select coalesce(a.id, b.id) as id,
coalesce(a.income, 0) as income,
coalesce(b.outcome, 0) as outcome,
(coalesce(a.income, 0) - coalesce(b.outcome, 0)) as balance
from tablea a full outer join
tableb b
on a.id = b.id;
You don't specify the database. Full outer join is ANSI-standard and supported by most databases.

SQL Query to Join Two Tables Based On Closest Timestamp

I need to retrieve the records from dbo.transaction (transaction of all users-more than one transaction for each user) that having timestamp which is closest to the time in dbo.bal (current balance details of each user-only one record for each user)
ie, the resultant records should equal to the no of records in the dbo.bal
Here i tried the below query, am getting only the records less than the time in dbo.bal. But there are some record having timestamp greater than and closest to dbo.bal.time
SELECT dbo.bal.uid,
dbo.bal.userId,
dbo.bal.balance,
dbo.bal.time,
(SELECT TOP 1 transactionBal
FROM dbo.transaction
WHERE TIMESTAMP <= dbo.bal.time
ORDER BY TIMESTAMP DESC) AS newBal
FROM dbo.bal
WHERE dbo.bal.time IS NOT NULL
ORDER BY dbo.bal.time DESC
here is my table structure,
dbo.transaction
---------------
| uid| userId | description| timestamp | credit | transactionBal
-------------------------------------------------------------------------
| 1 | 101 | buy credit1| 2012-01-25 03:23:31.624 | 100 | 500
| 2 | 102 | buy credit5| 2012-01-18 03:13:12.657 | 500 | 700
| 3 | 103 | buy credit3| 2012-01-15 02:16:34.667 | 300 | 300
| 4 | 101 | buy credit2| 2012-01-13 05:34:45.637 | 200 | 300
| 5 | 101 | buy credit1| 2012-01-12 07:45:21.457 | 100 | 100
| 6 | 102 | buy credit2| 2012-01-01 08:18:34.677 | 200 | 200
dbo.bal
-------
| uid| userId | balance | time |
-----------------------------------------------------
| 1 | 101 | 500 | 2012-01-13 05:34:45.645 |
| 2 | 102 | 700 | 2012-01-01 08:18:34.685 |
| 3 | 103 | 300 | 2012-01-15 02:16:34.672 |
And the result should be like,
| Id | userId | balance | time | credit | transactionBal
-----------------------------------------------------------------------------
| 1 | 101 | 500 | 2012-01-13 05:34:45.645 | 200 | 300
| 2 | 102 | 700 | 2012-01-01 08:18:34.685 | 200 | 200
| 3 | 103 | 300 | 2012-01-15 02:16:34.672 | 300 | 300
Please help me.. Any help is must appreciated...Thankyou
It would be helpful if you posted your table structures, but ...
I think your inner query needs a join condition. (That is not actually in your question)
Your ORDER BY clause in the inner query could be ABS(TIMESTAMP - DB0.BAL.TIME). That should give you the smallest difference between the 2.
Does that help ?
Based on the follwing Sql Fiddle http://sqlfiddle.com/#!3/7a900/15 I came up with ...
SELECT
bal.uid,
bal.userId,
bal.balance,
bal.time,
trn.timestamp,
trn.description,
datediff(ms, bal.time, trn.timestamp)
FROM
money_balances bal
JOIN money_transaction trn on
trn.userid = bal.userid and
trn.uid =
(
select top 1 uid
from money_transaction trn2
where trn2.userid = trn.userid
order by abs(datediff(ms, bal.time, trn2.timestamp))
)
WHERE
bal.time IS NOT NULL
ORDER BY
bal.time DESC
I cannot vouch for its performance because I know nothing of your data, but I believe it works.
I have simplified my answer - I believe what you need is
SELECT
bal.uid as baluid,
(
select top 1 uid
from money_transaction trn2
where trn2.userid = bal.userid
order by abs(datediff(ms, bal.time, trn2.timestamp))
) as tranuid
FROM
money_balances bal
and from that you can derive all the datasets you need.
for example :
with matched_credits as
(
SELECT
bal.uid as baluid,
(
select top 1 uid
from money_transaction trn2
where trn2.userid = bal.userid
order by abs(datediff(ms, bal.time, trn2.timestamp))
) as tranuid
FROM
money_balances bal
)
select
*
from
matched_credits mc
join money_balances mb on
mb.uid = mc.baluid
join money_transaction trn on
trn.uid = mc.tranuid
Try:
SELECT dbo.bal.uid,
dbo.bal.userId,
dbo.bal.balance,
dbo.bal.time,
(SELECT TOP 1 transactionBal
FROM dbo.transaction
ORDER BY abs(datediff(ms, dbo.bal.time, TIMESTAMP))) AS newBal
FROM dbo.bal
WHERE dbo.bal.time IS NOT NULL
ORDER BY dbo.bal.time DESC