Extracting account's transactions with no other transactions with this counterparty - sql

I have a database with transactions of accounts. The relevant columns for me are: Account,Amount, Date, description and Transaction_Code.
My goal is to extract rows for a given account which meets my trigger points.
The trigger points which I've succeeded writing are Amount greater than 200 and Transaction_Code in ('1,'2','3').
the only trigger point I'm struggling with is that: The account has no other transactions with this counterparty in the last 21 days. I've only succeeded in taking the range of dates I need.
Example for the Dataset:
**Account** **Amount** **Date** **Description** **Transaction_Code**
555 280 2019-10-06 amt_fee 1
555 700 2019-09-20 refund 2
555 250 2019-10-01 amt_fee 1
snippet of sql I wrote for the example for better understanding:
select Account, Amount, Date, Description
from MyTable
where Account = '555' and Date between '2019-09-15' and '2019-10-06'
and Amount >= 200
and Transaction_Code in ('1','2','3')
The problem I have is how to do the condition of: ''The account has no other transactions with this counterparty in the last 21 days.'' Counterparty refers to the Description or Transaction_Code columns.
How should I do that condition for my true larger dataset? with groupby and count distinct?

You could add a not exists condition with a correlated subquery that ensures that the same Account did not have a transaction with the same Description or Transaction_Code within the last 21 days.
select Account, Amount, Date, Description
from MyTable t
where
Account = '555' and Date between '2019-09-15' and '2019-10-06'
and Amount >= 200
and Transaction_Code in (1, 2, 3)
and not exists (
select 1
from MyTable t1
where
t1.Account = t.Account
and (t1.Description = t.Description or t1.Transaction_Code = t.Transaction_Code)
and t1.date < t.date
and t1.date >= dateadd(day, -21, t.date)
)

Related

Return data where the running total of amounts 30 days from another row is greater than or equal to the amount for that row

Let's say I a table that contains the date, account, product, purchase type, and amount like below:
Looking at this table, you can see that for any particular account/product combination, there are buys and sells. Essentially, what I'd like to write is a SQL query that flags the following: Are there accounts that bought at a certain amount and then sold the same aggregate amount or more 30 days from that buy?
So for example, we can see account 1 bought product A for 20k on 8/1. If we look at the running sum of sells by account 1 for product A over the next 30 days, we see they sold a total of 20k - the same as the initial buy:
Ideally, the query would return results that flag all of these instances: for each individual buy, find all sells for that product/account 30 days from that buy, and only return rows where the running total of sells is greater than or equal to that initial buy.
EDIT: Using the sample data provided, the desired should look more or less look like the following:
You'll see that the buy on 8/2 for product B/account 2 is not returned because the running sum of sells for that product/account/buy combination over the next 30 days does not equal or exceed the buy amount of 35k but it does return rows for the buy on 8/3 for product B/ account 2 because the sells do exceed the buy amount of 10k.
I know I need to self join the sells against the buys, where the accounts/products equal and the datediff is less than or equal 30 and I basically have that part structured. What I can't seem to get working is the running total part and only returning data when that total is greater than or equal to that buy. I know I likely need to use the over/partition by clauses for the running sum but I'm struggling to produce the right results/optimize properly. Any help on this would be greatly appreciated - just looking for some general direction on how to approach this.
Bonus: Would be even more powerful to stop returning the sells once the running total passes the buy, so for example, the last two rows in the desired output I provided aren't technically needed - since the first two sells following the buy had already eclipsed the buy amount.
In SQL Server, one option uses a lateral join:
select
t.*,
case when t.amount = x.amount then 1 else 0 end as is_returned
from mytable t
cross apply (
select sum(amount) amount
from mytable t1
where
t1.purchase_type = 'Sell'
and t1.account = t.account
and t1.product = t.product
and t1.date >= t.date
and t1.date <= dateadd(day, 30, t.date)
) x
where t.purchase_type = 'Buy'
The lateral join sums the amount of "sells" of the same account and product within the following 30 days, which you can then compare with the amount of the buy. The query gives you one row per buy, with a boolean flag that indicates if the amounts match.
In databases that support the range specification to window functions, this would be more efficiently expressed with a window sum:
select *
from (
select
t.*,
case when amount = sum(case when purchase_type = 'Sell' then amount end) over(
partition by account, product
order by date
range between current row and interval '30' day following
) then 1 else 0 end as is_returned
from mytable t
) t
where purchase_type = 'Buy'
Edit: this would generate a resultset similar to the third table in your question:
select t.*, x.*
from mytable t
cross apply (
select
t1.date sale_date,
t1.amount sell_amount,
sum(t1.amount) over(order by t1.date) running_sell_amount,
sum(t1.amount) over() total_sell_amount
from mytable t1
where
t1.purchase_type = 'Sell'
and t1.account = t.account
and t1.product = t.product
and t1.date >= t.date
and t1.date <= dateadd(day, 30, t.date)
) x
where t.purchase_type = 'Buy' and t.amount = x.total_sell_amount

SQL Server Interest calculations on transactions

I'm looking for advice on best way to build a compound interest module in SQL server. Basic set up:
Table: Transactions {TransID, MemberID, Trans_Date, Trans_Code, Trans_Value).
Table: Interest {IntID, Int_eff_Date, Int_Rate}
In the interest table, there may be different rates with an effective date - can never have an over lapping date though. For example:
Int_Eff_Date Int_Rate
01/01/2016 7%
01/10/2016 7.5%
10/01/2017 8%
I want to calculate the interest based on the transaction date and transaction value, where the correct interest rate is applied relative to transaction date.
So if Table transaction had:
TransID MemberID Trans_Date Trans_Value
1 1 15/04/2016 150
2 1 18/10/2016 200
3 1 24/11/2016 200
4 1 15/01/2017 250
For transID 1 it would use 7% from 15/04/2016 until 30/09/2016 (168 days) from 1/10/2016 to 09/01/2017 it would use 7.% and then from 10/01/2007 to calculation date (input parameter) it would use 8%.
It would apply similar methodology for all transactions, add them up and display the interest value.
I'm not sure if I should use cursors, UDF, etc.
This should provide an outline of what you're trying to do.
--Build Test Data
CREATE TABLE #Rates(Int_Eff_Date DATE
, Int_Rate FLOAT)
CREATE TABLE #Transactions(TransID INT
,MemberID INT
,Trans_Date DATE
,Trans_Value INT)
INSERT INTO #Rates
VALUES ('20160101',7)
,('20161001',7.5)
,('20170110',8)
INSERT INTO #Transactions
VALUES
(1,1,'20160415',150)
,(2,1,'20161018',200)
,(3,1,'20161124',200)
,(4,1,'20170115',250)
;WITH cte_Date_Rates
AS
(
SELECT
S.Int_Eff_Date
,ISNULL(E.Int_Eff_Date,'20490101') AS "Expire"
,S.Int_Rate
FROM
#Rates S
OUTER APPLY (SELECT TOP 1 Int_Eff_Date
FROM #Rates E
WHERE E.Int_Eff_Date > S.Int_Eff_Date
ORDER BY E.Int_Eff_Date) E
)
SELECT
T.*
,R.Int_Rate
FROM
#Transactions T
LEFT JOIN cte_Date_Rates R
ON
T.Trans_Date >= R.Int_Eff_Date
AND
T.Trans_Date < R.Expire

Count number of transactions for first 30 days of account creation for all accounts

I want to count the number of transactions for the first 30 days from an account's creation for all accounts. The issue is not all accounts were created at the same time.
Example: [Acct_createdTable]
Acct Created_date
909099 01/02/2015
878787 02/03/2003
676767 09/03/2013
I can't Declare a datetime variable since it can only take one datetime.
and I can't do :
Select acctnumber,min,count(*)
from transaction_table
where transactiondate between (
select Created_date from Acct_createdTable where Acct = 909099)
and (
select Created_date from Acct_createdTable where Acct = 909099)+30
Since then it'll only count the number of transaction for only one acct.
What I want for my output is.
Acct First_30_days_count
909099 23
878787 190
676767 23
I think what you're looking for is a basic GROUP BY query.
SELECT
ac.acctnumber,
COUNT(td.id)
FROM Acct_createdTable ac
LEFT JOIN transactiondate td ON
td.acct = ac.acctnumber
AND
td.transaction_date BETWEEN ac.create_date AND DATEADD(30, DAY, ac.create_date)
GROUP BY
ac.acctnumber
This should return number of transactions within first 30 days for each account. This of course is pseudocode as you didn't state your database platform. The left join will ensure that accounts with no transactions in that period will get displayed.
An alternative solution would be to use outer apply like this:
select a.acct, o.First_30_days_count
from acct_createdtable a
outer apply (
select count(*) First_30_days_count
from transaction_table
where acctnumber = a.acct
and transactiondate between a.created_date and dateadd(day, 30, a.created_date)
) o;

Join two Queries so that the second query becomes a row in the results of query 1

I have two queries that I would like to combine so i can make a chart out of the results.
The results have to be very specific or the chart will not display the information properly
I am using MS SQL in Crystal Reports 11
Below is the results I am looking for.
Date Invoice Type Amount
2012/08 Customer Payment 500
2012/08 Customer Invoice 1000
2012/08 Moving Balance 1500
2012/09 Customer Invoice 400
2012/09 Moving Balance 1900
2012/10 Interest 50
2012/10 Moving Balance 1950
So the First query returns the following results
Date Invoice Type Amount
2012/08 Customer Payment 500
2012/08 Customer Invoice 1000
2012/09 Customer Invoice 400
2012/10 Interest 50
and the second query returns
Date Invoice Type Amount
2012/08 Moving Balance 1500
2012/09 Moving Balance 1900
2012/10 Moving Balance 1950
The second query is very long and complicated with a join .
What is the best way of joining these two queries
so that I have one column called invoice Type ( as the chart is based on this field)
that covers all the invoice types plus the moving balance
I assume that the place of the Moving Balance rows inside the result set is important.
You can do something like this:
select date, invoice_type, amount
from
(
select date, invoice_type, amount from query1
union all
select date, invoice_type, amount from query2
)
order by date, case invoice_type when 'Moving Balance' then 1 else 0 end
This first appends the results of the second query to the results of the first query and then reorders the resulting list first by date and then by the invoice type in such a way that the row with Moving balance will come last.
With the actual queries you have given, it should look something like this:
select date, invoice_type, amount
from
(
SELECT
CONVERT(VARCHAR(7),case_createddate, 111) AS Date,
case_invoicetype as invoice_type,
Sum(case_totalexvat) as amount
FROM cases AS ca
WHERE case_primaryCompanyid = 2174 and
datediff(m,case_createddate,getDate())
union all
select
CONVERT(VARCHAR(7),ca.case_createddate, 111) AS Date,
'Moving Balance' as Invoice_Type,
sum(mb.Amount) as Amount
from
cases as ca
left join (
select
case_primaryCompanyId as ID,
case_createdDate,
case_TotalExVat as Amount
from
cases
) mb
on ca. case_primaryCompanyId = mb.ID
and ca.case_createdDate >= mb.case_CreatedDate
where
ca.case_primaryCompanyId = 2174 and
ca.case_createdDate > DATEADD(m, -12, current_timestamp)
group by
case_primaryCompanyId,
CONVERT(VARCHAR(7),ca.case_createddate, 111)
order by ca.case_primaryCompanyid, CONVERT(VARCHAR(7),ca.case_createddate, 111)
)
order by date, case invoice_type when 'Moving Balance' then 1 else 0 end
You can use Union and can use Order by clause
Select * from (Query 1
Union
Query 2
) as a Order by a.Date Asc

SQL Query: Calculating the deltas in a time series

For a development aid project I am helping a small town in Nicaragua improving their water-network-administration.
There are about 150 households and every month a person checks the meter and charges the houshold according to the consumed water (reading from this month minus reading from last month). Today all is done on paper and I would like to digitalize the administration to avoid calculation-errors.
I have an MS Access Table in mind - e.g.:
*HousholdID* *Date* *Meter*
0 1/1/2013 100
1 1/1/2013 130
0 1/2/2013 120
1 1/2/2013 140
...
From this data I would like to create a query that calculates the consumed water (the meter-difference of one household between two months)
*HouseholdID* *Date* *Consumption*
0 1/2/2013 20
1 1/2/2013 10
...
Please, how would I approach this problem?
This query returns every date with previous date, even if there are missing months:
SELECT TabPrev.*, Tab.Meter as PrevMeter, TabPrev.Meter-Tab.Meter as Diff
FROM (
SELECT
Tab.HousholdID,
Tab.Data,
Max(Tab_1.Data) AS PrevData,
Tab.Meter
FROM
Tab INNER JOIN Tab AS Tab_1 ON Tab.HousholdID = Tab_1.HousholdID
AND Tab.Data > Tab_1.Data
GROUP BY Tab.HousholdID, Tab.Data, Tab.Meter) As TabPrev
INNER JOIN Tab
ON TabPrev.HousholdID = Tab.HousholdID
AND TabPrev.PrevData=Tab.Data
Here's the result:
HousholdID Data PrevData Meter PrevMeter Diff
----------------------------------------------------------
0 01/02/2013 01/01/2013 120 100 20
1 01/02/2013 01/01/2012 140 130 10
The query above will return every delta, for every households, for every month (or for every interval). If you are just interested in the last delta, you could use this query:
SELECT
MaxTab.*,
TabCurr.Meter as CurrMeter,
TabPrev.Meter as PrevMeter,
TabCurr.Meter-TabPrev.Meter as Diff
FROM ((
SELECT
Tab.HousholdID,
Max(Tab.Data) AS CurrData,
Max(Tab_1.Data) AS PrevData
FROM
Tab INNER JOIN Tab AS Tab_1
ON Tab.HousholdID = Tab_1.HousholdID
AND Tab.Data > Tab_1.Data
GROUP BY Tab.HousholdID) As MaxTab
INNER JOIN Tab TabPrev
ON TabPrev.HousholdID = MaxTab.HousholdID
AND TabPrev.Data=MaxTab.PrevData)
INNER JOIN Tab TabCurr
ON TabCurr.HousholdID = MaxTab.HousholdID
AND TabCurr.Data=MaxTab.CurrData
and (depending on what you are after) you could only filter current month:
WHERE
DateSerial(Year(CurrData), Month(CurrData), 1)=
DateSerial(Year(DATE()), Month(DATE()), 1)
this way if you miss a check for a particular household, it won't show.
Or you might be interested in showing last month present in the table (which can be different than current month):
WHERE
DateSerial(Year(CurrData), Month(CurrData), 1)=
(SELECT MAX(DateSerial(Year(Data), Month(Data), 1))
FROM Tab)
(here I am taking in consideration the fact that checks might be on different days)
I think the best approach is to use a correlated subquery to get the previous date and join back to the original table. This ensures that you get the previous record, even if there is more or less than a 1 month lag.
So the right query looks like:
select t.*, tprev.date, tprev.meter
from (select t.*,
(select top 1 date from t t2 where t2.date < t.date order by date desc
) prevDate
from t
) join
t tprev
on tprev.date = t.prevdate
In an environment such as the one you describe, it is very important not to make assumptions about the frequency of reading the meter. Although they may be read on average once per month, there will always be exceptions.
Testing with the following data:
HousholdID Date Meter
0 01/12/2012 100
1 01/12/2012 130
0 01/01/2013 120
1 01/01/2013 140
0 01/02/2013 120
1 01/02/2013 140
The following query:
SELECT a.housholdid,
a.date,
b.date,
a.meter,
b.meter,
a.meter - b.meter AS Consumption
FROM (SELECT *
FROM water
WHERE Month([date]) = Month(Date())
AND Year([date])=year(Date())) a
LEFT JOIN (SELECT *
FROM water
WHERE DateSerial(Year([date]),Month([date]),Day([date]))
=DateSerial(Year(Date()),Month(Date())-1,Day([date])) ) b
ON a.housholdid = b.housholdid
The above query selects the records for this month Month([date]) = Month(Date()) and compares them to records for last month ([date]) = Month(Date()) - 1)
Please do not use Date as a field name.
Returns the following result.
housholdid a.date b.date a.meter b.meter Consumption
0 01/02/2013 01/01/2013 120 100 20
1 01/02/2013 01/01/2013 140 130 10
Try
select t.householdID
, max(s.theDate) as billingMonth
, max(s.meter)-max(t.meter) as waterUsed
from myTbl t join (
select householdID, max(theDate) as theDate, max(meter) as meter
from myTbl
group by householdID ) s
on t.householdID = s.householdID and t.theDate <> s.theDate
group by t.householdID
This works in SQL not sure about access
You can use the LAG() function in certain SQL dialects. I found this to be much faster and easier to read than joins.
Source: http://blog.jooq.org/2015/05/12/use-this-neat-window-function-trick-to-calculate-time-differences-in-a-time-series/