I have a problem with writing a query.
Row data is as follow :
DATE CUSTOMER_ID AMOUNT
20170101 1 150
20170201 1 50
20170203 1 200
20170204 1 250
20170101 2 300
20170201 2 70
I want to know when(which date) the sum of amount for each customer_id becomes more than 350,
How can I write this query to have such a result ?
CUSTOMER_ID MAX_DATE
1 20170203
2 20170201
Thanks,
Simply use ANSI/ISO standard window functions to calculate the running sum:
select t.*
from (select t.*,
sum(t.amount) over (partition by t.customer_id order by t.date) as running_amount
from t
) t
where running_amount - amount < 350 and
running_amount >= 350;
If for some reason, your database doesn't support this functionality, you can use a correlated subquery:
select t.*
from (select t.*,
(select sum(t2.amount)
from t t2
where t2.customer_id = t.customer_id and
t2.date <= t.date
) as running_amount
from t
) t
where running_amount - amount < 350 and
running_amount >= 350;
ANSI SQL
Used for the test: TSQL and MS SQL Server 2012
select
"CUSTOMER_ID",
min("DATE")
FROM
(
select
"CUSTOMER_ID",
"DATE",
(
SELECT
sum(T02."AMOUNT") AMOUNT
FROM "TABLE01" T02
WHERE
T01."CUSTOMER_ID" = T02."CUSTOMER_ID"
AND T02."DATE" <= T01."DATE"
) "AMOUNT"
from "TABLE01" T01
) T03
where
T03."AMOUNT" > 350
group by
"CUSTOMER_ID"
GO
CUSTOMER_ID | (No column name)
----------: | :------------------
1 | 03/02/2017 00:00:00
2 | 01/02/2017 00:00:00
db<>fiddle here
DB-Fiddle
SELECT
tmp.`CUSTOMER_ID`,
MIN(tmp.`DATE`) as MAX_DATE
FROM
(
SELECT
`DATE`,
`CUSTOMER_ID`,
`AMOUNT`,
(
SELECT SUM(`AMOUNT`) FROM tbl t2 WHERE t2.`DATE` <= t1.`DATE` AND `CUSTOMER_ID` = t1.`CUSTOMER_ID`
) AS SUM_UP
FROM
`tbl` t1
ORDER BY
`DATE` ASC
) tmp
WHERE
tmp.`SUM_UP` > 350
GROUP BY
tmp.`CUSTOMER_ID`
Explaination:
First I select all rows and subselect all rows with SUM and ID where the current row DATE is smaller or same as all rows for the customer. From this tabe i select the MIN date, which has a current sum of >350
I think it is not an easy calculation and you have to calculate something. I know It could be seen a little mixed but i want to calculate step by step. As fist step if we can get success for your scenario, I believe it can be made better about performance. If anybody can make better my query please edit my post;
Unfortunately the solution that i cannot try on computer is below, I guess it will give you expected result;
-- Get the start date of customers
SELECT MIN(DATE) AS DATE
,CUSTOMER_ID
INTO #table
FROM TABLE t1
-- Calculate all possible date and where is sum of amount greater than 350
SELECT t1.CUSTOMER_ID
,SUM(SELECT Amount FROM TABLE t3 WHERE t3.DATE BETWEEN t1.DATE
AND t2.DATE) AS total
,t2.DATE AS DATE
INTO #tableCalculated
FROM #table t1
INNER JOIN TABLE t2 ON t.ID = t2.ID
AND t1.DATE != t2.DATE
WHERE total > 350
-- SELECT Min amount and date for per Customer_ID
SELECT CUSTOMER_ID, MIN(DATE) AS DATE
FROM #tableCalculated
GROUP BY ID
SELECT CUSTOMER_ID, MIN(DATE) AS GOALDATE
FROM ( SELECT cd1.*, (SELECT SUM(AMOUNT)
FROM CustData cd2
WHERE cd2.CUSTOMER_ID = cd1.CUSTOMER_ID
AND cd2.DATE <= cd1.DATE) AS RUNNINGTOTAL
FROM CustData cd1) AS custdata2
WHERE RUNNINGTOTAL >= 350
GROUP BY CUSTOMER_ID
DB Fiddle
Related
I have a transactions table for a single year with the amount indicating the debit transaction if the value is negative or credit transaction values are positive.
Now in a given month if the number of debit records is less than 3 or if the sum of debits for a month is less than 100 then I want to charge a fee of 5.
I want to build and sql query for this in postgre:
select sum(amount), count(1), date_part('month', date) as month from transactions where amount < 0 group by month;
I am able get records per month level, I am stuck on how to proceed further and get the result.
You can start by generating the series of month with generate_series(). Then join that with an aggregate query on transactions, and finally implement the business logic in the outer query:
select sum(t.balance)
- 5 * count(*) filter(where coalesce(t.cnt, 0) < 3 or coalesce(t.debit, 0) < 100) as balance
from generate_series(date '2020-01-01', date '2020-12-01', '1 month') as d(dt)
left join (
select date_trunc('month', date) as dt, count(*) cnt, sum(amount) as balance,
sum(-amount) filter(where amount < 0) as debit
from transactions t
group by date_trunc('month', date)
) t on t.dt = d.dt
Demo on DB Fiddle:
| balance |
| ------: |
| 2746 |
How about this approach?
SELECT
SUM(
CASE
WHEN usage.amount_s > 100
OR usage.event_c > 3
THEN 0
ELSE 5
END
) AS YEAR_FEE
FROM (SELECT 1 AS month UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4 UNION
SELECT 5 UNION
SELECT 6 UNION
SELECT 7 UNION
SELECT 8 UNION
SELECT 9 UNION
SELECT 10 UNION
SELECT 11 UNION
SELECT 12
) months
LEFT OUTER JOIN
(
SELECT
sum(amount) AS amount_s,
count(1) event_c,
date_part('month', date) AS month
FROM transactions
WHERE amount < 0
GROUP BY month
) usage ON months.month = usage.month;
First you must use a resultset that returns all the months (1-12) and join it with a LEFT join to your table.
Then aggregate to get the the sum of each month's amount and with conditional aggregation subtract 5 from the months that meet your conditions.
Finally use SUM() window function to sum the result of each month:
SELECT DISTINCT SUM(
COALESCE(SUM(t.Amount), 0) -
CASE
WHEN SUM((t.Amount < 0)::int) < 3
OR SUM(CASE WHEN t.Amount < 0 THEN -t.Amount ELSE 0 END) < 100 THEN 5
ELSE 0
END
) OVER () total
FROM generate_series(1, 12, 1) m(month) LEFT JOIN transactions t
ON m.month = date_part('month', t.date) AND date_part('year', t.date) = 2020
GROUP BY m.month
See the demo.
Results:
> | total |
> | ----: |
> | 2746 |
I think you can use the hanving clause.
Select ( sum(a.total) - (12- count(b.cnt ))*5 ) as result From
(Select sum(amount) as total , 'A' as name from transactions ) as a left join
(Select count(amount) as cnt , 'A' as name
From transactions
where amount <0
group by month(date)
having not(count(amount) <3 or sum(amount) >-100) ) as b
on a.name = b.name
select
sum(amount) - 5*(12-(
select count(*)
from(select month, count(amount),sum(amount)
from transactions
where amount<0
group by month
having Count(amount)>=3 And Sum(amount)<=-100))) as balance
from transactions ;
I have a table with 2 columns date and sales, from this I need to pick up the dates on which sales have increased from previous date. Below is a sample table
Date Sales
-------------------
1/8/2020 10
1/9/2020 12
1/10/2020 8
1/11/2020 7
1/12/2020 13
Output should be as below:
Date
---------
1/9/2020
1/12/2020
Query:
Select data
from table
where sales > sales of previous day
You can use LAG to calculate this:
with cte
as (select date_c
, sales
, lag(sales) over (order by date_c) sales2
from Test)
select date_c, sales from cte
where sales > sales2;
Here is a DEMO
If you have gaps in days, you can consider this following logic with sub query-
DEMO HERE
WITH CTE AS
(
SELECT Date,Sales,
(
SELECT Sales
FROM your_table
WHERE Date = (SELECT MAX(Date) FROM your_table WHERE Date < A.Date)
) Last_day_sales
FROM your_table A
)
SELECT Date,Sales
FROM CTE
WHERE Sales > Last_day_sales*emphasized text*
You can use self join for this requirement.
select
A.date
from TableA as A
inner join TableA as B on B.date = (A.date - interval '1 day')
and A.sales > B.sales;
I have the following two columns.
Date | Market Value
------------------------------
2016-09-08 | 100
2016-09-07 | 130
2016-09-06 | 140
2016-09-05 | 180
I want to add a column that calulcate the difference in Market Value between the two dates.
Date | Market Value | Delta
------------------------------------------
2016-09-08 | 100 | -30
2016-09-07 | 130 | -10
2016-09-06 | 140 | -40
2016-09-05 | 180 |
.
100 (2016-09-08) minus 130 (2016-09-07) = -30
How do I write that function?
In SQL Server 2012+ the most efficient and simple way is to use the built-in LEAD function.
SELECT
[Date]
,[Market Value]
,LEAD([Market Value]) OVER (ORDER BY [Date] DESC) - [Market Value] AS Delta
FROM YourTable
;
LEAD returns the value of the next row as specified by its ORDER BY clause.
All other methods that self-join the table are less efficient.
If you have continous date you can do
select t1.date, t1.market_value, t1.market_value-t2.market_value from data_table t1 left join data_table t2 on t1.date-1=t2.date
If you dont have continous date and want to calculate diffrence between monday and friday you can use rownum for example like this
select t1.date, t1.market_value, t1.market_value-t2.market_value from (select rownum, date,market_value from data_table) t1 left join (select rownum, date,market_value from data_table) t2 on t1.rownum-1=t2.rownum
CREATE PROCEDURE UPDATE_DELTA
#START_DATE DATETIME,
#END_DATE DATETIME
AS BEGIN
UPDATE T
SET DELTA = MARKET_VALUE - (SELECT MARKET_VALUE
FROM YOURTABLE
WHERE [DATE] = T.[DATE] - 1)
FROM YOURTABLE T
WHERE [DATE] BETWEEN #START_DATE AND #END_DATE
END
And then to execute:
EXEC UPDATE_DELTA '2016-09-05', '2016-09-08'
This works as long as you have sequenced dates.
For SQL-Server below 2012 you could try this:
with cte as
(SELECT
ROW_NUMBER() OVER (ORDER BY [Date] DESC) row,
[Date],
[Market Value]
FROM [YourTable])
SELECT
a.[Date] ,
b.[Market Value] - ISNULL(a.[Market Value],0) AS Delta
FROM
cte a
LEFT JOIN cte b
on a.row = b.row+1
The original post is from here: SQL difference between rows
For SQL-Server 2012 and above you can use the recommended LEAD-Function.
Add column and update in the following way:
UPDATE t SET t.Delta = t.Market_Value-t2.Market_Value
FROM yourtable t
INNER JOIN yourtable t2 ON DATEADD(DD,-1,t.Date) = t2.Date
I have a table including more than 5 million rows of sales transactions. I would like to find sum of date intervals between each customer three recent purchases.
Suppose my table looks like this :
CustomerID ProductID ServiceStartDate ServiceExpiryDate
A X1 2010-01-01 2010-06-01
A X2 2010-08-12 2010-12-30
B X4 2011-10-01 2012-01-15
B X3 2012-04-01 2012-06-01
B X7 2012-08-01 2013-10-01
A X5 2013-01-01 2015-06-01
The Result that I'm looking for may looks like this :
CustomerID IntervalDays
A 802
B 135
I know the query need to first retrieve 3 resent transactions of each customer (based on ServiceStartDate) and then calculate the interval between startDate and ExpiryDate of his/her transactions.
You want to calculate the difference between the previous row's ServiceExpiryDate and the current row's ServiceStartDate based on descending dates and then sum up the last two differences:
with cte as
(
select tab.*,
row_number()
over (partition by customerId
order by ServiceStartDate desc
, ServiceExpiryDate desc -- don't know if this 2nd column is necessary
) as rn
from tab
)
select t2.customerId,
sum(datediff(day, prevEnd, ServiceStartDate)) as Intervaldays
,count(*) as purchases
from cte as t2 left join cte as t1
on t1.customerId = t2.customerId
and t1.rn = t2.rn+1 -- previous and current row
where t2.rn <= 3 -- last three rows
group by t2.customerId;
Same result using LEAD:
with cte as
(
select tab.*,
row_number()
over (partition by customerId
order by ServiceStartDate desc) as rn
,lead(ServiceExpiryDate)
over (partition by customerId
order by ServiceStartDate desc
) as prevEnd
from tab
)
select customerId,
sum(datediff(day, prevEnd, ServiceStartDate)) as Intervaldays
,count(*) as purchases
from cte
where rn <= 3
group by customerId;
Both will not return the expected result unless you subtract purchases (or max(rn)) from Intervaldays. But as you only sum two differences this seems to be not correct for me either...
Additional logic must be applied based on your rules regarding:
customer has less than 3 purchases
overlapping intervals
Assuming there are no overlaps, I think you want this:
select customerId,
sum(datediff(day, ServiceStartDate, ServieEndDate) as Intervaldays
from (select t.*, row_number() over (partition by customerId
order by ServiceStartDate desc) as seqnum
from table t
) t
where seqnum <= 3
group by customerId;
Try this:
SELECT dt.CustomerID,
SUM(DATEDIFF(DAY, dt.PrevExpiry, dt.ServiceStartDate)) As IntervalDays
FROM (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY ServiceStartDate DESC) AS rn
, (SELECT Max(ti.ServiceExpiryDate)
FROM yourTable ti
WHERE t.CustomerID = ti.CustomerID
AND ti.ServiceStartDate < t.ServiceStartDate) As PrevExpiry
FROM yourTable t )dt
GROUP BY dt.CustomerID
Result will be:
CustomerId | IntervalDays
-----------+--------------
A | 805
B | 138
I would like to know how to make intersections or concatenations of adjacent date ranges in sql.
I have a list of customer start and end dates, for example (in dd/mm/yyyy format, where 31/12/9999 means the customer is still a current customer).
CustID | StartDate | Enddate |
1 | 01/08/2011|19/06/2012|
1 | 20/06/2012|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|10/10/2010|
2 | 11/10/2010|31/12/9999|
3 | 01/08/2010|19/08/2010|
3 | 20/08/2010|26/12/2011|
Although the dates in different rows don't overlap, I would consider some of the ranges as a contigous period of time, e.g when the start date comes one day after an end date (for a given customer). Hence I would like to return a query that returns just the intersection of the dates,
CustID | StartDate | Enddate |
1 | 01/08/2011|07/03/2012|
1 | 03/05/2012|31/12/9999|
2 | 09/03/2009|16/08/2009|
2 | 16/01/2010|31/12/9999|
3 | 01/08/2010|26/12/2011|
I've looked at CTE tables, but I can't figure out how to return just one row for one contigous block of dates.
This should work in 2005 forward:
;WITH cte2 AS (SELECT 0 AS Number
UNION ALL
SELECT Number + 1
FROM cte2
WHERE Number < 10000)
SELECT CustID, Min(GroupStart) StartDate, MAX(EndDate) EndDate
FROM (SELECT *
, DATEADD(DAY,b.number,a.StartDate) GroupStart
, DATEADD(DAY,1- DENSE_RANK() OVER (PARTITION BY CustID ORDER BY DATEADD(DAY,b.number,a.StartDate)),DATEADD(DAY,b.number,a.StartDate)) GroupDate
FROM Table1 a
JOIN cte2 b
ON b.number <= DATEDIFF(d, startdate, EndDate)
) X
GROUP BY CustID, GroupDate
ORDER BY CustID, StartDate
OPTION (MAXRECURSION 0)
Demo: SQL Fiddle
You can build a quick table of numbers 0-something large enough to cover the spread of dates in your ranges to replace the cte so it doesn't run each time, indexed properly it will run quickly.
you can do this with recursive common table expression:
with cte as (
select t.CustID, t.StartDate, t.EndDate, t2.StartDate as NextStartDate
from Table1 as t
left outer join Table1 as t2 on t2.CustID = t.CustID and t2.StartDate = case when t.EndDate < '99991231' then dateadd(dd, 1, t.EndDate) end
), cte2 as (
select c.CustID, c.StartDate, c.EndDate, c.NextStartDate
from cte as c
where c.NextStartDate is null
union all
select c.CustID, c.StartDate, c2.EndDate, c2.NextStartDate
from cte2 as c2
inner join cte as c on c.CustID = c2.CustID and c.NextStartDate = c2.StartDate
)
select CustID, min(StartDate) as StartDate, EndDate
from cte2
group by CustID, EndDate
order by CustID, StartDate
option (maxrecursion 0);
sql fiddle demo
Quick performance tests:
Results on 750 rows, small periods of 2 days length:
sql fiddle demo
My query: 300 ms
Goat CO query with CTE: 10804 ms
Goat CO query with table of fixed numbers: 7 ms
Results on 5 rows, large periods:
sql fiddle demo
My query: 1 ms
Goat CO query with CTE: 700 ms
Goat CO query with table of fixed numbers: 36 ms