Identify two rows with 1 year or more of difference - sql

I have a table called finance that I store all payment of the customer. The main columns are: ID,COSTUMERID,DATEPAID,AMOUNTPAID.
What I need is a list of dates by COSTUMERID with dates of its first payment and any other payment that is grater than 1 year of the last one. Example:
+----+------------+------------+------------+
| ID | COSTUMERID | DATEPAID | AMOUNTPAID |
+----+------------+------------+------------+
| 1 | 1 | 2015-01-10 | 10 |
| 2 | 1 | 2016-01-05 | 30 |
| 2 | 1 | 2017-02-20 | 30 |
| 3 | 2 | 2016-03-15 | 100 |
| 4 | 2 | 2017-02-15 | 100 |
| 5 | 3 | 2017-05-01 | 25 |
+----+------------+------------+------------+
What I expect as result:
+------------+------------+
| COSTUMERID | DATEPAID |
+------------+------------+
| 1 | 2015-01-01 |
| 1 | 2017-02-20 |
| 2 | 2016-03-15 |
| 3 | 2017-05-01 |
+------------+------------+
Costumer 1 have 2 dates: the first one + one more that have more then 1 year after the last one.
I hope I make my self clear.

I think you just want lag():
select t.*
from (select t.*,
lag(datepaid) over (partition by customerid order by datepaid) as prev_datepaid
from t
) t
where prev_datepaid is null or
datepaid > dateadd(year, 1, prev_datepaid);

Gordon's solution is correct, as long as you are only looking at the previous row (previous payment) diff, but I wonder if Antonio is looking for payments greater than one year from the last 1 year payment, in which case this becomes a more complex problem to solve. Take the following example:
CREATE TABLE #Test (
CustomerID smallint
,DatePaid date
,AmountPaid smallint )
INSERT INTO #Test
SELECT 1, '2015-1-10', 10
INSERT INTO #Test
SELECT 1, '2016-1-05', 30
INSERT INTO #Test
SELECT 1, '2017-2-20', 30
INSERT INTO #Test
SELECT 1, '2017-6-30', 50
INSERT INTO #Test
SELECT 1, '2018-3-5', 50
INSERT INTO #Test
SELECT 1, '2018-5-15', 50
INSERT INTO #Test
SELECT 2, '2016-3-15', 100
INSERT INTO #Test
SELECT 2, '2017-6-15', 100
WITH CTE AS (
SELECT
CustomerID
,DatePaid
,LAG(DatePaid) OVER (PARTITION BY CustomerID ORDER BY DatePaid) AS PreviousPaidDate
,AmountPaid
FROM #Test )
SELECT
*
,-DATEDIFF(DAY, DatePaid, PreviousPaidDate) AS DayDiff
,CASE WHEN DATEDIFF(DAY, PreviousPaidDate, DatePaid) >= 365 THEN 1 ELSE 0 END AS Paid
FROM CTE
Row number 5 is > 1 year from the last 1 year payment, but subtracting from previous row doesn't address this. This may or may not matter but I wanted to point it out in case that is what he means.

Related

recursive moving average with sql

supose we have the next table:
table example
and what i need is:
frst iteration: calculate the moving average 5 days before the last day including the last day = (2+1+2+3+4)/5 = 2.4 and "save" this result, that result will be a prediction for the next day.
scnd iteration: calculate the moving average 5 days before the last, day where the last day basal cell is the value calculated in the previous iteration. (1+2+3+4+2.4)/5 = 2.48
..
and so on.. the recursion will stop for a concrete future day for example: 2022-12-9
deseable output for future day: 2022-12-9
| date_ | art_id | basal_sell |
| ------------| -----------|------------|
| 2022-12-01 | 1 | 2 |
| 2022-12-02 | 1 | 1 |
| 2022-12-03 | 1 | 2 |
| 2022-12-04 | 1 | 3 |
| 2022-12-05 | 1 | 4 |
| 2022-12-06 | 1 | 2.4 |
| 2022-12-07 | 1 | 2.48 |
| 2022-12-08 | 1 | 2.776 |
| 2022-12-09 | 1 | 2.9312 |
this is the partial problem, in the real problem will be a bunch of arts_ids but i think the idea for this parcial problem will be the solution for the big problem (with some little changes).
what i think:
I thought a recursive cte where in the recursive part of the cte i have a union that will be union the temporary table with the new row that i calculated.
Something like:
with MiCte as (
select *
from sells
union all
(
select * from MiCte
)
union
(
select dateadd(day, 1, date_), art_id, basal_sell
from(
select top 1 c.date_, c.art_id,
AVG(c.basal_sell) OVER (partition by c.art_id
ORDER BY c.date_
rows BETWEEN 4 PRECEDING AND current row) basal_sell
from MiCte c
order by c.date_ desc
) as tmp
)
) select * from MiCte
Obviously if I contemplate having more than one art_id I have to take this into account when making top 1 (which I still couldn't think of how to solve).
the example table:
CREATE TABLE sells
(date_ DATETIME,
art_id int,
basal_sell int)
;
INSERT INTO sells
(date_, art_id , basal_sell)
VALUES ('2022-12-1', 1, 2),
('2022-12-2', 1, 1),
('2022-12-3', 1, 2),
('2022-12-4', 1, 3),
('2022-12-5', 1, 4);

How do I summarize sales data in SQL by month for last 24months?

I have big number of rows with sales for different products on various days.
I want to retrieve the sum for each product and per month. For the last 24months.
How do I write a WHERE function showing the last 24 months (based on latest date in table not actual date)?
How is that summarized and shown by month instead of individual days like 2018-01-24?
**Sample Data Table**
| SalesDate | Product | SLSqty |
| 2018-01-24 | Product A | 25 |
| 2019-06-10 | Product B | 10 |
| 2019-10-07 | Product C | 4 |
| 2020-03-05 | Product A | 20 |
| 2021-09-01 | Product A | 50 |
| 2021-09-01 | Product B | 10 |
| 2021-09-02 | Product C | 3 |
| 2021-09-04 | Product A | 50 |
| 2021-09-07 | Product B | 10 |
**Expected Result**
| SalesMONTH | Product | SLSqty |
| 2019-10-31 | Product C | 4 |
| 2020-03-31 | Product A | 20 |
| 2021-09-30 | Product A | 100|
| 2021-09-30 | Product A | 20 |
| 2021-09-30 | Product B | 3 |
I would make a parameter that stores the value of the latest date in your table. Then you can impute the parameter in you WHERE clause.
IF OBJECT_ID('TEMPDB..#TEMP') IS NOT NULL
DROP TABLE #TEMP
CREATE TABLE #TEMP(
[SalesDate] DATE
,[product] NVARCHAR(20)
,[SLSqty] INT
)
INSERT INTO #TEMP([SalesDate],[product],[SLSqty])
VALUES('2018-01-24','Product A',25)
,('2019-06-10','Product B',10)
,('2019-10-07','Product C',4 )
,('2020-03-05','Product A',20)
,('2021-09-01','Product A',50)
,('2021-09-01','Product B',10)
,('2021-09-02','Product C',3 )
,('2021-09-04','Product A',50)
,('2021-09-07','Product B',10)
DECLARE #DATEVAR AS DATE = (SELECT MAX(#TEMP.SalesDate) FROM #TEMP)
The last line declares the variable. If you select #DATEVAR, you get the output of a single date defined by the select statement:
Then you impute it into a where clause. Since you want 24 months prior to the latest date, I would use a DATEDIFF(MONTH,,) function in your where clause. It outputs an integer of months and you simply constrain it to be 24 months or less.
SELECT #TEMP.SalesDate
,#TEMP.product
,#TEMP.SLSqty
,DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) [# of months Diff]
FROM #TEMP
WHERE DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) <= 24
OUTPUT:
Now you have to aggregate the sales grouped by the year-month and product.
I compute year-month by calculating an integer like 202109 (Sept. 2021)
SELECT --#TEMP.SalesDate --(YOU HAVE TO TAKE THIS OUT FOR THE GROUP BY)
YEAR(#TEMP.SalesDate)*100+MONTH(#TEMP.SalesDate) [year-month for GROUP BY]
,#TEMP.product
,SUM(#TEMP.SLSqty) SLSqty
-- ,DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) [# of months Diff] --(YOU HAVE TO TAKE THIS OUT FOR THE GROUP BY)
FROM #TEMP
WHERE DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) <= 24
GROUP BY YEAR(#TEMP.SalesDate)*100+MONTH(#TEMP.SalesDate)
,#TEMP.product
Output:
Here is some oracle sql:
With data ( SalesDate,Product,SLSqty)as(
Select to_date('2018-01-24'),'Product A',25 from dual union all
Select to_date('2019-06-10'),'Product B',10 from dual union all
Select to_date('2019-10-07'),'Product C',4 from dual union all
Select to_date('2020-03-05'),'Product A',20 from dual union all
Select to_date('2021-09-01'),'Product A',50 from dual union all
Select to_date('2021-09-01'),'Product B',10 from dual union all
Select to_date('2021-09-02'),'Product C',3 from dual union all
Select to_date('2021-09-04'),'Product A',50 from dual union all
Select to_date('2021-09-07'),'Product B',10 from dual),
theLatest(SalesDate) as(
select max(SalesDate) from data
)
select to_char(d.SalesDate,'YYYY-MM'),d.Product, sum(SLSqty)
from data d
Join theLatest on d.SalesDate >= add_months(theLatest.SalesDate,-24)
group by to_char(d.SalesDate,'YYYY-MM'),d.Product
order by to_char(d.SalesDate,'YYYY-MM')

SQL Query to apply a command to multiple rows

I am new to SQL and trying to write a statement similar to a 'for loop' in other languages and am stuck. I want to filter out rows of the table where for all of attribute 1, attribute2=attribute3 without using functions.
For example:
| Year | Month | Day|
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 4 |
| 2 | 3 | 3 |
| 2 | 4 | 4 |
| 3 | 4 | 4 |
| 3 | 4 | 4 |
| 3 | 4 | 4 |
I would only want the row
| Year | Month | Day|
|:---- |:------:| -----:|
| 3 | 4 | 4 |
because it is the only where month and day are equal for all of the values of year they share.
So far I have
select year, month, day from dates
where month=day
but unsure how to apply the constraint for all of year
-- month/day need to appear in aggregate functions (since they are not in the GROUP BY clause),
-- but the HAVING clause ensure we only have 1 month/day value (per year) here, so MIN/AVG/SUM/... would all work too
SELECT year, MAX(month), MAX(day)
FROM my_table
GROUP BY year
HAVING COUNT(DISTINCT (month, day)) = 1;
year
max
max
3
4
4
View on DB Fiddle
So one way would be
select distinct [year], [month], [day]
from [Table] t
where [month]=[day]
and not exists (
select * from [Table] x
where t.[year]=x.[year] and t.[month] <> x.[month] and t.[day] <> x.[day]
)
And another way would be
select distinct [year], [month], [day] from (
select *,
Lead([month],1) over(partition by [year] order by [month])m2,
Lead([day],1) over(partition by [year] order by [day])d2
from [table]
)x
where [month]=m2 and [day]=d2

Get value for the first date in month

I have weekly data of each product stock. I want to group it by year-month and get the first value of each month. In other words, I want to get the opening stock of each month, regardless the day of the month.
+------------+---------+
| MyDate | MyValue |
+------------+---------+
| 2018-01-06 | 2 |*
| 2018-01-13 | 7 |
| 2018-01-20 | 5 |
| 2018-01-27 | 2 |
| 2018-02-03 | 3 |*
| 2018-02-10 | 10 |
| 2018-02-17 | 6 |
| 2018-02-24 | 4 |
| 2018-03-03 | 7 |*
| 2018-03-10 | 5 |
| 2018-03-17 | 3 |
| 2018-03-24 | 4 |
| 2018-03-31 | 6 |
+------------+---------+
Desired results:
+----------------+---------+
| FirstDayOfMonth| MyValue |
+----------------+---------+
| 2018-01-01 | 2 |
| 2018-02-01 | 3 |
| 2018-03-01 | 7 |
+----------------+---------+
I thought this might work, but it ain't.
select
[product],
datefromparts(year([MyDate]), month([MyDate]), 1),
FIRST_VALUE(MyValue) OVER (PARTITION BY [Product], YEAR([MyDate]), MONTH([MyDate]) ORDER BY [MyDate] ASC) AS MyValue
from
MyTable
group by
[Product],
YEAR([MyDate]), MONTH([MyDate])
Edit. Thank you. The accent in my question is not how to get the first day of the month. I know that there are different techniques for that.
The accent is how to get the FIRST value in month (the opening stock). If there is a chance to get the closing stock in one shot - it would be great. The answers based on ROW_NUMBER do not allow to get closing stock in one shot, would require two joins.
Edit after accepting answer
Please consider John Cappelletti's answer as an alternative to the accepted one: https://stackoverflow.com/a/53559750/1903793
You don't really need the GROUP BY if you have chosen the window function route:
SELECT Product, DATEADD(DAY, 1, EOMONTH(MyDate, -1)) AS Month, MyValue
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Product, DATEADD(DAY, 1, EOMONTH(MyDate, -1)) ORDER BY MyDate) AS rn
FROM t
) AS x
WHERE rn = 1
UPDATE
To get the last row for the month just do a UNION ALL <above query> but change the order by clause to ORDER BY MyDate DESC. This will give you two rows per product-month.
You can use apply & eomonth to find the last day of month & add one day :
select distinct dateadd(day, 1, eomonth(t1.mydate, -1)) as FistDayOfMonth, t1.myvalue
from table t cross apply
( select top (1) t1.mydate, t1.myvalue
from table t1
where t1.product = t.product and
year(t1.MyDate) = year(t.MyDate) and month(t1.MyDate) = month(t.MyDate)
order by t1.mydate
) t1;
Could also use a rowNumber and cte.
DEMO
WITH CTE as (
SELECT '2018-01-06' myDate, 2 Myvalue UNION ALL
SELECT '2018-01-13', 7 UNION ALL
SELECT '2018-01-20', 5 UNION ALL
SELECT '2018-01-27', 2 UNION ALL
SELECT '2018-02-03', 3 UNION ALL
SELECT '2018-02-10', 10 UNION ALL
SELECT '2018-02-17', 6 UNION ALL
SELECT '2018-02-24', 4 UNION ALL
SELECT '2018-03-03', 7 UNION ALL
SELECT '2018-03-10', 5 UNION ALL
SELECT '2018-03-17', 3 UNION ALL
SELECT '2018-03-24', 4 UNION ALL
SELECT '2018-03-31', 6),
CTE2 as (SELECT *
, Row_Number() over (partition by DATEADD(month, DATEDIFF(month, 0, MyDate), 0) order by myDate) RN
FROM CTE)
SELECT DATEADD(month, DATEDIFF(month, 0, MyDate), 0), MyValue
FROM cte2
WHERE RN = 1
Giving us:
+----+---------------------+---------+
| | (No column name) | MyValue |
+----+---------------------+---------+
| 1 | 01.01.2018 00:00:00 | 2 |
| 2 | 01.02.2018 00:00:00 | 3 |
| 3 | 01.03.2018 00:00:00 | 7 |
+----+---------------------+---------+
Just another option is using the WITH TIES, and then a little cheat for the date
Example
Select top 1 with ties
MyDate = convert(varchar(7),MyDate,120)+'-01'
,MyValue
from YourTable
Order By Row_Number() over (Partition By convert(varchar(7),MyDate,120) Order By MyDate)
Returns
MyDate MyValue
2018-01-01 2
2018-02-01 3
2018-03-01 7

SQL Server: how to do 3 months' partition?

I want to count the number of rows in the partition 0-3months. Months are specified by MYMONTH in the format such that 201601 for 2016 January. I am using SQL Server 2014. How can I do the partition over 3 months?
SELECT COUNT(*),
COUNT(*)
/
(COUNT(*) OVER (PARTITION
BY MYMONTH RANGE BETWEEN 3 MONTH PRECEDING AND CURRENT MONTH))
FROM myData
Sample
| Month | Value | ID |
-------------------------|
| 201601 | 1 | X |
| 201601 | 1 | Y |
| 201601 | 1 | Y |
| 201602 | 1 | Z |
| 201603 | 1 | A |
| 201604 | 1 | B |
| 201605 | 1 | C |
| 201607 | 1 | E |
| 201607 | 10 | EE |
| 201607 | 100 | EEE|
Counts
| Month | Count | Count3M | Count/Count3M |
-------------------------------------------
| 201601| 3 | 3 | 3/3 |
| 201602| 1 | 4 | 1/4 |
| 201603| 1 | 5 | 1/5 |
| 201604| 1 | 6 | 1/6 |
| 201605| 1 | 4 | 1/4 |
| 201607| 3 | 5 | 3/5 |
You can try this (MSSQL 2012):
Sample data
CREATE TABLE mytable(
MONT INTEGER NOT NULL
,Value INTEGER NOT NULL
,ID VARCHAR(5) NOT NULL
);
INSERT INTO mytable(MONT,Value,ID) VALUES (201601,1,'X');
INSERT INTO mytable(MONT,Value,ID) VALUES (201601,1,'Y');
INSERT INTO mytable(MONT,Value,ID) VALUES (201601,1,'Y');
INSERT INTO mytable(MONT,Value,ID) VALUES (201602,1,'Z');
INSERT INTO mytable(MONT,Value,ID) VALUES (201603,1,'A');
INSERT INTO mytable(MONT,Value,ID) VALUES (201604,1,'B');
INSERT INTO mytable(MONT,Value,ID) VALUES (201605,1,'C');
INSERT INTO mytable(MONT,Value,ID) VALUES (201607,1,'E');
INSERT INTO mytable(MONT,Value,ID) VALUES (201607,10,'EE');
INSERT INTO mytable(MONT,Value,ID) VALUES (201607,100,'EEE');
Query 1
SELECT MONT, RC, RC+ LAG(RC,3,0) OVER ( ORDER BY MONT)+ LAG(RC,2,0) OVER ( ORDER BY MONT) + LAG(RC,1,0) OVER ( ORDER BY MONT) AS RC_3M_PREC -- + COALESCE( LEAD(RC) OVER ( ORDER BY MONT),0) AS RC_3M
FROM (SELECT MONT
, COUNT(*) RC
FROM mytable
GROUP BY MONT
) A
Output:
MONT RC RC_3M_PREC
----------- ----------- -----------
201601 3 3
201602 1 4
201603 1 5
201604 1 6
201605 1 4
201607 3 6
Or using what you proposed (option ROWS ... PRECEDING):
Query 2:
SELECT MONT, RC
, COALESCE(SUM(RC) OVER (ORDER BY MONT ROWS BETWEEN 3 PRECEDING AND CURRENT ROW),0) AS RC_3M
FROM (SELECT MONT
, COUNT(*) RC
FROM mytable
GROUP BY MONT
) A
Output:
MONT RC RC_3M
----------- ----------- -----------
201601 3 3
201602 1 4
201603 1 5
201604 1 6
201605 1 4
201607 3 6
If you want to count rows in the previous three months, just use conditional aggregation. You do need a way to enumerate the months:
SELECT COUNT(*),
SUM(CASE WHEN yyyymm_counter <= 3 THEN 1 ELSE 0 END)
FROM (SELECT md.*,
DENSE_RANK() OVER (ORDER BY MYMONTH DESC) as yyyymm_counter
FROM myData md
) md;
Another way without the subquery converts the month value to an actual date. Let me assume that it is a string:
SELECT COUNT(*),
SUM( CASE WHEN DATEDIFF(month, CAST(MYMONTH + '01' as DATE), GETDATE()) <= 3
THEN 1 ELSE 0
END)
FROM MyData;
I've left the / out of the answer. You need to be aware that SQL Server does integer division, so you may not get the results you want -- unless you convert values to non-integer number (I would suggest multiplying by 1.0 or using 1.0 instead of 1 in the queries).