Cumulative sum reset by change of year (SQL Server) - sql

I'm using SQL Server 2017. I would like to sum up the budget per month of a year for that year and factory.
This cumulation is to be reset with each new year.
Table schema:
CREATE TABLE [TABLE_1]
(
FACTORY varchar(50) Null,
DATE_YM int Null,
BUDGET int NULL,
);
INSERT INTO TABLE_1 (FACTORY, DATE_YM, BUDGET)
VALUES ('A', 202111, 1),
('A', 202112, 1),
('A', 202201, 10),
('A', 202202, 100),
('A', 202203, 1000),
('B', 202111, 2),
('B', 202112, 2),
('B', 202201, 20),
('B', 202202, 200),
('B', 202203, 2000),
('C', 202111, 3),
('C', 202112, 3),
('C', 202201, 30),
('C', 202202, 300),
('C', 202203, 3000);
LINK TO db<>fiddle
Desired result
FACTORY
DATE_YM
C_BUDGET_SUM
A
202111
1
A
202112
2
A
202201
10
A
202202
110
A
202203
1110
B
202111
2
B
202112
4
B
202201
20
B
202202
220
B
202203
2220
C
202111
3
C
202112
6
C
202201
30
C
202202
330
C
202203
3330
My approach:
WITH data AS
(
SELECT
T1.FACTORY,
T1.DATE_YM,
T1.BUDGET
FROM
TABLE_1 AS T1
)
SELECT
FACTORY,
DATE_YM,
SUM(BUDGET) OVER (ORDER BY FACTORY, DATE_YM ASC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS 'C_BUDGET_SUM'
FROM
data
This query totals across year ends. How can the year break be implemented dynamically?

The CTE is not necessary, but I'm assuming this is a simplified version.
To expand on my comment
with data as (
select
T1.FACTORY,
T1.DATE_YM,
T1.BUDGET
from TABLE_1 as T1
)
select
FACTORY,
DATE_YM,
sum(BUDGET) over (partition by Factory,left(Date_YM,4) order by DATE_YM asc rows between unbounded preceding and current row) as 'C_BUDGET_SUM'
from data
Results

Related

Denormalize column

I have data in my database like this:
Code
meta
meta_ID
date
A
1,2
1
01/01/2022 08:08:08
B
1,2
2
01/01/2022 02:00:00
B
null
2
01/01/1900 02:00:00
C
null
3
01/01/2022 02:00:00
D
8
8
01/01/2022 02:00:00
E
5,6,7
5
01/01/2022 02:00:00
F
1,2
2
01/01/2022 02:00:00
I want to have this with the last date (comparing with day, month year)
Code
meta
meta_ID
list_Code
date
A
2,3
1
A,B,F
01/01/2022 08:08:08
B
1,3
2
A,B,F
01/01/2022 02:00:00
C
null
3
C
01/01/2022 02:00:00
D
8
8
D
01/01/2022 02:00:00
E
5,6,7
5
E
01/01/2022 02:00:00
F
1,2
3
A,B,F
01/01/2022 02:00:00
I want to have the list of code having the same meta group, do you know how to do it with SQL Server?
The code below inputs the 1st table and outputs the 2nd table exactly. The Meta and Date columns had duplicate values, so in the CTE I took the MAX for both fields. Different logic can be applied if needed.
It uses XML Path to merge all rows into one column to create the List_Code column. The Stuff function removes the leading comma (,) delimiter.
CREATE TABLE MetaTable
(
Code VARCHAR(5),
Meta VARCHAR(100),
Meta_ID INT,
Date DATETIME
)
GO
INSERT INTO MetaTable
VALUES
('A', '1,2', '1', '01/01/2022 08:08:08'),
('B', '1,2','2', '01/01/2022 02:00:00'),
('B', NULL,'2', '01/01/1900 02:00:00'),
('C', NULL,'3', '01/01/2022 02:00:00'),
('D', '8','8', '01/01/2022 02:00:00'),
('E', '5,6,7', '5', '01/01/2022 02:00:00'),
('F', '1,2','2', '01/01/2022 02:00:00')
GO
WITH CTE_Meta
AS
(
SELECT
Code,
MAX(Meta) AS 'Meta',
Meta_ID,
MAX(Date) AS 'Date'
FROM MetaTable
GROUP BY
Code,
Meta_ID
)
SELECT
T1.Code,
T1.Meta,
T1.Meta_ID,
STUFF
(
(
SELECT ',' + Code
FROM CTE_Meta T2
WHERE ISNULL(T1.Meta, '') = ISNULL(T2.Meta, '')
FOR XML PATH('')
), 1, 1, ''
) AS 'List_Code',
T1.Date
FROM CTE_Meta T1
ORDER BY 1
I like the first answer using XML. It's very concise. This is more verbose, but might be more flexible if the data can have different meta values spread about in different records. The CAST to varchar(12) in various places is just for the display. I use STRING_AGG and STRING_SPLIT instead of XML.
WITH TestData as (
SELECT t.*
FROM (
Values
('A', '1,2', '1', '01/01/2022 08:08:08'),
('B', '1,2', '2', '01/01/2022 02:00:00'),
('B', null, '2', '01/01/1900 02:00:00'),
('C', null, '3', '01/01/2022 02:00:00'),
('D', '8', '8', '01/01/2022 02:00:00'),
('E', '5,6,7', '5', '01/01/2022 02:00:00'),
('F', '1,2', '2', '01/01/2022 02:00:00'),
('G', '16', '17', '01/01/2022 02:00:00'),
('G', null, '17', '01/02/2022 03:00:00'),
('G', '19', '18', '01/03/2022 04:00:00'),
('G', '19', '18', '01/03/2022 04:00:00'),
('G', '20', '19', '01/04/2022 05:00:00'),
('G', '20', '20', '01/05/2022 06:00:00')
) t (Code, meta, meta_ID, date)
), CodeLookup as ( -- used to find the Code from the meta_ID
SELECT DISTINCT meta_ID, Code
FROM TestData
), Normalized as ( -- split out the meta values, one per row
SELECT t.Code, s.Value as [meta], meta_ID, [date]
FROM TestData t
OUTER APPLY STRING_SPLIT(t.meta, ',') s
), MetaLookup as ( -- used to find the distinct list of meta values for a Code
SELECT n.Code, CAST(STRING_AGG(n.meta, ',') WITHIN GROUP ( ORDER BY n.meta ASC ) as varchar(12)) as [meta]
FROM (
SELECT DISTINCT Code, meta
FROM Normalized
WHERE meta is not NULL
) n
GROUP BY n.Code
), MetaIdLookup as ( -- used to find the distinct list of meta_ID values for a Code
SELECT n.Code, CAST(STRING_AGG(n.meta_ID, ',') WITHIN GROUP ( ORDER BY n.meta_ID ASC ) as varchar(12)) as [meta_ID]
FROM (
SELECT DISTINCT Code, meta_ID
FROM Normalized
) n
GROUP BY n.Code
), ListCodeLookup as ( -- for every code, get all codes for the meta values
SELECT l.Code, CAST(STRING_AGG(l.lookupCode, ',') WITHIN GROUP ( ORDER BY l.lookupCode ASC ) as varchar(12)) as [list_Code]
FROM (
SELECT DISTINCT n.Code, c.Code as [lookupCode]
FROM Normalized n
INNER JOIN CodeLookup c
ON c.meta_ID = n.meta
UNION -- every record needs it's own code in the list_code?
SELECT DISTINCT n.Code, n.Code as [lookupCode]
FROM Normalized n
) l
GROUP BY l.Code
)
SELECT t.Code, m.meta, mi.meta_ID, lc.list_Code, t.[date]
FROM (
SELECT Code, MAX([date]) as [date]
FROM TestData
GROUP BY Code
) t
LEFT JOIN MetaLookup m
ON m.Code = t.Code
LEFT JOIN MetaIdLookup mi
ON mi.Code = t.Code
LEFT JOIN ListCodeLookup lc
ON lc.Code = t.Code
Code meta meta_ID list_Code date
---- ------------ ------------ ------------ -------------------
A 1,2 1 A,B,F 01/01/2022 08:08:08
B 1,2 2 A,B,F 01/01/2022 02:00:00
C NULL 3 C 01/01/2022 02:00:00
D 8 8 D 01/01/2022 02:00:00
E 5,6,7 5 E 01/01/2022 02:00:00
F 1,2 2 A,B,F 01/01/2022 02:00:00
G 16,19,20 17,18,19,20 G 01/05/2022 06:00:00

Identify rows subsequent to other rows based on criteria?

I am fairly new to DB2 and SQL. There exists a table of customers and their visits. I need to write a query to find visits by the same customer subsequent and within 24hr to a visit when Sale = 'Y'.
Based on this example data:
CustomerId
VisitID
Sale
DateTime
1
1
Y
2021-04-23 20:16:00.000000
2
2
N
2021-04-24 20:16:00.000000
1
3
N
2021-04-23 21:16:00.000000
2
4
Y
2021-04-25 20:16:00.000000
3
5
Y
2021-04-23 20:16:00.000000
2
6
N
2021-04-25 24:16:00.000000
3
7
N
2021-5-23 20:16:00.000000
The query results should return:
VisitID
3
6
How do I do this?
Try this. You may uncomment the commented out block to run this statement as is.
/*
WITH MYTAB (CustomerId, VisitID, Sale, DateTime) AS
(
VALUES
(1, 1, 'Y', '2021-04-23 20:16:00'::TIMESTAMP)
, (1, 3, 'N', '2021-04-23 21:16:00'::TIMESTAMP)
, (2, 2, 'N', '2021-04-24 20:16:00'::TIMESTAMP)
, (2, 4, 'Y', '2021-04-25 20:16:00'::TIMESTAMP)
, (2, 6, 'N', '2021-04-25 23:16:00'::TIMESTAMP)
, (3, 5, 'Y', '2021-04-23 20:16:00'::TIMESTAMP)
, (3, 7, 'N', '2021-05-23 20:16:00'::TIMESTAMP)
)
*/
SELECT VisitID
FROM MYTAB A
WHERE EXISTS
(
SELECT 1
FROM MYTAB B
WHERE B.CustomerID = A.CustomerID
AND B.Sale = 'Y'
AND B.VisitID <> A.VisitID
AND A.DateTime BETWEEN B.DateTime AND B.DateTime + 24 HOUR
)

How to find all those Sellers from the table who had increase in sales in at least 3 months consecutively in SQL?

How to find all those Sellers from below table who had increase in sales in at least 3 months consecutively?
Record | Seller_id | Months | Sales_amount
0 121 Feb 100
1 121 Jan 87
2 121 Mar 95
3 121 May 105
4 121 Apr 100
5 321 Jan 100
6 321 Feb 87
7 321 Mar 95
8 321 Apr 105
9 321 May 110
10 597 Jan 100
11 597 Feb 105
12 597 Mar 95
13 597 Apr 100
14 597 May 110
This is curious you have no year and months are three letter codes. Do it with lag
and table of months
With tbl as (
select * from (values
-- source data
(0 , 121,'Feb',100)
,(1 , 121,'Jan',87 )
,(2 , 121,'Mar',95 )
,(3 , 121,'May',105)
,(4 , 121,'Apr',100)
,(5 , 321,'Jan',100)
,(6 , 321,'Feb',87 )
,(7 , 321,'Mar',95 )
,(8 , 321,'Apr',105)
,(9 , 321,'May',110)
,(10, 597,'Jan',100)
,(11, 597,'Feb',105)
,(12, 597,'Mar',95 )
,(13, 597,'Apr',100)
,(14, 597,'May',110)
) t(id, Seller_id, Months, Sales_amount)
), months as (
select * from ( values
(1, 'Jan')
,(2, 'Feb')
,(3, 'Mar')
,(4, 'Apr')
,(5, 'May')
-- , etc
) t(id,name)
)
select *
from (
select t.*,
lag(Sales_amount,1) over (partition by Seller_id order by m.id) m1,
lag(Sales_amount,2) over (partition by Seller_id order by m.id) m2
from tbl t
join months m on m.name=t.Months
) t
where Sales_amount > m1 and m1 > m2;
WITH a
AS (SELECT *
FROM
(
VALUES -- source data
(0, 121, 'Feb', 100),
(1, 121, 'Jan', 87),
(2, 121, 'Mar', 95),
(3, 121, 'May', 105),
(4, 121, 'Apr', 100),
(5, 321, 'Jan', 100),
(6, 321, 'Feb', 87),
(7, 321, 'Mar', 95),
(8, 321, 'Apr', 105),
(9, 321, 'May', 110),
(10, 597, 'Jan', 100),
(11, 597, 'Feb', 105),
(12, 597, 'Mar', 95),
(13, 597, 'Apr', 100),
(14, 597, 'May', 110)
) t (id, Seller_id, Months, Sales_amount) ),
b
AS (SELECT *
FROM
(
VALUES
(1, 'Jan'),
(2, 'Feb'),
(3, 'Mar'),
(4, 'Apr'),
(5, 'May') -- , etc
) t (id, name) ),
c
AS (SELECT a.*,
b.id id2,
ROW_NUMBER() OVER (PARTITION BY a.Seller_id ORDER BY b.id ASC) rnk
FROM a
LEFT JOIN b
ON a.Months = b.name),
d
AS (SELECT --c1.*
c1.Seller_id,
c1.Months AS m1,
c2.Months AS m2,
c3.Months AS m3,
c1.Sales_amount AS sa1,
c2.Sales_amount AS sa2,
c3.Sales_amount AS sa3
FROM c c1
LEFT JOIN c c2
ON c1.id2 = c2.id2 - 1
AND c1.Seller_id = c2.Seller_id
LEFT JOIN c c3
ON c2.id2 = c3.id2 - 1
AND c2.Seller_id = c3.Seller_id)
SELECT *,
CASE
WHEN sa1 < sa2
AND sa2 < sa3 THEN
1
ELSE
0
END is_con
FROM d;

Select rows until running sum reaches specific value

I have the following data:
DECLARE #t TABLE (usr VARCHAR(100), dt DATE, amount INT);
INSERT INTO #t VALUES
('a', '2018-01-01', 100), -- 100
('a', '2018-02-01', 100), -- 200
('a', '2018-03-01', 100), -- 300
('a', '2018-04-01', 100), -- 400
('a', '2018-05-01', 100), -- 500
('b', '2018-01-01', 150), -- 150
('b', '2018-02-01', 150), -- 300
('b', '2018-03-01', 150), -- 450
('b', '2018-04-01', 150), -- 600
('b', '2018-05-01', 150); -- 750
And a value such as 300 or 301 (a user variable or column). I want to select rows until running total of amount reaches the specified value, with the following twist:
For 300 I want to select first 3 rows for a and first 2 rows for b
For 301 I want to select first 4 rows for a and first 3 rows for b
This is supposed to be simple but the solutions I found do not handle the second case.
DECLARE #t TABLE (usr VARCHAR(100), dt DATE, amount INT);
INSERT INTO #t VALUES
('a', '2018-01-01', 100), -- 100
('a', '2018-02-01', 100), -- 200
('a', '2018-03-01', 100), -- 300
('a', '2018-04-01', 100), -- 400
('a', '2018-05-01', 100), -- 500
('b', '2018-01-01', 150), -- 150
('b', '2018-02-01', 150), -- 300
('b', '2018-03-01', 150), -- 450
('b', '2018-04-01', 150), -- 600
('b', '2018-05-01', 150); -- 750
DECLARE #Total INT = 301;
WITH cte AS
(
SELECT *, SUM(amount) OVER (PARTITION BY usr ORDER BY dt) AS RunTotal
FROM #t
)
SELECT *
FROM cte
WHERE cte.RunTotal - cte.amount < #Total -- running total for previous row is less
-- than #Total then include current row
DECLARE #t TABLE (usr VARCHAR(100), dt DATE, amount INT);
INSERT INTO #t VALUES
('a', '2018-01-01', 100), -- 100
('a', '2018-02-01', 100), -- 200
('a', '2018-03-01', 100), -- 300
('a', '2018-04-01', 100), -- 400
('a', '2018-05-01', 100), -- 500
('b', '2018-01-01', 150), -- 150
('b', '2018-02-01', 150), -- 300
('b', '2018-03-01', 150), -- 450
('b', '2018-04-01', 150), -- 600
('b', '2018-05-01', 150); -- 750
declare #target int = 300;
with cte_RunningTotal as
(
select
usr,
dt,
amount,
sum(amount) over (partition by usr order by dt rows unbounded preceding) as runningTotal
from #t
)
select *
from cte_RunningTotal
where runningTotal < #target + amount
order by usr, dt

PostgreSQL/plpython: how compare two columns from different table in loop

I have a problem with loop in which I must compare columns between different tables.
I have two tables year2004 and year2005. Both contains month numbers and an amount for that month. I want compare the amount from both tables and produce a third table year with the number of month and greatest amount for that month.
For example I have in 2004 - 100, in 2005 - 200 so I must return values(2005, number_of_month, 200). Have you any ideas for solve this problem?
PS. Sorry for my writing errors, I learned English only few years ago :)
I'm guessing that you're trying to find the greatest amount for each month across the two years.
This would be much, much easier if your data was all in one table monthly_statistics with a date column. Then it'd just be a simple aggregate function or a window.
So lets turn the two tables into one.
Given sample data:
CREATE TABLE year2004 ( month int primary key, amount int);
INSERT INTO year2004 (month, amount)
VALUES (1, 50), (2, 40), (3, 60), (4, 80), (5, 100), (6, 800), (7, 20), (8, 40), (9, 30), (10, 40), (11, 50), (12, 99);
CREATE TABLE year2005 ( month int primary key, amount int);
INSERT INTO year2005 (month, amount)
VALUES (1, 88), (2, 44), (3, 11), (4, 123), (5, 12), (6, 88), (7, 21), (8, 19), (9, 44), (10, 89), (11, 4), (12, 42);
we could either join the tables, or we could convert it to a single table by date then filter it. Here's how we might generate a single table with the contents:
SELECT DATE '2004-01-01' + month * INTERVAL '1' MONTH AS sampledate, amount
FROM year2004
UNION ALL
SELECT DATE '2005-01-01' + month * INTERVAL '1' MONTH, amount
FROM year2005;
That's what you'd use if you were going to create a new table, but if you don't care about the actual dates, only the months, you can simply union all the two tables:
WITH samples AS (
SELECT month, amount
FROM year2004
UNION ALL
SELECT month, amount
FROM year2005
)
SELECT month, max(amount) AS amount
FROM samples
GROUP BY 1
ORDER BY month;
samplemonth | amount
-------------+--------
5 | 123
11 | 89
1 | 99
2 | 88
3 | 44
9 | 40
4 | 60
6 | 100
10 | 44
12 | 50
7 | 800
8 | 21
(12 rows)