Rolling Sum (4 Months) - sql

I have been struggeling with building a query in access that calculates a "rolling 4 months" of sales data. I have been experimenting with DSUM, but I only seem to be able to get the subtotal or running total for a specific group (not a moving total). I have tried to illustrate what I am trying to do below.
Date Product Value Rolling_4_Month_Sum
January A 100 100
February A 200 300
March A 300 600
April A 300 900
May A 200 1000
June A 400 1200
July A 500 1400
August A 700 1800
Is it possible to make a running total for 4 rows/months only?

SELECT
a.Date,
a.Product,
a.Value,
SUM(b.value)
FROM
Table a
INNER JOIN Table b ON a.Product=b.Product
AND b.Date <= a.Date
AND b.Date >= DateAdd("q",1, a.Date)
GROUP BY
a.Date, a.Product
This should work in my opinion.
Table a is your "single month" row date.
Table b is self join to retrieve the last 4 predecessing months. It is done by adding b.Date >= DateAdd("q",1, a.Date) as self-join criteria.

Here is a nice example of how these kinds of things work.
Data:
OrderDetailID OrderID ProductID Price
1 1234 1 $5.00
2 1234 2 ($2.00)
3 1234 3 $4.00
4 1235 1 $5.00
5 1235 3 $4.00
6 1235 5 $12.00
7 1235 2 ($2.00)
SQL:
SELECT OD.OrderDetailID, OD.OrderID, OD.ProductID, OD.Price, (SELECT Sum(Price) FROM tblOrderDetails
WHERE OrderDetailID <= OD.OrderDetailID) AS RunningSum
FROM tblOrderDetails AS OD;

Related

Grouping and Summarize SQL

My table looks like the following:
income
date
productid
invoiceid
customerid
300
2015-01-01
A
1234551
1
300
2016-01-02
A
1234552
1
300
2016-01-03
B
1234553
2
300
2016-01-03
A
1234553
2
300
2016-01-04
C
1234554
3
300
2016-01-04
C
1234554
3
300
2016-01-08
A
1234556
3
300
2016-01-08
B
1234556
3
300
2016-01-11
C
1234557
3
I need to know : Number of invoices per customer, how many customers in total (for example one invoice = several customers, two invoices = two customers, three invoices = three customers, and so..).
What is the syntax for this query?
In my sample data above, customer 1 has two invoices, customer 2 one invoice and customer 3 three invoices. So there is one customer each with a count of 1, 2, and 3 invoices in my example.
Expected result:
invoice_count
customers_with_this_invoice_count
1
1
2
1
3
1
I tried this syntax and I'm still stuck:
select * from
(
select CustomerID,count(distinct InvoiceID) as 'Total Invoices'
from exam
GROUP BY CustomerID
) a
Select Count(customerID),CustomerID From a
Group By customerID
Having Count(customerID) > 1

Showing particular row data as column

My table is as follows:
Date
Code
Price
MA5
MA20
2022-01-01
APPLE
1000
1080
1090
2022-01-02
APPLE
1100
1084
1100
2022-01-03
APPLE
1200
1090
1100
2022-01-01
MICROSOFT
7
9
10
2022-01-02
MICROSOFT
7.5
8
9.5
2022-01-03
MICROSOFT
8
8.5
9
...
...
...
...
...
2022-01-01
NASDAQ
14400
15600
16700
2022-01-02
NASDAQ
14500
15200
16100
2022-01-03
NASDAQ
14600
15000
16000
I'm currently saving NASDAQ values and stock data on the same table using MariaDB.
However, I want to show NASDAQ's MA values as new column fields into rest of the field, as NASDAQ_MA5, NASDAQ_MA20.
My question is, how do I select nasdaq's MA5 and MA20 values and put it as the values according to the matching dates? My desired output is as follows:
Date
Code
Price
MA5
MA20
NASDAQ_MA5
NASDAQ_MA20
2022-01-01
APPLE
1000
1080
1090
15600
16700
2022-01-02
APPLE
1100
1084
1100
15200
16100
2022-01-03
APPLE
1200
1090
1100
15000
16000
2022-01-01
MICROSOFT
7
9
10
15600
16700
2022-01-02
MICROSOFT
7.5
8
9.5
15200
16100
2022-01-03
MICROSOFT
8
8.5
9
15000
16000
I've been trying the following:
SELECT *,
(PARTITION BY DATE, case when (code='NASDAQ') then MA5 else NULL end) as 'NASDAQ_MA5',
(PARTITION BY DATE, case when (code='NASDAQ') then MA20 else NULL end) as 'NASDAQ_MA20'
FROM TABLE
Your help will be very appreciated.
You need to join the two distinct sets of data. It probably makes sense to define a CTE to keep the definitions clear, then join with your main table - either an inner join if there's always a corresponding date or left join if there might not be.
This assumes there's only a single nasdaq code for each date, if that's not the case you can aggregate in the CTE as required.
with nasdaq as (
select date, MA5 NASDAQ_MA5, MA20 NASDAQ_MA20
from t
where code = 'NASDAQ'
)
select t.*, n.NASDAQ_MA5, n.NASDAQ_MA20
from t
left join nasdaq n on n.date=t.date
where t.code != 'NASDAQ';
Im sure theres a more efficient way of doing this, but heres a way to solve your issue using 2 subqueries:
SELECT t1.*, (SELECT t2.MA5
FROM tableA t2
WHERE t1.date = t2.date
AND code = 'NASDAQ') as NASDAQ_MA5,
(SELECT t2.MA20
FROM tableA t2
WHERE t1.date = t2.date
AND code = 'NASDAQ') as NASDAQ_MA20
FROM tableA t1
WHERE code != 'NASDAQ'
Try it out here.
You can do a self join on date column and separating the data by CODE = 'NASDAQ'. Here is a way of doing this:
select a.*, b.ma5 as NASDAQ_MA5, b.ma20 as NASDAQ_MA20 from table1 a
left outer join (select date, ma5,ma20 from table1 where code = 'NASDAQ') b
on a.date = b.date where a.code <> 'NASDAQ'

group by of one column and having count of another

I have a table 'customer' which contains 4 columns
name day product price
A 2021-04-01 p1 100
B 2021-04-01 p1 100
C 2021-04-01 p2 120
A 2021-04-01 p2 120
A 2021-04-02 p1 100
B 2021-04-02 p3 80
C 2021-04-03 p2 120
D 2021-04-03 p2 120
C 2021-04-04 p1 100
With a command
SELECT COUNT(name)
FROM (SELECT name
FROM customer
WHERE day > '2021-03-28'
AND day < '2021-04-09'
GROUP BY name
HAVING COUNT(name) > 2)
I could count number of customer that bought something more than twice in a period of time.
I would like to know in each day (GROUP BY over day) how many customers bought something with this condition that in a period they bought something more than twice.
Suggested Edit:
For above example A and C are valid agents by the condition.
The desired output will be:
day how_many
2021-04-01 2
2021-04-02 1
2021-04-03 1
2021-04-04 1
I interpret your question as wanting to know how many customers made more than one purchase on each day. If so, one method uses two levels of aggregation:
select day,
sum(case when day_count >= 2 then 1 else 0 end)
from (select c.name, c.day, count(*) as day_count
from customer c
group by c.name, c.day
) nc
group by day
order by day;

Aggregate payments per year per customer per type

Please consider the following payment data:
customerID paymentID pamentType paymentDate paymentAmount
---------------------------------------------------------------------
1 1 A 2015-11-28 500
1 2 A 2015-11-29 -150
1 3 B 2016-03-07 300
2 4 A 2015-03-03 200
2 5 B 2016-05-25 -100
2 6 C 2016-06-24 700
1 7 B 2015-09-22 110
2 8 B 2016-01-03 400
I need to tally per year, per customer, the sum of the diverse payment types (A = invoice, B = credit note, etc), as follows:
year customerID paymentType paymentSum
-----------------------------------------------
2015 1 A 350 : paymentID 1 + 2
2015 1 B 110 : paymentID 7
2015 1 C 0
2015 2 A 200 : paymentID 4
2015 2 B 0
2015 2 C 0
2016 1 A 0
2016 1 B 300 : paymentID 3
2016 1 C 0
2016 2 A 0
2016 2 B 300 : paymentID 5 + 8
2016 2 C 700 : paymentId 6
It is important that there are values for every category (so for 2015, customer 1 has 0 payment value for type C, but still it is good to see this).
In reality, there are over 10 payment types and about 30 customers. The total date range is 10 years.
Is this possible to do in only SQL, and if so could somebody show me how? If possible by using relatively easy queries so that I can learn from it, for instance by storing intermediary result into a #temptable.
Any help is greatly appreciated!
a simple GROUP BY with SUM() on the paymentAmount will gives you what you wanted
select year = datepart(year, paymentDate),
customerID,
paymentType,
paymentSum = sum(paymentAmount)
from payment_data
group by datepart(year, paymentDate), customerID, paymentType
This is a simple query that generates the required 0s. Note that it may not be the most efficient way to generate this result set. If you already have lookup tables for customers or payment types, it would be preferable to use those rather than the CTEs1 I use here:
declare #t table (customerID int,paymentID int,paymentType char(1),paymentDate date,
paymentAmount int)
insert into #t(customerID,paymentID,paymentType,paymentDate,paymentAmount) values
(1,1,'A','20151128', 500),
(1,2,'A','20151129',-150),
(1,3,'B','20160307', 300),
(2,4,'A','20150303', 200),
(2,5,'B','20160525',-100),
(2,6,'C','20160624', 700),
(1,7,'B','20150922', 110),
(2,8,'B','20160103', 400)
;With Customers as (
select DISTINCT customerID from #t
), PaymentTypes as (
select DISTINCT paymentType from #t
), Years as (
select DISTINCT DATEPART(year,paymentDate) as Yr from #t
), Matrix as (
select
customerID,
paymentType,
Yr
from
Customers
cross join
PaymentTypes
cross join
Years
)
select
m.customerID,
m.paymentType,
m.Yr,
COALESCE(SUM(paymentAmount),0) as Total
from
Matrix m
left join
#t t
on
m.customerID = t.customerID and
m.paymentType = t.paymentType and
m.Yr = DATEPART(year,t.paymentDate)
group by
m.customerID,
m.paymentType,
m.Yr
Result:
customerID paymentType Yr Total
----------- ----------- ----------- -----------
1 A 2015 350
1 A 2016 0
1 B 2015 110
1 B 2016 300
1 C 2015 0
1 C 2016 0
2 A 2015 200
2 A 2016 0
2 B 2015 0
2 B 2016 300
2 C 2015 0
2 C 2016 700
(We may also want to play games with a numbers table and/or generate actual start and end dates for years if the date processing above needs to be able to use an index)
Note also how similar the top of my script is to the sample data in your question - except it's actual code that generates the sample data. You may wish to consider presenting sample code in such a way in the future since it simplifies the process of actually being able to test scripts in answers.
1CTEs - Common Table Expressions. They may be thought of as conceptually similar to temp tables - except we don't actually (necessarily) materialize the results. They also are incorporated into the single query that follows them and the whole query is optimized as a whole.
Your suggestion to use temp tables means that you'd be breaking this into multiple separate queries that then necessarily force SQL to perform the task in an order that we have selected rather than letting the optimizer choose the best approach for the above single query.

Joining to another table only on the first occurrence of a field

Note: I have tried to simplify the below to make it simpler both for me and for anyone else to understand, the tables I reference below are in fact sub-queries joining a lot of different data together from different sources)
I have a table of purchased items:
Items
ItemSaleID CustomerID ItemCode
1 100 A
2 100 B
3 100 C
4 200 A
5 200 C
I also have transaction header and detail tables coming from a till system:
TranDetail
TranDetailID TranHeaderID ItemSaleID Cost
11 51 1 $10
12 51 2 $10
13 51 3 $10
14 52 4 $20
15 52 5 $10
TranHeader
TranHeaderID CustomerID Payment Time
51 100 $100 11:00
52 200 $50 12:00
53 100 $20 13:00
I want to get to a point where I have a table like:
ItemSaleID CustomerID ItemCode Cost Payment Time
1 100 A $10 $120 11:00
2 100 B $10 11:00
3 100 C $10 11:00
4 200 D $20 $50 12:00
5 200 E $10 12:00
I have a query which produces the results but when I add in the ROW_NUMBER() case statement goes from 2 minutes to 30+ minutes.
The query is further confused because I need to supply the earliest date relating to the list of transactions and the total price paid (could be many transactions throughout the day for upgrades etc)
Query below:
SELECT ItemSaleID
, CustomerID
, ItemCode
, Cost
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY TranHeaderID ORDER BY ItemSaleID) = 1
THEN TRN.Payment ELSE NULL END AS Payment
FROM Items I
OUTER APPLY (
SELECT TOP 1 SUB.Payment, Time
FROM TranHeader H
INNER JOIN TranDetail D ON H.TranHeaderID = D.TranHeaderID
OUTER APPLY (SELECT SUM(Payment) AS Payment
FROM TranHeader H2
WHERE H2.CustomerID = Items.CustomerID
) SUB
WHERE D.CustomerID = I.CustomerID
) TRN
WHERE ...
Is there a way that I can only show payments for each occurrence of the customer ID whilst maintaining performance