Sum a subquery and group by customer info - sql

I have three tables something like the following:
Customer (CustomerID, AddressState)
Account (AccountID, CustomerID, OpenedDate)
Payment (AccountID, Amount)
The Payment table can contain multiple payments for an Account and a Customer can have multiple accounts.
What I would like to do is retrieve the total amount of all payments on a State by State and Month by Month basis. E.g.
Opened Date| State | Total
--------------------------
2009-01-01 | CA | 2,500
2009-01-01 | GA | 1,000
2009-01-01 | NY | 500
2009-02-01 | CA | 1,500
2009-02-01 | NY | 2,000
In other words, I'm trying to find out what States paid the most for each month. I'm only interested in the month of the OpenedDate but I get it as a date for processing afterwards. I was trying to retrieve all the data I needed in a single query.
I've been trying something along the lines of:
select
dateadd (month, datediff(month, 0, a.OpenedDate), 0) as 'Date',
c.AddressState as 'State',
(
select sum(x.Amount)
from (
select p.Amount
from Payment p
where p.AccountID = a.AccountID
) as x
)
from Account a
inner join Customer c on c.CustomerID = a.CustomerID
where ***
group by
dateadd(month, datediff(month, 0, a.OpenedDate), 0),
c.AddressState
The where clause includes some general stuff on the Account table. The query won't work because the a.AccountID is not included in the aggregate function.
Am I approaching this the right way? How can I retrieve the data I require in order to calculate which States' customers pay the most?

If you want the data grouped by month, you need to group by month:
SELECT AddressState, DATEPART(mm, OpenedDate), SUM(Amount)
FROM Customer c
INNER JOIN Account a ON a.CustomerID = c.CustomerID
INNER JOIN Payments p ON p.AccountID = a.AccountID
GROUP BY AddressState, DATEPART(mm, OpenedDate)
This shows you the monthnumber (1-12) and the total amount per state. Note that this example doesn't include years: all amounts of month 1 are summed regardless of year. Add a datepart(yy, OpenedDate) if you like.

In other words, I'm trying to find out what States paid the most for each month
This one will select the most profitable state for each month:
SELECT *
FROM (
SELECT yr, mon, AddressState, amt, ROW_NUMBER() OVER (PARTITION BY yr, mon, addressstate ORDER BY amt DESC) AS rn
FROM (
SELECT YEAR(OpenedDate) AS yr, MONTH(OpenedDate) AS mon, AddressState, SUM(Amount) AS amt
FROM Customer c
JOIN Account a
ON a.CustomerID = c.CustomerID
JOIN Payments p
ON p.AccountID = a.AccountID
GROUP BY
YEAR(OpenedDate), MONTH(OpenedDate), AddressState
)
) q
WHERE rn = 1
Replace the last condition with ORDER BY yr, mon, amt DESC to get the list of all states like in your resultset:
SELECT *
FROM (
SELECT yr, mon, AddressState, amt, ROW_NUMBER() OVER (PARTITION BY yr, mon, addressstate ORDER BY amt DESC) AS rn
FROM (
SELECT YEAR(OpenedDate) AS yr, MONTH(OpenedDate) AS mon, AddressState, SUM(Amount) AS amt
FROM Customer c
JOIN Account a
ON a.CustomerID = c.CustomerID
JOIN Payments p
ON p.AccountID = a.AccountID
GROUP BY
YEAR(OpenedDate), MONTH(OpenedDate), AddressState
)
) q
ORDER BY
yr, mon, amt DESC

select
AddressState,
year(OpenedDate) as Yr,
month(OpenedDate) as Mnth,
sum(Payment) as SumPayment
from Customer c
inner join Account a
on c.CustomerID=a.CustomerID
inner join Payment p
on a.AccountID=p.AccountID
group by AddressState, month(OpenedDate)

Related

How to get value from a query of another table to create a new column (postgresql)

I am new to postgres and I want to be able to set value to Y if order (order table) is a first month order (first month order table)
first month order table is as per below. It will only show the order placed by user the first time in the month:
customer_id | order_date | order_id
--------------------------------------------------
a1 | December 6, 2015, 8:30 PM | orderA1
order table is as per below. It shows all the order records:
customer_id | order_date | order_id
-----------------------------------------------------
a1 | December 6, 2020, 8:30 PM | orderA1
a1 | December 7, 2020, 8:30 PM | orderA2
a2 | December 11, 2020, 8:30 PM | orderA3
To get the first month order column in the order table, I tried using case as below. But then it will give the error more than one row returned by a subquery.
SELECT DISTINCT ON (order_id) order_id, customer_id,
(CASE when (select distinct order_id from first_month_order_table) = order_id then 'Y' else 'N'
END)
FROM order_table
ORDER BY order_id;
I also tried using count but then i understand that this is quite inefficient and overworks the database i think.
SELECT DISTINCT ON (order_id) order_id, customer_id,
(CASE when (select count order_id from first_month_order_table) then 'Y' else 'N'
END)
FROM order_table
ORDER BY order_id;
How can I determine if the order is first month order and set the value as Y for every order in the order table efficiently?
Use the left join as follows:
SELECT o.order_id, o.customer_id,
CASE when f.order_id is not null then 'Y' else 'N' END as flag
FROM order_table o left join first_month_order_table f
on f.order_id = o.order_id
ORDER BY o.order_id;
If you have all orders in the orders table, you don't need the second table. Just use window functions. The following returns a boolean, which I find much more convenient than a character flag:
select o.*,
(row_number() over (partition by customer_id, date_trunc('month', order_date order by order_date) = 1) as flag
from orders o;
If you want a character flag, then you need case:
select o.*,
(case when row_number() over (partition by customer_id, date_trunc('month', order_date order by order_date) = 1
then 'Y' else 'N'
end) as flag
from orders o;

add missing month in sales

I have a sales table with below values.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-07-01,1234,9
2020-03-01,3241,8
2020-07-01,3241,4
As you can see first purchase was for CustomerID = 1234 in Jan 2020 and for CustomerID = 3241 in MAR 2020.
I want on output where in all the date should be filled up with 0 purchase value.
means if there is no sale between Jan and July Then output should be as below.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-02-01,1234,0
2020-03-01,1234,0
2020-04-01,1234,0
2020-05-01,1234,0
2020-06-01,1234,0
2020-07-01,1234,9
2020-03-01,3241,8
2020-04-01,3241,0
2020-05-01,3241,0
2020-06-01,3241,0
2020-07-01,3241,4
You can use a recursive query to create the missing dates per customer.
with recursive dates (customerid, transactiondate, max_transactiondate) as
(
select customerid, min(transactiondate), max(transactiondate)
from sales
group by customerid
union all
select customerid, dateadd(month, 1, transactiondate), max_transactiondate
from dates
where transactiondate < max_transactiondate
)
select
d.customerid,
d.transactiondate,
coalesce(s.quantity, 0) as quantity
from dates d
left join sales s on s.customerid = d.customerid and s.transactiondate = d.transactiondate
order by d.customerid, d.transactiondate;
This is a convenient place to use a recursive CTE. Assuming all your dates are on the first of the month:
with cr as (
select customerid, min(transactiondate) as mindate, max(transactiondate) as maxdate
from t
group by customerid
union all
select customerid, dateadd(month, 1, mindate), maxdate
from cr
where mindate < maxdate
)
select cr.customerid, cr.mindate as transactiondate, coalesce(t.quantity, 0) as quantity
from cr left join
t
on cr.customerid = t.customerid and
cr.mindate = t.transactiondate;
Here is a db<>fiddle.
Note that if you have more than 100 months to fill in, then you will need option (maxrecursion 0).
Also, this can easily be adapted if the dates are not all on the first of the month. But you would need to explain what the result set should look like in that case.
[EDIT] Based on what other posted I updated the code.
;with
min_date_cte(MinTransactionDate, MaxTransactionDate) as (
select min(TransactionDate), max(TransactionDate) from tsales),
unq_yrs_cte(year_int) as (
select distinct year(TransactionDate) from tsales),
unq_cust_cte(CustomerID) as (
select distinct CustomerID from tsales)
select datefromparts(uyc.year_int, v.month_int, 1) TransactionDate,
ucc.CustomerID,
isnull(t.Quantity, 0) Quantity
from min_date_cte mdc
cross join unq_yrs_cte uyc
cross join unq_cust_cte ucc
cross join (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(month_int)
left join tsales t on datefromparts(uyc.year_int, v.month_int, 1)=t.TransactionDate
and ucc.CustomerID=t.CustomerId
where
datefromparts(uyc.year_int, v.month_int, 1)>=mdc.MinTransactionDate
and datefromparts(uyc.year_int, v.month_int, 1)<=mdc.MaxTransactionDate;
Results
TransactionDate CustomerID Quantity
2020-01-01 1234 5
2020-01-01 3241 0
2020-02-01 1234 0
2020-02-01 3241 0
2020-03-01 1234 0
2020-03-01 3241 8
2020-04-01 1234 0
2020-04-01 3241 0
2020-05-01 1234 0
2020-05-01 3241 0
2020-06-01 1234 0
2020-06-01 3241 0
2020-07-01 1234 9
2020-07-01 3241 4
You can make use of recursive query:
WITH cte1 as
(
select customerid, min([TransactionDate]) as Monthly_date, max([TransactionDate]) as end_date from calender_table
group by customerid
union all
select customerid, dateadd(month, 1, Monthly_date), end_date from cte1
where Monthly_date < end_date
)
select a.Monthly_date, a.customerid,coalesce(b.quantity, 0) from cte1 a left outer join calender_table b
on (a.Monthly_date = b.[TransactionDate] and a.customerid = b.customerid)
order by a.customerid, a.Monthly_date;

How to query total sum of sales by month by teams?

What is the best way to get the best selling teams by each month when I have tables like these:
The results should be something like this (group by total price of orders):
Month | Team | Sales
____________________
March | 2 | 3453
April | 3 | 1353
May | 2 | 5341
I have joined two tables before but for some reason joining 4 tables and grouping them by month seems difficult.
Thank you.
In Postgres, you can use distinct on -- if you want exactly one row per month:
select date_trunc('month', created), e.Team_nr, sum(p.price) as Sales
from employee e join
orders o
on e.id = o.employee_ID join
products p
on p.id = orders.product_id
group by e.Team_nr, date_trunc('month', o.Created)
order by date_trunc('month', o.Created), sum(p.price) desc
This should do it. I added the year too. It uses a CTE
with cte as
(
select Team_nr, sum(price) as Sales, date_part('month', Created) as _Month, date_part('year', Created) as _Year
from employee e
inner join orders o on e.id = o.employee_ID
inner join products p on p.id = orders.product_id
Group by Team_nr, date_part('month', Created), date_part('year', Created)
)
select Team_nr, Sales, _Month, _Year
from cte a
where not exists(select 1 from cte b where
a._Month = b._Month and a._Year = b._Year and a.Team_nr <> b.Team_nr and a.Sales < b.Sales )

Grouping data with step down summation

I have a table with OrderDate,TotalAmount. I want to display month and TotalAmount of month with total amount of previous month to be added in next month.
e.g.
OrderDate TotalAmount
---------- -----------
13.01.1998--- 10
15.01.1998--- 11
01.02.1998--- 12
18.02.1998--- 10
12.03.1998--- 09
Output should be
Month TotalSum
------ --------
1--- 21
2--- 43
3--- 52
If your data would only be from a single calendar year, you could use
with g as
( select month(orderdate) as ordermonth,
sum( totalamount ) as sales
from orders
group by month(orderdate)
)
select m.ordermonth, sum(t.sales) as totalsales
from g as m
join g as t on m.ordermonth >= t.ordermonth
group by m.ordermonth
order by m.ordermonth
But if there is ANY chance that your data could have two years, then you need year in there as well, so construct your month to include year.
with g as
( select format(orderdate, 'yyyy-MM') as ordermonth,
sum( totalamount ) as sales
from orders
group by format(orderdate, 'yyyy-MM')
)
select m.ordermonth, sum(t.sales) as totalsales
from g as m
join g as t on m.ordermonth >= t.ordermonth
group by m.ordermonth
order by m.ordermonth

SQL Join Not Returning What I Expect

I have a temp table I am creating a query off of in the following format. That contains a record for every CustomerID, Year, and Month for several years.
#T
Customer | CustomerID | Year | Month
ex.
Foo | 12345 | 2008 | 12
Foo | 12345 | 2008 | 11
Bar | 11224 | 2007 | 7
When I join this temp table to another table of the following format I get many more results than I am expecting.
Event
EventID | CustomerID | DateOpened
ex.
1100 | 12345 | '2008-12-11 10:15:43'
1100 | 12345 | '2008-12-11 11:25:17'
I am trying to get a result set of the count of events along with the Customer, Year, and Month like this.
SELECT COUNT(EventID), Customer, Year, Month
FROM [Event]
JOIN #T ON [Event].CustomerID = #T.CustomerID
WHERE [Event].DateOpened BETWEEN '2008-12-01' AND '2008-12-31'
GROUP BY Customer, Year, Month
ORDER BY Year, Month
I am getting a record for every Year and Month instead of only for December 2008.
You're specifying the date on the event table but not on the join -- so it's joining all records from the temp table with a matching customerid.
Try this:
SELECT COUNT(e.EventID), T.Customer, T.Year, T.Month
FROM [Event] e
INNER JOIN #T T ON (
T.CustomerID = e.CustomerID and
T.Year = year(e.DateOpened) and
T.Month = month(e.DateOpened)
)
WHERE T.Year = 2008
and T.Month = 12
GROUP BY T.Customer, T.Year, T.Month
ORDER BY T.Year, T.Month
Perhaps what you mean is:
SELECT COUNT(EventID)
,Customer
,Year
,Month
FROM [Event]
INNER JOIN #T
ON [Event].CustomerID = #T.CustomerID
AND YEAR([Event].DateOpened) = #T.YEAR
AND MONTH([Event].DateOpened) = #T.MONTH
WHERE [Event].DateOpened >= '2008-12-01'
AND [Event].DateOpened < '2009-01-01'
GROUP BY Customer
,Year
,Month
ORDER BY Year
,Month
Note, I've fixed another latent bug in your code: your BETWEEN is going to exclude datetimes like '2008-12-31 10:15:43' You can use this or similar technique.
The problem is that there are two rows in #T with CustomerID = 12345. Each of those rows joins with each of the rows in Event. If you only want the CustomerID in December, then you need to filter #T too:
SELECT COUNT(EventID), Customer, Year, Month
FROM [Event]
JOIN #T ON [Event].CustomerID = #T.CustomerID
WHERE [Event].DateOpened BETWEEN '2008-12-01' AND '2008-12-31'
AND #T.Year = 2008
AND #T.Month = 12
GROUP BY Customer, Year, Month
ORDER BY Year, Month
If you have some other expectation, you'd better clarify your question.
It seams that #T has redundant information if you only want the customer name.
You can solve it with a subquery
SELECT COUNT(EventID), (select TOP 1 #T.Customer from #T Where #T.CustomerID = [Event].CustomerID ), Year, Month
FROM [Event]
WHERE [Event].DateOpened BETWEEN '2008-12-01' AND '2008-12-31'
GROUP BY CustomerID, Year, Month
ORDER BY Year, Month