JOIN CTE's Grouped by Month/Year

JOIN CTE's Grouped by Month/Year - sql

I have multiple CTEs which result in the following common table structure:
Year | Month | Total_Purchases_Product_Line_X
These represent purchases grouped by month & year across several product lines.
Ex.)
SELECT * FROM cte_line_x
Year | Month | Total_Purchases_Product_Line_X
2018 01 256
2018 02 192
SELECT * FROM cte_line_y
Year | Month | Total_Purchases_Product_Line_Y
2018 01 76
2018 02 59
I'd like to create something like the following
Year | Month | Total_Purchases_Line_X | Total_Purchases_Line_Y | Total_Purchases_Line_Z
2018 01 256 76
2018 02 192 59
Where the total purchases of each product line is joined. However, I'm running into issues grouping the dates from each CTE after I have joined them together.
Here is what I've tried:
SELECT
cte_product_x.Month,
cte_product_x.Year,
cte_product_x.total as Total_X,
cte_product_y.total as Total_Y,
cte_product_z.total as Total_Z
FROM
cte_product_x
LEFT JOIN
cte_product_y ON
cte_product_y.year = cte_product_x.year
AND
cte_product_y.month = cte_product_x.month
LEFT JOIN
cte_product_z ON
cte_product_z.year = cte_product_x.year
AND
cte_product_z.month = cte_product_x.month
GROUP BY
cte_product_x.Month,
cte_product_x.Year
ORDER BY
cte_product_x.Month,
cte_product_x.Year
I tried changing my SELECT to:
SELECT
cte_product_x.Month,
cte_product_x.Year,
MAX(cte_product_x.total as Total_X),
MAX(cte_product_y as Total_Y),
MAX(cte_product_z as Total_Z)
However, it only worked for "Total_X". The counts for the other columns were the max value found for a grouped total for all months. I don't understand why.

Doesn't this work?
SELECT x.Month, x.Year, x.total as Total_X,
y.total as Total_Y, z.total as Total_Z
FROM cte_product_x x JOIN
cte_product_y y
ON y.year = x.year AND y.month = x.month JOIN
cte_product_z z
ON z.year = x.year AND z.month = x.month
ORDER BY x.Month, x.Year;
At least it works for your sample data.

Related

SQL: How to return revenue for specific year

I would like to show the revenue for a specific year for all customers regardless of whether or not they have revenue data for the specific year. (in cases they dont have data for the specific year, a filler like 'no data' would work)
Sample Data looks like:
Table 1
Customer
Price
Quantity
Order Date
xxx
12
5
1990/03/25
yyy
15
7
1991/05/35
xxx
34
2
1990/08/21
Desired Output would look a little something like this:
Customer
Revenue (for 1990)
xxx
128
yyy
no data
Getting the total revenue for each would be:
SELECT Customer,
SUM(quantity*price) AS Revenue
but how would i go about listing it out for a specific year for all customers? (incl. customers that dont have data for that specific year)

We can use a CTE or a sub-query to create a list of all customers and another to get all years and the cross join them and left join onto revenue.
This gives an row for each customer for each year. If you add where y= you will only get the year requested.
CREATE TABLE revenue(
Customer varchar(10),
Price int,
Quantity int,
OrderDate date);
insert into revenue values
('xxx', 12,5,'2021-03-25'),
('yyy', 15,7,'2021-05-15'),
('xxx', 34,2,'2022-08-21');
with cust as
(select distinct customer c from revenue),
years as
(select distinct year(OrderDate) y from revenue)
select
y "year",
c customer ,
sum(price*quantity) revenue
from years
cross join cust
left join revenue r
on cust.c = r.customer and years.y = year(OrderDate)
group by
c,y,
year(OrderDate)
order by y,c
year | customer | revenue
---: | :------- | ------:
2021 | xxx | 60
2021 | yyy | 105
2022 | xxx | 68
2022 | yyy | null
db<>fiddle here

You would simply use group by and do the sum in a subquery and left join it to your customers table. ie:
select customers.Name, totals.Revenue
from Customers
Left join
( select customerId, sum(quantity*price) as revenue
from myTable
where year(orderDate) = 1990
group by customer) totals on customers.CustomerId = myTable.customerId;

PostgreSQL - How to get month/year even if there are no records within that date?

What I'm trying to do in this case is to get the ''most future'' record of a Bills table and get all the record prior 13 months from that last record, so what I've tried is something like this
SELECT
users.name,
EXTRACT(month from priority_date) as month,
EXTRACT(year from priority_date) as year,
SUM("money_balance") as "money_balance"
FROM bills
JOIN users on users.id = bills.user_id
WHERE priority_date >= ( SELECT
DATE_TRUNC('month', MAX(debts.priority_date))
FROM bills
INNER JOIN users ON bills.property_id = users.id
WHERE users.company_id = 15
AND users.active = true
AND bills.paid = false ) - interval '13 month'
AND priority_date <= ( SELECT
MAX(bills.priority_date)
FROM bills
INNER JOIN users ON bills.property_id = users.id
WHERE users.community_id = 15
AND users.active = true
AND debts.paid = false )
AND users.company_id = 15
AND bills.paid = false
AND users.active = true
GROUP BY 1,2,3
ORDER BY year, month
So for instance, lets say the most future date for a created bill is December 2022, this query will give me the info from November 2021 to December 2022
The data will give me something like
name
month
year
money_balance
Joshua..
11
2021
300
Joshua..
1
2022
111
Mark..
1
2022
200
...
...
...
...
John
12
2022
399
In the case of Joshua, because he had no bills to pay in December 2021, it doesn't return anything for that month/year.
Is it possible to return the months/year where there are no records for that month, for each user?
Something like
name
month
year
money_balance
Joshua..
11
2021
300
Joshua..
12
2021
0
Joshua..
1
2022
111
other users
....
...
...
Thank you so much!

We can use a CTE to create the list of months, using the maximum and minimum dates from bill, and then cross join it onto users to get a line for all users for all months. We then left join onto bills to populate the last column.
The problem with this approach is that we can end up with a lot of rows with no value.
create table bills(user_id int,priority_date date, money_balance int);
create table users(id int, name varchar(25));
insert into users values(1,'Joshua'),(2,'Mark'),(3,'John');
insert into bills values(1,'2021-11-01',300),(1,'2022-01-01',111),(2,'2022-01-01',200),(3,'2021-12-01',399);
;with months as
(SELECT to_char(generate_series(min(priority_date), max(priority_date), '1 month'), 'Mon-YY') AS "Mon-YY"
from bills)
SELECT
u.name,
"Mon-YY",
--EXTRACT(month from "Mon-YY") as month,
--EXTRACT(year from "Mon-YY") as year,
SUM("money_balance") as "money_balance"
FROM months m
CROSS JOIN users u
LEFT JOIN bills b
ON u.id = b.user_id
AND to_char(priority_date,'Mon-YY') = m."Mon-YY"
GROUP BY
u.name,
"Mon-YY"
ORDER BY "Mon-YY", u.name
name | Mon-YY | money_balance
:----- | :----- | ------------:
John | Dec-21 | 399
Joshua | Dec-21 | null
Mark | Dec-21 | null
John | Jan-22 | null
Joshua | Jan-22 | 111
Mark | Jan-22 | 200
John | Nov-21 | null
Joshua | Nov-21 | 300
Mark | Nov-21 | null
db<>fiddle here

Is there a way to select sum on one column based on other DISTINCT column, while grouping by third column(date) only

I have three columns
year | money | id
2020 100 01
2020 100 01
2019 50 02
2018 50 03
2020 40 04
results should be
Year | Money | total people
2020 | 240 | 4
** AS first two ids are the same, I tried it as below
select year, sum(money), Count( Distinct id) from table
group by year
But the result shows 4 people which is the correct but wrong sum, as it is counting all of the money

You can aggregate and then aggregate again:
select max(year), sum(money), count(*)
from (select distinct year, money, id
from t
) t;

You can use SUM() and COUNT(DISTINCT x).
For example:
select
year,
sum(money) as money,
(select count(distinct id) from t) as total_people
from t
where year = 2020
group by year;
Result:
YEAR MONEY TOTAL_PEOPLE
----- ------ ------------
2020 240 4
See running example at db<>fiddle.

Not the most performant, but if you wish to avoid a derived table, you can do
select distinct
max(year) over (),
sum(money) over (),
count(*) over ()
from t
group by year, money, id;
And if you want this grouped by year, you can define the partitions in the over clause

How to avoid transition between column-organized data processing and row-organized data processing

I'm working on DB2 Blu on with column organized tables.
My dataset is the following :
Day month year value
------- -------
20200101 202001 2020 100
20200102 202001 2020 110
...
20200215 202002 2020 120
I want to aggregate by week, month and year for this result :
Id value
2020 12000
202001 4000 'january
202002 4000 'february
2020001 700 'first week of 2020
In order to do this, I also have the table d_tps
Type Id week month year
J 20200101 2020001 202001 2020
J 20200102 2020001 202001 2020
...
J 20200215 2020007 202002 2020
M 202001 null 202001 2020
M 202002 null 202002 2020
Y 2020 null null 2020
My approach is the following
select d.id, sum(value) from tab1
Inner join d_tps d
On d.id = tab1.year
Or d.id = tab1.month
Or d.id = tab1.year
group by d.id
It works and return the expected result. Unfortunately, in the query plan, the join with OR condition causes the CTQ operator to come early and most of the query (which is in reality more complex) is treated as rows instead of columns.
How can I optimize it ?

It looks like one join condition is sufficient along with aggregation:
select d.week, sum(value)
from tab1 Inner join
d_tps d
On d.id = tab1.day
group by d.week
If you want to aggregate by multiple time levels, then use grouping sets:
select d.week, d.month, d.year, sum(value)
from tab1 Inner join
d_tps d
On d.id = tab1.day
group by grouping sets ((d.week), (d.month), (d.year))

You should use GROUP BY GROUPING SETS & GROUPING function to achieve what you want.
WITH T (day, month, year, value) AS
(
values
(20200101, 202001, 2020, 100)
, (20200102, 202001, 2020, 110)
, (20200215, 202002, 2020, 120)
)
SELECT
CASE
WHEN GROUPING(DAY) = 0 THEN DAY
WHEN GROUPING(MONTH) = 0 THEN MONTH
WHEN GROUPING(YEAR ) = 0 THEN YEAR
END AS ID
, SUM(VALUE) AS VALUE
FROM T
GROUP BY GROUPING SETS (DAY, MONTH, YEAR);
The result is:
|ID |VALUE |
|-----------|-----------|
|2020 |330 |
|202001 |210 |
|202002 |120 |
|20200101 |100 |
|20200102 |110 |
|20200215 |120 |

SQL | Aggregation

My objective is to get a table such as this:
Request:
Wards where the annual cost of drugs prescribed exceeds 25
ward_no | year | cost
w2 | 2007 | 34
w4 | 2007 | 160
w5 | 2006 | 26
w5 | 2007 | 33
I would input a picture but lack reputation points.
Here is what I have done so far:
select w.ward_no,
year(pn.start_date) as year,
pn.quantity * d.price as cost
from ward w,
patient pt,
prescription pn,
drug d
where w.ward_no = pt.ward_no
and
pn.drug_code = d.drug_code
and
pn.patient_id = pt.patient_id
group by w.ward_no,
year,
cost
having cost > 25
order by w.ward_no, year
My current output is such:
ward_no|year|cost
'w2' |2006|0.28
'w2' |2007|3.20
'w2' |2007|9.50
'w2' |2007|21.60
'w3' |2006|10.08
'w3' |2007|4.80
'w4' |2006|4.41
'w4' |2007|101.00
'w4' |2007|58.80
'w5' |2006|25.20
'w5' |2006|0.56
'w5' |2007|20.16
'w5' |2007|12.60
How would I reduce each ward_no to have only a single row of year (say 2006 or 2007) instead of having x of them?
Any help would be very much appreciated I am really stuck and have no clue what to do.

You need to group by ward and year, and sum up the costs:
select w.ward_no,
year(pn.start_date) as year,
sum(pn.quantity * d.price) as cost
from ward w
inner join patient pt
on pt.ward_no = w.ward_no
inner join prescription pn
on pn.patient_id = pt.patient_id
inner join drug d
on d.drug_code = pn.drug_code
group by w.ward_no,
year(pn.start_date)
having sum(pn.quantity * d.price) > 25
order by w.ward_no, year
What flavor of SQL is this supposed to be for?
Share and enjoy.

You are grouping by cost, so each cost gets a new line. Remove the cost from your grouping
select w.ward_no,
year(pn.start_date) as year,
avg(pn.quantity * d.price) as cost
from ward w,
patient pt,
prescription pn,
drug d
where w.ward_no = pt.ward_no
and
pn.drug_code = d.drug_code
and
pn.patient_id = pt.patient_id
group by w.ward_no,
year
having avg(pn.quantity * d.price) > 25
order by w.ward_no, year

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

JOIN CTE's Grouped by Month/Year - sql

Related

SQL: How to return revenue for specific year

PostgreSQL - How to get month/year even if there are no records within that date?

Is there a way to select sum on one column based on other DISTINCT column, while grouping by third column(date) only

How to avoid transition between column-organized data processing and row-organized data processing

SQL | Aggregation

Categories

Resources