Using CTE to create pivot table - sql

I've a table:
Task:
Create a pivot table using CTE.
Count the orders placed for each month for several years: from 2011 to 2013. The final table should include four fields: invoice_month, year_2011, year_2012, year_2013. The month field must store the month as a number between 1 and 12.
If no orders were placed in any month, the number of that month should still be included in the table.
I was able to solve this task with this query:
WITH year11
AS (
SELECT EXTRACT(MONTH FROM invoice.invoice_date::TIMESTAMP) AS invoice_month
,COUNT(*) AS orders
FROM invoice
WHERE EXTRACT(YEAR FROM invoice.invoice_date::TIMESTAMP) = 2011
GROUP BY invoice_month
)
,year12
AS (
SELECT EXTRACT(MONTH FROM invoice.invoice_date::TIMESTAMP) AS invoice_month
,COUNT(*) AS orders
FROM invoice
WHERE EXTRACT(YEAR FROM invoice.invoice_date::TIMESTAMP) = 2012
GROUP BY invoice_month
)
,year13
AS (
SELECT EXTRACT(MONTH FROM invoice.invoice_date::TIMESTAMP) AS invoice_month
,COUNT(*) AS orders
FROM invoice
WHERE EXTRACT(YEAR FROM invoice.invoice_date::TIMESTAMP) = 2013
GROUP BY invoice_month
)
SELECT year11.invoice_month
,year11.orders AS year_2011
,year12.orders AS year_2012
,year13.orders AS year_2013
FROM year11
INNER JOIN year12 ON year11.invoice_month = year12.invoice_month
INNER JOIN year13 ON year11.invoice_month = year13.invoice_month
But this request looks too big (or not?).
What can I improve (should I?)using CTE in my query?
Other tools to solve this task fast and beautiful?

I find using filtered aggregation a lot easier to generate pivot tables:
SELECT extract(month from inv.invoice_date) AS invoice_month
COUNT(*) filter (where extract(year from inv.invoice_date) = 2011) AS orders_2011,
COUNT(*) filter (where extract(year from inv.invoice_date) = 2012) AS orders_2012,
COUNT(*) filter (where extract(year from inv.invoice_date) = 2013) AS orders_2013
FROM invoice inv
WHERE inv.invoice_date >= date '2011-01-01'
AND inv.invoice_date < date '2014-01-01'
GROUP BY invoice_month

Related

Selecting month in each year with maximum number of projects

I have the following scenario
img
For each year I would like to display the month with the highest number of projects that have ended
I have tried the following so far:
SELECT COUNT(proj.projno) nr_proj, extract(month from proj.end_date) month
, extract(year from proj.end_date) year
FROM PROJ
GROUP BY extract(month from proj.end_date)
,extract(year from proj.end_date)
I am getting the information about the number of projects per month, per year.
Could any one give me hints how for each of the years I would select only the records with the highest count of projects?
You can use this solution using max analytic function to get max nr_proj value per year (partition by clause), then keep only rows where nr_proj = mx.
select t.nr_proj, t.month, t.year
from (
SELECT COUNT(proj.projno) nr_proj
, extract(month from proj.end_date) month
, extract(year from proj.end_date) year
, max( COUNT(proj.projno) ) over(partition by extract(year from proj.end_date)) mx
FROM PROJ
GROUP BY extract(month from proj.end_date), extract(year from proj.end_date)
) t
where nr_proj = mx
;
demo
I think the following will give you what you are after (if I understood the requirements). It fist counts the projects for each month then ranks the months by year, finally it selects the first rank.
select dt "Most Projects Month", cnt "Monthly Projects"
from ( -- Rank month Count by Year
select to_char( dt, 'yyyy-mm') dt
, cnt
, rank() over (partition by extract(year from dt)
order by cnt desc) rnk
from (-- count number of in month projects for each year
select trunc(end_date,'mon') dt, count(*) cnt
from projects
group by trunc(end_date,'mon')
)
)
where rnk = 1
order by dt;
NOTE: Not tested, no data supplied. In future do not post images, see Why No Images.

Why two of my different sql queries that must perform the same result act different?

I need to get the number of new buyers who came in 1990 year.
The first query says it's 17, but the second says it's 29? So which one is wrong and why?
SELECT DISTINCT COUNT(customer_id) FROM SALES_ORDER WHERE
EXTRACT(YEAR FROM order_date) = 1990
AND
customer_id NOT IN (SELECT customer_id FROM SALES_ORDER WHERE EXTRACT(YEAR FROM order_date) < 1990);
SELECT DISTINCT COUNT(customer_id) FROM SALES_ORDER WHERE
customer_id IN (SELECT customer_id FROM SALES_ORDER WHERE EXTRACT(YEAR FROM order_date) = 1990)
AND
customer_id NOT IN (SELECT customer_id FROM SALES_ORDER WHERE EXTRACT(YEAR FROM order_date) < 1990);
Here is my data schema:
Both queries do not do what you want. You are using SELECT DISTINCT COUNT(customer_id), while you probably want SELECT COUNT(DISTINCT customer_id).
I find that the logic would be simpler expressed with two levels of aggregation:
select count(*)
from (
select customer_id
from sales_order
group by customer_id
having extract(year from min(order_date)) = 1990
) t

T-SQL query to summarize total per month per year, and cumulative amounts to date

I have a database table that captures every Sales Transaction:
Transactions
(
ID INT,
TransactionDate DATETIME,
SalesAmount MONEY
)
I want to write a T-SQL query which returns a report (snapshot sample below). First column it lists the month, next column Total-Sales per month within year, and last column cumulative amount of that year up to this month. Only for year of 2018.
Any thoughts or solutions? Thank you.
Try this:
;with cte as
(
Select
YEAR(TransactionDate) as [Year],
MONTH(TransactionDate) as [Month],
SUM (SalesAmount) as [MonthlySales],
DATEPART(m, TransactionDate) as [MonthNumber]
from Transactions
group by YEAR(TransactionDate), MONTH(TransactionDate)
)
select
a.[Month], a.MonthlySales as [MonthlySales 2018], SUM(b.MonthlySales) as [Cumulative 2018]
from cte a inner join cte b on a.MonthNumber >= b.MonthNumber
WHERE (a.[Year]) = 2018 AND (b.[Year]) = 2018
group by a.[Month], a.MonthlySales
ORDER by a.[Month]
Try this one:
With Q
as
(
Select DatePart(yyyy,TransactionDate) 'Year',DatePart(m,TransactionDate) 'Month', sum(SalesAmount) 'Sales'
From Transactions
Group by DatePart(yyyy,TransactionDate),DatePart(m,TransactionDate)
)
Select q.Year,q.Month,( Select sum(q1.Sales)
From Q q1
Where q1.Year=q.Year
And q1.Month <= q.Month
) 'Cumulative Sale'
From Q q
Order by q.Year,q.Month
You would use aggregation and window functions:
select datename(month, transaction_date) as mon,
sum(salesAmount) as monthly_sales,
sum(salesAumount) over (order by min(transaction_date)) as running_amount
from transactions t
where t.transaction_date >= '2018-01-01' and
t.transaction_date < '2019-01-01'
group by datename(month, transaction_date)
order by min(transaction_date);

SQL Query to group ID overlap (via inner join) by month

I'm trying to find a query that will give me the number of customers that have transacted with 2 different entities in the same month. In other words, customer_ids that transacted with company_a and company_b within the same month. Here is what I have so far:
SELECT Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date) AS
payment_month,
Count(UNIQUE(company_a_customers.customer_id))
FROM (SELECT *
FROM my_table
WHERE ( merchant_name LIKE '%company_a%' )) AS company_a_customers
INNER JOIN (SELECT *
FROM my_table
WHERE ( merchant_name = 'company_b' )) AS
company_b_customers
ON company_a_customers.customer_id =
company_b_customers.customer_id
GROUP BY Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date)
The problem is that this is giving me a running total of all customers that transacted with company A on a month-by-month basis who also ever transacted with company B.
If I whittle it down to a specific month, it will obviously give me the correct overlap, because the query is only getting IDs for that month:
SELECT Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date) AS
payment_month,
Count(UNIQUE(company_a_customers.customer_id))
FROM (SELECT *
FROM my_table
WHERE ( merchant_name LIKE '%company_a%' )
AND transaction_date >= '2017-06-01'
AND transaction_date <= '2017-06-30') AS company_a_customers
INNER JOIN (SELECT *
FROM my_table
WHERE ( merchant_name = 'company_b' )
AND transaction_date >= '2017-06-01'
AND transaction_date <= '2017-06-30') AS
company_b_customers
ON company_a_customers.customer_id =
company_b_customers.customer_id
GROUP BY Extract(year FROM company_a_customers.transaction_date)
|| Extract(month FROM company_a_customers.transaction_date)
How can I do this in one query to get monthly totals for customers who transacted with both companies within the given month?
Desired result: Output of second query, but for every month that is in the database. In other words:
January 2017: xx,xxx overlapping customers
February 2017: xx,xxx overlapping customers
March 2017: xx,xxx overlapping customers
Thanks very much.
You could simply calculate year/month for both and then add it as a join-condition, but this is not very efficient as it might create a huge intermediate result.
You better check for each month/customer if there were transactions with both merchants using conditional aggregation. And then count by month:
SELECT payment_month, count(*)
FROM
( SELECT Extract(year FROM transaction_date)
|| Extract(month FROM transaction_date) AS payment_month,
customer_id
FROM my_table
WHERE ( merchant_name LIKE '%company_a%' )
OR ( merchant_name = 'company_b' )
GROUP BY payment_month,
customer_id
-- both merchants within the same months
HAVING SUM(CASE WHEN merchant_name LIKE '%company_a%' THEN 1 ELSE 0 END) > 0
AND SUM(CASE WHEN merchant_name = 'company_b' THEN 1 ELSE 0 END) > 0
) AS dt
GROUP BY 1
YOur payment_month calculation is to complicated (and the returned string is not nicely formatted).
To get year/month as string:
TO_CHAR(transaction_date, 'YYYYMM')
as number:
EXTRACT(YEAR FROM transaction_date) * 100
+ EXTRACT(MONTH FROM transaction_date)
or calculate the first of month:
TRUNC(transaction_date, 'mon')
You should be able to get your desired results in one query just by counting the number of merchant_names per month per customer id. Using HAVING > 1 will show you only customers with transactions with both (or more if there are more matches for like '%company_a%').
SELECT
EXTRACT(Year from transaction_date)||EXTRACT(Month from transaction_date) as payment_month
,customer_id
,COUNT(DISTINCT merchant_name) as CompanyCount
FROM my_table
WHERE transaction_date >= '2017-06-01' AND transaction_date <= '2017-06-30'
AND (merchant_name = 'company_b' or merchant_name LIKE '%company_a%')
GROUP BY
EXTRACT(Year from transaction_date)||EXTRACT(Month from transaction_date)
,customer_id
HAVING COUNT(DISTINCT merchant_name) > 1

How can I return a row for each group even if there were no results?

I'm working with a database containing customer orders. These orders contain the customer id, order month, order year, order half month( either first half 'FH' or last half 'LH' of the month), and quantity ordered.
I want to query monthly totals for each customer for given month. Here's what I have so far.
SELECT id, half_month, month, year, SUM(nbr_ord)
FROM Orders
WHERE month = 7
AND year = 2015
GROUP BY id, half_month, year, month
The problem with this is that if a customer did not order anything during one half_month there will not be a row returned for that period.
I want there to be a row for each customer for every half month. If they didn't order anything during a half month then a row should be returned with their id, the month, year, half month, and 0 for number ordered.
First, generate all the rows, which you can do with a cross join of the customers and the time periods. Then, bring in the information for the aggregation:
select i.id, t.half_month, t.month, t.year, coalesce(sum(nbr_ord), 0)
from (select distinct id from orders) i cross join
(select distinct half_month, month, year
from orders
where month = 7 and year = 2015
) t left join
orders o
on o.id = i.id and o.half_month = t.half_month and
o.month = t.month and o.year = t.year
group by i.id, t.half_month, t.month, t.year;
Note: you might have other sources for the id and date parts. This pulls them from orders.
IF you know the entire dataset has an occurance of each half_month, month, year combination you could use the listing of those 3 things as the left side of a left join. That would look like this:
Select t1.half_month, t1.month, t1.year, t2.ID, t2.nbr_ord from
(Select half_month, month, year)t1
Left Join
(SELECT id, half_month, month, year, SUM(nbr_ord)nbr_ord
FROM Orders
WHERE month = 7
AND year = 2015
GROUP BY id, half_month, year, month)t2
on t1.half_month = t2.half_month
and t1.month = t2.month
and t1.year = t2.year
SELECT m.id, m.half_month, m.year, t.nbr_order
FROM (
SELECT Id, sum(nbr_order) AS nbr_order
FROM Orders
GROUP BY id
) t
INNER JOIN Orders m
ON t.Id = m.id
WHERE m.month = 7
AND m.year = 2015;