Challenging PostgreSQL SELECT Statement for Northwind Database (NEED HELP - BEGINNER) - sql

I began learning SQL about a month ago and my dad has been giving me practice queries to run
with the Northwind database to help practice my DML. This most recent one he gave me was as follows:
-- Return Month, Product name, SalesForMonth for highest selling product in each
-- month in 1997. (see issues with creating said query below)
(I am currently using PostgreSQL on pgAdmin4)
I was able to come up with the following query that returns the required columns with the correct information for ONLY a single month:
SELECT CAST( EXTRACT( MONTH FROM o.orderdate) AS integer) AS Month, p.productname,
ROUND(CAST(SUM(od.quantity * od.unitprice) AS numeric), 2) SalesForMonth
FROM order_details od
INNER JOIN orders o ON od.orderid = o.orderid
INNER JOIN products p ON od.productid = p.productid
WHERE EXTRACT( YEAR FROM o.orderdate) = 1997 AND EXTRACT( MONTH FROM o.orderdate) = 1
GROUP BY Month, p.productname
ORDER BY salesformonth DESC
LIMIT 1
By making 12 of these queries and changing the extract-month bit in the WHERE statement from 1-12 and UNIONing them all together, I can produce the desired result but I wondered if there was an easier way that I was missing to display the same result but only using 1 query. Interested to see what y'all can come up with.
SIDE NOTE: My initial thought is that it has something to do with subqueries because what you're effectively trying to do is display the MAX(SUM(values)) but can't actually do that since you can't nest aggregate function.

Your query gives you the top selling product for a given month, and you want the same logic for multipe months at once.
Starting from your existing and working query, a simple approach is to use WITH TIES:
SELECT DATE_TRUNC('month', o.orderdate) DateMonth,
p.productname,
ROUND(CAST(SUM(od.quantity * od.unitprice) AS numeric), 2) SalesForMonth
FROM order_details od
INNER JOIN orders o ON od.orderid = o.orderid
INNER JOIN products p ON od.productid = p.productid
WHERE o.orderdate >= DATE '1997-01-01' AND o.orderdate < DATE '1998-01-01'
GROUP BY DateMonth, p.productname
ORDER BY RANK() OVER(
PARTITION BY DATE_TRUNC('month', o.orderdate)
ORDER BY SUM(od.quantity * od.unitprice) DESC
)
FETCH FIRST ROW WITH TIES
We can also use DISTINCT ON:
SELECT DISTINCT ON (DateMonth)
DATE_TRUNC('month', o.orderdate) DateMonth,
p.productname,
ROUND(CAST(SUM(od.quantity * od.unitprice) AS numeric), 2) SalesForMonth
FROM order_details od
INNER JOIN orders o ON od.orderid = o.orderid
INNER JOIN products p ON od.productid = p.productid
WHERE o.orderdate >= DATE '1997-01-01' AND o.orderdate < DATE '1998-01-01'
GROUP BY DateMonth, p.productname
ORDER BY DateMonth, SalesForMonth DESC

Related

SQL -percent calculation

Make a report on the sales in 2015 of the products by categories (total value and
quantity sold). Also determine what% of the value of sales
for a given category represent the sales of each of the products in the category.
My query so far:
WITH sales AS
(SELECT t1.category_name
, t2.product_name
, (t3.unit_price*t3.quantity) Total_sales
, EXTRACT (YEAR FROM order_date) Year
FROM categories t1
INNER JOIN
products t2
ON t2.category_id=t1.category_id
INNER JOIN
order_details t3
ON t3.product_id=t2.product_id
INNER JOIN
orders t4
ON t4.order_id=t3.order_id
WHERE EXTRACT (YEAR FROM order_date) = '2015'
GROUP BY t1.category_name
, t2.product_name
, (t3.unit_price*t3.quantity)
, EXTRACT (YEAR FROM order_date)
ORDER BY 1
)
SELECT s.category_name
, s.product_name
, SUM (Total_sales)
FROM sales s
GROUP BY s.category_name
, s.product_name
ORDER BY 1
How to calculate %? Thank you
I think that you want window functions - if your database, that you did not specify, supports them:
SELECT
c.category_name
p.product_name
SUM(od.unit_price * od.quantity) as total_sales
1.0 * SUM(od.unit_price * od.quantity)
/ SUM(SUM(od.unit_price * od.quantity)) OVER(PARTITION BY c.category_id)
as category_sales_ratio
FROM categories c
INNER JOIN products t2 p ON p.category_id = c.ategory_id
INNER JOIN order_details od ON od.product_id = p.product_id
INNER JOIN orders o ON o.order_id = od.order_id
WHERE o.order_date >= '2015-01-01' AND o.order_date < '2016-01-01'
GROUP BY c.category_id, c.ategory_name, p.product_id, p.product_name
ORDER BY c.category_name, p.product_name
The window sum computes the total sales for the whole category, that you can divide the sales of the current product with.
Note that I changed your query in serveral ways:
meaningful table aliases make the query easier to write, read and maintain
filtering dates without transformation is much more efficient that using date functions
there is no need for a subquery
it is always a good idea to put the relevant primary keys in the GROUP BY clause (in case two different products or categories have the same name) - on the other hand, you also had additiona uneeded columns in that clause

joining tables , data between dates

I am practicing with northwind database: I am quite new to sql.
Question I am trying to solve is :
Q. Total Sales for each customer in October 1996 (based on OrderDate). Show the result in CustomerID, CompanyName, and [total sales], sorted in [total sales] in Decending order.
I have used this code but doesn't seems to be correct , please advise.
select c.customerid , c.companyname , o.orderdate , sum(od.unitprice *od.Quantity*1-od.Discount) as totalsales
from customers as c , orders as o , [Order Details] as od
where o.customerid = c.CustomerID
and o.OrderID = od.OrderID
and o.OrderDate >= '1996/10/01' and o.orderdate <= '1996/10/31'
group by c.customerid , c.companyname, o.orderdate
order by totalsales desc
;
*******************************
I suspect is it just the method of calculation inside the sum function
SELECT
c.customerid
, c.companyname
, o.orderdate
, SUM((od.unitprice * od.Quantity) * (1 - od.Discount)) AS totalsales
FROM customers AS c
INNER JOIN orders AS o ON o.customerid = c.CustomerID
INNER JOIN [Order Details] AS od ON o.OrderID = od.OrderID
WHERE o.OrderDate >= '1996-10-01'
AND o.orderdate < '1996-11-01' -- move up one day, use less than
GROUP BY
c.customerid
, c.companyname
, o.orderdate
ORDER BY
totalsales DESC
;
(od.unitprice * od.Quantity) provides total discounted price, then
the discount rate is (1 - od.Discount)
multiply those (od.unitprice * od.Quantity) * (1 - od.Discount) for total discounted price
Please note I have changed the syntax of the joins! PLEASE learn this more modern syntax. Don't use commas between table names in the from clause, then conditions such as AND o.customerid = c.CustomerID move to after ON instead of within the where clause..
Also, the most reliable date literals in SQL Server are yyyymmdd and the second best is yyyy-mm-dd. It's good to see you using year first, but I would suggest using dashes not slashes, or (even better) no delimiter. e.g.
WHERE o.OrderDate >= '19961001'
AND o.orderdate < '19961101'
Also note that I have removed the <= and replaced it with < and moved that higher date to the first of the next month. It is actually easier this way as every month has a day 1, just use less than this higher date.

Find the most popular month for customers to order a certain product

I am trying to find out which month has the most orders for a certain product (Product HHYDP). This is my code so far, but each time I try to use GROUP BY and SORT BY functions related to my problem, I get an error. There are three years in the database (2006,2007,2008) and they are formatted as (YYYY-MM-DD). I am trying to find which month has the highest total order volume across the three years, quantity is irrelevant.
SELECT p.productname, o.orderdate
FROM [Sales].[Orders] as o
JOIN [Sales].[OrderDetails] as od
ON o.orderid = od.orderid
JOIN [Production].[Products] as p
ON od.productid = p.productid
WHERE p.productname like '%hhydp%'
I am using microsoft SQL management server.
You can try to use year and month with COUNT aggregate function function and add them in group by
SELECT p.productname,
year(o.orderdate) yr,
month(o.orderdate) mn,
COUNT(*) cnt
FROM [Sales].[Orders] as o
JOIN [Sales].[OrderDetails] as od
ON o.orderid = od.orderid
JOIN [Production].[Products] as p
ON od.productid = p.productid
WHERE p.productname like '%hhydp%'
GROUP BY p.productname, year(o.orderdate) ,month(o.orderdate)

Creating an SQL query that eliminates duplicate months/year

Hello Stack Overflow community - hopefully i'm on the right track with this one, but i'm trying to write a query where a report out shows the number of orders placed by month/year. The report currently brings up all the days where i'm trying to join them all by month/year collectively. Hopefully this makes sense, i'm pretty new to this, be gentle please ;)
select distinct month(o.orderdate) 'Month',
year(o.orderdate) 'Year', sum(od.Quantity) as Orders
from OrderDetails od
join Products p
on od.ProductID = p.ProductID
join Orders o
on od.OrderID = o.OrderID
group by o.orderdate
Order by year, month asc;
You need to group by what you want to define each row. In your case, that is the year and month:
select year(orderdate) as yyyy, month(o.orderdate) as mm,
sum(od.Quantity) as Orders
from OrderDetails od join
Products p
on od.ProductID = p.ProductID join
Orders o
on od.OrderID = o.OrderID
group by year(o.orderdate), month(o.orderdate)
Order by yyyy, mm asc;
Notes:
I changed the column names to yyyy and mm so they do not conflict with the reserved words year and month.
Don't use single quotes for column aliases. This is a bad habit that will eventually cause problems in your query.
I always use as for column aliases (to help prevent missing comma mistakes), but never for table aliases.
The product table is not needed for this query.
Edit: If you want a count of orders, which your query suggests, then this might be more appropriate:
select year(o.orderdate) as yyyy, month(o.o.orderdate) as mm,
count(*) as Orders
from orders o
group by year(o.orderdate), month(o.orderdate)
Order by yyyy, mm asc;
You have to group by month and year
select distinct month(o.orderdate) 'Month',
year(o.orderdate) 'Year', sum(od.Quantity) as Orders
from OrderDetails od
join Products p
on od.ProductID = p.ProductID
join Orders o
on od.OrderID = o.OrderID
group by month(o.orderdate), year(o.orderdate)
Order by [Year],[month]

Getting max value before given date

I am pretty new to using MS SQL 2012 and I am trying to create a query that will:
Report the order id, the order date and the employee id that processed the order
report the maximum shipping cost among the orders processed by the same employee prior to that order
This is the code that I've come up with, but it returns the freight of the particular order date. Whereas I am trying to get the maximum freight from all the orders before the particular order.
select o.employeeid, o.orderid, o.orderdate, t2.maxfreight
from orders o
inner join
(
select employeeid, orderdate, max(freight) as maxfreight
from orders
group by EmployeeID, OrderDate
) t2
on o.EmployeeID = t2.EmployeeID
inner join
(
select employeeid, max(orderdate) as mostRecentOrderDate
from Orders
group by EmployeeID
) t3
on t2.EmployeeID = t3.EmployeeID
where o.freight = t2.maxfreight and t2.orderdate < t3.mostRecentOrderDate
Step one is to read the order:
select o.employeeid, o.orderid, o.orderdate
from orders o
where o.orderid = #ParticularOrder;
That gives you everything you need to go out and get the previous orders from the same employee and join each one to the row you get from above.
select o.employeeid, o.orderid, o.orderdate, o2.freight
from orders o
join orders o2
on o2.employeeid = o.employeeid
and o2.orderdate < o.orderdate
where o.orderid = #ParticularOrder;
Now you have a whole bunch of rows with the first three values the same and the fourth is the freight cost of each previous order. So just group by the first three fields and select the maximum of the previous orders.
select o.employeeid, o.orderid, o.orderdate, max( o2.freight ) as maxfreight
from orders o
join orders o2
on o2.employeeid = o.employeeid
and o2.orderdate < o.orderdate
where o.orderid = #ParticularOrder
group by o.employeeid, o.orderid, o.orderdate;
Done. Build your query in stages and many times it will turn out to be much simpler than you at first thought.
It is unclear why you are using t3. From the question it doesn't sound like the employee's most recent order date is relevant at all, unless I am misunderstanding (which is absolutely possible).
I believe the issue lies in t2. You are grouping by orderdate, which will return the max freight for that date and employeeid, as you describe. You need to calculate a maximum total from all orders that occurred before the date that the order occurred on, for that employee, for every row you are returning.
It probably makes more sense to use a subquery for this.
SELECT o.employeeid, o.orderid, o.orderdate, m.maxfreight
FROM
orders o LEFT OUTER JOIN
(SELECT max(freight) as maxfreight
FROM orders AS f
WHERE f.orderdate <= o.orderdate AND f.employeeid = o.employeeid
) AS m
Hoping this is syntactically correct as I'm not in front of SSMS right now. I also included a left outer join as your previous query with an inner join would have excluded any rows where an employee had no previous orders (i.e. first order ever).
You can do what you want with a correlated subquery or apply. Here is one way:
select o.employeeid, o.orderid, o.orderdate, t2.maxfreight
from orders o outer apply
(select max(freight) as maxfreight
from orders o2
where o2.employeeid = o.employeid and
o2.orderdate < o.orderdate
) t2;
In SQL Server 2012+, you can also do this with a cumulative maximum:
select o.employeeid, o.orderid, o.orderdate,
max(freight) over (partition by employeeid
order by o.orderdate rows between unbounded preceding and 1 preceding
) as maxfreight
from orders o;