query with subquery with 1 result(max) for each year - sql

I have to make a query where I show for each year wich shipper had the maximum total cost.
My query now show for each year the total cost of each shipper. So in the result i must have a list of the years, for each year the shipper and the total cost.
Thanks in advance.
select year(OrderDate), s.ShipperID, sum(freight)
from orders o
join shippers s on o.ShipVia = s.ShipperID
group by year(OrderDate),s.ShipperID

Select a.FreightYear, a,ShipperID, a.FreightValue
from
(
select year(OrderDate) FreightYear, s.ShipperID, sum(freight) FreightValue
from orders o
join shippers s on o.ShipVia = s.ShipperID
group by year(OrderDate),s.ShipperID
) a
inner join
(
select FreightYear, max(FrieghtTotal) MaxFreight
from
(
select year(OrderDate) FreightYear, s.ShipperID, sum(freight) FreightTotal
from orders o
join shippers s on o.ShipVia = s.ShipperID
group by year(OrderDate),s.ShipperID
) x
group by FreightYear
) max on max.FreightYear = a.FreightYear and max.MaxFreight = a.FreightValue
order by FreightYear
Inner query a is your original query, getting the value of freight by shipper.
Inner query max gets the max value for each year, and then query max is joined to query a, restricting the rows in a to be those with a value for a year = to the max value for the year.
Cheers -

It's marginally shorter if you use windowing functions.
select shippers_ranked.OrderYear as OrderYear,
shippers_ranked.ShipperId as ShipperId,
shippers_ranked.TotalFreight as TotalFreight
from
(
select shippers_freight.*, row_number() over (partition by shippers_freight.OrderYear order by shippers_freight.TotalFreight desc) as Ranking
from
(
select year(OrderDate) as OrderYear,
s.ShipperID as ShipperId,
sum(freight) as TotalFreight
from orders o
inner join shippers s on o.ShipVia = s.ShipperID
group by year(OrderDate), s.ShipperID
) shippers_freight
) shippers_ranked
where shippers_ranked.Ranking = 1
order by shippers_ranked.OrderYear
;
You need to decide what you would like to happen if two shippers have the same TotalFreight for a year - as the code above stands you will get one row (non-deterministically). If you would like one row, I would add ShipperId to the order by in the over() clause so that you always get the same row. If in the same TotalFreight case you would like multiple rows returned, use dense_rank() rather than row_number().

Related

Calculating the average of order value without using a WITH statement

I am trying to add a new column to my table which will be the average value calculated as the division of two existing columns. Therefore Average value = Total Sales / Number of Orders.
My data looks like this:click to view picture
I don't understand why Example Code A does not work but Example Code B does. Please can someone explain?
Example Code A
%%sql
SELECT
c.country,
count(distinct c.customer_id) customer_num,
count(i.invoice_id) order_num,
ROUND(SUM(i.total),2) total_sales,
order_num / total_sales avg_order_value
FROM customer c
LEFT JOIN invoice i ON c.customer_id = i.customer_id
GROUP BY 1
ORDER BY 4 DESC;
Example Code B
%%sql
WITH
customer_sales AS
(
SELECT
c.country,
count(distinct c.customer_id) customer_num,
count(i.invoice_id) order_num,
ROUND(SUM(i.total),2) total_sales
FROM customer c
LEFT JOIN invoice i ON c.customer_id = i.customer_id
GROUP BY 1
ORDER BY 4 DESC
)
SELECT
country,
customer_num,
order_num,
total_sales,
total_sales / order_num avg_order_value
FROM customer_sales;
Thank you!
Depending on the DBMS some allow you to reference the alias in the calculation (in the same select) and others require you to either bring it outside in an outer query or state your previous aggregation/functions, such as counts or sums.
SELECT
c.country,
count(distinct c.customer_id) customer_num,
count(i.invoice_id) order_num,
ROUND(SUM(i.total),2) total_sales,
count(i.invoice_id) / ROUND(SUM(i.total),2) avg_order_value
FROM customer c
LEFT JOIN invoice i ON c.customer_id = i.customer_id
GROUP BY 1
ORDER BY 4 DESC;

Why use GROUP BY in WINDOW FUNCTION

Im currently working with the northwind database and want to see the companies with more orders place in 1997. Im being ask to use windows function so i wrote this
select c.customerid,
c.companyname,
rank() over (order by count(orderid) desc )
from customers c
inner join orders o on c.customerid = o.customerid
where date_part('year',orderdate) = 1997;
However this code ask me to use GROUP BY with c.customerid. And i simply don't understand why. Supposedly this code will give me all the customers id and names and after that the window function kicks in giving them a rank base on the amount of orders. So why group them?
Here:
rank() over (order by count(orderid) desc )
You have an aggregate function in the over() clause of the window function (count(orderid)), so you do need a group by clause. Your idea is to put in the same group all orders of the same customer:
select c.customerid,
c.companyname,
rank() over (order by count(*) desc) as rn
from customers c
inner join orders o on c.customerid = o.customerid
where o.orderdate = date '1997-01-01' and o.orderdate < '1998-01-01'
group by c.customerid;
Notes:
Filtering on literal dates is much more efficient than applying a date function on the date column
count(orderid) is equivalent to count(*) in the context of this query
Postgres understands functionnaly-dependent column: assuming that customerid is the primary key of customer, it is sufficient to put just that column in the group by clause
It is a good practice to give aliases to expressions in the select clause
Another good practice is to prefix all columns with the (alias of) table they belong to
You would use it correctly in an aggregation query. That would be:
select c.customerid, c.companyname, count(*) as num_orders,
rank() over (order by count(*) desc) as ranking
from customers c inner join
orders o
on c.customerid = o.customerid
where date_part('year',orderdate) = 1997
group by c.customerid, c.companyname;
This counts the number of orders per customer in 1997. It then ranks the customers based on the number of orders.
I would advise you to use:
where orderdate >= '1997-01-01' and
orderdate < '1998-01-01'
For the filtering by year. This allows Postgres to use an index if one is available.

Adventure Works SQL Server 2014 Enquiry

I am writing a SQL query using the AdventureWorks 2014 database. I want to show which employee has sold for the highest order value.
I tried to write each select statement by itself (see below), but I'd like to be able to combine both queries into one:
select
s.SalesOrderID, s.SalesPersonID, COUNT(sd.SalesOrderID) as count
from
Sales.SalesOrderHeader s, Sales.SalesOrderDetail sd
where
s.SalesOrderID = sd.SalesOrderID
group by
sd.SalesOrderID, s.SalesOrderID, s.SalesPersonID
order by
sd.SalesOrderID
select
sd.SalesOrderID, sd.LineTotal, count (sd.SalesOrderID) as count
from
Sales.SalesOrderDetail sd
group by
sd.SalesOrderID, sd.LineTotal
order by
sd.SalesOrderID
are you looking for something like this:
select top 1
s.SalesPersonID
,sum(sd.LineTotal ) as orderTotal
s.salesorderid
from
Sales.SalesOrderHeader s
inner join Sales.SalesOrderDetail sd
on s.SalesOrderID = sd.SalesOrerID
group by
s.SalesPersonID
s.salesorderid
order by
orderTotal desc
in SQL server you can just ask for a limited number of rows with the top function (this can give you the highest order value when sorted correctly). this can be used with a group by that adds all the line totals together that have the same values in the columns being grouped by.
This is what I would do, to get the total by each Sales Person. Order by the Sum(sd.LineTotal) descending to get the Highest Value.
select s.SalesPersonID ,COUNT(sd.SalesOrderID) as count,sum(sd.LineTotal ) as orderTotal
from Sales.SalesOrderHeader s
Inner Join Sales.SalesOrderDetail sd ON s.SalesOrderID=sd.SalesOrderID
group by s.SalesPersonID
order by 3 Desc
We appreciate your effort. However, there is a simpler way to get your answer.
SELECT
-- TOP 1 //to get highest order total
s.SalesPersonID
,COUNT(sd.SalesOrderID) as Total_Count
,sum(sd.LineTotal) as orderTotal
FROM
Sales.SalesOrderHeader s Inner Join Sales.SalesOrderDetail sd
ON s.SalesOrderID=sd.SalesOrderID
GROUP BY s.SalesPersonID
HAVING s.SalesPersonID IS NOT NULL
ORDER BY sum(sd.LineTotal) desc
What I am doing is, joining SalesOrderHeader with SalesOrderDetail through SalesOrderId and using aggregate functions to get desired result.
Order by is use to get highest value first. Top 1 is use to get only desired output.

Getting max value before given date

I am pretty new to using MS SQL 2012 and I am trying to create a query that will:
Report the order id, the order date and the employee id that processed the order
report the maximum shipping cost among the orders processed by the same employee prior to that order
This is the code that I've come up with, but it returns the freight of the particular order date. Whereas I am trying to get the maximum freight from all the orders before the particular order.
select o.employeeid, o.orderid, o.orderdate, t2.maxfreight
from orders o
inner join
(
select employeeid, orderdate, max(freight) as maxfreight
from orders
group by EmployeeID, OrderDate
) t2
on o.EmployeeID = t2.EmployeeID
inner join
(
select employeeid, max(orderdate) as mostRecentOrderDate
from Orders
group by EmployeeID
) t3
on t2.EmployeeID = t3.EmployeeID
where o.freight = t2.maxfreight and t2.orderdate < t3.mostRecentOrderDate
Step one is to read the order:
select o.employeeid, o.orderid, o.orderdate
from orders o
where o.orderid = #ParticularOrder;
That gives you everything you need to go out and get the previous orders from the same employee and join each one to the row you get from above.
select o.employeeid, o.orderid, o.orderdate, o2.freight
from orders o
join orders o2
on o2.employeeid = o.employeeid
and o2.orderdate < o.orderdate
where o.orderid = #ParticularOrder;
Now you have a whole bunch of rows with the first three values the same and the fourth is the freight cost of each previous order. So just group by the first three fields and select the maximum of the previous orders.
select o.employeeid, o.orderid, o.orderdate, max( o2.freight ) as maxfreight
from orders o
join orders o2
on o2.employeeid = o.employeeid
and o2.orderdate < o.orderdate
where o.orderid = #ParticularOrder
group by o.employeeid, o.orderid, o.orderdate;
Done. Build your query in stages and many times it will turn out to be much simpler than you at first thought.
It is unclear why you are using t3. From the question it doesn't sound like the employee's most recent order date is relevant at all, unless I am misunderstanding (which is absolutely possible).
I believe the issue lies in t2. You are grouping by orderdate, which will return the max freight for that date and employeeid, as you describe. You need to calculate a maximum total from all orders that occurred before the date that the order occurred on, for that employee, for every row you are returning.
It probably makes more sense to use a subquery for this.
SELECT o.employeeid, o.orderid, o.orderdate, m.maxfreight
FROM
orders o LEFT OUTER JOIN
(SELECT max(freight) as maxfreight
FROM orders AS f
WHERE f.orderdate <= o.orderdate AND f.employeeid = o.employeeid
) AS m
Hoping this is syntactically correct as I'm not in front of SSMS right now. I also included a left outer join as your previous query with an inner join would have excluded any rows where an employee had no previous orders (i.e. first order ever).
You can do what you want with a correlated subquery or apply. Here is one way:
select o.employeeid, o.orderid, o.orderdate, t2.maxfreight
from orders o outer apply
(select max(freight) as maxfreight
from orders o2
where o2.employeeid = o.employeid and
o2.orderdate < o.orderdate
) t2;
In SQL Server 2012+, you can also do this with a cumulative maximum:
select o.employeeid, o.orderid, o.orderdate,
max(freight) over (partition by employeeid
order by o.orderdate rows between unbounded preceding and 1 preceding
) as maxfreight
from orders o;

SQL Query for counting number of orders per customer and Total Dollar amount

I have two tables
Order with columns:
OrderID,OrderDate,CID,EmployeeID
And OrderItem with columns:
OrderID,ItemID,Quantity,SalePrice
I need to return the CustomerID(CID), number of orders per customer, and each customers total amount for all orders.
So far I have two separate queries. One gives me the count of customer orders....
SELECT CID, Count(Order.OrderID) AS TotalOrders
FROM [Order]
Where CID = CID
GROUP BY CID
Order BY Count(Order.OrderID) DESC;
And the other gives me the total sales. I'm having trouble combining them...
SELECT CID, Sum(OrderItem.Quantity*OrderItem.SalePrice) AS TotalDollarAmount
FROM OrderItem, [Order]
WHERE OrderItem.OrderID = [Order].OrderID
GROUP BY CID
I'm doing this in Access 2010.
You would use COUNT(DISTINCT ...) in other SQL engines:
SELECT CID,
Count(DISTINCT O.OrderID) AS TotalOrders,
Sum(OI.Quantity*OI.SalePrice) AS TotalDollarAmount
FROM [Order] O
INNER JOIN [OrderItem] OI
ON O.OrderID = OI.OrderID
GROUP BY CID
Order BY Count(DISTINCT O.OrderID) DESC
Which Access unfortunately does not support. Instead you can first get the Order dollar amounts and then join them before figuring the order counts:
SELECT CID,
COUNT(Orders.OrderID) AS TotalOrders,
SUM(OrderAmounts.DollarAmount) AS TotalDollarAmount
FROM [Orders]
INNER JOIN (SELECT OrderID, Sum(Quantity*SalePrice) AS DollarAmount
FROM OrderItems GROUP BY OrderID) AS OrderAmounts
ON Orders.OrderID = OrderAmounts.OrderID
GROUP BY CID
ORDER BY Count(Orders.OrderID) DESC
If you need to include Customers that have orders with no items (unusual but possible), change INNER JOIN to LEFT OUTER JOIN.
Create a query which uses your 2 existing queries as subqueriers, and join the 2 subqueries on CID. Define your ORDER BY in the parent query instead of in a subquery.
SELECT
sub1.CID,
sub1.TotalOrders,
sub2.TotalDollarAmount
FROM
(
SELECT
CID,
Count(Order.OrderID) AS TotalOrders
FROM [Order]
GROUP BY CID
) AS sub1
INNER JOIN
(
SELECT
CID,
Sum(OrderItem.Quantity*OrderItem.SalePrice)
AS TotalDollarAmount
FROM OrderItem INNER JOIN [Order]
ON OrderItem.OrderID = [Order].OrderID
GROUP BY CID
) AS sub2
ON sub1.CID = sub2.CID
ORDER BY sub1.TotalOrders DESC;