Find the maximum average over a specific period, two tables - sql

My task sounds like this: "Select sales territory (name) with sales in May 2013 higher than the average monthly sales per sales territory (Use SalesTerritory, SalesHeader tables)." As I understand it, logically, I need to find what territory was the maximum average for May 2013, while I need to link two tables (the "name" field in the "salesterritory" table, the rest of the data in the second, but the "name" must be present).
I tried to divide the task into parts, and find at least a territory by id without a name, here is my code:
SELECT TerritoryID, MAX(avga.sal)
from (select YEAR(OrderDate) AS 'Year', MONTH(OrderDate) AS 'Month', TerritoryID, AVG(TotalDue) AS 'sal'
FROM Sales.SalesOrderHeader
GROUP BY YEAR(OrderDate), MONTH(OrderDate), TerritoryID
having YEAR(OrderDate)=2013) as avga
group by TerritoryID
This result does not appear to be correct even at this stage. Please help how to do it right? At least without the second table.

Can you try this steps:
Separate this query into small queries that collect part of the data you want and make sense to you, for example: A query to select the sales territory (name) with sales in May 2013; another query that brings the average monthly sales by sales territory etc. This will help you understand parts of the main query that you will create.
You can now try this in one query. Perhaps common table expressions is an easier approach. Here are some examples: CTE

I believe you need both the average per territory in May 2013, but also the average across all territories for the same month. Note the use of OVER() in the query below. This clause enables to calculation of an average across multiple rows, which is ideal in this situation because we need to only return those territories that have their figures higher than the overall average.
select
yyyy
, mm
, TerritoryID
, territory_av
, month_av
from (
SELECT
yyyy
, mm
, TerritoryID
, territory_av
, AVG(av_value) OVER() month_av
FROM (
SELECT
YEAR(OrderDate) AS yyyy
, MONTH(OrderDate) AS mm
, TerritoryID
, AVG(TotalDue) AS territory_av
FROM Sales.SalesOrderHeader
WHERE YEAR(OrderDate) = 2013
AND MONTH(OrderDate) = 5
GROUP BY
YEAR(OrderDate)
, MONTH(OrderDate)
, TerritoryID
) AS derive1
) AS derive2
) AS derive3
WHERE territory_av > month_av
;
Don't use having as an alternative for where. Use where to filter table data which reduces the data processed by group by. Use having to filter aggregated values which happens after group by.
Regarding filtering for May 2013, it is more efficient to NOT use functions on data to assist filtering in a where clause. A more generic way to select a date range (that does not require changing data via functions) is like this:
WHERE OrderDate >= '2013-05-01'
AND OrderDate < '2013-06-01'
Syntax for dates differes amongst databases, you might need to convert the date literals into a date (or timestamp)
WHERE OrderDate >= to_date('2013-05-01','yyyy-mm-dd')
AND OrderDate < to_date('2013-06-01','yyyy-mm-dd')
or, in SQL Server you could use this:
WHERE OrderDate >= '20130501'
AND OrderDate < '20130601'

Related

SQL - Aggregate dates from different columns into Month/Year table

So I have an 'Orders' table that lists the 'Ordered' and 'Shipped' dates for each order.
These are custom products and it takes 1 week to fill orders.
This is pretty representative of the table I have:
I want to aggregate this into a table so that I can see how many orders were ordered and shipped for each month during the date range specified when the report is run, and I want the Months and years to automatically populate without me having to hardcode for each month and year:
What's the best way to do this with SQL?
I eventually want to place the aggregated table into an SSRS report so that you can expand/collapse each year, if needed.
Date/time functions are notoriously database dependent. Here is a typical approach, though:
select yyyy, mm, sum(num_ordered), sum(num_shipped)
from ((select year(ordered) as yyyy, month(ordered) as mm, count(*) as num_ordered, 0 as num_shipped
from orders
group by year(ordered), month(ordered)
) union all
(select year(shipped) as yyyy, month(shipped) as mm, 0 count(*) as num_shipped
from orders
group by year(shipped), month(shipped)
)
) ym
group by yyyy, mm;

How to group by year with the year showing only once

I have tried using the following query
select distinct Year (SaleDate) AS SaleYear,Max(SalePrice)
from Sale
group by SaleDate
The years 2010 and 2014 are showing twice,even though i used distinct and group by. the amounts in Maxprice are different as well. am i doing something wrong here?
You need to repeat year() in the group by:
select Year(SaleDate) AS SaleYear, Max(SalePrice)
from Sale
group by year(SaleDate);
SELECT DISTINCT with GROUP BY is almost never correct. All that your query does is aggregate by SaleDate and in the result set extract the year. That is why you see duplicates.

How do I sum over multiple criteria in T-SQL?

I'm trying to improve on my very basic SQL querying skills and am using the AdventureWorks2012 sample database in SQL Server 2012. I have used SUM() OVER(PARTITION BY) like this:
SELECT DISTINCT
SUM(SubTotal) OVER (PARTITION BY CustomerID), CustomerID
FROM
Sales.SalesOrderHeader
To get the total sales value for each customer, however I'd like to sum the SubTotal by customer & year using YEAR(OrderDate) to extract just the year portion of the order date.
Firstly it appears that I can't use the year portion of the order date to sum by year independently of customer so this approach isn't going to work anyhow.
Secondly I can't see any way to use multiple partition criteria.
I suspect that my inexperience is leading me to think about this in the wrong way so a theoretical approach would be as useful as a specific solution.
I guess I'm looking for something that is functionally similar to Excel's SUMIFS() function
First, the correct way to write your query is:
SELECT CustomerID, SUM(SubTotal)
FROM Sales.SalesOrderHeader
GROUP BY CustomerID;
Using SELECT DISTINCT with window functions is clever. But, it overcomplicates the query, can have poorer performance, and is confusing to anyone reading it.
To get the information by year (for each customer), just add that to the SELECT and GROUP BY:
SELECT CustomerID, YEAR(OrderDate) as yyyy, SUM(SubTotal)
FROM Sales.SalesOrderHeader
GROUP BY CustomerID, YEAR(OrderDate)
ORDER BY CustomerId, yyyy;
If you actually want to get separate rows with subtotals, then study up on GROUPING SETS and ROLLUP. These are options to the GROUP BY.
You should use group by instead of PARTITION BY whenever you need an aggregate (sum/count/max) against a specific column like (customerid) as following
select customerId, sum(subTotal)
FROM sales.salesOrderHeader
group by customerId
Edit : including missing requirement of date (response to comment)
If you want calculation against more than one column, you still can do it same way. Just add the date in group by clause as group by customerId, saleDate
select customerId, sum(subTotal)
,saleDate //=> you can miss it (date) from selection if you want to
FROM sales.salesOrderHeader
group by customerId, saleDate

Multiple Counts Over Multiple Dates

I am essentially doing the following query (edited):
Select count(orders)
From Orders_Table
Where Order_Open_Date<=##/##/####
and Order_Close_Date>=##/##/####
Where the ##/##/##### is the same date. So in essence the number of 'open' orders for any given day. However I am wanting this same count for every single day for a year and don't want to write a separate query for each day for the whole year. I'm sorry this is probably really simple but I am new to SQL and I guess I don't know how to search for an answer to this question since my searches have come up with nothing. Thanks for any help you can offer.
why not
select Order_Date, count(orders) from Orders_Table group by Order_Date
and for last year
select Order_Date, count(orders) from Orders_Table where Order_Date > DATE_SUB(CURDATE(), INTERVAL 1 YEAR) group by Order_Date;
SELECT CONVERT(VARCHAR, Order_Date, 110), count(orders)
FROM Orders_Table
WHERE Order_Date = BETWEEN #A AND #B
GROUP BY CONVERT(VARCHAR, Order_Date, 110)
If you want to have every day of the year, including those with no orders, you will need to generate a temporary table or similar containing every date in the range and left/right join it to the Orders_Table data. This depends upon which RDBMS you're using. In SQL Server I have done this using a user defined function which returns a table variable.

COUNT and GROUP BY over time

I have a need to create sales reports by day, week, month, etc. in PostgreSQL. I have the following tables setup:
tbl_products:
id INT
name VARCHAR
tbl_purchase_order:
id INT
order_timestamp TIMESTAMP
tbl_purchase_order_items:
id INT
product_id INT (FK to tbl_products.id)
order_id (FK to tbl_purchase_order.id)
I need to create a SQL query that returns the number of times a given product has been purchased within a given time frame. That is, I need to query the number of times a given product ID appears in a purchase order item in a specific month, day, year, etc. In an earlier question I learned how to use date_trunc() to truncate my TIMESTAMP column to the period of time I'm concerned about. Now I'm faced with how to perform the COUNT and GROUP BY properly.
I've tried several queries using various combinations of COUNT(XXX) and GROUP BY XXX but never seem to come up with what I'm expecting. Can someone give me guidance as to how to construct this query? I'm more of a Java developer, so I'm still getting up to speed on SQL queries. Thanks for any help you can provide.
Count per year:
SELECT oi.product_id,
extract(year from po.order_timestamp) as order_year
count(*)
FROM purchase_order_items oi
JOIN purchase_order po ON po.id = oi.order_id
GROUP BY extract(year from po.order_timestamp)
Counter per month:
SELECT oi.product_id,
extract(month from po.order_timestamp) as order_month
extract(year from po.order_timestamp) as order_year
count(*)
FROM purchase_order_items oi
JOIN purchase_order po ON po.id = oi.order_id
GROUP BY extract(year from po.order_timestamp),
extract(month from po.order_timestamp)
See the postgres datetime functions http://www.postgresql.org/docs/8.1/static/functions-datetime.html
I would suggest that you use the extract function, to split the year, month and day into discreet columns in the result set, and then group by as per your requirements.