How to group by year with the year showing only once - sql

I have tried using the following query
select distinct Year (SaleDate) AS SaleYear,Max(SalePrice)
from Sale
group by SaleDate
The years 2010 and 2014 are showing twice,even though i used distinct and group by. the amounts in Maxprice are different as well. am i doing something wrong here?

You need to repeat year() in the group by:
select Year(SaleDate) AS SaleYear, Max(SalePrice)
from Sale
group by year(SaleDate);
SELECT DISTINCT with GROUP BY is almost never correct. All that your query does is aggregate by SaleDate and in the result set extract the year. That is why you see duplicates.

Related

Find the maximum average over a specific period, two tables

My task sounds like this: "Select sales territory (name) with sales in May 2013 higher than the average monthly sales per sales territory (Use SalesTerritory, SalesHeader tables)." As I understand it, logically, I need to find what territory was the maximum average for May 2013, while I need to link two tables (the "name" field in the "salesterritory" table, the rest of the data in the second, but the "name" must be present).
I tried to divide the task into parts, and find at least a territory by id without a name, here is my code:
SELECT TerritoryID, MAX(avga.sal)
from (select YEAR(OrderDate) AS 'Year', MONTH(OrderDate) AS 'Month', TerritoryID, AVG(TotalDue) AS 'sal'
FROM Sales.SalesOrderHeader
GROUP BY YEAR(OrderDate), MONTH(OrderDate), TerritoryID
having YEAR(OrderDate)=2013) as avga
group by TerritoryID
This result does not appear to be correct even at this stage. Please help how to do it right? At least without the second table.
Can you try this steps:
Separate this query into small queries that collect part of the data you want and make sense to you, for example: A query to select the sales territory (name) with sales in May 2013; another query that brings the average monthly sales by sales territory etc. This will help you understand parts of the main query that you will create.
You can now try this in one query. Perhaps common table expressions is an easier approach. Here are some examples: CTE
I believe you need both the average per territory in May 2013, but also the average across all territories for the same month. Note the use of OVER() in the query below. This clause enables to calculation of an average across multiple rows, which is ideal in this situation because we need to only return those territories that have their figures higher than the overall average.
select
yyyy
, mm
, TerritoryID
, territory_av
, month_av
from (
SELECT
yyyy
, mm
, TerritoryID
, territory_av
, AVG(av_value) OVER() month_av
FROM (
SELECT
YEAR(OrderDate) AS yyyy
, MONTH(OrderDate) AS mm
, TerritoryID
, AVG(TotalDue) AS territory_av
FROM Sales.SalesOrderHeader
WHERE YEAR(OrderDate) = 2013
AND MONTH(OrderDate) = 5
GROUP BY
YEAR(OrderDate)
, MONTH(OrderDate)
, TerritoryID
) AS derive1
) AS derive2
) AS derive3
WHERE territory_av > month_av
;
Don't use having as an alternative for where. Use where to filter table data which reduces the data processed by group by. Use having to filter aggregated values which happens after group by.
Regarding filtering for May 2013, it is more efficient to NOT use functions on data to assist filtering in a where clause. A more generic way to select a date range (that does not require changing data via functions) is like this:
WHERE OrderDate >= '2013-05-01'
AND OrderDate < '2013-06-01'
Syntax for dates differes amongst databases, you might need to convert the date literals into a date (or timestamp)
WHERE OrderDate >= to_date('2013-05-01','yyyy-mm-dd')
AND OrderDate < to_date('2013-06-01','yyyy-mm-dd')
or, in SQL Server you could use this:
WHERE OrderDate >= '20130501'
AND OrderDate < '20130601'

How do you use having for multiple conditions?

In the following codes, how do you exclude members's spending that's larger than $500 for each year (instead of total spending for all years)?
select
Year
,month
,memberkey
,sum(spending) as spending
from table1
group by
1,2,3
A HAVING clause won't work here since you really want to aggregate at the YEAR level to determine which records should be included. Traditionally you would do this with a correlated subquery, but in Teradata you can make use of the QUALIFY clause:
SELECT "Year"
,"Month"
,MemberKey
,spending
from table1
QUALIFY sum(spending) OVER (PARTITION BY "Year", MemberID) < 500

How to use an sum() function without group by?

I just have to omit those records whose sum of sales in all 53 weeks is 0 and would need the output without group by
You cannnot really get that in one query.
To get all years without any sum of sales, you have to sum the sales.
That is:
Firstly:
select YEAR(date) from YourTable group by YEAR(date) having sum(sales) > 0
Then:
select * from YourTable where Year in (<firstquery>) as aliasname
order by <anydatecolumn>
If you are using mssql you can do that in one query using the OVER clause and partitioning

How do I sum over multiple criteria in T-SQL?

I'm trying to improve on my very basic SQL querying skills and am using the AdventureWorks2012 sample database in SQL Server 2012. I have used SUM() OVER(PARTITION BY) like this:
SELECT DISTINCT
SUM(SubTotal) OVER (PARTITION BY CustomerID), CustomerID
FROM
Sales.SalesOrderHeader
To get the total sales value for each customer, however I'd like to sum the SubTotal by customer & year using YEAR(OrderDate) to extract just the year portion of the order date.
Firstly it appears that I can't use the year portion of the order date to sum by year independently of customer so this approach isn't going to work anyhow.
Secondly I can't see any way to use multiple partition criteria.
I suspect that my inexperience is leading me to think about this in the wrong way so a theoretical approach would be as useful as a specific solution.
I guess I'm looking for something that is functionally similar to Excel's SUMIFS() function
First, the correct way to write your query is:
SELECT CustomerID, SUM(SubTotal)
FROM Sales.SalesOrderHeader
GROUP BY CustomerID;
Using SELECT DISTINCT with window functions is clever. But, it overcomplicates the query, can have poorer performance, and is confusing to anyone reading it.
To get the information by year (for each customer), just add that to the SELECT and GROUP BY:
SELECT CustomerID, YEAR(OrderDate) as yyyy, SUM(SubTotal)
FROM Sales.SalesOrderHeader
GROUP BY CustomerID, YEAR(OrderDate)
ORDER BY CustomerId, yyyy;
If you actually want to get separate rows with subtotals, then study up on GROUPING SETS and ROLLUP. These are options to the GROUP BY.
You should use group by instead of PARTITION BY whenever you need an aggregate (sum/count/max) against a specific column like (customerid) as following
select customerId, sum(subTotal)
FROM sales.salesOrderHeader
group by customerId
Edit : including missing requirement of date (response to comment)
If you want calculation against more than one column, you still can do it same way. Just add the date in group by clause as group by customerId, saleDate
select customerId, sum(subTotal)
,saleDate //=> you can miss it (date) from selection if you want to
FROM sales.salesOrderHeader
group by customerId, saleDate

Logical error in my SqlCode

I need to run a report that give the count of record with specific createria Per Month.For each Month my query display the month more than once. Could it be that am doing something wrong: My script:
Select DATEPART(mm,DatePrinted),COUNT(ReceiptNo)As CardPrinted
from mytble where ReceiptNo like'990%'
Group by DatePrinted
posible receipts:800,75.
Am expected something like:
January totalcount
Feb totalcount etc.
Use Group by DATEPART(month,DatePrinted).
Select DATEPART(month,DatePrinted) As MyMonth, COUNT(ReceiptNo) As CardPrinted
From mytble
Where ReceiptNo like '990%'
Group by DATEPART(month,DatePrinted)
If you need name of the month, then use DATENAME() function:
Select DATENAME(month,DatePrinted) As MyMonth, COUNT(ReceiptNo) As CardPrinted
From mytble
Where ReceiptNo like '990%'
Group by DATENAME(month,DatePrinted)
Note: May be you need to group by year to get correct results. Otherwise, you will get the count of similar months regardless of the year. If you are looking for a particular year, add this filter to the WHERE clause Year(DatePrinted) = yourYear
Your group by statement is wrong, it must be on DATEPART(mm,DatePrinted)
SELECT DATEPART(mm, DatePrinted) AS [Month], COUNT(ReceiptNo) As CardPrinted
FROM mytble
WHERE ReceiptNo LIKE '990%'
GROUP BY DATEPART(mm, DatePrinted)
You can also replace COUNT(ReceiptNo) by COUNT(*).
Also note that as it is right now, all months of different years will be grouped together.
If that isin't the desired behaviour you can SELECT and GROUP BY DATEPART(yyyy, DatePrinted), DATEPART(mm, DatePrinted)