Distinct with order by clause - sql

I want to get distinct Category and order there result by curdate column.
select distinct(Category)'Category' from sizes order by curdate desc
But this simple query is generating errors.
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

I'm afraid you have the same constraint for SELECT DISTINCT as for GROUP BY clauses: namely, you cannot make use of a field that's not declared in the fields list, because it simply doesn't know which curdate to use when sorting in case there are several rows with different curdate values for the same Category.
EDIT: try something like:
SELECT Category FROM sizes GROUP BY Category ORDER BY MAX(curdate) DESC
Replace MAX with MIN or whatever suits you.
EDIT2: In this case, MAX(curdate) doesn't even have to be present in the field list since it's used in an aggregate function.

with cte as
(
select
Category,
[CurDate],
row_number() over(partition by Category order by [CurDate]) as rn
from sizes
)
select
Category
from cte
where rn = 1
order by [CurDate]

You look to be after a list of all the categories, with a date associated with each one. Whether you want the earliest first or latest first, you should be able to do one of the following:
SELECT Category, MAX(curdate) FROM sizes GROUP BY Category
Or:
SELECT Category, MIN(curdate) FROM sizes GROUP BY Category
Depending on whether you want the most recent or earliest dates associated with each category. If you need the list to then be ORDERed by the dates, add one of the following onto the end:
ORDER BY MAX(curdate)
ORDER BY MIN(curdate)

Curdate must be in your select statement also, right now you are only specifying Category

Exactly what the error says, it cannot order a distinct list if the sort filed is not part of the select. Reason being is that there may be multiple sort values for each of the distinct values selected. If the data looks like this
Category CurDate
AAA 1/1/2011
BBB 2/1/2011
AAA 3/1/2011
Should AA be before or after BBB in the distinct list? If you just ordered by the date without the distinct you would get it in both positions. Since SQL doesn't know which date should be associated with the distinct category it will not let you sort by the date.

as the error-message said, you can't order by a column that isn't selected wehen using SELECT DISTINCT (same problem as with GROUP BY...). change your query to this:
SELECT DISTINCT category, curdate FROM sizes ORDER BY curdate DESC
EDIT: replying to yourt comment:
if you want to select the distinct category with the last date for every category, you'll have to change your query a bit. i can think of two possibilities for this: using MAX() like Costi Ciudatu posted or doing some crazy stuff with subselects - the first one would be the better approach.

Related

Why does MAX statement require a Group By?

I understand why the first query needs a GROUP BY, as it doesn't know which date to apply the sum to, but I don't understand why this is the case with the second query. The value that ultimately is the max amount is already contained in the table - it is not calculated like SUM is. thank you
-- First Query
select
sum(OrderSales),OrderDates
From Orders
-- Second Query
select
max(FilmOscarWins),FilmName
From tblFilm
It is not the SUM and MAX that require the GROUP BY, it is the unaggregated column.
If you just write this, you will get a single row, for the maximum value of the FilmOscarWins column across the whole table:
select
max(FilmOscarWins)
From
tblFilm
If the most Oscars any film won was 12, that one row will say 12. But there could be multiple films, all of which won 12 Oscars, so if we ask for the FilmName alongside that 12, there is no single answer.
By adding the Group By, we fundamentally change the query: instead of returning one number for the whole table, it will return one row for each group - which in this case, means one row for each film.
If you do want to get a list of all those films which had the maximum 12 Oscars, you have to do something more complicated, such as using a sub-query to first find that single number (12) and then find all the rows matching it:
select
FilmOscarWins,
FilmName
From
tblFilm
Where FilmOscarWins = (
select
max(FilmOscarWins)
From
tblFilm
)
If you want the film with the most Oscar wins, then use select top:
select top (1) f.*
From tblFilm f
order by FilmOscarWins desc;
In an aggregation query, the select columns need to be consistent with the group by columns -- the unaggregated columns in the select must match the group by.

how to extract MAX value with multiple conditions in SQL

I'm dumping an SQL sales/order table and running the following Excel array command to find the highest value, for a particular order:
{ =MAX(IF([ORDER]=[#[ORDER]];IF([PRODUCT]=[#PRODUCT];[QTY]))) }
This checks, for any rows belonging to the same order, for the same product, what the highest QTY listed is. But being an Arry formula, it freezes my Excel for many minutes.
Can I do something similar directly in SQL?
You can get MAX value with aggregate function MAX and apply condition with WHERE like below.
SELECT MAX(QTY)
FROM TABLE
WHERE [ORDER] = #ORDER AND [PRODUCT] = #PRODUCT
And if you want ORDER and PRODUCT wise MAX QTY value for all ORDER and PRODUCT then use GROUP BY like below.
SELECT MAX(QTY)
FROM TABLE
GROUP BY [ORDER], [PRODUCT]
If you want the value per row, then you would use window functions:
select t.*, max(qty) over (partition by order, product)
from t;
Note: order is a very bad name for a column because it is a SQL keyword. If that is the real name, you need to escape it.

Combine multiple date fields into one on query

I have a requirement to create a report that counts a total from 2 date fields into one. A simplified example of the table I'm querying is:
ID, FirstName, LastName, InitialApplicationDate, UpdatedApplicationDate
I need to query the two date fields in a way that creates similar output to the following:
Date | TotalApplications
I would need the date output to include both InitialApplicationDate and
UpdatedApplicationDate fields and the TotalApplications output to be a count of the total for both types of date fields. Originally I thought maybe a Union would work however that returns 2 separate records for each date. Any ideas how I might accomplish this?
The simplest way, I think, is to unpivot using apply and then aggregate:
select v.thedate, count(*)
from t cross apply
(values (InitialApplicationDate), (UpdatedApplicationDate)) v(thedate)
group by v.thedate;
You might want to add where thedate is not null if either column could be NULL.
Note that the above will count the same application twice, once for each date. That appears to be your intention.

Is it possible to not SELECT the column you wish to GROUP BY?

Do I have to select the column I want to group by?
I want to select sales numbers and group them by the month they were in, but I don't actually want the month in my results, just the sales numbers. Can I do this? I cant seem to get it to work:
SELECT Line_Item_Total
FROM CUSTOMER
GROUP BY MONTH(Actual_Setup_Date), YEAR(Actual_Setup_Date)
I should add this is for a delimited data chart in Filemaker.
You need an aggregation function on the rest of the columns, that is all:
SELECT SUM(Line_Item_Total )
FROM CUSTOMER
GROUP BY MONTH(Actual_Setup_Date), YEAR(Actual_Setup_Date)
You do not need to include the expressions in the group by in the select.

How to produce a distinct count of records that are stored by day by month

I have a table with several "ticket" records in it. Each ticket is stored by day (i.e. 2011-07-30 00:00:00.000) I would like to count the unique records in each month by year I have used the following sql statement
SELECT DISTINCT
YEAR(TICKETDATE) as TICKETYEAR,
MONTH(TICKETDATE) AS TICKETMONTH,
COUNT(DISTINCT TICKETID) AS DAILYTICKETCOUNT
FROM
NAT_JOBLINE
GROUP BY
YEAR(TICKETDATE),
MONTH(TICKETDATE)
ORDER BY
YEAR(TICKETDATE),
MONTH(TICKETDATE)
This does produce a count but it is wrong as it picks up the unique tickets for every day. I just want a unique count by month.
Try combining Year and Month into one field, and grouping on that new field.
You may have to cast them to varchar to ensure that they don't simply get added together. Or.. you could multiple through the year...
SELECT
(YEAR(TICKETDATE) * 100) + MONTH(TICKETDATE),
count(*) AS DAILYTICKETCOUNT
FROM NAT_JOBLINE GROUP BY
(YEAR(TICKETDATE) * 100) + MONTH(TICKETDATE)
Presuming that TICKETID is not a primary or unique key, but does appear multiple times in table NAT_JOBLINE, that query should work. If it is unique (does not occur in more than 1 row per value), you will need to select on a different column, one that uniquely identifies the "entity" that you want to count, if not each occurance/instance/reference of that entity.
(As ever, it is hard to tell without working with the actual data.)
I think you need to remove the first distinct. You already have the group by. If I was the first Distict I would be confused as to what I was supposed to do.
SELECT
YEAR(TICKETDATE) as TICKETYEAR,
MONTH(TICKETDATE) AS TICKETMONTH,
COUNT(DISTINCT TICKETID) AS DAILYTICKETCOUNT
FROM NAT_JOBLINE
GROUP BY YEAR(TICKETDATE), MONTH(TICKETDATE)
ORDER BY YEAR(TICKETDATE), MONTH(TICKETDATE)
From what I understand from your comments to Phillip Kelley's solution:
SELECT TICKETDATE, COUNT(*) AS DAILYTICKETCOUNT
FROM NAT_JOBLINE
GROUP BY TICKETDATE
should do the trick, but I suggest you update your question.