SQL MAX(COUNT(*)) GROUP BY Alternatives? - sql

I've seen many topics about this and none of them is what I'm looking for.
Say we have this simple table:
CREATE TABLE A (
id INT,
date DATETIME
);
I want to retrieve the MAX value after grouping.
So I do it as follow:
DECLARE #tmpTable TABLE(id INT, count INT);
INSERT INTO #tmpTable SELECT id, COUNT(*) FROM A GROUP BY id;
SELECT MAX(count) FROM #tmpTable;
Is there a better way of doing that?
I've seen a solution in a book that I'm reading that they do it as follows:
SELECT MAX(count) FROM (SELECT COUNT(*) AS count FROM A GROUP BY id);
But this won't work :/ Could be that it works in newer T-SQL servers? Currently I'm using 2008 R2.

You can make use of TOP
SELECT TOP 1 Id,COUNT(*) AS MAXCOUNT
FROM A
GROUP BY Id
ORDER BY MAXCOUNT DESC
If you wants the result with same max count use TOP WITH TIES
SELECT TOP 1 WITH TIES Id,COUNT(*) AS MAXCOUNT
FROM A
GROUP BY Id
ORDER BY MAXCOUNT DESC

Is there a better way of doing that?
We could try using analytic functions:
WITH cte AS (
SELECT id, COUNT(*) cnt, ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) rn
FROM A
GROUP BY id
)
SELECT cnt
FROM cte
WHERE rn = 1;
This approach is to turn out a row number, ordered descending by the count, during your original aggregation query by id. The id with the highest count then should be the first record (and this result should hold valid even if more than one id be tied for the highest count).
Regarding your original max query, see the answer by #apomene, and you are just missing an alias.

You also need to add an alias name for your sub query. Try like:
SELECT MAX(sub.count1) FROM (SELECT COUNT(*) AS count1 FROM A GROUP BY id) sub;

Related

Distinct rows in a table in sql

I have a table with multiple rows of the same member id. I need only distinct rows based on 2 unique columns
Ex: there are 100 different customers, the table has 1000 rows because every customer has multiple cities and segments assigned to him.
I need 100 distinct rows for these customers depending on a unique segment and city combination. There is no specific requirement for this combination, just the first from the table is fine.
So, currently the table is somewhat like this,
Hope this helps.
use row_number()
select * from (select *,row_number() over(partition by memberid order by sales) rn
from table_name
) a where a.rn=1
Handy sql-server top(1) with ties syntax for that
select top(1) with ties t.*
from table_name t
order by row_number() over(partition by memberid order by sales)
As you have no paticular requirement for which exactly row to select, any column will do at order by, it can be null as well
select top(1) with ties t.*
from table_name t
order by row_number() over(partition by memberid order by (select null))
The simplest way to do this is to use the ROW_NUMBER() OVER(GROUP BY...) syntax. You have no need to use an order by, since you want an arbitrary row, but only one, for each member.
Since you need only the expected data, and not the Row_Number value, make sure that you detail the fields returned, like below:
SELECT
MemberId,
city,
segment,
sales
FROM (
SELECT *
ROW_NUMBER() OVER (GROUP BY MemberId) as Seq
FROM [Status]
) src
WHERE Seq = 1

Aggregate function like MAX for most common cell in column?

Group by the highest Number in a column worked great with MAX(), but what if I would like to get the cell that is at most common.
As example:
ID
100
250
250
300
200
250
So I would like to group by ID and instead of get the lowest (MIN) or highest (MAX) number, I would like to get the most common one (that would be 250, because there 3x).
Is there an easy way in SQL Server 2012 or am I forced to add a second SELECT where I COUNT(DISTINCT ID) and add that somehow to my first SELECT statement?
You can use dense_rank to return all the id's with the highest counts. This would handle cases when there are ties for the highest counts as well.
select id from
(select id, dense_rank() over(order by count(*) desc) as rnk from tablename group by id) t
where rnk = 1
A simple way to do what you want uses top and order by:
SELECT top 1 id
FROM t
GROUP BY id
ORDER BY COUNT(*) DESC;
This is a statistic called the mode. Getting the mode and max is a bit challenging in SQL Server. I would approach it as:
WITH cte AS (
SELECT t.id, COUNT(*) AS cnt,
row_number() OVER (ORDER BY COUNT(*) DESC) AS seqnum
FROM t
GROUP BY id
)
SELECT MAX(id) AS themax, MAX(CASE WHEN seqnum = 1 THEN id END) AS MODE
FROM cte;

how to get the distinct records based on maximum date?

I'm working with Sql server 2008.i have a table contains following columns,
Id,
Name,
Date
this table contains more than one record for same id.i want to get distinct id having maximum date.how can i write sql query for this?
Use the ROW_NUMBER() function and PARTITION BY clause. Something like this:
SELECT Id, Name, Date FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Date desc) AS ROWNUM
FROM [MyTable]
) x WHERE ROWNUM = 1
If you need only ID column and other columns are NOT required, then you don't need to go with ROW_NUMBER or MAX or anything else. You just do a Group By over ID column, because whatever the maximum date is you will get same ID.
SELECT ID FROM table GROUP BY ID
--OR
SELECT DISTINCT ID FROM table
If you need ID and Date columns with maximum date, then simply do a Group By on ID column and select the Max Date.
SELECT ID, Max(Date) AS Date
FROM table
GROUP BY ID
If you need all the columns but 1 line having Max. date then you can go with ROW_NUMBER or MAX as mentioned in other answers.
SELECT *
FROM table AS M
WHERE Exists(
SELECT 1
FROM table
WHERE ID = M.ID
HAVING M.Date = Max(Date)
)
One way, using ROW_NUMBER:
With CTE As
(
SELECT Id, Name, Date, Rn = Row_Number() Over (Partition By Id
Order By Date DESC)
FROM dbo.TableName
)
SELECT Id --, Name, Date
FROM CTE
WHERE Rn = 1
If multiple max-dates are possible and you want all you could use DENSE_RANK instead.
Here's an overview of sql-server's ranking function: http://technet.microsoft.com/en-us/library/ms189798.aspx
By the way, CTE is a common-table-expression which is similar to a named sub-query. I'm using it to be able to filter by the row_number. This approach allows to select all columns if you want.
select Max(Date) as "Max Date"
from table
group by Id
order by Id
Try with Max(Date) and GROUP BY the other two columns (the ones with repeating data)..
SELECT ID, Max(Date) as date, Name
FROM YourTable
GROUP BY ID, Name
You may try with this
DECLARE #T TABLE(ID INT, NAME VARCHAR(50),DATE DATETIME)
INSERT INTO #T VALUES(1,'A','2014-04-20'),(1,'A','2014-04-28')
,(2,'A2','2014-04-22'),(2,'A2','2014-04-24')
,(3,'A3','2014-04-20'),(3,'A3','2014-04-28')
,(4,'A4','2014-04-28'),(4,'A4','2014-04-28')
,(5,'A5','2014-04-28'),(5,'A5','2014-04-28')
SELECT T.ID FROM #T T
WHERE T.DATE=(SELECT MAX(A.DATE)
FROM #T A
WHERE A.ID=T.ID
GROUP BY A.ID )
GROUP BY T.ID
select id, max(date) from NameOfYourTable group by id;

How to select a row based on its row number?

I'm working on a small project in which I'll need to select a record from a temporary table based on the actual row number of the record.
How can I select a record based on its row number?
A couple of the other answers touched on the problem, but this might explain. There really isn't an order implied in SQL (set theory). So to refer to the "fifth row" requires you to introduce the concept
Select *
From
(
Select
Row_Number() Over (Order By SomeField) As RowNum
, *
From TheTable
) t2
Where RowNum = 5
In the subquery, a row number is "created" by defining the order you expect. Now the outer query is able to pull the fifth entry out of that ordered set.
Technically SQL Rows do not have "RowNumbers" in their tables. Some implementations (Oracle, I think) provide one of their own, but that's not standard and SQL Server/T-SQL does not. You can add one to the table (sort of) with an IDENTITY column.
Or you can add one (for real) in a query with the ROW_NUMBER() function, but unless you specify your own unique ORDER for the rows, the ROW_NUMBERS will be assigned non-deterministically.
What you're looking for is the row_number() function, as Kaf mentioned in the comments.
Here is an example:
WITH MyCte AS
(
SELECT employee_id,
RowNum = row_number() OVER ( order by employee_id )
FROM V_EMPLOYEE
ORDER BY Employee_ID
)
SELECT employee_id
FROM MyCte
WHERE RowNum > 0
There are 3 ways of doing this.
Suppose u have an employee table with the columns as emp_id, emp_name, salary. You need the top 10 employees who has highest salary.
Using row_number() analytic function
Select * from
( select emp_id,emp_name,row_number() over (order by salary desc) rank
from employee)
where rank<=10
Using rank() analytic function
Select * from
( select emp_id,emp_name,rank() over (order by salary desc) rank
from employee)
where rank<=10
Using rownum
select * from
(select * from employee order by salary desc)
where rownum<=10;
This will give you the rows of the table without being re-ordered by some set of values:
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT '1')) AS RowID, * FROM #table
If using SQL Server 2012 you can now use offset/fetch:
declare #rowIndexToFetch int
set #rowIndexToFetch = 0
select
*
from
dbo.EntityA ea
order by
ea.Id
offset #rowIndexToFetch rows
fetch next 1 rows only

Select the first instance of a record

I have a table, myTable that has two fields in it ID and patientID. The same patientID can be in the table more than once with a different ID. How can I make sure that I get only ONE instance of every patientID.?
EDIT: I know this isn't perfect design, but I need to get some info out of the database and today and then fix it later.
You could use a CTE with ROW_NUMBER function:
WITH CTE AS(
SELECT myTable.*
, RN = ROW_NUMBER()OVER(PARTITION BY patientID ORDER BY ID)
FROM myTable
)
SELECT * FROM CTE
WHERE RN = 1
It sounds like you're looking for DISTINCT:
SELECT DISTINCT patientID FROM myTable
you can get the same "effect" with GROUP BY:
SELECT patientID FROM myTable GROUP BY patientID
The simple way would be to add LIMIT 1 to the end of your query. This will ensure only a single row is returned in the result set.
WITH CTE AS
(
SELECT tableName.*,ROW_NUMBER() OVER(PARTITION BY patientID ORDER BY patientID) As 'Position' FROM tableName
)
SELECT * FROM CTE
WHERE
Position = 1