Select distinct name with random id - sql

I have a table with an id and a name (an a bunch of other stuff not relevant for this query). Now I need an SQL statement that returns one row per distinct name and in that row I need the name and one id (can be any id).
The table is looking something like this:
id | name
---+-----
1 | a2
2 | a2
3 | a4
4 | a4
5 | a2
6 | a3
btw. using Postgres 8.4
Tried various combinations of grouping or joining with self. Is this even possible without creating extra tables?

Arbitrarily choosing to return the minimum id per name.
SELECT name, MIN(id)
FROM YourTable
GROUP BY name

You may look at PostgreSQL wiki. It shows how to select random rows.
You may use random() function to select random rows using ORDER BY clause of SELECT. Example:
SELECT id FROM mytable ORDER BY random()
You can then use GROUP BY to select distinct names. You may need to limit results using LIMIT clause. So the query looks something like this:
SELECT id, name FROM table_name GROUP BY name ORDER BY random() LIMIT 1

select ID, name from table group by name;

Related

How to select distinct result set in SQL Server?

I have a table, sample data like this. There are names associated with multiple ids
Id Name
-------
1 A
1 B
2 A
2 B
Result needed
ID Name
--------
1 A
1 B
How to get this? Order of ID doesn't matter
TL; DR
SELECT min(id) as id, name from mytable GROUP BY name
When you need to summarize over multiple records in SQL, each column you include should either be aggregated using a function like min(), max() or sum(); or included in the GROUP BY clause. Here you are looking to pick an arbitrary ID, so why not use the "min()" ID? Then we want each unique name, so we add it to the GROUP BY clause.

SQL Query to group text based on numeric column

I have a table 'TEST' as shown below
Number | Seq | Name
-------+-------+------
123 | 1 | Hello
123 | 2 | Hi
123 | 3 | Greetings
234 | 1 | Goodbye
234 | 2 | Bye
I want to write a query, to group the table by 'Number', and select the rows with the maximum sequence number (MAX(Seq)). The output of the query would be
Number | Seq | Name
-------+-------+------
123 | 3 | Greetings
234 | 2 | Bye
How do I go about this?
EDIT: TEST is actually a table that is the result from a long query (joining multiple tables) that I have already written. I already have a (SELECT ...) statement to get the values I need. Is there a way to remove duplicate rows (with the same 'Number' as shown above) and select only the one with maximum 'Seq' value.
I am on Microsoft SQL Server 2008 (SP2)
I was hoping there would be a way to achieve this by
SELECT * FROM (SELECT ...) TEST <condition to group>
You can use a select win in clause
select * from test
where (number, count) in (select number, max(count) from test group by Number)
Another option is to use a windowed ROW_NUMBER() function with a partition on the number:
With Cte As
(
Select *,
Row_Number() Over (Partition By Number Order By Count Desc) RN
From TEST
)
Select Number, Count, Name
From Cte
Where RN = 1
SELECT *
FROM (SELECT test.*, MAX (seq) OVER (PARTITION BY num) max_seq
FROM test)
WHERE seq = max_seq
I changed the column name from number because you can't use a reserved word for a column name. This is pretty much the same as the other answers, except that it explicitly gets the maximum sequence number for each NUM.
You want to use an ANALYTIC function together with a conditional clause to get you only the rows of TEST that you desire.
WITH TEST as (
...your really complex query that generates TEST...
)
SELECT
Number, Seq, Name,
RANK() OVER (PARTITION By Number ORDER BY Seq DESC) AS aRank
FROM Test
WHERE aRank = 1
;
This returns the Number, Seq, Name for each Number grouping where the Seq is maximum. Yes, it also returns a column named aRank with all '1' in it...hopefully it can be ignored.
The solution to this is to do an self join on only the MAX(Seq) values.
This answer can be found at SQL Select only rows with Max Value on a Column

Count items in column SQL query

Let's say I have a table that looks like,
id
2
2
3
4
5
5
5
How do I get something like,
id count
2 2
3 1
4 1
5 3
where the count column is just the count of each id in the id column?
You want to use the GROUP BY operation
SELECT id, COUNT(id)
FROM table
GROUP BY id
select id, count(id) from table_name group by id
or
select id, count(*) from table_name group by id
This is your query:
SELECT id, COUNT(id)
FROM table
GROUP BY id
What GROUP BY clause does is this:
It will split your table based on ids i.e all your 1's are separated, then the 2's , 3's and so on. You can assume it like new tables are created where in one table all the 1's are stored, 2's in another , 3's in yet another and so on.
Then after that the SELECT query is applied on each of these separate tables and the result is returned for each of these "groups".
Good luck!
Kudos! :)

SQL Separating Distinct Values using single column

Does anyone happen to know a way of basically taking the 'Distinct' command but only using it on a single column. For lack of example, something similar to this:
Select (Distinct ID), Name, Term from Table
So it would get rid of row with duplicate ID's but still use the other column information. I would use distinct on the full query but the rows are all different due to certain columns data set. And I would need to output only the top most term between the two duplicates:
ID Name Term
1 Suzy A
1 Suzy B
2 John A
2 John B
3 Pete A
4 Carl A
5 Sally B
Any suggestions would be helpful.
select t.Id, t.Name, t.Term
from (select distinct ID from Table order by id, term) t
You can use row number for this
Select ID, Name, Term from(
Select ID, Name, Term, ROW_NUMBER ( )
OVER ( PARTITION BY ID order by Name) as rn from Table
Where rn = 1)
as tbl
Order by determines the order from which the first row will be picked.

How to find first duplicate row in a table sql server

I am working on SQL Server. I have a table, that contains around 75000 records. Among them there are several duplicate records. So i wrote a query to know which record repeated how many times like,
SELECT [RETAILERNAME],COUNT([RETAILERNAME]) as Repeated FROM [Stores] GROUP BY [RETAILERNAME]
It gives me result like,
---------------------------
RETAILERNAME | Repeated
---------------------------
X | 4
---------------------------
Y | 6
---------------------------
Z | 10
---------------------------
Among 4 record(s) of X record, i need take only first record of X.
so here i want to retrieve all fields from first row of duplicate records. i.e. Take all records whose RETAILERNAME='X' we will get some no. of duplicate records, we need to get only first row from them.
Please guide me.
You could try using ROW_NUMBER.
Something like
;WITH Vals AS (
SELECT [RETAILERNAME],
ROW_NUMBER() OVER(PARTITION BY [RETAILERNAME] ORDER BY [RETAILERNAME]) RowID
FROM [Stores ]
)
SELECT *
FROm Vals
WHERE RowID = 1
SQL Fiddle DEMO
You can then also remove the duplicates if need be (BUT BE CAREFUL THIS IS PERMANENT)
;WITH Vals AS (
SELECT [RETAILERNAME],
ROW_NUMBER() OVER(PARTITION BY [RETAILERNAME] ORDER BY [RETAILERNAME]) RowID
FROM Stores
)
DELETE
FROM Vals
WHERE RowID > 1;
You Can write query as under
SELECT TOP 1 * FROM [Stores] GROUP BY [RETAILERNAME]
HAVING your condition
WITH cte
AS (SELECT [retailername],
Row_number()
OVER(
partition BY [retailername]
ORDER BY [retailername])'RowRank'
FROM [retailername])
SELECT *
FROM cte