Find MAX LEN name against duplicate IDs - sql

Being a beginner at SQL, I'm stuck.
I have a table structure like thi:
+------+-------+-----------------------------------------+
| id | name | content |
+------+-------+-----------------------------------------+
| 1 | Jack | ... |
| 2 | Dan | ... |
| 1 | Joe | ... |
| 1 | Jeoffery | ... |
+------+-------+-----------------------------------------+
What I want to do is that I want to select the Distinct IDs along with the name with max length against that specific id.
For e.g: Against ID 1, it should return Jeoffery while against ID 2, Dan.
Any help would be much appreciated.

You can use ROW_NUMBER():
;WITH CTE AS
(
SELECT id,
name,
RN = ROW_NUMBER() OVER(PARTITION BY id ORDER BY LEN(name) DESC)
)
SELECT id,
name
FROM CTE
WHERE RN = 1;

Related

How to Number Each Group in PostgreSQL

How would I use a window function or similar, to number each group or partition of rows, based on certain shared characteristics?
For example:
I have a list of names ordered alphabetically that I wish to group and identify using IDs that describe the group that they belong to and position within each group.
-------------------------------------------
| outer_id | inner_id | src_id | name |
|----------|----------|--------|----------|
| 1 | 1 | 88129 | albert |
| 1 | 2 | 88130 | albrecht |
| 1 | 3 | 88131 | allan |
| 2 | 1 | 88132 | barnaby |
| 2 | 2 | 88133 | barry |
| 2 | 3 | 88134 | bart |
-------------------------------------------
I can achieve inner_id, src_id and name using a query similar to the following:
WITH cte (src_id, name) AS (
VALUES
(88129, 'albert'),
(88130, 'albrecht'),
(88131, 'allan'),
(88132, 'barnaby'),
(88133, 'barry'),
(88134, 'bart')
)
SELECT row_number() OVER (partition by left(name, 1) ORDER BY name DESC) AS inner_id, src_id, name
FROM cte;
How would I go about adding an outer_id column as shown, to represent each window (or group)?
You can use dense_rank():
select dense_rank() over (order by left(name, 1)) as outer_id,
row_number() over (partition by left(name, 1) order by name desc) as inner_id,
src_id, name
from cte;

SQL: Select only one row of table with same value

Im a bit new to sql and for my project I need to do some Database sorting and filtering:
Let's assume my database looks like this:
==========================================
| id | email | name
==========================================
| 1 | 123#test.com | John
| 2 | 234#test.com | Peter
| 3 | 234#test.com | Steward
| 4 | 123#test.com | Ethan
| 5 | 542#test.com | Bob
| 6 | 123#test.com | Patrick
==========================================
What should I do to only have the last column with the same email te be returned:
==========================================
| id | email | name
==========================================
| 3 | 234#test.com | Steward
| 5 | 542#test.com | Bob
| 6 | 123#test.com | Patrick
==========================================
Thanks in advance!
SQL Query:
SELECT * FROM test.test1 WHERE id IN (
SELECT MAX(id) FROM test.test1 GROUP BY email
);
Hope this solves your problem. Thanks.
A generic way to do this in SQL is to use the ANSI standard row_number() function:
select t.*
from (select t.*, row_number() over (partition by email order by id desc) as seqnum
from t
) t
where seqnum = 1;
Here is a clearer way:
SELECT *
FROM table
ORDER BY email DESC
LIMIT 1;
You can use following query to get the MAX id value per email:
SELECT email, MAX(id)
FROM mytable
GROUP BY email
Using the above query as a derived table you can obtain the whole record:
SELECT t1.*
FROM mytable AS t1
JOIN (
SELECT email, MAX(id) AS id
FROM mytable
GROUP BY email
) AS t2 ON t1.id = t2.id

SQL subquery to return rank 2

I have a question about writing a sub-query in Microsoft T-SQL. From the original table I need to return the name of the person with the second most pets. I am able to write a query that returns the number of perts per person, but I'm not sure how to write a subquery to return rank #2.
Original table:
+—————————-——+———-————-+
| Name | Pet |
+————————————+————-————+
| Kathy | dog |
| Kathy | cat |
| Nick | gerbil |
| Bob | turtle |
| Bob | cat |
| Bob | snake |
+—————————-——+—————-———+
I have the following query:
SELECT Name, COUNT(Pet) AS NumPets
FROM PetTable
GROUP BY Name
ORDER BY NumPets DESC
Which returns:
+—————————-——+———-————-+
| Name | NumPets |
+————————————+————-————+
| Bob | 3 |
| Kathy | 2 |
| Nick | 1 |
+—————————-——+—————-———+
You are using TSQL So:
WITH C AS (
SELECT COUNT(Pet) OVER (PARTITION BY Name) cnt
,Name
FROM PetTable
)
SELECT TOP 1 Name, cnt AS NumPets
FROM C
WHERE cnt = 2
The ANSI standard method is:
OFFSET 1 FETCH FIRST 1 ROW ONLY
However, most databases have their own syntax for this, using limit, top or rownum. You don't specify the database, so I'm sticking with the standard.
This is how you could use ROW_NUMBER to get the result.
SELECT *
FROM(
SELECT ROW_NUMBER() OVER (ORDER BY COUNT(name) DESC) as RN, Name, COUNT(NAME) AS COUNT
FROM PetTable
GROUP BY Name
) T
WHERE T.RN = 2
In MSSQL you can do this:
SELECT PetCounts.Name, PetCounts.NumPets FROM (
SELECT
RANK() OVER (ORDER BY COUNT(Pet) DESC) AS rank,
Name, COUNT(Pet)as NumPets
FROM PetTable
GROUP BY Name
) AS PetCounts
WHERE rank = 2
This will return multiple rows if they have the same rank. If you want to return just one row you can replace RANK() with ROW_NUMBER()

SELECT only latest record of an ID from given rows

I have this table shown below...How do I select only the latest data of the id based on changeno?
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Yes | 2 | |
| 2 | Maybe | 3 | |
| 3 | Yes | 4 | |
| 3 | Yes | 5 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Maybe | 8 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I would want this result...
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Maybe | 3 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I currently have this SQL statement...
SELECT id, data, MAX(changeno) as changeno FROM Table1 GROUP BY id;
and clearly it doesn't return what I want. This should return an error because of the aggrerate function. If I added fields under the GROUP BY clause it works but it doesn't return what I want. The SQL statement is by far the closest I could think of. I'd appreciate it if anybody could help me on this. Thank you in advance :)
This is typically referred to as the "greatest-n-per-group" problem. One way to solve this in SQL Server 2005 and higher is to use a CTE with a calculated ROW_NUMBER() based on the grouping of the id column, and sorting those by largest changeno first:
;WITH cte AS
(
SELECT id, data, changeno,
rn = ROW_NUMBER() OVER (PARTITION BY id ORDER BY changeno DESC)
FROM dbo.Table1
)
SELECT id, data, changeno
FROM cte
WHERE rn = 1
ORDER BY id;
You want to use row_number() for this:
select id, data, changeno
from (SELECT t.*,
row_number() over (partition by id order by changeno desc) as seqnum
FROM Table1 t
) t
where seqnum = 1;
Not a well formed or performance optimized query but for small tasks it works fine.
SELECT * FROM TEST
WHERE changeno IN (SELECT MAX(changeno)
FROM TEST
GROUP BY id)
for other alternatives :
DECLARE #Table1 TABLE
(
id INT, data VARCHAR(5), changeno INT
);
INSERT INTO #Table1
SELECT 1,'Yes',1
UNION ALL
SELECT 2,'Yes',2
UNION ALL
SELECT 2,'Maybe',3
UNION ALL
SELECT 3,'Yes',4
UNION ALL
SELECT 3,'Yes',5
UNION ALL
SELECT 3,'No',6
UNION ALL
SELECT 4,'No',7
UNION ALL
SELECT 5,'Maybe',8
UNION ALL
SELECT 5,'Yes',9
SELECT Y.id, Y.data, Y.changeno
FROM #Table1 Y
INNER JOIN (
SELECT id, changeno = MAX(changeno)
FROM #Table1
GROUP BY id
) X ON X.id = Y.id
WHERE X.changeno = Y.changeno
ORDER BY Y.id

Make a field monotonic across all rows

I have table in my sql server database which I want to convert to PK column
To do that I want to change value of each row in this column to 1,2,3 ...
Could You write T-Sql query for that task ?
Thanks for help
begin state:
Id | Name |
----------
1 | One |
2 | Two |
2 | Three|
x | xxx |
result:
Id | Name |
----------
1 | One |
2 | Two |
3 | Three|
4 | xxx |
;with cte as
(
SELECT Id, ROW_NUMBER() over (order by Id) as rn
from YourTable
)
UPDATE cte SET Id = rn
you can also do it with name if you dont have the id!
;with cte as
(
SELECT Id, ROW_NUMBER() over (order by name) as rn
from YourTable
)
UPDATE cte SET Id = rn