SQL: Select only one row of table with same value - sql

Im a bit new to sql and for my project I need to do some Database sorting and filtering:
Let's assume my database looks like this:
==========================================
| id | email | name
==========================================
| 1 | 123#test.com | John
| 2 | 234#test.com | Peter
| 3 | 234#test.com | Steward
| 4 | 123#test.com | Ethan
| 5 | 542#test.com | Bob
| 6 | 123#test.com | Patrick
==========================================
What should I do to only have the last column with the same email te be returned:
==========================================
| id | email | name
==========================================
| 3 | 234#test.com | Steward
| 5 | 542#test.com | Bob
| 6 | 123#test.com | Patrick
==========================================
Thanks in advance!

SQL Query:
SELECT * FROM test.test1 WHERE id IN (
SELECT MAX(id) FROM test.test1 GROUP BY email
);
Hope this solves your problem. Thanks.

A generic way to do this in SQL is to use the ANSI standard row_number() function:
select t.*
from (select t.*, row_number() over (partition by email order by id desc) as seqnum
from t
) t
where seqnum = 1;

Here is a clearer way:
SELECT *
FROM table
ORDER BY email DESC
LIMIT 1;

You can use following query to get the MAX id value per email:
SELECT email, MAX(id)
FROM mytable
GROUP BY email
Using the above query as a derived table you can obtain the whole record:
SELECT t1.*
FROM mytable AS t1
JOIN (
SELECT email, MAX(id) AS id
FROM mytable
GROUP BY email
) AS t2 ON t1.id = t2.id

Related

SQL - SELECT duplicates between IDs, but not show records if duplicates occur for same ID

I have the following table (simplified from the real table) at the moment:
+----+-------+-------+
| ID | Name | Phone |
+----+-------+-------+
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 2 | Mark | 321 |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+-------+-------+
My desired output in the SELECT statement is:
+----+------+-------+
| ID | Name | Phone |
+----+------+-------+
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+------+-------+
I want to select duplicates only when they occur between two different IDs (like Mark and Kate sharing the same phone number), but not to show any records for IDs that share the same phone number with themselves only (like Tom).
Could someone advise how this can be achieved?
You can use an EXISTS condition with a correlated subquery to ensure that another record exists that has the same phone and a different id. We also need DISTINCT to remove the duplicates in the resultset.
SELECT DISTINCT id, name, phone
FROM mytable t
WHERE EXISTS (
SELECT 1
FROM mytable t1
WHERE t1.phone = t.phone AND t1.id <> t.id
)
Demo on DB Fiddle:
| id | name | phone |
| --- | ---- | ----- |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
You can use window functions for this:
select t.*
from (select t.*,
row_number() over (partition by phone, name order by id) as seqnum,
min(id) over (partition by phone) as min_id,
max(id) over (partition by phone) as max_id
from t
) t
where seqnum = 1 and min_id <> max_id;
Another method uses aggregation and a window function:
select phone, name, id
from (select phone, name, id,
count(*) over (partition by phone) as num_ids
from t
group by phone, name, id
) pn
where num_ids > 1;
Both of these have the advantage over the exists solution (GMB's) that they refer to the "table" only once. That can be a big advantage if the table is a complex view or query. If performance is an issue, I would encourage you to test several variants to see which works best.
Can use somewhat a corelated query with group by and having as below
Select ID, NAME, max(PHONE) From
(Select * From Table) t group by id,
name having
1= max(
case
When phone in (select phone from
table where t.id<>Id) then 1 else 0)
end)

I am not sure why it's giving this error. Remove duplicate records in Primary Key

Select
*
from [myTable]
WHERE [myTable].ID IN
(
Select
Min( [myTable].ID ),
[myTable].Username
FROM [myTable]
group by [myTable].Username);
Gives me the error:
You have written a subquery that can return more than one field without using EXISTS reserved word in the main query's FROM clause. Revise the Select statement of the subquery to request only one field
I have duplicate records in Username, so I am trying to eliminate them by using MIN of ID number as the first record in Username is correct. Can someone help or tell me where to look?
+------+-------+-------+---------+--------------+
| Data | id | Fname | Lname | Status |
+------+-------+-------+---------+--------------+
| 1 | 12345 | Kunal | Kumar | completed |
| 2 | 12345 | Kunal | Kumar | Not Started |
| 3 | 12346 | Rahul | Malviya | Completed |
| 4 | 12346 | Rahul | Malviya | Not Started |
+------+-------+-------+---------+--------------+
The problem is you try to compare ID with a touple {Username, ID}
Instead you use outer value as filter for inner query.
SELECT *
FROM [myTable] T1
WHERE T1.ID =
(SELECT Min( T2.ID )
FROM [myTable] T2
WHERE T2.Username = T1.Username);
Try this:
Select
*
from [myTable]
WHERE [myTable].ID IN
(
Select
Min( [myTable].ID )
FROM [myTable]
group by [myTable].Username);
I don't know much about MS Access but does it require a column to be selected before you can use it in GROUP BY?

SQL - How to find which page is the first for users?

I have a table like this:
+----------+-------------------------------------+----------------------------------+
| user_id | time | url |
+----------+-------------------------------------+----------------------------------+
| 1 | 02.04.2017 8:56 | www.landingpage.com/ |
| 1 | 02.04.2017 8:57 | www.landingpage.com/about-us |
| 1 | 02.04.2017 8:58 | www.landingpage.com/faq |
| 2 | 02.04.2017 6:34 | www.landingpage.com/about-us |
| 2 | 02.04.2017 6:35 | www.landingpage.com/how-to-order |
| 3 | 03.04.2017 9:11 | www.landingpage.com/ |
| 3 | 03.04.2017 9:12 | www.landingpage.com/contact |
| 3 | 03.04.2017 9:13 | www.landingpage.com/about-us |
| 3 | 03.04.2017 9:14 | www.landingpage.com/our-legacy |
| 3 | 03.04.2017 9:15 | www.landingpage.com/ |
+----------+-------------------------------------+----------------------------------+
I want to figure out which page is the first for most users (first page a user see when he comes to the site) and count the number of times it is viewed as the first page.
Is there a way to write a query to do this? I guess I need to use
MIN(time)
in conjunction with grouping but I don't know how.
So regarding the sample I provided it should be like:
url url_count
---------------------------------------------------
www.landingpage.com/ 2
www.landingpage.com/about-us 1
Thanks!
You're correct, you'll need to use the min() aggregate function within a subselect.
select
my_table.url
from
my_table
where
my_table.time = (
select
min(t.time)
from
my_table t
where
t.user_id = my_table.user_id
)
replace my_table with whatever your table is actually named.
To include how many pages the user has seen, you'll need something like this:
select
my_table.url
, (
select
count(t.url)
from
my_table t
where
t.user_id = my_table.user_id
) as url_count
from
my_table
where
my_table.time = (
select
min(t.time)
from
my_table t
where
t.user_id = my_table.user_id
)
SELECT *
FROM my_table
WHERE time IN
(
SELECT min(time)
FROM my_table
GROUP BY url
);
You can query as below:
Select top (1) with ties *
from yourtable
order by row_number() over(partition by user_id order by [time])
You can use outer query to get the same as below:
Select * from (
Select *, RowN = row_number() over(partition by user_id order by [time]) from yourtable) a
Where a.RowN = 1

Find MAX LEN name against duplicate IDs

Being a beginner at SQL, I'm stuck.
I have a table structure like thi:
+------+-------+-----------------------------------------+
| id | name | content |
+------+-------+-----------------------------------------+
| 1 | Jack | ... |
| 2 | Dan | ... |
| 1 | Joe | ... |
| 1 | Jeoffery | ... |
+------+-------+-----------------------------------------+
What I want to do is that I want to select the Distinct IDs along with the name with max length against that specific id.
For e.g: Against ID 1, it should return Jeoffery while against ID 2, Dan.
Any help would be much appreciated.
You can use ROW_NUMBER():
;WITH CTE AS
(
SELECT id,
name,
RN = ROW_NUMBER() OVER(PARTITION BY id ORDER BY LEN(name) DESC)
)
SELECT id,
name
FROM CTE
WHERE RN = 1;

SELECT only latest record of an ID from given rows

I have this table shown below...How do I select only the latest data of the id based on changeno?
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Yes | 2 | |
| 2 | Maybe | 3 | |
| 3 | Yes | 4 | |
| 3 | Yes | 5 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Maybe | 8 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I would want this result...
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Maybe | 3 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I currently have this SQL statement...
SELECT id, data, MAX(changeno) as changeno FROM Table1 GROUP BY id;
and clearly it doesn't return what I want. This should return an error because of the aggrerate function. If I added fields under the GROUP BY clause it works but it doesn't return what I want. The SQL statement is by far the closest I could think of. I'd appreciate it if anybody could help me on this. Thank you in advance :)
This is typically referred to as the "greatest-n-per-group" problem. One way to solve this in SQL Server 2005 and higher is to use a CTE with a calculated ROW_NUMBER() based on the grouping of the id column, and sorting those by largest changeno first:
;WITH cte AS
(
SELECT id, data, changeno,
rn = ROW_NUMBER() OVER (PARTITION BY id ORDER BY changeno DESC)
FROM dbo.Table1
)
SELECT id, data, changeno
FROM cte
WHERE rn = 1
ORDER BY id;
You want to use row_number() for this:
select id, data, changeno
from (SELECT t.*,
row_number() over (partition by id order by changeno desc) as seqnum
FROM Table1 t
) t
where seqnum = 1;
Not a well formed or performance optimized query but for small tasks it works fine.
SELECT * FROM TEST
WHERE changeno IN (SELECT MAX(changeno)
FROM TEST
GROUP BY id)
for other alternatives :
DECLARE #Table1 TABLE
(
id INT, data VARCHAR(5), changeno INT
);
INSERT INTO #Table1
SELECT 1,'Yes',1
UNION ALL
SELECT 2,'Yes',2
UNION ALL
SELECT 2,'Maybe',3
UNION ALL
SELECT 3,'Yes',4
UNION ALL
SELECT 3,'Yes',5
UNION ALL
SELECT 3,'No',6
UNION ALL
SELECT 4,'No',7
UNION ALL
SELECT 5,'Maybe',8
UNION ALL
SELECT 5,'Yes',9
SELECT Y.id, Y.data, Y.changeno
FROM #Table1 Y
INNER JOIN (
SELECT id, changeno = MAX(changeno)
FROM #Table1
GROUP BY id
) X ON X.id = Y.id
WHERE X.changeno = Y.changeno
ORDER BY Y.id