SQL - Removing Duplicate without 'hard' coding? - sql

Heres my scenario.
I have a table with 3 rows I want to return within a stored procedure, rows are email, name and id. id must = 3 or 4 and email must only be per user as some have multiple entries.
I have a Select statement as follows
SELECT
DISTINCT email,
name,
id
from table
where
id = 3
or id = 4
Ok fairly simple but there are some users whose have entries that are both 3 and 4 so they appear twice, if they appear twice I want only those with ids of 4 remaining. I'll give another example below as its hard to explain.
Table -
Email Name Id
jimmy#domain.com jimmy 4
brian#domain.com brian 4
kevin#domain.com kevin 3
jimmy#domain.com jimmy 3
So in the above scenario I would want to ignore the jimmy with the id of 3, any way of doing this without hard coding?
Thanks

SELECT
email,
name,
max(id)
from table
where
id in( 3, 4 )
group by email, name

Is this what you want to achieve?
SELECT Email, Name, MAX(Id) FROM Table WHERE Id IN (3, 4) GROUP BY Email;

Sometimes using Having Count(*) > 1 may be useful to find duplicated records.
select * from table group by Email having count(*) > 1
or
select * from table group by Email having count(*) > 1 and id > 3.
The solution provided before with the select MAX(ID) from table sounds good for this case.
This maybe an alternative solution.

What RDMS are you using? This will return only one "Jimmy", using RANK():
SELECT A.email, A.name,A.id
FROM SO_Table A
INNER JOIN(
SELECT
email, name,id,RANK() OVER (Partition BY name ORDER BY ID DESC) AS COUNTER
FROM SO_Table B
) X ON X.ID = A.ID AND X.NAME = A.NAME
WHERE X.COUNTER = 1
Returns:
email name id
------------------------------
jimmy#domain.com jimmy 4
brian#domain.com brian 4
kevin#domain.com kevin 3

Related

How to select people who has multiple values

I want to select people who has 2 values (activate & recurring) in the table for example,
table :: tbl_transactions
id
name
action
1
John
activate
2
John
recurring
3
Salah
activate
4
Bill
activate
5
Bill
recurring
6
Bill
recurring
Expected result,
id
name
action
1
John
activate
2
John
recurring
4
Bill
activate
5
Bill
recurring
6
Bill
recurring
Please help. I have been spent for an hour to fix this.
Really thanks.
You can aggregate the action values for each user name and check if the array is 2 long (since you only need 2 actions) and contains ['activate', 'recurring'] (since you only need these values)
SELECT t.id, t.name FROM tbl_transactions t
JOIN LATERAL (
SELECT
name,
ARRAY_AGG(action) AS actions
FROM tbl_transactions
GROUP BY name
) user_actions ON t.name = user_actions.name
AND ARRAY_LENGTH(actions, 1) = 2
AND ARRAY['activate', 'recurring']::VARCHAR[] #> actions
Demo
here is a query for selecting only names appearing twice:
select t.*
from t
join (select name, count(*)
from t
having count(*) = 2
) c on t.name = c.name ;
The query will be as below. Check Sql Fiddle. You can use count distinct action to count distinct action value check it is greater than 1.
select a.* from tbl_transactions a join
(select name, count(*) from tbl_transactions group by name having count(distinct action) > ) b
on a.name = b.name
I would use the bool_or() window function for this:
with complete_check as (
select id, name, action,
bool_or(action = 'recurring') over w as has_recurring,
bool_or(action = 'activate') over w as has_activate
from tbl_transactions
window w as (partition by name)
)
select id, name, action
from complete_check
where has_recurring and has_activate;
db<>fiddle here

How can I delete completely duplicate rows from a query, without having a unique value for it?

I'm having an issue getting information from an MS Access Database table. I need a count of a code but I don't have to take into account duplicate rows, which means that I need to delete all duplicate rows.
Here's an example to illustrate what I need:
Code | Name
12 | George
20 | John
12 | George
33 | John
I will need first to delete both rows with the same code, and then I need a count for the name the rest of the table data for example this will be the result that I'm expecting:
Name | Count
John | 2
I already have a query that does that for me, but is taking around 1 hour to get me around 5000 rows and I need something more efficient. My query:
select name, count(*) from Table
where name = '" + input_name + "'
and code in (select code from Table group by code
having count(code) = 1)
group by name
order by count(name) desc;
I would appreciate any suggestion.
Rather than using in, I might suggest filtering the original dataset in a subquery, e.g.:
select u.name, count(*)
from (select t.code, t.name from yourtable t group by t.code, t.name having count(*) = 1) u
group by u.name
Here, change yourtable to the name of your table.

Complex SQL query or queries

I looked at other examples, but I don't know enough about SQL to adapt it to my needs. I have a table that looks like this:
ID Month NAME COUNT First LAST TOTAL
------------------------------------------------------
1 JAN2013 fred 4
2 MAR2013 fred 5
3 APR2014 fred 1
4 JAN2013 Tom 6
5 MAR2014 Tom 1
6 APR2014 Tom 1
This could be in separate queries, but I need 'First' to equal the first month that a particular name is used, so every row with fred would have JAN2013 in the first field for example. I need the 'Last" column to equal the month of the last record of each name, and finally I need the 'total' column to be the sum of all the counts for each name, so in each row that had fred the total would be 10 in this sample data. This is over my head. Can one of you assist?
This is crude but should do the trick. I renamed your fields a bit because you are using a bunch of "RESERVED" sql words and that is bad form.
;WITH cte as
(
Select
[NAME]
,[nmCOUNT]
,ROW_NUMBER() over (partition by NAME order by txtMONTH ASC) as 'FirstMonth'
,ROW_NUMBER() over (partition by NAME order by txtMONTH DESC) as 'LastMonth'
,SUM([nmCOUNT]) as 'TotNameCount'
From Table
Group by NAME, [nmCOUNT]
)
,cteFirst as
(
Select
NAME
,[nmCOUNT]
,[TotNameCount]
,[txtMONTH] as 'ansFirst'
From cte
Where FirstMonth = 1
)
,cteLast as
(
Select
NAME
,[txtMONTH] as 'ansLast'
From cte
Where LastMonth = 1
Select c.NAME, c.nmCount, c.ansFirst, l.ansLast, c.TotNameCount
From cteFirst c
LEFT JOIN cteLast l on c.NAME = l.NAME

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance
Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,
Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table
This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.

Using sql to keep only a single record where both name field and address field repeat in 5+ records

I am trying to delete all but one record from table where name field repeats same value more than 5 times and the address field repeats more than five times for a table. So if there are 5 records with a name field and address field that are the same for all 5, then I would like to delete 4 out of 5. An example:
id name address
1 john 6440
2 john 6440
3 john 6440
4 john 6440
5 john 6440
I would only want to return 1 record from the 5 records above.
I'm still having problems with this.
1) I create a table called KeepThese and give it a primary key id.
2) I create a query called delete_1 and copy this into it:
INSERT INTO KeepThese
SELECT ID FROM
(
SELECT Min(ID) AS ID
FROM Print_Ready
GROUP BY names_1, addresses
HAVING COUNT(*) >=5
UNION ALL
SELECT ID FROM Print_Ready as P
INNER JOIN
(SELECT Names_1, addresses
FROM Print_ready
GROUP BY Names_1, addresses
HAVING COUNT(*) < 5) as ThoseLessThan5
ON ThoseLessThan5.Names_1 = P.Names_1
AND ThoseLessThan5.addresses = P.addresses
)
3) I create a query called delete_2 and copy this into it:
DELETE P.* FROM Print_Ready as P
LEFT JOIN KeepThese as K
ON K.ID = P.ID
WHERE K.ID IS NULL
4) Then I run delete_1. I get a message that says "circular reference caused by alias ID" So I change this piece:
FROM (SELECT Min(ID) AS ID
to say this:
FROM (SELECT Min(ID) AS ID2
Then I double click again and a popup displays saying Enter Parameter Value for ID.This indicates that it doesn't know what ID is. But print_ready is only a query and while it has an id, it is in reality the id of another table that got filtered into this query.
Not sure what to do at this point.
CREATE TABLE isolate_duplicates AS dont sure it work for access, beside you should give a name for count(*) for new table.
This maybe work:
SELECT DISTINCT name, address
INTO isolate_duplicate
FROM print_ready
GROUP BY name + address
HAVING COUNT(*) > 4
DELETE print_ready
WHERE name + address
IN (SELECT name + address
FROM isolate_duplicate)
INSERT print_ready
SELECT *
FROM isolate_duplicate
DROP TABLE isolate_duplicate
Not tested.