Better solution for existing sql query - sql

I have table like this:
type | date | id
-----------------------------
1 | 2012-01-01 | 1
2 | 2012-01-01 | 2
1 | 2012-02-02 | 3
2 | 2012-02-02 | 4
I need to build query that will pick all "up-to-date" distinct values of type ( in this example it will be records with id's 3 and 4). Now i have this solution :
select * from test t1 where date =
(select max(date) from test t2 where t2.type = t1.type ) order by type, date desc
I am embarrassed by the presence of nested select, maybe there is more elegant solution ?

since you didn't mention the RDBMS you are using, try this one. will work on most RDBMS.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT type, MAX(DATE) maxDAte
FROM tableName
GROUP BY type
) b ON a.type = b.type AND
a.DATE = b.maxDate
SQLFiddle Demo
or if you RDBMS supports Window Function
SELECT type, date, id
FROM
(
SELECT type, date, id,
ROW_NUMBER() OVER (PARTITION BY type
ORDER BY date DESC) rn
FROM tableNAme
) s
WHERE rn = 1
SQLFiddle Demo

In your particular example, this could work also:
select type, max(date), max(id)
from your_table
group by type
but notice that it will work only if you are absolutely sure that dates and ids are always increasing. If this is not the case, max(date) and max(id) could be on two different rows. Use this only if you know what you are doing! If not, there's nothing wrong with nested queries.

Related

SQL remove duplicate row depend on certain value

I spend day in hope to figure out how to solve this query.
I have following table
ID Name Pregnancy Gender
1 Raghad Yes Female
1 Raghad No Female
2 Ohoud no Male
What I need is to remove duplicate (in this case 1,1) and to keep one of these rows which has a pregnancy status of yes.
To clarify, I can't use delete since it's a restricted database. I can only retrieve data.
Using an exists clause:
DELETE
FROM yourTable t1
WHERE
pregnancy = 'no' AND
EXISTS (SELECT 1 FROM yourTable t2 WHERE t2.ID = t1.ID AND t2.pregnancy = 'yes');
There are other ways to go about doing this, e.g. using ROW_NUMBER, but as you did not tag your database, I offer the above solution which should work on basically any database.
If you want to just view your data with the "duplicates" removed, then use:
SELECT *
FROM yourTable t1
WHERE
pregnancy = 'yes' OR
NOT EXISTS (SELECT 1 FROM yourTable t2 WHERE t2.ID = t1.ID AND t2.pregnancy = 'yes');
If column Pregnancy have just two values "Yes" and "No", in that case you can use ROW_NUMBER() also to get the results.
;WITH CTE
AS (
SELECT *,ROW_NUMBER() OVER (PARTITION BY id ORDER BY Pregnancy DESC) RN
FROM TABLE_NAME
)
SELECT *
FROM CTE
WHERE RN = 1
In case of multiple values when you want to give highest priorty to "Yes", you can write your query like following
;WITH CTE
AS (
SELECT *,ROW_NUMBER() OVER
(PARTITION BY id ORDER BY CASE WHEN Pregnancy = 'Yes' then 0 else 1 end) RN
FROM TABLE_NAME
)
SELECT *
FROM CTE
WHERE RN= 1
For this sample data you can group by ID, Name, Gender and return the maximum value of the column Pregnancy for each group since Yes is greater compared to No:
SELECT ID, Name, MAX(Pregnancy) Pregnancy, Gender
FROM tablename
GROUP BY ID, Name, Gender
See the demo.
Results:
> ID | Name | Pregnancy | Gender
> -: | :----- | :-------- | :-----
> 1 | Raghad | Yes | Female
> 2 | Ohoud | No | Male
Here is how you could do it in MySQL 8.
Similar Common Table Expressions exist in SQL Server and Oracle.
There you may need to add a comma after then closing parentheses that
ends the CTE (with) definition.
with dups as (
Select id from test
group by id
Having count(1) > 1
)
select * from test
where id in (select id from dups)
and Pregnancy = 'Yes'
union all
select * from test where id not in (select id from dups);
You can see it in action, by running it here
Note this does it without deleting the original.
But it gives you a result set to work with that has what you want.
If you wanted to delete, then you could use this instead, after the dups CTE definition:
delete from test
where id in (select id from dups) and Pregnancy = 'No'
Or distill this into:
delete from test
where id in (Select id from test
group by id
Having count(1) > 1) and Pregnancy = 'No'
1) First of all, update design of your table. ID must be primary key. This would automatically restrict the duplicate rows having same ID.
2) You can use Group by and having clause to remove duplicates
delete from table where pregnancy='no' and exists (SELECT
id
FROM table
GROUP BY id
HAVING count(id)>1)

Remove duplicates in Select query based on one column

I want to select without duplicate ids and keep row '5d' and not '5e' in select statement.
table
id | name
1 | a
2 | b
3 | c
5 | d
5 | e
I tried:
SELECT id, name
FROM table t
INNER JOIN (SELECT DISTINCT id FROM table) t2 ON t.id = t2.id
For the given example an aggregation using min() would work.
SELECT id,
min(name) name
FROM table
GROUP BY id;
You can also use ROW_NUMBER():
SELECT id, name
FROM (
SELECT id, name, ROW_NUMBER() OVER(PARTITION BY id ORDER BY name) rn
FROM mytable
) x
WHERE rn = 1
This will retain the record that has the smallest name (so '5d' will come before '5e'). With this technique, you can also use a sort criteria on another column that the one where duplicates exists (which an aggregate query with MIN() cannot do). Also, queries using window functions usually perform better than the equivalent aggregate query.
If you want to keep the row with the smallest name then you can use not exists:
select t.* from tablename t
where not exists (
select 1 from tablename
where id = t.id and name < t.name
)

how to get dupes from table using group by and/or having

If I have this table:
id | aux_id | name
------------------
1 | 22 | foo
2 | 22 | bar
3 | 19 | baz
How can I get this result, showing names that share an aux_id with at least one other record?
name
----
foo
bar
I know I need to use GROUP BY and/or HAVING but this isn't working:
SELECT name FROM my_table
GROUP BY aux_id
HAVING COUNT(aux_id) > 1
Column 'name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
How about exists?
select t.name
from my_table t
where exists (select 1
from my_table t2
where t2.aux_id = t.aux_id and t2.name <> t.name
);
I would use exists :
select t.name
from table t
where exists (select 1 from table t1 where t1.aux_id = t.aux_id and t1.id <> t.id);
This will have a advantage to cover all columns if you want, without using group by clause.
An alternative, just for fun...
WITH
duplication_counts AS
(
SELECT
*,
COUNT(*) OVER (PARTITION BY aux_id) AS aux_id_occurrences
FROM
my_table
)
SELECT
*
FROM
duplication_counts
WHERE
aux_id_occurrences > 1
Group by works IMHO (performance would not be good in large data as it would be with EXISTS):
select * from myTable
where aux_id in
(select aux_id
from myTable
group by aux_id
having count(*) > 1)
SQLFiddle Demo

SQL Server - SQL Query to find duplicate records with different dates

I need to find duplicate records (ID) which have different dates, however the duplicate has to be on the day before/after the original. Essentially I am trying to find if the same ID's were used on a different day.
I am able to find the duplicates but cannot get the date part correct, is there a simpler way to perform the above?
I have been trying the following but feel I am over complicating things:
SELECT *
FROM table
WHERE ID IN (
SELECT ID
FROM Table
Where [DATE] < DATEADD(day, +1, [DATE]) and ID=ID
GROUP BY ID
HAVING COUNT(*) > 1 )
ORDER BY Name,[DATE], ID ASC
My data is similar to:
Name Date ID
A 3/30/2018 6.26
B 3/31/2018 6.26
C 4/1/2018 7.85
D 4/2/2018 11.88
E 4/3/2018 11.88
F 4/4/2018 9.48
The query should only pick up names AB and DE.
Any help would be appreciated.
SELECT NAME,
DATE,
ID
FROM TABLE1
WHERE ID IN (SELECT ID
FROM TABLE1
GROUP BY ID
HAVING COUNT(*) > 1)
Output
NAME DATE ID
A 2018-03-30 6.26
B 2018-03-31 6.26
D 2018-04-02 11.88
E 2018-04-03 11.88
Demo
http://sqlfiddle.com/#!18/15edb/1
You can use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.id = t.id and
t2.date = dateadd(day, -1, t1.date)
);
This selects the later duplicate. For the earlier one, use 1 instead of -1.
You may try the following Method
DECLARE #T TABLE
(
Nm VARCHAR(50),
Mydate DATE,
Id FLOAT
)
INSERT INTO #T
VALUES('A','3/30/2018',6.26),
('B','3/31/2018',6.26),
('C','4/1/2018',7.85),
('D','4/2/2018',11.88),
('E','4/3/2018',11.88),
('F','4/4/2018',9.48)
SELECT
*
FROM #T T
WHERE EXISTS
(
SELECT 1 FROM #T WHERE ID = T.Id GROUP BY Id HAVING COUNT(1)>1
)

how to remove a duplicate row in SQL with date condition?

my data is:
ID Name date
1 Ben 2017-01-21
2 Mark 2017-01-20
3 Mark 2017-01-21
4 Ell 2017-01-19
and it should be
ID Name date
1 Ben 2017-01-21
3 Mark 2017-01-21
4 Ell 2017-01-19
just the older "mark" with ID 2 must be remove
If you just want to return the most recent row for name, you can use:
select t.*
from t
where t.date = (select max(t2.date) from t t2 where t2.name = t.name);
In most databases, you can use similar logic for a delete:
delete from t
where t.date < (select max(t2.date) from t t2 where t2.name = t.name)
you can use following query. i usually avoid to add sub queries in select or where to avoid performance issues.
Select id, name date from mydata x
inner join (SELECT name,MAX(date) from mydata group by name) y on x.name=y.name
It looks like the table has not been normalized. But as per the question following should work if the database is SQL Server 2008 and above, credit:
WITH cte AS (
SELECT Id, Name, [Date]
row_number() OVER(PARTITION BY Name ORDER BY [date]) AS [RowNum]
FROM YourTable
)
DELETE cte WHERE [RowNum] > 1