how to remove a duplicate row in SQL with date condition? - sql

my data is:
ID Name date
1 Ben 2017-01-21
2 Mark 2017-01-20
3 Mark 2017-01-21
4 Ell 2017-01-19
and it should be
ID Name date
1 Ben 2017-01-21
3 Mark 2017-01-21
4 Ell 2017-01-19
just the older "mark" with ID 2 must be remove

If you just want to return the most recent row for name, you can use:
select t.*
from t
where t.date = (select max(t2.date) from t t2 where t2.name = t.name);
In most databases, you can use similar logic for a delete:
delete from t
where t.date < (select max(t2.date) from t t2 where t2.name = t.name)

you can use following query. i usually avoid to add sub queries in select or where to avoid performance issues.
Select id, name date from mydata x
inner join (SELECT name,MAX(date) from mydata group by name) y on x.name=y.name

It looks like the table has not been normalized. But as per the question following should work if the database is SQL Server 2008 and above, credit:
WITH cte AS (
SELECT Id, Name, [Date]
row_number() OVER(PARTITION BY Name ORDER BY [date]) AS [RowNum]
FROM YourTable
)
DELETE cte WHERE [RowNum] > 1

Related

How to select the top 3 values from a group based on date and exclude duplicate value?

If I three columns and 1 column has ID, 1 column has value and 1 column has date. Example, ID column has ID1, ID2, ID3. The value for each ID has a numeric value, say 1,2,3,4,5 for each ID.
How do I only get 3 results for each ID based on the most recent date descending.
I am using Sybase SQL. Is there any way I can write this?
I tried to use Row_number() and rank() but I don't get to use either of those functions with my SQL tool.
ID value Date
1 3 20190511
1 1 20190503
1 5 20190401
2 2 20190520
2 1 20190514
2 4 20190503
3 1 20190516
3 5 20190415
3 3 20190402
If you don't have row_number try this
SELECT *
FROM yourTable t1
WHERE (SELECT COUNT(*)
FROM yourTable t2
WHERE t1.id = t2.id
AND t1.date < t2.date) < 3
So if one id have 3 or more older rows wont appear.
with row_number
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) as rn
FROM YourTable t1
) as t
WHERE t.rn <= 3
I assume you cant have multiple rows in same date. In that case you may want use RANK() or DENSE_RANK() and decide how handle ties.
One method uses a correlated subquery with in:
select t.*
from t
where t.date in (select top (3) t2.date
from t t2
where t2.id = t.id
order by t2.date desc
);
Note that this assumes that the dates are unique.

SQL Server - SQL Query to find duplicate records with different dates

I need to find duplicate records (ID) which have different dates, however the duplicate has to be on the day before/after the original. Essentially I am trying to find if the same ID's were used on a different day.
I am able to find the duplicates but cannot get the date part correct, is there a simpler way to perform the above?
I have been trying the following but feel I am over complicating things:
SELECT *
FROM table
WHERE ID IN (
SELECT ID
FROM Table
Where [DATE] < DATEADD(day, +1, [DATE]) and ID=ID
GROUP BY ID
HAVING COUNT(*) > 1 )
ORDER BY Name,[DATE], ID ASC
My data is similar to:
Name Date ID
A 3/30/2018 6.26
B 3/31/2018 6.26
C 4/1/2018 7.85
D 4/2/2018 11.88
E 4/3/2018 11.88
F 4/4/2018 9.48
The query should only pick up names AB and DE.
Any help would be appreciated.
SELECT NAME,
DATE,
ID
FROM TABLE1
WHERE ID IN (SELECT ID
FROM TABLE1
GROUP BY ID
HAVING COUNT(*) > 1)
Output
NAME DATE ID
A 2018-03-30 6.26
B 2018-03-31 6.26
D 2018-04-02 11.88
E 2018-04-03 11.88
Demo
http://sqlfiddle.com/#!18/15edb/1
You can use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.id = t.id and
t2.date = dateadd(day, -1, t1.date)
);
This selects the later duplicate. For the earlier one, use 1 instead of -1.
You may try the following Method
DECLARE #T TABLE
(
Nm VARCHAR(50),
Mydate DATE,
Id FLOAT
)
INSERT INTO #T
VALUES('A','3/30/2018',6.26),
('B','3/31/2018',6.26),
('C','4/1/2018',7.85),
('D','4/2/2018',11.88),
('E','4/3/2018',11.88),
('F','4/4/2018',9.48)
SELECT
*
FROM #T T
WHERE EXISTS
(
SELECT 1 FROM #T WHERE ID = T.Id GROUP BY Id HAVING COUNT(1)>1
)

How can I search by two continuous rows in SQL?

Given the SQL table
id date employee_type employee_level
1 10/01/2015 other 2
1 09/13/2011 full-time 1
1 09/25/2010 intern 1
2 09/25/2013 full-time 3
2 09/25/2011 full-time 2
2 09/25/2008 full-time 1
3 09/23/2015 full-time 5
3 09/23/2013 full-time 4
Is it possible to search for ids that have one row with employee_type "intern", and the row above it in the table (same id with later date) with employee_type "full-time".
In this case, id 1 meets my requirement.
Thanks a lot!
Assuming that you mean the same id with the previous date, then you can use lag(), an ANSI standard function supported by most databases:
select t.*
from table t
where t.id in (select id
from (select t.*,
lag(employee_type) over (partition by id order by date) as prev_et
from table t
) tt
where tt.employee_type = 'intern' and tt.prev_et = 'full-time'
);
If your database doesn't support lag(), you can do something similar with correlated subqueries.
I believe the request isn't as described in the question; instead what you appear to be wanting is list all rows for folks who have been interns.
SELECT
t1.*
FROM yourtable AS t1
INNER JOIN (
SELECT DISTINCT
id
FROM yourtable
WHERE employee_type = 'intern'
) AS t2 ON t1.id = t2.id
;
Alternatively you might be wanting only those folks who have been both 'intern' and 'full-time' in which case you could use the query below that uses a HAVING clause:
SELECT
t1.*
FROM yourtable AS t1
INNER JOIN (
SELECT id
FROM yourtable
WHERE employee_type = 'intern'
OR employee_type = 'full-time'
GROUP BY id
HAVING COUNT(DISTINCT employee_type) > 1
) AS t2 ON t1.id = t2.id
;

SQL get column value associated with max()

I have the following table
ID Version
--- ----------
123 1
124 2
125 3
126 4
127 5
128 6
Now I need to get ID value where version# is maximum
What I can do is
select ID from tbl where version = (select max(version) from tbl)
I don't want to use this since i need to use this part in a join inside another query and i don't want to complicate things further.
You can use select FIRST():
SELECT FIRST(id) FROM tbl ORDER BY Version DESC
Or limit the number results using LIMIT 1 option:
SELECT id FROM tbl ORDER BY Version DESC LIMIT 1
You mentioned you need this in a join, so something like this should do it
select *
from table_1 as t1
join (
select id,
row_number() over (order by version desc) as rn
from table_2
) as t2 on t1.id = t2.id and t2.rn = 1
(This is ANSI SQL as you didn't mention a DBMS - but should work on most modern DBMS)

Better solution for existing sql query

I have table like this:
type | date | id
-----------------------------
1 | 2012-01-01 | 1
2 | 2012-01-01 | 2
1 | 2012-02-02 | 3
2 | 2012-02-02 | 4
I need to build query that will pick all "up-to-date" distinct values of type ( in this example it will be records with id's 3 and 4). Now i have this solution :
select * from test t1 where date =
(select max(date) from test t2 where t2.type = t1.type ) order by type, date desc
I am embarrassed by the presence of nested select, maybe there is more elegant solution ?
since you didn't mention the RDBMS you are using, try this one. will work on most RDBMS.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT type, MAX(DATE) maxDAte
FROM tableName
GROUP BY type
) b ON a.type = b.type AND
a.DATE = b.maxDate
SQLFiddle Demo
or if you RDBMS supports Window Function
SELECT type, date, id
FROM
(
SELECT type, date, id,
ROW_NUMBER() OVER (PARTITION BY type
ORDER BY date DESC) rn
FROM tableNAme
) s
WHERE rn = 1
SQLFiddle Demo
In your particular example, this could work also:
select type, max(date), max(id)
from your_table
group by type
but notice that it will work only if you are absolutely sure that dates and ids are always increasing. If this is not the case, max(date) and max(id) could be on two different rows. Use this only if you know what you are doing! If not, there's nothing wrong with nested queries.