SQL select newst data in 2 table same primary key - sql

I have two same table:
A(id, ..., modified_date)
and B(id, ..., modified_date). I need to select the record with same id but modified_date larger.
How can I write the SQL? Please help.
Example:
Table A
id | user name | email | modified date
------------------------------------------------
1 | Anne | ana#gmail.com | 2016/12/20
And table B
id | user name | email | modified date
------------------------------------------------
1 | Anne Jr, | ana_j#gmail.com | 2017/01/20
With two record has same id, I need to get the record with modified_date larger. The example above, with id = 1, I need to select the record has modified_date = 2017/01/20

You can do a JOIN and then ORDER BY modified_date column like
select t1.id,t1.modified_date
from table1 t1 join table2 t2 on t1.id = t2.id
order by t1.modified_date desc;

If you need data from the B table, you can use :
SELECT b.*
FROM B b
WHERE b.id = a.id
AND b.modified_date > a.modified.date
Similarly, if you need data from the A table you can use :
SELECT a.*
FROM A a
WHERE a.id = b.id
AND a.modified_date > b.modified.date
In case there are multiple records which fit the criteria and you need only the one record which has the greatest modified date value then you can use :
SELECT TOP 1 a.*
FROM A a
WHERE a.id = b.id
AND a.modified_date > b.modified.date
ORDER BY a.modified_date
OR
SELECT TOP 1 b.*
FROM B b
WHERE b.id = a.id
AND b.modified_date > a.modified.date
ORDER BY b.modified_date
Hope this helps!!!

You can try using a CASE expression on SQL Server
SELECT A.id,A.other_columns,
(CASE WHEN a.modified_date > b.modified_date THEN a.modified_date ELSE b.modified_date END) as modified_date
FROM [A] INNER JOIN [B] on A.id=B.id

If you want the higher of those two values, then use greatest() with a join:
select ta.id,
greatest(ta.modifed_date, tb.modified_date)
from table_a ta
join table_b tb on ta.id = tb.id;
If you want all columns from the row with the later date, you can use a case statement:
select ta.id,
case
when ta.modified_date > tb.modified_date then ta.email
else tb.email
end as email,
case
when ta.modifed_date > tb.modified_date then ta.user_name
else tb.user_name
end as user_name,
greatest(ta.modified_date, tb.modified_date) as modified_date
from table_a ta
join table_b tb on ta.id = tb.id;

Related

How do I update a table that references duplicate records?

I have two SQL tables. One gets a reference value from another table which stores a list of Modules and their ID. But these descriptions are not unique. I am trying to remove the duplicates of Table A but I'm not sure how to update Table B to only reference the single values.
Example:
Table A: Table B:
-------------------------------- ------------------------------------
ID Description RefID ID Name
-------------------------------- ------------------------------------
1 Test 1 2 1 QuickReports
-------------------------------- ------------------------------------
2 Test 2 1 2 QuickReports
-------------------------------- ------------------------------------
I want the results to be the following:
Table A: Table B:
-------------------------------- ------------------------------------
ID Description RefID ID Name
-------------------------------- ------------------------------------
1 Test 1 1 1 QuickReports
-------------------------------- ------------------------------------
2 Test 2 1
--------------------------------
I managed to delete duplicates from table B using the below code but I haven't been able to update the records in Table A. Each table have over 500 records each.
WITH cte AS(
SELECT
Name,
ROW_NUMBER() OVER (
PARTITION BY
Name
ORDER BY
Name
)row_num
FROM ReportmodulesTest
)
DELETE FROM cte
WHERE row_num > 1;
You would need to update table A first, before deleting from table B.
You tagged your question MySQL but that database would not support the delete statement that you are showing. I suspect that you are running SQL Server, so here is how to do it in that database:
update a
set refid = b.minid
from tablea
inner join (select name, id, min(id) over(partition by name) minid from tableb) b
on b.id = a.id and b.minid <> a.id
In MySQL, you would phrase the same query as:
update tablea a
from tablea
inner join (select name, id, min(id) over(partition by name) minid from tableb) b on b.id = a.id
set a.refid = b.minid
where b.minid <> a.id
You can update the first table using :
update a join
(select b.*,
min(id) over (partition by name) as min_id
from b
) b
on a.refid = b.id
set a.refid = b.min_id
where a.refid <> b.min_id;
Then, you can delete rows in the second table with a similar logic :
delete b
from b join
(select b.*,
min(id) over (partition by name) as min_id
from b
) bb
on bb.id = b.id
where b.id <> bb.min_id;
I found a solution that has made this process easier. I first use Row_Number to find duplicates in Table A and SELECT INTO a temporary table.
SELECT
a.Id
, a.Name
, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Id DESC) RN
INTO
#TestTable
FROM
TableA a WITH(NOLOCK)
I then JOIN Table A and Table B to see where the ID's match and identify which ID I need to keep and which ID's I need to delete:
SELECT
b.Id
, b.Name
, b.RefId
, ToKeep.Id KeepId
, ToDelete.Id DeleteId
FROM
#TestTable ToDelete
JOIN TableB b WITH(NOLOCK)
ON b.RefId = ToDelete.Id
JOIN #TestTable ToKeep
ON ToDelete.Name = ToKeep.Name
AND ToKeep.RN = 1
WHERE ToDelete.RN > 1
Then using a similar statement, I just update the records:
UPDATE b
SET
b.RefId = ToKeep.Id,
FROM #TestTable ToDelete
JOIN TableB b WITH(NOLOCK)
ON b.RefId = ToDelete.Id
JOIN #TestTable ToKeep
ON ToDelete.Name = ToKeep.Name
AND ToKeep.RN = 1
WHERE
ToDelete.RN > 1
Lastly, I can now delete the duplicate records:
DELETE a
FROM #TestTable b
INNER JOIN TableA a
ON b.Id = a.Id
WHERE
b.RN > 1
After this, you can use the same first SELECT statement to ensure that all duplicates are deleted. Just remove the SELECT INTO statement.
Thanks to an anonymous colleague of mine for this solution and hope this helps someone out there.

Query to output not existing data

Table A:
id Name
1 a
2 b
3 c
4 d
5 e
Table B:
id Name
3 c
4 d
5 e
Here, id is the primary key connected to Table B.
I need output like this:-
id
1
2
That means, which ids in Table A are not present in Table B
Use EXCEPT operator:
select id from tableA
except
select id from tableB
You can use a left join, which will preserve all records on the left side and associate them with null if no matching record is available on the right side.
This way you can then filter on the right side columns to be null to get the desired outcome
select t1.id
from tableA t1
left join
tableB t2
on t1.id = t2.id
where t2.id is null
Use NOT EXISTS in WHERE clause
SELECT id FROM TableA A
WHERE NOT EXISTS(SELECT 1 FROM TableB B WHERE A.id = B.Id )
Using Not in statement.
Try this:-
Select id from TableA
where id not in (Select id from TableB);
You can use minus:
select * from tableA
minus
select * from tableB

Need to retrieve all records in table A and only single one in table B that is the last updated

I have to retrieve certain records in TABLE_A - then need to display the last time the row was updated - which is in TABLE_B (however, there are many records that correlate in TABLE_B). TABLE_A's TABLE_A.PK is ID and links to TABLE_B through TABLE_B.LINK, where the schema would be:
TABLE_A
===================
ID NUMBER
DESC VARCHAR2
TABLE_B
===================
ID NUMBER
LINK NUMBER
LAST_DATE DATE
And the actual table data would be:
TABLE_A
===================
100 DESCRIPTION0
101 DESCRIPTION1
TABLE_B
===================
1 100 12/12/2012
2 100 12/13/2012
3 100 12/14/2013
4 101 12/12/2012
5 101 12/13/2012
6 101 12/14/2013
So, I would need something to read out:
Result
====================
100 DESCRIPTION0 12/14/2013
101 DESCRIPTION1 12/14/2013
I tried to join different ways, but nothing seems to work:
select * from
(SELECT ID, DESC from TABLE_A WHERE ID >= 100) TBL_A
full outer join
(select LAST_DATE from TABLE_B WHERE ROWNUM = 1 order by LAST_DATE DESC) TBL_B
on TBL_A.ID = TBL_B.LINK;
The easiest thing to do would be to join table_a with an aggregate query on table_b:
SELECT table_a.*, table_b.last_date
FROM table_a
LEFT JOIN (SELECT link, MAX(last_date) AS last_date
FROM table_b
GROUP BY link) table_b ON table_a.id = table_b.link
If you just want the most recent date, think aggregation and join. The extra levels of subqueries do not help. Something like:
select a.id, a.desc, max(last_date)
from table_a a join
table_b b
on a.id = b.link
where a.id >= 100
group by a.id, a.desc;
Note: I doubt a full outer join is necessary, although you can keep that if you have join keys that don't match between the tables. Perhaps a left join is appropriate.
I should point out that if you want more fields from b, then your initial inclination to use row_number() is correct. But the query would look like:
select a.id, a.desc, max(last_date)
from table_a a left join
(select b.*, row_number() over (partition by link order by last_date desc) as seqnum
from table_b b
) b
on a.id = b.link and b.seqnum = 1
where a.id >= 100
group by a.id, a.desc;

SQL Select all from table A with counting table B

If I needed a query such that I grab all columns from table A but I also need to count how many B's each row in table A has.
Table A: id | username | email | address
Table B: user_id
SELECT *, total
FROM table_a
WHERE total = (SELECT * FROM table_b WHERE table_a.id==table_b.user_id)
Any ideas?
Edit: For more clarification here is the desired output
1 | steve | steve#steve.steve | 123 Steve | 5 // letters
2 | chris | chris#chris.chris | 123 chris | 2 // letters
SELECT
table_a.id,
table_a.username,
table_a.email,
table_a.address,
count(table_b.user_id) as total
FROM table_a
LEFT OUTER JOIN table_b
ON table_a.id = table_b.user_id
GROUP BY (
table_a.id,
table_a.username,
table_a.email,
table_a.address
)
This is a good example of needing an outer join. If we used an inner join, the query would exclude the entries in table_a which has zero table_b entries.
This could be further refined to meet two challenges:
include all of the columns of table_a without explicitly asking for them.
Handle the zero entry scenario without using non-standard SQL (eg ISNULL, WHERE)
This code below should do it.
SELECT
table_a.*,
tempTable.total
FROM (
SELECT
table_a.Id,
COUNT(table_b.user_id) as total
FROM table_a
LEFT OUTER JOIN table_b
ON table_a.id = table_b.user_id
GROUP BY (table_a.id)
) AS tempTable
INNER JOIN table_a
ON tempTable.Id = table_a.Id;
Comparing this with Cybernate's solution, non-standard SQL looks very attractive :-)
Try this:
SELECT a.*, ISNULL(bcnt, 0) bcnt
FROM TableA a LEFT JOIN
(
SELECT user_id, COUNT(1) AS BCNT
FROM TableB
GROUP BY user_id
) b
ON a.id = b.user_id
You can just use a LEFT JOIN and AGGREGATION functions
SELECT b.user_id,
min(a.username) UserName,
min(a.email) Email,
min(a.address) Address,
COUNT(*) Quantity
FROM table_b b left join
table_a a on a.id=b.user_id
group by b.user_id
select
*,
(select count(*)
from #TableB as B
where A.id = B.user_id) as total
from #TableA as A

SQL: select all unique values in table A which are not in table B

I have table A
Id | Name | Department
-----------------------------
0 | Alice | 1
0 | Alice | 2
1 | Bob | 1
and table B
Id | Name
-------------
0 | Alice
I want to select all unique Ids in table A which do not exist in table B. how can I do this?
select distinct id
from TableA a
where not exists (
select id
from TableB
where id = a.id
)
Just to provide a different solution than NOT IN :
SELECT DISTINCT A.Id
FROM A
LEFT OUTER JOIN B
ON A.Id = B.Id
WHERE B.Id IS NULL
The "good" solution is usually MINUS or EXCEPT, but MySQL doesn't support it.
This question was asked a few time ago and someone posted an article comparing NOT IN, NOT EXISTS and LEFT OUTER JOIN ... IS NULL. It would be interesting if someone could find it again!
The most efficient answer is to use a left join, as using "NOT IN" can sometimes prevent a query from using an index, if present.
The answer in this case would be something like
SELECT DISTINCT
*
FROM
TableA a
LEFT JOIN
TableB b
ON
a.Id = b.Id
WHERE
b.Id IS NULL
Alternatively, this is more readable than a left join, and more efficient than the NOT IN solutions
SELECT * FROM TableA a where NOT EXISTS (SELECT * FROM TableB where Id = a.Id)
I'd use a NOT EXISTS Like this:
SELECT A.Id
FROM TableA A
WHERE NOT EXISTS (SELECT B.Id FROM TableB B WHERE A.Id = B.Id)
GROUP BY A.Id
SELECT DISTINCT Id FROM A WHERE Id NOT IN (SELECT ID FROM B);
SELECT DISTINCT Id FROM A WHERE Id NOT IN(SELECT DISTINCT Id FROM B);
The subquery will get all the IDs in B. The group by...having will get all the unique IDs in A that are also not in B.
select *
from A
where id not in
(
select distinct id
from B
)
group by ID
having count(*) > 1;