Finding the most recent records from duplicates - sql

I'm trying to get the most recent records from a table where there are duplicates for each row.
Every month a new row for some IDs is getting added to the table, but some other records might not have a new row each month so the data will be like this
ID Date
1 8/30/2022
1 7/30/2022
3 8/30/2022
3 7/30/2022
3 6/30/2022
4 1/11/2021
The query result should be
ID Date
1 8/30/2022
3 8/30/2022
4 1/11/2021
I tried to use a sub-query, but it is only returning records that actually has the most recent for the whole table not per ID so it is only returning those who has a record in 8/30/2022.
This is my query
create table test as (
select * from table1 inner join
(select EmpID, max(Record_Date) as maxdate
from table1 group by EmpID) ms
on table1.EmpID ms.EmpID and Record_Date=maxdate)
WITH DATA;

You may use NOT EXISTS operator with correlated subquery as the following:
SELECT T.ID, T.Date
FROM Table1 T
WHERE NOT EXISTS(SELECT * FROM Table1 D WHERE D.ID=T.ID AND D.Date>T.Date)
And of course, if you want to create a new table from this statement the query will be:
CREATE TABLE test AS
(
SELECT T.ID, T.Date
FROM Table1 T
WHERE NOT EXISTS(SELECT * FROM Table1 D WHERE D.ID=T.ID AND D.Date>T.Date)
) WITH DATA;

Related

Max and Min records from 2 tables

I have 2 tables.
The 1st table have the columns fileID, createdate.
The 2nd table have the userid, fileID, createdate as common fields along with other columns.
I am trying to write a query to find latest fileid(max) and the 1st loaded fileid(min) based on the createdate for a specific userid by joining both these tables and using groupby on fileid, createdate in the query and filtering the user id in the where clause.
However the result is showing all the rows.
I need a suggestion as how to write a query inorder to get 2 records(max and min fileid records) only from both these tables and not all the records with these field changes.
I am using SQL Server to write the query.
Thanks for your help.
To select fieldid by earliest or lattest createdate and to have it in two separate rows you can try something like this:
SELECT fileid, "earliest" as type FROM table1 WHERE createdate = (SELECT MIN(createdate) from table1) LIMIT 1
UNION ALL
SELECT fileid, "lattest" as type FROM table1 WHERE createdate = (SELECT MAX(createdate) from table1) LIMIT 1
It is not clear, why you want to join it with the second table, but you can do it like this:
SELECT
*
FROM
(
SELECT fileid, "earliest" as type FROM table1 WHERE createdate = (SELECT
MIN(createdate) from table1) LIMIT 1
UNION ALL
SELECT fileid, "lattest" as type FROM table1 WHERE createdate = (SELECT
MAX(createdate) from table1) LIMIT 1
) as subquery1
LEFT JOIN
table2 on table2.fileid = createdate.fileid

How to write a query to delete everything except maximum value grouped by an ID?

I am trying to write a query to delete duplicate records based on a ID and a value. There are multiple rows with the same ID. Condition to get the result are (and the queries I have written as per my understanding),
Look for maximum value available for the ID column in Value column (SELECT * FROM TABLE WHERE VALUE IN (SELECT MAX(VALUE) FROM TABLE GROUP BY ID)
Example:
Table data:
ID - Value
a - 1
a - 2
a - 3
b - 2
c - 3
Output:
ID - Value
a - 3
b - 2
c - 3
Ignore the results from point 1 in the table (SELECT * FROM TABLE WHERE NOT EXISTS ((SELECT * FROM TABLE WHERE VALUE IN (SELECT MAX(VALUE) FROM TABLE GROUP BY ID))
Edit: I wrote a query that finally outputs the required result for point 2
SELECT t1.* FROM TABLE t1
LEFT JOIN
(
SELECT 1 AS aux, * FROM (SELECT * FROM TABLE
WHERE VALUE IN
(SELECT MAX(VALUE) FROM TABLE group by ID))
) t2
ON
t2.ID= t1.ID
and
t2.VALUE= t1.VALUE
WHERE t2.aux IS NULL
Example:
Table data:
ID - Value
a - 1
a - 2
a - 3
b - 2
c - 3
Output:
ID - Value
a - 1
a - 2
Use the query of point 2 to delete rows from table (DELETE FROM TABLE WHERE (ID,VALUE) IN (SELECT * FROM TABLE WHERE NOT EXISTS ((SELECT * FROM TABLE WHERE VALUE IN (SELECT MAX(VALUE) FROM TABLE GROUP BY ID)))
Example:
Table data:
ID - Value
a - 1
a - 2
a - 3
b - 2
c - 3
Table data:
ID - Value
a - 3
b - 2
c - 3
Point 2 does not work, it is giving no results. When the checked the total row of output of the query from point 2 and total row of the table, there is a difference.
Since point 2 does not work, point 3 fails as well. What am I doing wrong?
After our discussion, I understand that you aimed to select many rows of data which respects the filter id and max(value). Therefore, I can suggest you the following script:
SELECT
DISTINCT a.*
FROM
`test-proj-261014.sample.id_value` a
RIGHT JOIN (
SELECT
id,
MAX(value) AS max_val
FROM
`test-proj-261014.sample.id_value`
GROUP BY
id
ORDER BY
id) b
ON
a.id = b.id
AND a.value = b.max_val
WHERE
a.value IS NOT NULL
ORDER BY
id;
Not that I use SELECT DISTINCT, which will not select duplicated data. In addition, due to the possibility of the existence of null values, I added the consition***WHERE a.value IS NOT NULL***, which will not select the rows that do not respect the condition.
The above query should solve the problem, however if you find any discrepancy with the expected amount of rows, I encourage you explore your data set and detect the reason why there are extra or less rows. You can use different types of joins to do so, one example would be the following query:
SELECT
a.*
FROM
`test-proj-261014.sample.id_value` a
LEFT JOIN (
SELECT
id,
MAX(value) AS max_val
FROM
`test-proj-261014.sample.id_value`
GROUP BY
id
ORDER BY
id) b
ON
a.id = b.id
AND a.value = b.max_val
WHERE
b.max_val IS NULL
ORDER BY
id;
This query retrieves all the values which are not present in the final output generated by the first query. This would help you understand better the data you are dealing with.
I hope it helps.

SQLite - Return Rows Even If They Are Duplicates

I have a simple SQLite table which has just one ID column.
I have some variable IDs that may be duplicates of each other like: 1,2,3,4,3,1 (These IDs are just examples, there could be hundreds of them).
And I have a simple query as follows:
SELECT ID FROM TABLE WHERE ID in (1,2,3,4,3,1)
In the usual case the answer contains only 4 rows with ids 1,2,3,4. Is there any way to force SQLite to return rows in the order of the request (1,2,3,4,3,1) even if they are duplicates?
I have n IDs in my query and I want n rows in return even if they are duplicates.
Edit: The Table Definition is:
CREATE TABLE TEST(ID TEXT PRIMARY KEY)
You can use left join:
select t.*
from (select 1 as id, 1 as ord union all
select 2 as id, 2 as ord union all
select 3 as id, 3 as ord union all
select 4 as id, 4 as ord union all
select 3 as id, 5 as ord union all
select 1 as id, 6 as ord
) ids left join
t
on t.id = ids.id
order by ids.ord;

How to Compare Date between 2 different Tables

How can I get LastAmount and CurrentAmount from a table which has latest Date between this two tables? For example, I want to get value from 2016 March, but the result I need is a value from the latest date.
SELECT TOP 1
*
FROM (
(SELECT * FROM TABLE1)
UNION ALL
(SELECT * FROM TABLE2)
) AS T WHERE monthCondition
ORDER BY T.CreatedData DESC

How to add 2 temporary tables together

If I am creating temporary tables, that have 2 columns. id and score. I want to to add them together.
The way I want to add them is if they each contain the same id then I do not want to duplicate the id but instead add the scores together.
if I have 2 temp tables called t1 and t2
and t1 had:
id 3 score 4
id 6 score 7
and t2 had:
id 3 score 5
id 5 score 2
I would end up with a new temp table containing:
id 3 score 9
id 5 score 2
id 6 score 7
The reason I want to do this is, I am trying to build a product search. I have a few algorithms I want to use, 1 using fulltext another not. And I want to use both algorithms so I want to create a temporary table based on algorithm1 and a temp table based on algorithm2. Then combine them.
How about:
SELECT id, SUM(score) AS score FROM (
SELECT id, score FROM t1
UNION ALL
SELECT id, score FROM t2
) t3
GROUP BY id
This is untested but you should be able to perform a union on the two tables and then perform a select on the results, grouping the fields and adding the scores
SELECT id,SUM(score) FROM
(
SELECT id,score FROM t1
UNION ALL
SELECT id,score FROM t2
) joined
GROUP BY id
Perform a full outer join on the ID. Select on the ID and the sum of the two "score" columns after coalescing the values to 0.
SELECT id, SUM(score) FROM
(
SELECT id, score FROM #t1
UNION ALL
SELECT id, score FROM #t2
) AS Temp
GROUP BY id
select id, sum(score)
from (
select * from table 1
union all
select * from table2
) tables
group by id
You need to create an union of those two tables then You can easily group the results.
SELECT id, sum(score) FROM
(
SELECT id, score FROM t1
UNION
SELECT id, score FROM t2
) as tmp
GROUP BY id;