SQL Find duplicates and Update table - sql

Im trying to update a table with a Duplicate find query, my table is made of 60k+ records and doing this over excel is kinda complicated, My table looks like this
Serial_NO
.......Determine Duplicate
1
.......................Good Record
2
.......................Good Record
3
.......................Good Record
1
.......................Duplicate
The idea of this in to only update the last or more recent duplicate entry in the table by checking the entire "Serial_NO" column and them add the 'Good Record' or 'Duplicate' in column "Determine duplicate".
thanks in advance for your help!

This assumes there is an "id" field on the table, as indicated in your comments:
update tbl x
set x.[Determine Duplicate] = 'Good Record'
where x.id = (select min(y.id) from tbl y where y.[Serial_NO] = x.[Serial_NO]);
update tbl x
set x.[Determine Duplicate] = 'Duplicate'
where x.id > (select min(y.id) from tbl y where y.[Serial_NO] = x.[Serial_NO]);

One method is with ROW_NUMBER:
WITH t AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY Serial_NO ORDER BY Serial_NO) AS seq
,DetermineDuplicate
FROM dbo.YourTable
)
UPDATE t
SET DetermineDuplicate = CASE WHEN seq = 1 THEN 'Good Record' ELSE 'Duplicate' END;

After considering given comments, I think following answer will suits for your requirement.
UPDATE tbl
SET tbl.[Determine Duplicate] = tb.[Determine Duplicate]
FROM tb_src tbl
INNER JOIN
(
SELECT t1.[Serial_NO],
COALESCE(t2.isduplicate,'Good Record') AS [Determine Duplicate]
FROM
(
SELECT DISTINCT t.[Serial_NO]
FROM tb_src t
) t1
LEFT OUTER JOIN tb_src t2
ON t1.[sno] = t2.sno AND t2.isduplicate = 'Duplicate'
) tb
ON tb.[Serial_NO] = tbl.[Serial_NO]
Using above, you can go with a single query.
Please note : tb_src is the table you provide.

Related

Check if a combination of fields already exists in the table

My weakest area of SQL are self JOINS, currently struggling with an issue.
I need to find the latest entry in a table, I'm using a WHERE DATEFIELD IN (SELECT MAX(DATEFIELD) FROM TABLE) to do this. I then need to establish if 3 columns from that already exist in the same TABLE.
My latest attempt looks like this -
SELECT * FROM PART_TABLE
WHERE NOT EXISTS
(
SELECT
t1.DATEFIELD
t1.CODE1
t1.CODE2
t1.CODE3
FROM PART_TABLE t1
INNER JOIN PART_TABLE t2 ON t1.UNIQUE = t2.UNQIUE
)
WHERE t1.DATEFIELD IN
(
SELECT MAX(DATEFIELD)
FROM PARTTABLE
)
)
I think part of the issue is that I can't exclude the unique row from t1 when checking in t2 using this method.
Using MSSQL 2014.
The following query will return the latest record from your table and a bit flag whether a duplicate tuple {Code1, Code2, Code3} exists in it under a different identifier:
select top (1) p.*,
case when exists (
select 0 from dbo.Part_Table t where t.Unique != p.Unique
and t.Code1 = p.Code1 and t.Code2 = p.Code2 and t.Code3 = p.Code3
) then 1
else 0 end as [IsDuplicateExists]
from dbo.Part_Table p
order by p.DateField desc;
You can use this example as a template to address your specific needs, which unfortunately aren't immediately apparent from your explanation.

Set boolean value to FALSE for each row in query with 'HAVING'?

(Sorry about the phrasing of the title.)
I have the following SQL query that retrieves all the rows I want:
SELECT
app_activity.name
FROM
app_chatmessage
JOIN
app_activity ON app_chatmessage.activity_id = app_activity.id
GROUP BY
app_activity.name
HAVING
COUNT(app_chatmessage.owner_id) = 1;
Now, the table app_chatmessage also has a column seen. I'd like to set this column to FALSE for all the rows returned by the aforementioned query. How do I do that?
Try this
Update app_chatmessage set seen='FALSE' where
activity_id in
(
SELECT
app_activity.id
FROM
app_chatmessage
JOIN
app_activity ON app_chatmessage.activity_id = app_activity.id
GROUP BY
app_activity.name
HAVING
COUNT(app_chatmessage.owner_id) = 1
);
try below one
WITH CTE AS ( SELECT
app_activity.name, SEEN, COUNT(app_chatmessage.owner_id) over(partition by app_activity.name order by app_activity.name) CNT
FROM
app_chatmessage
JOIN
app_activity ON app_chatmessage.activity_id = app_activity.id
)
UPDATE CTE
SET SEEN = FALSE
WHERE CNT =1

Selective update in SQL Server

I've created a junction table like this one:
http://imageshack.us/scaled/landing/822/kantotype.png
I was trying to figure out a query that could able to select some rows - based on the PokémonID - and then updating only the first or second row after the major "filtering".
For example:
Let's suppose that I would like to change the value of the TypeID from the second row containing PokémonID = 2. I cannot simply use UPDATE KantoType SET TypeID = x WHERE PokémonID = 2, because it will change both rows!
I've already tried to use subqueries containing IN,EXISTS and LIMIT, but with no success.
Its unclear what are your trying to do. However, you can UPDATE with JOIN like so:
UPDATE
SET k1.TypeID = 'somethng' -- or some value from k2
FROM KantoType k1
INNER JOIN
(
Some filtering and selecting
) k2 ON k1.PokémonID = k2.PokémonID
WHERE k1.PokémonID = 2;
Or: if you want to UPDATE only the two rows that have PokémonID = 2 you can do this:
WITH CTE
AS
(
SELECT *,
ROW_NUMBER() OVER(ORDER BY TypeID) rownum
FROM KantoType
WHERE PokemonID = 2
)
UPDATE c
SET c.TypeID = 5
FROM CTE c
WHERE c.rownum = 1;
SQL Fiddle Demo
I can suggest something like this if you just need to update a single line in your table:
UPDATE kantotype
SET
type = 2
WHERE pokemon = 2
AND NOT EXISTS (SELECT * FROM kantotype k2
WHERE kantotype.type > k2.type
AND kantotype.pokemon = k2.pokemon)
It would be easier to get the first or last item of the table if you had unique identifier field in your table.
Not sure even if you are trying to update the row with PokemenID =2 by doing a major filtering on TypeID... So just out of assumptiong (big one), you can give a try on Case
UPDATE yourtable a
LEFT JOIN youtable b on a.pokeid = b.pokeid
SET a.typeid = (CASE
WHEN a.typeid < b.typeid THEN yourupdatevalue
WHEN a.typeid > b.typeid THEN someothervalue
ELSE a.typeid END);
If you know the pokemon ID and the type id then just add both to the where clause of your query.
UPDATE KantoType
SET TypeID = x
WHERE PokémonID = 2
AND TypeID=1
If you don't know the type ID, then you need to provide more information about what you're trying to accomplish. It's not clear why you don't have this information.
Perhaps think about what is the unique identifier in your data set.

select complete rows using subset of columns from a subquery (single table)

I am trying to solve the following problem, illustrated in this table, sql statement and comments
TABLE COLUMNS: id, version, idx_on; PK is 'id' column
So, I get from the subquery a set of tuples{ id, version}.
I want to set the IDX_ON value for all rows which have ID and VERSION the same as those in the subquery' tuples above. Alternately, selecting all rows (ID, VERSION, IDX_ON) with the same criterion would be a good first step.
I tried without success to use something like:
SELECT * FROM docs where ID, VERSION in (subquery)
Thanks for any comment...
You can update all rows for which a later version exists:
update (
select *
from docs d1
where exists
(
select *
from docs d2
where d1.id = d2.id
and d2.version > d1.version
)
)
set idx_on = 0;
Updated SQL Fiddle.
This seemed to work:
update docs d
set d.idx_on = 0
where exists (select * from docs where id = d.id and version > d.version);
I'm not familiar with Oracle's SQL syntax, if same as in SQL-Server, try this:
UPDATE d
SET d.idx_on = 0
FROM docs d
INNER JOIN
( SELECT id,
MAX(version) AS "version"
FROM docs
GROUP BY id
)
q
ON q.id = d.id
AND q.version = d.version

Update in child table, only one value got updated

Below I am trying to update value of a parent table from child table and counting matching values. Tables in my db:
issue_dimension with id = issue_id and have column accno.
star_schema with id star_id,this Child column have fk issue_id and column book_frequency
The book_frequency need to match the count of each accno in parent table , I tried this
update [test1] .[dbo] .star_schema
set [book_frequency] = (
select top 1 COUNT([issue_dimension].ACCNO)as book_frequency
from issue_dimension
group by ACCNO having (COUNT(*)>1) and
issue_dimension.ACCNO = star_schema .ACCNO
)
It only updates only 1st value count issue_dimension. I need to count every accno in issue_dimension and update it to matching accno of star_schema.
I never did update by joining two or more tables , can anyone help in this with joins
UPDATE s
SET [book_frequency] = i.CNT
FROM [test1].[dbo].star_schema s
INNER JOIN
(
SELECT ACCNO, COUNT(*) as CNT
FROM issue_dimension
GROUP BY ACC_NO
HAVING COUNT(*)>1
) i on (s.ACCNO = i.ACCNO)
I didn't check it but it should works
Try in this way, without grouping, just with the WHERE clause:
UPDATE [test1].[dbo].star_schema SET
[book_frequency] =
(
SELECT COUNT([issue_dimension].ACCNO)
FROM issue_dimension
WHERE issue_dimension .ACCNO = star_schema.ACCNO
HAVING COUNT(*)>1
)
It's not fully clear to me so the answer is a bit of guessing:
update s set
book_frequency = t.qty
from star_schema s
join issue_dimension i on s.issue_id = s.issue_id
join (select count(*) as qty, accno
from issue_dimension
group by accno
) t on i.accno = t.accno
Here's the example from BOL that does the kind of thing you're looking for, using AW:
USE AdventureWorks2008R2;
GO
UPDATE Sales.SalesPerson
SET SalesYTD = SalesYTD +
(SELECT SUM(so.SubTotal)
FROM Sales.SalesOrderHeader AS so
WHERE so.OrderDate = (SELECT MAX(OrderDate)
FROM Sales.SalesOrderHeader AS so2
WHERE so2.SalesPersonID = so.SalesPersonID)
AND Sales.SalesPerson.BusinessEntityID = so.SalesPersonID
GROUP BY so.SalesPersonID);