Retrieve records from a table which has different partial key - sql

I have a table like as follows:
Table 1 Schema
ID/Name/Description are part of primary key.
Table Structure with data
Now, I want to compare table records on the basis of ID and need to find records which are not matching. for e.g. from above screen print I want last row as my query result.
I will be really thankful for any input. Thanks !

select t1.*
from
table t1
join
(
select name,description,comment
from
table t2
group by
name,description,comment
having count(*)=1) b
on t1.name=b.name
and t1.description=b.description
and t1.comment=b.comment
If using SQLServer,this does the trick..
SELECT TOP 1 WITH TIES ID,NAME,DESCRIPTION,COMMENT
FROM
#TEMP
ORDER BY
COUNT(ID) OVER (PARTITION BY NAME,DESCRIPTION,COMMENT )

Related

How to check for only last value related to specific id - sql

I am trying to write query which would insert data from one table to another table but only in case if last email history record holds different email value than it is in customers table.
I'm talking here about next scenario:
There is one table called customers and it looks like this:
Customer_Id
Customer_Email
There is second table called customers_email_history and it looks like this:
Customer_Email_History_Id
Customer_Id
Customer_Email
I would like to insert data from customers table which holds current customer_email value to table customer_email_history but only in case if customer_email from customers table is different than last record (newest record) related to that customer in customers_email_history. Here is an example:
SCENARIO 1: DO NOT INSERT DATA
As it is possible to see in table customers_email_history last row related to customer with id 1 is his current email from table customers so we wont insert new row in customers_email_history.
SCENARIO 2: INSERT DATA
As it is possbile to see for customer with id 2 we should insert new row to customers_email_history table since last (newest) row added related to that customer is not same as his current email from customers table. Customer table holds smith#john.com email while email history table holds smith1#john.com so we should insert smith#john.com to customers_email_history table
I tried to write something like this, but it is not working :(
SELECT T1.id, T1.email
FROM customers AS T1 INNER JOIN customers_email_history AS T2
on T1.id = T2.customer_id
WHERE T1.email != (SELECT T2.email FROM email_history ORDER BY ID DESC LIMIT 1) -- here i tried to get last (newest email) email related to that customer and to compare it
with current email but this aint work liks this :(
SQL Fiddle
Looks like a perfect case for EXCEPT ALL:
INSERT INTO customers_email_history(customer_id, email)
SELECT id, email
FROM customers
EXCEPT ALL
(
SELECT DISTINCT ON (customer_id)
customer_id, email
FROM customers_email_history
ORDER BY customer_id, id DESC
);
db<>fiddle here
See:
Select rows which are not present in other table
Assuming that customers_email_history.id really is an autoincrement column like serial or an IDENTITY column. See:
Auto increment table column
Else you need add a manual ID.
Depending on undisclosed Postgres version, cardinalities, table definition and data distribution, there may be (much) faster solutions. See:
Select first row in each GROUP BY group?
Optimize GROUP BY query to retrieve latest row per user
Try this
SELECT T1.id, T1.email
FROM Customers AS T1
WHERE T1.email != (SELECT T2.email FROM Customers_Email_History AS T2 WHERE T1.Id = T2.Customer_ID ORDER BY ID DESC LIMIT 1)

How can I make selection based on conditions on SQL?

There is a table based on ID an those ID's status keys:
The table
I need to write query that will bring higher status key of the same ID. For example; query will bring only the row with status key number 9 for ID number 123. But it will bring the row with status key number 2 for ID number 156.
Hope I managed to explain myself clearly. Please help me with this query.
Use max() aggregation
select id, max(status_key)
from tablename
group by id
You didn't tag your backend, this would work with many backends and older versions of many backends (assuming you have other columns too in your table - otherwise do only group by):
select myTable.*
from myTable
inner join
(select id, max(statusKey) as statusKey
from myTable
group by id) tmp on myTable.id = tmp.id and myTable.statusKey = tmp.statusKey;

SQL 'NOT IN' Operator doesn't give expected results when comparing columns in two tables

I need to update a table from a temporary table. Therefore I need to compare and find out what lines are not in the main table to be imported from the temp table.
My tables look like follows,
line_id -> nvarchar(20)
order_no -> nvarchar(20)
line_no ->int
Both tables have same fields but the temp table has more up to date records to be brought to the main table. I am using;
INSERT INTO main_table
SELECT * FROM temp_table t
WHERE t.line_id NOT IN (SELECT line_id FROM main_table)
But the condition WHERE t.line_id NOT IN (SELECT line_id FROM main_table) doesn't bring any order lines.
But when order_no is used instead of line_id, the comparison is done and a number or order lines start to show up. But order_no is not an unique key and that comparison doesn't return all the lines needed.
It would be great if you could help me.. Thanking in advance!
Not-in's can give odd troubles. Here's a different spin on the same idea.
Insert Into main_table
select t.*
from temp_table t
left outer join main_Table m
on t.line_id=m.line_id
where m.line_id is null
Try the following:
INSERT INTO main_table
SELECT * FROM temp_table t
WHERE LTRIM(RTRIMt.line_id)) NOT IN (SELECT LTRIM(RTRIM(line_id)) FROM main_table)

How do I find duplicated values in related table and update them

This is my situation.
TABLE1:
DOCUMENT_ID,
GUID
TABLE2:
DOCUMENT_ID,
FILE
The tables are joined by DOCUMENT_ID, meaning that TABLE2 can have one or many rows with the same DOCUMENT_ID.
My problem is that TABLE2 values for whole bunch of DOCUMENT_ID have same FILE values.
I need a SQL query that will get me all GUID and count how many rows in TABLE2 for this DOCUMENT_ID have EXACTLY THE SAME FILE value (so that I can copy the GUID to Excel).
Then I need to UPDATE TABLE2's FILE columns for these cases.
For instance if DOCUMENT_ID has three rows in TABLE2 with same FILE value, I need to update two of them by adding a postfix like FILEVALUE-1, FILEVALUE-2 and so on.
Hope I make sense.
To all experts thank you in advance.
To get duplicates you might employ oldfashioned group by:
select table1.guid, table1.document_id, table2.[file], count(*) cnt
from table1
inner join table2
on table1.document_id = table2.document_id
group by table1.guid, table1.document_id, table2.[file]
having count (*) > 1
To directly update duplicates, you might use CTE:
; with t2 as (
select id,
[file],
row_number() over (partition by document_id, [file]
order by id) rn
from table2
)
update t2
set [file] = [file] + '-' + convert(varchar(10), rn - 1)
where t2.rn > 1
Note that I've added ID as a placeholder for primary key. You need a way to identify a record to be updated.
There is live test # Sql Fiddle.
This will get you all FILES that have more than a Document_id
Select FILE, COUNT(DOCUMENT_ID) as DOCUMENT_ID from table2
group by FILE
Having count(DOCUMENT_ID)>1
You can use CTE to find out duplicate value from TABLE2:
WITH CTE_1 (DOCUMENT_ID,FILE, DuplicateCount)
AS
(
SELECT DOCUMENT_ID,FILE,
ROW_NUMBER() OVER(PARTITION BY DOCUMENT_ID,FILE ORDER BY DOCUMENT_ID) AS DuplicateCount
FROM table2
)
select *
FROM CTE_1
WHERE DuplicateCount >1
I have 1 approach in mind, but not sure whether it is feasible at your end or not. But let me assure you, this is a very effective approach. You can create a table having an identity column and insert your entire data in that table. And from there on handling any duplicate data is a child's play.
There are two ways of adding an identity column to a table with existing data:
Create a new table with identity, copy data to this new table then drop the existing table followed by renaming the temp table.
Create a new column with identity & drop the existing column
For reference the I have found 2 articles :
http://blog.sqlauthority.com/2009/05/03/sql-server-add-or-remove-identity-property-on-column/
http://cavemansblog.wordpress.com/2009/04/02/sql-how-to-add-an-identity-column-to-a-table-with-data/

How to keep only one row of a table, removing duplicate rows?

I have a table that has a lot of duplicates in the Name column. I'd
like to only keep one row for each.
The following lists the duplicates, but I don't know how to delete the
duplicates and just keep one:
SELECT name FROM members GROUP BY name HAVING COUNT(*) > 1;
Thank you.
See the following question: Deleting duplicate rows from a table.
The adapted accepted answer from there (which is my answer, so no "theft" here...):
You can do it in a simple way assuming you have a unique ID field: you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their name.
Example query:
DELETE FROM members
WHERE ID NOT IN
(
SELECT MIN(ID)
FROM members
GROUP BY name
)
In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.
It would probably be easier to select the unique ones into a new table, drop the old table, then rename the temp table to replace it.
#create a table with same schema as members
CREATE TABLE tmp (...);
#insert the unique records
INSERT INTO tmp SELECT * FROM members GROUP BY name;
#swap it in
RENAME TABLE members TO members_old, tmp TO members;
#drop the old one
DROP TABLE members_old;
We have a huge database where deleting duplicates is part of the regular maintenance process. We use DISTINCT to select the unique records then write them into a TEMPORARY TABLE. After TRUNCATE we write back the TEMPORARY data into the TABLE.
That is one way of doing it and works as a STORED PROCEDURE.
If we want to see first which rows you are about to delete. Then delete them.
with MYCTE as (
SELECT DuplicateKey1
,DuplicateKey2 --optional
,count(*) X
FROM MyTable
group by DuplicateKey1, DuplicateKey2
having count(*) > 1
)
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt
Full example at http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/
You can join table with yourself by matched field and delete unmatching rows
DELETE t1 FROM table_name t1
LEFT JOIN tablename t2 ON t1.match_field = t2.match_field
WHERE t1.id <> t2.id;
delete dup row keep one
table has duplicate rows and may be some rows have no duplicate rows then it keep one rows if have duplicate or single in a table.
table has two column id and name if we have to remove duplicate name from table
and keep one. Its Work Fine at My end You have to Use this query.
DELETE FROM tablename
WHERE id NOT IN(
SELECT id FROM
(
SELECT MIN(id)AS id
FROM tablename
GROUP BY name HAVING
COUNT(*) > 1
)AS a )
AND id NOT IN(
(SELECT ids FROM
(
SELECT MIN(id)AS ids
FROM tablename
GROUP BY name HAVING
COUNT(*) =1
)AS a1
)
)
before delete table is below see the screenshot:
enter image description here
after delete table is below see the screenshot this query delete amit and akhil duplicate rows and keep one record (amit and akhil):
enter image description here
if you want to remove duplicate record from table.
CREATE TABLE tmp SELECT lastname, firstname, sex
FROM user_tbl;
GROUP BY (lastname, firstname);
DROP TABLE user_tbl;
ALTER TABLE tmp RENAME TO user_tbl;
show record
SELECT `page_url`,count(*) FROM wl_meta_tags GROUP BY page_url HAVING count(*) > 1
delete record
DELETE FROM wl_meta_tags
WHERE meta_id NOT IN( SELECT meta_id
FROM ( SELECT MIN(meta_id)AS meta_id FROM wl_meta_tags GROUP BY page_url HAVING COUNT(*) > 1 )AS a )
AND meta_id NOT IN( (SELECT ids FROM (
SELECT MIN(meta_id)AS ids FROM wl_meta_tags GROUP BY page_url HAVING COUNT(*) =1 )AS a1 ) )
Source url
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY [emp_id] ORDER BY [emp_id]) AS Row, * FROM employee_salary
)
DELETE FROM CTE
WHERE ROW <> 1