How can I effectively check uniqueness of an item in database? - sql

I have a spreadsheet that needs to be uploaded. But each row needs to be check if they are unique in the database. One option that I can think of is to check each row exists in the database or not, that means doing N sql queries for N rows. Is there any alternatives on effectively checking for unique data instead of checking row by row?

we can do that by using NOT EXISTS and NOT IN methods. To do that, first we need to insert that spreadsheet data's into one temp table.
lets take your main table as TABLE_1 and your spreadsheet temp table as TABLE_2.
Using NOT IN -
INSERT INTO TABLE_1 (id, name)
SELECT t2.id, t2.name FROM TABLE_2 t2
WHERE t2.id NOT IN (SELECT id FROM TABLE_1)
Using NOT EXISTS -
INSERT INTO TABLE_1 (id, name)
SELECT t2.id, t2.name FROM TABLE_2 t2
WHERE NOT EXISTS(SELECT id FROM TABLE_1 t1 WHERE t1.id = t2.id)

Related

SQL migration script - insert into select with output ID

I am using PostgreSQL and Flyway to perform a data migration in an application. The idea is to move rows from one table to another and keep the link between the old and new table in the old table. So, let's say we have Table_1 with columns (id, name, user_id) and a new Table_2 with similiar columns (id2, name2, user_id2).
Now, the first step will be to add a column to Table_1 that will store the id of its counterpart in new Table_2. So:
alter Table_1 add column if not exists migrated_table_2_id int;
And now I would like to write an sql that will perform the migration of data from Table_1 to Table_2 and at the same time fill in the id values in the migrated_table_2_id column. So something like:
insert into Table_2 (name2, user_id2) select name, user_id from Table_1;
but with filling in the migrated_table_2_id with the newly created row in Table 2
You can use a CTE, assuming that name2, user_id2 or both in combination are unique:
with i as (
insert into Table_2 (name2, user_id2)
select name, user_id
from Table_1
returning *
)
update table_1 t1
set t1.user_id2 = t2.id
from table_2 t2
where t2.name = t1.name and t2.user_id2 = t.user_id;

Hive - cannot recognize input 'insert' in select clause

Say I've already created table3, and try to insert data into it using the following code
WITH table1
AS
(SELECT 1 AS key, 'One' AS value),
table2
AS
(SELECT 1 AS key, 'I' AS value)
INSERT TABLE table3
SELECT t1.key, t1.value, t2.value
FROM table1 t1
JOIN table2 t2
ON (t1.key = t2.key)
However, I got an error as cannot recognize input 'insert' in select clause. If I simply delete the insert sentence, then the query runs just fine.
Is this a syntax problem? Or I cannot use with clause to insert?
Use INTO or OVERWRITE depending on what you need:
INSERT INTO TABLE table3 --this will append data, keeping the existing data intact
or
INSERT OVERWRITE TABLE table3 --will overwrite any existing data
Read manual: Inserting data into Hive Tables from queries

Deleting duplicates from composite Primary key in SQL server

I am using SQL server 2012.
I have two tables with identical structure. (Say there are four columns - ID, C2, C3, C4)
I want to copy data from Table1 to Table2.
The only difference between the two tables is that Table1 contains the local ID in the 'ID' field and table 2 should have the global ID.
There is another third table that contains this mapping between the localID and the global ID.
(multiple local ID's can be mapped to one Global ID)
Example:
ID 'abcd' in russia can be global ID 123
ID 'cdef' in china can be global ID 123
therefore abcd in russia is basically cdef in china (this mapping is stored in table3)
I want to take the GlobalID from table3 corresponding to the local ID that we have in Table1 and insert them into table2.
Table2 has a primary key constraint defined on the column ID+C2.
The problem is that there are high chance we'll have duplicate data - So we want to insert only distinct combination of ID+C2. (clustered PK)
Can someone please help me with this? Sorry if this is confusing.
I right now have this query, but it doesnt eliminates duplicates when copying data from T1 to T2 and thus I get an error on PK.
INSERT INTO TABLE2
(ID,
C2,
C3,
C4)
SELECT T3.ID,
T1.C2,
T1.C3,
T1.C4
FROM TABLE1 T1
JOIN TABLE3 T3 ON T1.ID = T3.ID
JOIN TABLE2 T2 ON T2.ID = T1.ID
I'm not sure, but I guess you want something like
INSERT INTO Table2 (globalID)
SELECT DISTINCT m.globalID from Mapping m inner join Table1 t
on t.country = m.country and t.localId = m.localID
You can use any select, so alter it for your needs.

copy records from one table to another query give error sql

Using the below query to copy records from one table to another, but i get error
insert into table1 (datestamp)
select datestamp
from table2
where table1.datestamp is null
I want to copy records of datestamp from table 2 to table 1 where datestamp in table 1 is null.
Is this what you mean?
insert into table1 (datestamp)
select datestamp
from table2
where table2.datestamp is null
You are referencing table1 datestamp in the where clause and this is not allowed.
Perhaps you really want an update. If so, you need a way to link the two tables:
update t1
set datestamp = t2.datestamp
from table1 t1 join
table2 t2
on t1.id = t2.id
where t1.datestamp is null
I'm assuming the tables are tied together by some unique id? We'll call that tableID.
UPDATE table1 t1, table2 t2
SET t1.datestamp = t2.datestamp
WHERE t1.datestamp IS NULL
AND t1.tableID = t2.tableID

sql query distinct join

Sorry I missed and deleted earlier question on accident again.
I have a situation, I am trying to select distinct values from table 1 that are new and store them in table 2. The problem is that table has duplicates on column "name" but it does have a key column "id", but the different ids of course map to the same name.
My idea on the query would be
INSERT INTO TABLE2
(NAME, UniqueID)
SELECT DISTINCT TABLE1.NAME, TABLE1.ID
FROM TABLE1
LEFT JOIN TABLE2 ON TABLE1.ID=TABLE2.UniqueID
WHERE TABLE2.NAME IS NULL
Need help on getting the query to return my desired results, right now it still produces duplicates in table2 (on name column), which I don't want. I would want it to only append new records even if I run the query multiple times. For example if two new records were added into table1 but one has the name already in table 2, then the query would only add 1 new record to table2
just a note: I am using ms access, so it has strict syntax on single queries
EDIT:
Folliwing input I had came with this query
INSERT INTO TABLE2
(NAME, UniqueID)
SELECT TABLE1.NAME, Min(TABLE1.ID)
FROM TABLE1
LEFT JOIN TABLE2 ON TABLE1.NAME=TABLE2.NAME
WHERE TABLE2.UniqueID IS NULL
Group By TABLE1.NAME;
but these actually had to be separated to two separate wueries in access to run without a reserver error flag but now I ran into additional problem. When I run the two separate queries, it works fine the first time, but when I run it twice trying to test to see if any new records have been added to table 1, it then appends 1 record when no new records are in table 1, so it appends a blank name value and a duplicate unique id, and continually does that same process everytime I run it.
Since you're pulling both Name and ID, the distinct keyword will only pull distinct combinations of those. Two records with the same Name and different ID's is still valid.
In the case of two Names with different ID's, which would you like to be inserted?...
insert into table2 (Name, UniqueID)
select t1.Name, MIN(t1.ID)
from table1 t1
left join table2 t2 on t1.ID = t2.UniqueID
where t2.Name is null
group by t1.Name
in response to comments, I realize the Name field is what should be joined on, to prevent dupes that already exist.
insert into table2 (Name, UniqueID)
select t1.Name, MIN(t1.ID)
from table1 t1
left join table2 t2 on t1.Name = t2.Name
where t2.UniqueID is null
group by t1.Name
INSERT INTO TABLE2 (UniqueID, NAME)
SELECT min(t1.ID) as UniqueID, t1.NAME
FROM TABLE1 t1
LEFT JOIN TABLE2 t2 ON t1.ID=t2.UniqueID
WHERE t2.NAME IS NULL
group by t1.NAME