insert data and avoid duplication by checking a specific column - sql

I have a local db that I'm trying to insert multiple rows of data, but I do not want duplicates. I do not have a second db that I'm trying to insert from. I have an sql file. The structure is this for the db I'm inserting into:
(db)artists
(table)names-> ID | ArtistName | ArtistURL | Modified
I am trying to do this insertion:
INSERT names (ArtistName, Modified)
VALUES (name1, date),
(name2, date2),
...
(name40, date40)
The question is, how can I insert data and avoid duplication by checking a specific column to this list of data that I want inserted using SQL?

Duplicate what? Duplicate name? Duplicate row? I'll assume no dup ArtistName.
Have UNIQUE(ArtistName) (or PRIMARY KEY) on the table.
Use INSERT IGNORE instead of IGNORE.
(No LEFT JOIN, etc)

I ended up following the advice of #Hart CO a little bit by inserting all my values into a completely new table. Then I used this SQL statement:
SELECT ArtistName
FROM testing_table
WHERE !EXISTS
(SELECT ArtistName FROM names WHERE
testing_table.ArtistName = testing_table.ArtistName)
This gave me all my artist names that were in my data and not in the name table.
I then exported to an sql file and adjusted the INSERT a little bit to insert into the names table with the corresponding data.
INSERT IGNORE INTO `names` (ArtistName) VALUES
*all my values from the exported data*
Where (ArtistName) could have any of the data returned. For example,
(ArtistName, ArtistUrl, Modified). As long as the values returned from the export has 3 values.
This is probably not the most efficient, but it worked for what I was trying to do.

Related

How to append unique values from temp_tbl into original_tbl (SQL Server)?

I have a table that I'm trying to append unique values to. Every month I get list of user logins to import into this table. I would like to keep all the original values and just append the new and unique values onto the existing table. Both the table and the flatfile have a single column, with unique values, built like this:
_____
login
abcde001
abcde002
...
_____
I'm bulk ingesting the flat file into a temp table, with this:
IF OBJECT_ID('tempdb..#FLAT_FILE_TBL') IS NOT NULL
DROP TABLE #FLAT_FILE_TBL
CREATE TABLE #FLAT_FILE_TBL
(
ntlogin2 nvarchar(15)
)
BULK INSERT #FLAT_FILE_TBL
FROM 'C:\ImportFiles\logins_Dec2021.csv'
WITH (FIELDTERMINATOR = ' ');
Is there a join that would give me the table with existing values + new unique values appended? I'd rather not hard code a loop to evaluate it line by line.
Something like (pseudocode):
append unique {login} from temp_tbl into original_tbl
Hopefully it's an easy answer for someone out there.
Thanks!
Poster on Reddit r/sql provided this answer, which I'm pursuing:
Merge statement?
It looks like using a merge statement will do exactly what I want. Thanks for those who already posted replies.
You can check if a record exists using 'EXISTS' clause and insert if it doesn't exist in the target table. You can also use MERGE statement to achieve the same. Depending on what you want to do to the existing records in the target table, you can modify the Merge statement. Here since you only want to insert new records, you need to specify only what you want to do when a new record comes in. Here is an example
MERGE original_tbl T
USING temp_tbl S
ON T.login = S.login
WHEN NOT MATCHED THEN
INSERT (login)
VALUES(S.login)
Another solution would be to left join the target table to the temp table and insert only when the record doesn't exist.
INSERT INTO original_tbl(login)
SELECT S.Login
FROM temp_tbl S
LEFT JOIN original_tbl T
ON S.Login = T.Login
WHERE T.Login IS NULL

Inserting values based on another column SQL

I've added the column DVDAtTime and I'm and trying to insert values using a subquery. Seems rather straight forward but I keep getting an error that I can't insert null into (I believe) an unrelated field in the table. Ultimately, DVDAtTime should be the number shown in MembershipType
My code is as follows:
Insert Into Membership(DVDAtTime)
Select LEFT(MembershipType,1)
FROM Membership
I suspect you want to update each existing row, not insert new rows:
update membership
set DVDAtTime = left(MembershipType, 1)

Does Oracle allow an SQL INSERT INTO using a SELECT statement for VALUES if the destination table has an GENERATE ALWAYS AS IDENTITY COLUMN

I am trying to insert rows into an Oracle 19c table that we recently added a GENERATED ALWAYS AS IDENTITY column (column name is "ID"). The column should auto-increment and not need to be specified explicitly in an INSERT statement. Typical INSERT statements work - i.e. INSERT INTO table_name (field1,field2) VALUES ('f1', 'f2'). (merely an example). The ID field increments when typical INSERT is executed. But the query below, that was working before the addition of the IDENTITY COLUMN, is now not working and returning the error: ORA-00947: not enough values.
The field counts are identical with the exception of not including the new ID IDENTITY field, which I am expecting to auto-increment. Is this statement not allowed with an IDENTITY column?
Is the INSERT INTO statement, using a SELECT from another table, not allowing this and producing the error?
INSERT INTO T.AUDIT
(SELECT r.IDENTIFIER, r.SERIAL, r.NODE, r.NODEALIAS, r.MANAGER, r.AGENT, r.ALERTGROUP,
r.ALERTKEY, r.SEVERITY, r.SUMMARY, r.LASTMODIFIED, r.FIRSTOCCURRENCE, r.LASTOCCURRENCE,
r.POLL, r.TYPE, r.TALLY, r.CLASS, r.LOCATION, r.OWNERUID, r.OWNERGID, r.ACKNOWLEDGED,
r.EVENTID, r.DELETEDAT, r.ORIGINALSEVERITY, r.CATEGORY, r.SITEID, r.SITENAME, r.DURATION,
r.ACTIVECLEARCHANGE, r.NETWORK, r.EXTENDEDATTR, r.SERVERNAME, r.SERVERSERIAL, r.PROBESUBSECONDID
FROM R.STATUS r
JOIN
(SELECT SERVERSERIAL, MAX(LASTOCCURRENCE) as maxlast
FROM T.AUDIT
GROUP BY SERVERSERIAL) gla
ON r.SERVERSERIAL = gla.SERVERSERIAL
WHERE (r.LASTOCCURRENCE > SYSDATE - (1/1440)*5 AND gla.maxlast < r.LASTOCCURRENCE)
) )
Thanks for any help.
Yes, it does; your example insert
INSERT INTO table_name (field1,field2) VALUES ('f1', 'f2')
would also work as
INSERT INTO table_name (field1,field2) SELECT 'f1', 'f2' FROM DUAL
db<>fiddle demo
Your problematic real insert statement is not specifying the target column list, so when it used to work it was relying on the columns in the table (and their data types) matching the results of the query. (This is similar to relying on select *, and potentially problematic for some of the same reasons.)
Your query selects 34 values, so your table had 34 columns. You have now added a 35th column to the table, your new ID column. You know that you don't want to insert directly into that column, but in general Oracle doesn't, at least at the point it's comparing the query with the table columns. The table has 35 columns, so as you haven't said otherwise as part of the statement, it is expecting 35 values in the select list.
There's no way for Oracle to know which of the 35 columns you're skipping. Arguably it could guess based on the identity column, but that would be more work and inconsistent, and it's not unreasonable for it to insist you do the work to make sure it's right. It's expecting 35 values, it sees 34, so it throws an error saying there are not enough values - which is true.
Your question sort of implies you think Oracle might be doing something special to prevent the insert ... select ... syntax if there is an identity column, but in facts it's the opposite - it isn't doing anything special, and it's reporting the column/value count mismatch as it usually would.
So, you have to list the columns you are populating - you can't automatically skip one. So you statement needs to be:
INSERT INTO T.AUDIT (IDENTIFIER, SERIAL, NODE, ..., PROBESUBSECONDID)
SELECT r.IDENTIFIER, r.SERIAL, r.NODE, ..., r.PROBESUBSECONDID
FROM ...
using the actual column names of course if they differ from the query column names.
If you can't change that insert statement then you could make the ID column invisible; but then you would have to specify it explicitly in queries, as select * won't see it - but then you shouldn't rely on * anyway.
db<>fiddle

Insert several values into one column of a SQL table

I have a list of about 22,000 ids that I would like to insert into one SQL table. The table contains only one column which will contain all of the 22,000 ids.
How can I populate the column with all of these values in one query? Thanks.
It depends (as usual) where you have the values.
If the values reside in a table it is just insert into <yourTargetTable> select <yourColumns> from <yourSourceTable>.
If you have the values in a file, one way could be to load it with bteq's .importcommand. See an example here https://community.teradata.com/t5/Tools-Utilities/BTEQ-examples/td-p/2466
Other options: SQL-Assistant, TD-Studio, TPT, Easy Loader ...
Serach for teradata import data from text file and you'll get a lot of answers.

Can't INSERT INTO SELECT into a table with identity column

In SQL server, I'm using a table variable and when done manipulating it I want to insert its values into a real table that has an identity column which is also the PK.
The table variable I'm making has two columns; the physical table has four, the first of which is the identity column, an integer IK. The data types for the columns I want to insert are the same as the target columns' data types.
INSERT INTO [dbo].[Message] ([Name], [Type])
SELECT DISTINCT [Code],[MessageType]
FROM #TempTableVariable
This fails with:
Cannot insert duplicate key row in object 'dbo.Message' with unique index
'IX_Message_Id'. The duplicate key value is (ApplicationSelection).
But when trying to insert just Values (...) it works ok.
How do I get it right?
It appears that the data "ApplicationSelection" is already in the database. YOu need to write the select to exclude records that are already in the database. YOu can do that with a where not exists clause or a left join. LOok up teh index to see what field is unique besides the identity. That will tell you what feild you need to check to see if teh record currently exists.