Appending Rows into an SQLite Database Where Primary Key May Already Exist

Appending Rows into an SQLite Database Where Primary Key May Already Exist - sql

I’m trying to merge a few pairs of SQLite3 databases that have the same tables (and schemas). Some of the tables are pretty simple and just have rows of plain data, but some of the tables have primary keys. Some of the keys are unique like a URL (eg url LONGVARCHAR PRIMARY KEY), and some of them are just simple integer indexes, but NOT set to auto-increment (eg id INTEGER PRIMARY KEY).
I’ve found several topics on merging databases (and I had already manually merged one pair of non-primary-key databases without effort), but am concerned about the ones with keys which may already exist in both.
My question is what happens if a row is inserted to a database where a row with the same key already exists? It should overwrite the row that has that key right? I was hoping that it would append them to the table and update the key, but that only works if the key has a numeric component that is set to auto-increment correct?
Can anyone confirm my suppositions—and if possible, offer a suggestion on the easiest way to append such rows?
Thanks a lot.

You should have no problems if you set the primary key in the destination table to auto increment.
Therefore, when you do you bulk insert command or whatever you are using to insert values into your new table, you simply do not supply input for your primary key field and there will NEVER be a duplicate.
Columns:
ID Name
Just don't provide ID field, ie/
INSERT INTO tableName ("Synetech")
The insert would just add this with the next available ID index in the table.
Good Luck!

If you try to INSERT a duplicate primary key, it will give you an error and not allow the insert. SQLite also supports the 'REPLACE INTO' syntax, which will update on a duplicate primary key.
If you want to append on duplicates, you will have to check whether a field with that key already exists, and if so then change the key to some new value. The correct way to do this likely depends on your application. For integer keys you could just take the max+1, but for the url keys it's not clear what the correct behavior should be.

Related

Is it possible that setting a column as a primary key could turn some values in a column NULL?

I needed to modify by a datatable by setting the id column as the PRIMARY KEY in order to work with it on a client I am developing. However, I forgot to copy/screen-cap the already existing records and now I get the feeling data is missing. Was it possible that setting a column as the primary key could've affected data in other columns?
FYI, I set the primary key by going into the design, right clicking the column, and clicking on "Set as primary key"

What 'design' tool were you using? The only thing that would make sense would be if there were duplicate ids, but that should give an error instead of picking random rows. The only other thing I can think of would be if you had a foreign key set to cascade deletes and deleted rows in the child table.
You only say you have an 'idea' that rows were deleted. That makes me think that you might be just doing some simple 'select top(100) *' query or something and don't see the same data as you had before, i.e. you used to see ids like 2093939 and now you only see 1, 2, 3, etc.
When creating a primary key or a clustered index, that can alter the order that rows are returned by default. Creating a clustered primary key would most likely then return the rows in ascending order in that case by default. Could this be the case?

When unique value can’t be used practically as a primary key in a database

I am having trouble answering the following question...
Illustrate by an example a scenario where an attribute has unique values in the different rows, yet it can’t be used practically as a primary key in the database relations/tables.

If the suggested column that has unique values is nullable and contains null values too, it cannot be a practical primary key. Because primary keys can't be null.

A non-sequential GUID is a bad candidate for a primary key, when the data is stored ordered by that key. An insert of a new row will not be appended to the table but must be inserted in the middle, meaning that data must be moved around to make room.
That is why there may also exist sequential GUIDs .

"random" is unique but practically as a key I would not use it.

SQL Server: How to allow duplicate records on small table

I have a small table "ImgViews" that only contains two columns, an ID column called "imgID" + a count column called "viewed", both set up as int.
The idea is to use this table only as a counter so that I can track how often an image with a certain ID is viewed / clicked.
The table has no primary or foreign keys and no relationships.
However, when I enter some data for testing and try entering the same imgID multiple times it always appears greyed out and with a red error icon.
Usually this makes sense as you don't want duplicate records but as the purpose is different here it does make sense for me.
Can someone tell me how I can achieve this or work around it ? What would be a common way to do this ?
Many thanks in advance, Tim.

To address your requirement to store non-unique values, simply remove primary keys, unique constraints, and unique indexes. I expect you may still want a non-unique clustered index on ImgID to improve performance of aggregate queries that would otherwise require a scan the entire table and sort. I suggest you store an insert timestamp, not to provide uniqueness, but to facilitate purging data by date, should the need arise in the future.

You must have some unique index on that table. Make sure there is no unique index and no unique or primary key constraint.
Or, SSMS simply doesn't know how to identify the row that was just inserted because it has no key.
It is generally not best practice to have a table without a (logical) primary key. In your case, I'd make the image id the primary key and increment the counter. The MERGE statement is well-suited for performing and insert or update at the same time. Alternatives exist.
If you don't like that, create a surrogate primary key (an identity column set as the primary key).
At the moment you have no way of addressing a specific row. That makes the table a little unwieldy.

If you allow multiple rows being absolutely identical, how would you update/delete one of those rows?
How would you expect the database being able to "know" what row you referred to??
At the very least add a separate identity column (preferred being the clustered index, too).
As a side note: It's weird that you "like to avoid unneeded data" but at the same time insert duplicates over and over again instead of simply add up the click count per single image...

Use SQL statements, not GUI, if the table has not primary key or unique constraint.

SQL Trigger: On update of primary key, how to determine which "deleted" record cooresponds to which "inserted" record?

Assume that I know that updating a primary key is bad.
There are other questions which imply that the inserted and updated table records match by position (the first of one matches the first of the other.) Is this a fact or coincidence?
Is there anything that could join the two tables together when the primary key changes on an update?

There is no match of inserted+deleted virtual table row positions.
And no, you can't match rows
Some options:
there is another unique unchanging (for that update) key to link rows
limit to single row actions.
use a stored procedure with the OUTPUT clause to capture before and after keys
INSTEAD OF trigger with OUTPUT clause (TBH not sure if you can do this)
disallow primary key updates (added after comment)

Each table is allowed to have one identity column. Identity columns are not updateable; they are assigned a value when the records are inserted (or when the column is added), and they can never change. If the primary key is updateable, it must not be an identity column. So, either the table has another column which is an identity column, or you can add one to it. There is no rule that says the identity column has to be the primary key. Then in the trigger, rows in inserted and updated that have the same identity value are the same row, and you can support updating the primary key on multiple rows at a time.

Yes -- create an "old_primary_key" field in the table you're updating, and populate it first.
Nothing you can do to match-up the inserted and deleted psuedo table record keys -- even if you store their data in a log table somewhere.
I guess alternatively, you could create a separate log table that tracked changes to primary keys (old and new). This might be more useful than adding a field to the table you're updating as I suggested right at first, as it would allow you to track more than one change for a given record. Just depends on your situation, I guess.
But that said -- before you do anything, please go find a chalk board and write this 100 times:
I know that updating a primary key is bad.
I know that updating a primary key is bad.
I know that updating a primary key is bad.
I know that updating a primary key is bad.
I know that updating a primary key is bad.
...
:-) (just kidding)

Can you use auto-increment in MySql with out it being the primary Key

I am using GUIDs as my primary key for all my other tables, but I have a requirement that needs to have an incrementing number. I tried to create a field in the table with the auto increment but MySql complained that it needed to be the primary key.
My application uses MySql 5, nhibernate as the ORM.
Possible solutions I have thought of are:
change the primary key to the auto-increment field but still have the Id as a GUID so the rest of my app is consistent.
create a composite key with both the GUID and the auto-increment field.
My thoughts at the moment are leaning towards the composite key idea.
EDIT: The Row ID (Primary Key) is the GUID currently. I would like to add an an INT Field that is Auto Incremented so that it is human readable. I just didn't want to move away from current standard in the app of having GUID's as primary-keys.

A GUID value is intended to be unique across tables and even databases so, make the auto_increment column primary index and make a UNIQUE index for the GUID

I would lean the other way.
Why? Because creating a composite key gives the impression to the next guy who comes along that it's OK to have the same GUID in the table twice but with different sequence numbers.

A couple of thoughts:
If your GUID is auntoincremental and unique, why not let it be the actual Primary Key?
On the other hand, you should never take semantical decisions based on programmatic problems: you have a problem with MySQL, not with the design of your DB.
So, a couple of workarounds here:
Creating a trigger that would set the GUID to the proper value once it's inserted. That's a MySQL solution to a MySQL problem, without altering semantics for your schema.
Before inserting, start a transaction (make sure auto commit is set to false), find out the latest GUID, increment and insert with the new value. In other words, auto-increment not automatically :P

GUID's are not intended to be orderable, that's why AUTO_INCREMENT for them does not make sense.
You may, though, use an AUTO_INCREMENT for a second column of a composite primary key in MyISAM tables. You can create a composite key over (GUID, INT) column and make the second column to be AUTO_INCREMENT.
To generate a new GUID, just call UUID() in an INSERT statement or in a trigger.

No, only the primary key can have auto_increment as its value.

If, for some reason, you can't change the identity column to be a primary key, what about manually generating the auto-increment via some kind of SEQUENCE table plus a trigger to query the SEQUENCE table and save the next value to use. Then assign the value to the destination table in the trigger. Same effect. The only question I would have is whether the auto-incremented value is going to make it back thru NHibernate without a re-select of the table.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Appending Rows into an SQLite Database Where Primary Key May Already Exist - sql

Related

Is it possible that setting a column as a primary key could turn some values in a column NULL?

When unique value can’t be used practically as a primary key in a database

SQL Server: How to allow duplicate records on small table

SQL Trigger: On update of primary key, how to determine which "deleted" record cooresponds to which "inserted" record?

Can you use auto-increment in MySql with out it being the primary Key

Categories

Resources