I am using SQL AZURE and I have a problem.
I have an id_pk at my table that is identity(1,1). I started to save data at this table but suddenly, I noticed that my pk was not 1,2,3... anymore, it jumped from 38 to 1038, for example.
After that kept inserting data and it was normal, but suddenly, again, it did another jump, from 1043 to 2039.
Why is it happening? and how to fix it?
It happened not only at one table, but in 3 of them (at least I have noticed at 3 tables).
Short answer - I don't beleive anything is wrong here. We can guess at a few explainations (insert and delete is my thought, but dunno), but it won't change anything. Yes, the ID field is incrementing itself pretty rapidly, but the ID field is a behind the scene number that the database cares about and not much else. In that light, I suggest no harm no foul, keep going.
If you do want to address this and 'reset' the ID field...create an identical table with the ID field set to 1,1 again. Insert all the rows from the old table and put them in the new table...the ID's should then be sequential again. Be warned that any other tables that have a foriegn key over to that ID field will likely lose it's relation. I'd only attempt this as a last (very last) resort.
Related
while creating a tkinter application to store book information, I realize that simply deleting a row of information from the SQL database does not update the indexes. Kinda hard to explain but here is a picture of what I meant:
link to picture. (still young on this account, so pictures can't be embedded, sorry for the inconvenience)
As you can see, the first column represents the index and index 3 is missing because I deleted it. Is there a way such that upon deleting a row, anything below it just shifts up to cover for the empty spot?
Your use of the word "index" must be based on the application language, not the database language. In databases, indexes are additional data structures that speed certain operations on tables.
You are referring to an "id" column, presumably one that is defined automatically as identity, auto_increment, serial, or whatever the underlying database uses.
A very important point is that deleting a row from a table does not affect other rows in the table (unless you have gone through the work of writing triggers to make that happen). It just deletes the rows.
The second more important point is that you do not want to change the "identity" of rows -- and that is what the column you are calling an "index" is doing. It identifies the row. It not only identifies the row today, but it identifies the same row tomorrow. And, if it existed, yesterday. That is, you don't want to change the identity.
This is even more important when you have foreign key relationships -- that is, other tables that refer to this row. Those relationships could get all messed up if the ids start changing.
SQL does offer a simple way to get a number with no gaps:
select row_number() over (order by "index") as seqnum
from t;
I was testing a migration that deletes a primary key column id (I wanted to use a foreign key as primary key). When I ran and reverted the migration, I saw that the state of my table is the same, except that id column is now last one.
Will it change the behaviour of my database in any way and should I bother to restore the column order in the migration revert code?
In theory everything should be fine, but there are always scenarios when your code could fail.
For example:
a) blind insert:
INSERT INTO tab_name
VALUES (1, 'b', 'c');
A blind insert is when an INSERT query doesn’t specify which columns receive the inserted data.
Why is this a bad thing?
Because the database schema may change. Columns may be moved, renamed,
added, or deleted. And when they are, one of at least three things can
happen:
The query fails. This is the best-case scenario. Someone deleted a column from the target table, and now there aren’t enough columns for
the insert to go into, or someone changed a data type and the inserted
type isn’t compatible, or so on. But at least your data isn’t getting
corrupted, and you may even know the problem exists because of an
error message.
The query continues to work, and nothing is wrong. This is a middle-worst-case scenario. Your data isn’t corrupt, but the monster
is still hiding under the bed.
The query continues to work, but now some data is being inserted somewhere it doesn’t belong. Your data is getting corrupted.
b) ORDER BY oridinal
SELECT *
FROM tab
ORDER BY 1;
Apologies for the clumsy title feel free to suggest an improvement.
I have a table Records and there's a UserId column that refers to who's made the deposition. There's also Counter column which is identity(1,1) (it's important to keep in mind that it's not same one as the Id column that is the primary key).
The problem got obvious when we started depositing from different accounts, because before, the user could ask for record number 123 through 127, getting 5 amounts but now, their picks might be 123, 125, 126 or even worse - nothing at all.
The only option to handle it as far I can imagine to create a business logic layer that checks for the highest deposition counter for a user and adds the new record with that increased by one.
But it sure would be nice to have it automagically working. Something like identity(1,1,guid). Is it possible?
The only option to handle it as far I can imagine to create a business
logic layer that checks for the highest deposition counter for a user
and adds the new record with that increased by one.
Time to learn.
Add the last given number to the account table.
Use a trigger to assign higher numbers in the insert event of SQL Server.
Finished. This obviously assumes your relational database is used in a relational fashion so the records are related to a user table, not just holding a user id (which would be a terrible design).
If not, you can also maintain a table of all seen GUID's by means of said trigger.
To maintain such a column, you would need a trigger.
You might consider calculating the value when you query the table:
select r.*, row_number() over (partition by guid order by id) as seqnum
from records r;
WARNING: This tale of woe contains examples of code smells and poor design decisions, and technical debt.
If you are conversant with SOLID principles, practice TDD and unit test your work, DO NOT READ ON. Unless you want a good giggle at someone's misfortune and gloat in your own awesomeness knowing that you would never leave behind such a monumental pile of crap for your successors.
So, if you're sitting comfortably then I'll begin.
In this app that I have inherited and been supporting and bug fixing for the last 7 months I have been left with a DOOZY of a balls up by a developer that left 6 and a half months ago. Yes, 2 weeks after I started.
Anyway. In this app we have clients, employees and visits tables.
There is also a table called AppNewRef (or something similar) that ... wait for it ... contains the next record ID to use for each of the other tables. So, may contain data such as :-
TypeID Description NextRef
1 Employees 804
2 Clients 1708
3 Visits 56783
When the application creates new rows for Employees, it looks in the AppNewRef table, gets the value, uses that value for the ID, and then updates the NextRef column. Same thing for Clients, and Visits and all the other tables whose NextID to use is stored in here.
Yes, I know, no auto-numbering IDENTITY columns on this database. All under the excuse of "when it was an Access app". These ID's are held in the (VB6) code as longs. So, up to 2 billion 147 million records possible. OK, that seems to work fairly well. (apart from the fact that the app is updating and taking care of locking / updating, etc., and not the database)
So, our users are quite happily creating Employees, Clients, Visits etc. The Visits ID is steady increasing a few dozen at a time. Then the problems happen. Our clients are causing database corruptions while creating batches of visits because the server is beavering away nicely, and the app becomes unresponsive. So they kill the app using task manager instead of being patient and waiting. Granted the app does seem to lock up.
Roll on to earlier this year and developer Tim (real name. No protecting the guilty here) starts to modify the code to do the batch updates in stages, so that the UI remains 'responsive'. Then April comes along, and he's working his notice (you can picture the scene now, can't you ?) and he's beavering away to finish the updates.
End of April, and beginning of May we update some of our clients. Over the next few months we update more and more of them.
Unseen by Tim (real name, remember) and me (who started two weeks before Tim left) and the other new developer that started a week after, the ID's in the visit table start to take huge leaps upwards. By huge, I mean 10000, 20000, 30000 at a time. Sometimes a few hundred thousand.
Here's a graph that illustrates the rapid increase in IDs used.
Roll on November. Customer phones Tech Support and reports that he's getting an error. I look at the error message and ask for the database so I can debug the code. I find that the value is too large for a long. I do some queries, pull the information, drop it into Excel and graph it.
I don't think making the code handle anything longer than a long for the ID's is the right approach, as this app passes that ID into other DLL's and OCX's and breaking the interface on those just seems like a whole world of hurt that I don't want to encounter right now.
One potential idea that I'm investigating is try to modify the ID's so that I can get them down to a lower level. Essentially filling the gaps. Using the ROW_NUMBER function
What I'm thinking of doing is adding a new column to each of the tables that have a Foreign Key reference to these Visit ID's (not a proper foreign key mind, those constraints don't exist in this database). This new column could store the old (current) value of the Visit ID (oh, just to confuse things; on some tables it's called EventID, and on some it's called VisitID).
Then, for each of the other tables that refer to that VisitID, update to the new value.
Ideas ? Suggestions ? Snippets of T-SQL to help all gratefully received.
Option one:
Explicitly constrain all of your foreign key relationships, and set them to be ON UPDATE CASCADE.
This will mean that whenever you change the ID, the foreign keys will automatically be updated.
Then you just run something like this...
WITH
resequenced AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY id) AS newID,
*
FROM
yourTable
)
UPDATE
resequenced
SET
id = newID
I haven't done this in ages, so I forget if it causes problems mid-update by having two records with the same id value. If it does, you could do somethign like this first...
UPDATE yourTable SET id = -id
Option two:
Ensure that none of your foreign key relationships are explicitly defined. If they are, note them donw and remove them.
Then do something like...
CREATE TABLE temp AS
newID INT IDENTITY (1,1),
oldID INT
)
INSERT INTO temp (oldID) SELECT id FROM yourTable
/* Do this once for the table you are re-identifiering */
/* Repeat this for all fact tables holding that ID as a foreign key */
UPDATE
factTable
SET
foreignID = temp.newID
FROM
temp
WHERE
foreignID = temp.oldID
Then re-apply any existing foreign key relationships.
This is a pretty scary option. If you forget to update a table, you just borked your data. But, you can give that temp table a much nicer name and KEEP it.
Good luck. And may the lord have mercy on your soul. And Tim's if you ever meet him in a dark alley.
I would create a numbers table that has just a sequence from 1 to whatever max with an increment of 1 is for long and then change the logic of getting the maxid for visitid and maybe the others doing a right join between the numbers and the visits table. and then you can just look for te min of that number
select min(number) from visits right join numbers on visits.id = numbers.number
That way you get all the gaps filled in without having to change any of the other tables.
but I would just redo the whole database.
Earlier today I asked this question which arose from A- My poor planning and B- My complete disregard for the practice of normalizing databases. I spent the last 8 hours reading about normalizing databases and the finer points of JOIN and worked my way through the SQLZoo.com tutorials.
I am enlightened. I understand the purpose of database normalization and how it can suit me. Except that I'm not entirely sure how to execute that vision from a procedural standpoint.
Here's my old vision: 1 table called "files" that held, let's say, a file id and a file url and appropos grade levels for that file.
New vision!: 1 table for "files", 1 table for "grades", and a junction table to mediate.
But that's not my problem. This is a really basic Q that I'm sure has an obvious answer- When I create a record in "files", it gets assigned the incremented primary key automatically (file_id). However, from now on I'm going to need to write that file_id to the other tables as well. Because I don't assign that id manually, how do I know what it is?
If I upload text.doc and it gets file_id 123, how do I know it got 123 in order to write it to "grades" and the junction table? I can't do a max(file_id) because if you have concurrent users, you might nab a different id. I just don't know how to get the file_id value without having manually assigned it.
You may want to use LAST_INSERT_ID() as in the following example:
START TRANSACTION;
INSERT INTO files (file_id, url) VALUES (NULL, 'text.doc');
INSERT INTO grades (file_id, grade) VALUES (LAST_INSERT_ID(), 'some-grade');
COMMIT;
The transaction ensures that the operation remains atomic: This guarantees that either both inserts complete successfully or none at all. This is optional, but it is recommended in order to maintain the integrity of the data.
For LAST_INSERT_ID(), the most
recently generated ID is maintained in
the server on a per-connection basis.
It is not changed by another client.
It is not even changed if you update
another AUTO_INCREMENT column with a
nonmagic value (that is, a value that
is not NULL and not 0).
Using
LAST_INSERT_ID() and AUTO_INCREMENT
columns simultaneously from multiple
clients is perfectly valid. Each
client will receive the last inserted
ID for the last statement that client
executed.
Source and further reading:
MySQL Reference: How to Get the Unique ID for the Last Inserted Row
MySQL Reference: START TRANSACTION, COMMIT, and ROLLBACK Syntax
In PHP to get the automatically generated ID of a MySQL record, use mysqli->insert_id property of your mysqli object.
How are you going to find the entry tomorrow, after your program has forgotten the value of last_insert_id()?
Using a surrogate key is fine, but your table still represents an entity, and you should be able to answer the question: what measurable properties define this particular entity? The set of these properties are the natural key of your table, and even if you use surrogate keys, such a natural key should always exist and you should use it to retrieve information from the table. Use the surrogate key to enforce referential integrity, for indexing purpuses and to make joins easier on the eye. But don't let them escape from the database