Custom sequence in postgres - sql

I need to have a custom unique identifier (sequence). In my table there is a field ready_to_fetch_id that will be null by default and when my message is ready to be delivered then i make it update with unique max id, this is quite heavy process as load increasing.
So it it possible to have some sequence in postgres that allow null and unique ids.

Allowing NULL values has nothing todo with sequences. If your column definition allows NULLs you can put NULL values in the column. When you update the column you take the nextval from the sequence.
Notice that if you plan to use the ids to keep track of which rows you have already processed that it won't work perfectly. When two transactions are going to update the ready_to_fetch_id column simultaneous the transaction that started last might commit first which means that the higher id of the last transaction to start will become visible before the lower id the earlier transaction is using.

Related

DB2 auto increment changed by itself after restarted

Each time I restart my DB2 services, the auto increment field, always change by itself,
for example : before I restart, the auto increment value is at 13, and it's incremented by 1, and after I restart it's always become 31 and it's always incremented by 20
Any idea what may cause this?
Each time I restarted my Db2 service, I have to execute this command
ALTER TABLE <table> ALTER COLUMN <column> RESTART WITH 1
DB2 has a cache of generated values in order to reduce the overhead of generating values (Reduce the IO). This cache in memory, and assign the values as requested.
Take a look at the cache option when creating / altering the table. By default the cache value is 20.
It is important to understand how the sequeneces work in DB2. Sequences share many concepts with generated values / identity column.
Create table http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.sql.ref.doc/doc/r0000927.html
Alter table http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.sql.ref.doc/doc/r0000888.html
Sequences http://publib.boulder.ibm.com/infocenter/db2luw/v10r1/topic/com.ibm.db2.luw.admin.dbobj.doc/doc/c0023175.html
From W3schools:
"Auto-increment allows a unique number to be generated when a new record is inserted into a table."
This is the only thing you may expect: unique (=non-conflicting) numbers. How these are generated is left to the DBMS. You must not expect a number sequence without any gaps.
For instance, a DBMS might choose to "pre-allocate" blocks of ten numbers (23..32, 33..42, ...) for performance reasons, so that the auto-increment field must only be incremented for every (up to) ten records. If you have an INSERT statement that inserts only 5 records into a newly created table, it can "acquire a block of 10 numbers" (0..9), use the first five values (0..4) of it and leave the rest unused. By acquiring this one block of numbers, the counter was incremented from 0 to 10. So the next INSERT statement that fetches a block will get the numbers ranging from 10 to 19.

SQLPlus Sequence - multiple tables

I am trying to use Dennis' solution here as an implementation of auto_increment in Oracle database. Say I create one sequence as follows:
CREATE SEQUENCE auto_increment
START WITH 1
INCREMENT BY 1;
If I want auto_increment behavior in multiple tables, can I just use this sequence for all tables? Or do I need a separate sequence per table? That is, will the sequence increment for one table be affected by another table using the sequence?
Yes, the sequence accesses will be affecting each other if you use the same sequence. However the tone of your question makes me think that you expect the sequence to be continuous.
Don't be fooled, sequences are NOT sequential. The only thing that you can be garanteed is that the numbers retrieved are unique, and in an ascending order (in your case)
You can use the same sequence for many tables. It would be unconventional to do so, it would lead to more contention on the sequence, and it would make life a bit more difficult if you needed to reset the sequence value as a result of, say, an export and import between environments but it would work.
Of course, if the sequence gave a value of 1 for table A, it would never give that same value to a trigger defined on B. Since sequences do not generate gap-free sets of values (i.e. you can guarantee that there will be "missing" values in every table no matter how many sequences you create) that shouldn't be a major downside.
Sequences are sequential. However, there are many things that can cause gaps in the sequence e.g rollback, commit (because the sequence generator issues sequences irrespective of commits or rollbacks), and same sequence for multiple tables.

Some sort of “different auto-increment indexes” per a primary key values

I have got a table which has an id (primary key with auto increment), uid (key refering to users id for example) and something else which for my question won’t matter.
I want to make, lets call it, different auto-increment keys on id for each uid entry.
So, I will add an entry with uid 10, and the id field for this entry will have a 1 because there were no previous entries with a value of 10 in uid. I will add a new one with uid 4 and its id will be 3 because I there were already two entried with uid 4.
...Very obvious explanation, but I am trying to be as explainative an clear as I can to demonstrate the idea... clearly.
What SQL engine can provide such a functionality natively? (non Microsoft/Oracle based)
If there is none, how could I best replicate it? Triggers perhaps?
Does this functionality have a more suitable name?
In case you know about a non SQL database engine providing such a functioality, name it anyway, I am curious.
Thanks.
MySQL's MyISAM engine can do this. See their manual, in section Using AUTO_INCREMENT:
For MyISAM tables you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
The docs go on after that paragraph, showing an example.
The InnoDB engine in MySQL does not support this feature, which is unfortunate because it's better to use InnoDB in almost all cases.
You can't emulate this behavior using triggers (or any SQL statements limited to transaction scope) without locking tables on INSERT. Consider this sequence of actions:
Mario starts transaction and inserts a new row for user 4.
Bill starts transaction and inserts a new row for user 4.
Mario's session fires a trigger to computes MAX(id)+1 for user 4. You get 3.
Bill's session fires a trigger to compute MAX(id). I get 3.
Bill's session finishes his INSERT and commits.
Mario's session tries to finish his INSERT, but the row with (userid=4, id=3) now exists, so Mario gets a primary key conflict.
In general, you can't control the order of execution of these steps without some kind of synchronization.
The solutions to this are either:
Get an exclusive table lock. Before trying an INSERT, lock the table. This is necessary to prevent concurrent INSERTs from creating a race condition like in the example above. It's necessary to lock the whole table, since you're trying to restrict INSERT there's no specific row to lock (if you were trying to govern access to a given row with UPDATE, you could lock just the specific row). But locking the table causes access to the table to become serial, which limits your throughput.
Do it outside transaction scope. Generate the id number in a way that won't be hidden from two concurrent transactions. By the way, this is what AUTO_INCREMENT does. Two concurrent sessions will each get a unique id value, regardless of their order of execution or order of commit. But tracking the last generated id per userid requires access to the database, or a duplicate data store. For example, a memcached key per userid, which can be incremented atomically.
It's relatively easy to ensure that inserts get unique values. But it's hard to ensure they will get consecutive ordinal values. Also consider:
What happens if you INSERT in a transaction but then roll back? You've allocated id value 3 in that transaction, and then I allocated value 4, so if you roll back and I commit, now there's a gap.
What happens if an INSERT fails because of other constraints on the table (e.g. another column is NOT NULL)? You could get gaps this way too.
If you ever DELETE a row, do you need to renumber all the following rows for the same userid? What does that do to your memcached entries if you use that solution?
SQL Server should allow you to do this. If you can't implement this using a computed column (probably not - there are some restrictions), surely you can implement it in a trigger.
MySQL also would allow you to implement this via triggers.
In a comment you ask the question about efficiency. Unless you are dealing with extreme volumes, storing an 8 byte DATETIME isn't much of an overhead compared to using, for example, a 4 byte INT.
It also massively simplifies your data inserts, as well as being able to cope with records being deleted without creating 'holes' in your sequence.
If you DO need this, be careful with the field names. If you have uid and id in a table, I'd expect id to be unique in that table, and uid to refer to something else. Perhaps, instead, use the field names property_id and amendment_id.
In terms of implementation, there are generally two options.
1). A trigger
Implementations vary, but the logic remains the same. As you don't specify an RDBMS (other than NOT MS/Oracle) the general logic is simple...
Start a transaction (often this is Implicitly already started inside triggers)
Find the MAX(amendment_id) for the property_id being inserted
Update the newly inserted value with MAX(amendment_id) + 1
Commit the transaction
Things to be aware of are...
- multiple records being inserted at the same time
- records being inserted with amendment_id being already populated
- updates altering existing records
2). A Stored Procedure
If you use a stored procedure to control writes to the table, you gain a lot more control.
Implicitly, you know you're only dealing with one record.
You simply don't provide a parameter for DEFAULT fields.
You know what updates / deletes can and can't happen.
You can implement all the business logic you like without hidden triggers
I personally recommend the Stored Procedure route, but triggers do work.
It is important to get your data types right.
What you are describing is a multi-part key. So use a multi-part key. Don't try to encode everything into a magic integer, you will poison the rest of your code.
If a record is identified by (entity_id,version_number) then embrace that description and use it directly instead of mangling the meaning of your keys. You will have to write queries which constrain the version number but that's OK. Databases are good at this sort of thing.
version_number could be a timestamp, as a_horse_with_no_name suggests. This is quite a good idea. There is no meaningful performance disadvantage to using timestamps instead of plain integers. What you gain is meaning, which is more important.
You could maintain a "latest version" table which contains, for each entity_id, only the record with the most-recent version_number. This will be more work for you, so only do it if you really need the performance.

Is it a bad practice to use an identity column to determine the order of row creation? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Can I use a SQL Server identity column to determine the inserted order of rows?
If an identity column is reseeded, then it can not be used be used to determine the order of row insertion, but I have no reason to ever reseed the identity.
Are there any reasons why I should not use the identity column to determine the order of creation?
Because it wouldn't be reliable would be the reason I would not use it. You might have two processes ask simultaneously for identity values and process 1 got the first value and process 2 got the second value but process 2 actually finished the transaction earlier and thus was inserted earlier. A datetime field for date inserted is the only reliable choice if you want to know the order that records were actually inserted.
It is not considered a good practice. For example, two processes doing inserts on a table in simultaneous transactions can in some servers have chunks of ids assigned to them, so any row inserted from one transaction will have a lesser id than any row inserted from the other transaction. Also, this can sometimes cause gaps in sequence of ids. And there may be also other scenarios something unexpected might happen.
In short, autoincremented ids are not always guaranteed to a be a continuous and ascending sequence.

Is there a best-practise method to swap primary key values using SQL Lite?

As a bit of background, I'm working with a SQL Lite database that is being consumed by a closed-source UI that doesn't order the results by the handy timestamp column (gee, thanks Nokia!) - it just uses the default ordering, which corresponds to the primary key, which is a vanilla auto-incrementing 'id' column.
I easily have a map of the current and desired id values, but applying the mapping is my current problem. It seems I cannot swap the values as an update processes rows one at a time, which would temporarily result in a duplicate value. I've tried using an update statement with case clauses using a temporary out-of-sequence value, but as each row is only processed once this obviously doesn't work. Thus I've reached the point of needing 3 update statements to swap a pair of values, which is far from ideal as I want this to scale well.
Compounded to this, there are a number of triggers set up which makes adding/deleting rows into a new table a complex problem unless I can disable those for the duration of any additions/deletions resulting from table duplication & deletion, which is why I haven't pursued that avenue yet.
I'm thinking my next line of enquiry will be a new column with the new ids then finding a way to move the primary index to it before removing the old column, but I'm throwing this out there in case anyone can offer up a better solution that will save me some time :)