How are unique IDs / sequence numbers generated in SAP B1? - sql

I'm wondering if anyone knows how SAP B1 (SAP Business One) generates the unique primary keys it uses in various tables. Examples of what I am talking about would include OCRD.DocEntry and OCPR.CntctCode. These integer columns that get "automatically" incremented.
Typical approaches for doing this include identity columns (e.g., SQL Server), sequences (e.g., Oracle), or manual sequence tables holding a Nextval which is programmatically incremented. As best I can tell, B1 is not using any of these techniques for these columns. So how is it handling them?
The particular instance I'm looking at is using an SQL Server database.
Yes, I'm well aware of the fact that there is no "need" for me to know know about the inner workings, shouldn't be mucking around in the DB, etc. It's just bothering me that I don't know how they are doing it! If anyone can explain, I'd be grateful.

SAPB1 generates new unique numbers using the ONNM table. When a document is added the following takes place.
SQL Transaction begins
The next number is queried from the ONNM table with an update lock
The ONNM table is updated with the new number (+1).
The document is added
The SQL transaction is committed.
Running an SQL SELECT statement with an update lock returns the current row while simultaneously locking that row until the end of the transaction. You are guaranteed that no other user can change that row between when you select it and when the transaction ends.
You can use SQL Profiler to watch the statements executed when you perform actions in SAP B1. Here is the line that gets the next number to use in a Quotation. Quotations are ObjectType 23.
SELECT T0.* FROM [dbo].[ONNM] T0 WITH (UPDLOCK) WHERE T0.[ObjectCode] = '23'

SAP B1 use ONNM for sequence number generation.
it maintains an auto key for every object that was registered in it. and based on the auto key sequence number will be generated.
for every add event this auto key will be incremented by +1

Related

SEQUENCE number on every INSERT in MS SQL 2012

I am in the situation where multiple user inserting values from application to database via web service, have using stored procedure for validate and insert records.
Requirement is create unique number for each entries but strictly in SEQUENCE only. I added Identity column but its missed some of the number in between e.g. 25,26,27,29,34...
Our requirement is strictly generate next number only like we use for Invoice Number/ Order Number/ Receipt Number etc. 1,2,3,4,5...
I checked below link about Sequence Number but not sure if its surely resolve my issue. Can someone please assist in this.
Sequence Numbers
If you absolutely, positively cannot have gaps, then you need to use a trigger and your own logic. This puts a lot of overhead into inserts, but it is the only guarantee.
Basically, the parts of a database that protect the data get in the way of doing what you want. If a transaction uses a sequence number (or identity) and it is later rolled back, then what happens to the generated number? Well, that is one way that gaps appear.
Of course, you will have to figure out what to do in that case. I would just go for an identity column and work on educating users that gaps are possible. After all, if you don't want gaps on output, then row_number() is available to re-assign numbers.

How to find the last number of a sequence

I am using spring batch in my application. I have an upload process in which the data is loaded in DB from an excel file. The excel has five spreadsheets loaded in five different tables. If the upload is successful then no problem. but when the upload fails, say in the 3rd sheet, I am rolling back the entire upload. This causes the sequence in the first table to skip numbers. How can I solve this issue. My sequence is incremented by 1 and has nocache.
Thanks in Advance
To solve this issue, you most probably have to change your expectations. Sequences usually don't roll back when they were used in a transaction. Example:
T1 insert into table A using sequence S to generate PK values
T2 insert into table B using sequence S to generate PK values
Now you roll back T1. What should happen to the PKs values in T2? Should they be renumbered? What if someone (like a Java program) already read the PK values?
You can see that even for pretty simple cases, it's impossible to roll back sequences.
Some databases have the concept of identity columns where the DB server internally assigns keys but even for those, you will have gaps.
If you really need an uninterrupted flow of PKs, you will have to use your own sequence (or a counter in your Java code) and maintain/reset this one yourself.

Postgresql wrong auto-increment for serial

I have a problem on postgresql which I think there is a bug in the postgresql, I wrongly implement something.
There is a table including colmn1(primary key), colmn2(unique), colmn3, ...
After an insertion of a row, if I try another insertion with an existing colmn2 value I am getting a duplicate value error as I expected. But after this unsuccesful try, colmn1's next value is
incremented by 1 although there is no insertion so i am getting rows with id sequences like , 1,2,4,6,9.(3,5,6,7,8 goes for unsuccessful trials).
I need help from the ones who can explain this weird behaviour.
This information may be useful: I used "create unique index on tableName (lower(column1)) " query to set unique constraint.
See the PostgreSQL sequence FAQ:
Sequences are intended for generating unique identifiers — not
necessarily identifiers that are strictly sequential. If two
concurrent database clients both attempt to get a value from a
sequence (using nextval()), each client will get a different sequence
value. If one of those clients subsequently aborts their transaction,
the sequence value that was generated for that client will be unused,
creating a gap in the sequence.
This can't easily be fixed without incurring a significant performance
penalty. For more information, see Elein Mustein's "Gapless Sequences for Primary Keys" in the General Bits Newsletter.
From the manual:
Important: Because sequences are non-transactional, changes made by
setval are not undone if the transaction rolls back.
In other words, it's normal to have gaps. If you don't want gaps, don't use a sequence.

Sequence vs identity

SQL Server 2012 introduced Sequence as a new feature, same as in Oracle and Postgres. Where sequences are preferred over identities? And why do we need sequences?
I think you will find your answer here
Using the identity attribute for a column, you can easily generate
auto-incrementing numbers (which as often used as a primary key). With
Sequence, it will be a different object which you can attach to a
table column while inserting. Unlike identity, the next number for the
column value will be retrieved from memory rather than from the disk –
this makes Sequence significantly faster than Identity. We will see
this in coming examples.
And here:
Sequences: Sequences have been requested by the SQL Server community
for years, and it's included in this release. Sequence is a user
defined object that generates a sequence of a number. Here is an
example using Sequence.
and here as well:
A SQL Server sequence object generates sequence of numbers just like
an identity column in sql tables. But the advantage of sequence
numbers is the sequence number object is not limited with single sql
table.
and on msdn you can also read more about usage and why we need it (here):
A sequence is a user-defined schema-bound object that generates a
sequence of numeric values according to the specification with which
the sequence was created. The sequence of numeric values is generated
in an ascending or descending order at a defined interval and may
cycle (repeat) as requested. Sequences, unlike identity columns, are
not associated with tables. An application refers to a sequence object
to receive its next value. The relationship between sequences and
tables is controlled by the application. User applications can
reference a sequence object and coordinate the values keys across
multiple rows and tables.
A sequence is created independently of the tables by using the CREATE
SEQUENCE statement. Options enable you to control the increment,
maximum and minimum values, starting point, automatic restarting
capability, and caching to improve performance. For information about
the options, see CREATE SEQUENCE.
Unlike identity column values, which are generated when rows are
inserted, an application can obtain the next sequence number before
inserting the row by calling the NEXT VALUE FOR function. The sequence
number is allocated when NEXT VALUE FOR is called even if the number
is never inserted into a table. The NEXT VALUE FOR function can be
used as the default value for a column in a table definition. Use
sp_sequence_get_range to get a range of multiple sequence numbers at
once.
A sequence can be defined as any integer data type. If the data type
is not specified, a sequence defaults to bigint.
Sequence and identity both used to generate auto number but the major difference is Identity is a table dependant and Sequence is independent from table.
If you have a scenario where you need to maintain an auto number globally (in multiple tables), also you need to restart your interval after particular number and you need to cache it also for performance, here is the place where we need sequence and not identity.
Although sequences provide more flexibility than identity columns, I didn't find they had any performance benefits.
I found performance using identity was consistently 3x faster than using sequence for batch inserts.
I inserted approx 1.5M rows and performance was:
14 seconds for identity
45 seconds for sequence
I inserted the rows into a table which used sequence object via a table default:
NEXT VALUE for <seq> for <col_name>
and also tried specifying sequence value in select statement:
SELECT NEXT VALUE for <seq>, <other columns> from <table>
Both were the same factor slower than the identity method. I used the default cache option for the sequence.
The article referenced in Arion's first link shows performance for row-by-row insert and difference between identity and sequence was 16.6 seconds to 14.3 seconds for 10,000 inserts.
The Caching option has a big impact on performance, but identity is faster for higher volumes (+1M rows)
See this link for an indepth analysis as per utly4life's comment.
I know this is a little old, but wanted to add an observation that bit me.
I switched from identity to sequence to have my indexes in order. I later found out that sequence doesn't transfer with replication. I started getting key violations after I setup replication between two databases since the sequences were not in sync. just something to watch out for before you make a decision.
I find the best use of Sequences is not to replace an identity column but to create a "Order Number" type of field.
In other words, an Order Number is exposed to the end user and may have business rules along with it. You want it to be unique, but just using an Identity Column isn't really correct either.
For example, different order types might require a different sequence, so you might have a sequence for Internet Order, as opposed to In-house orders.
In other words, don't think of a Sequence as simple a replacement for identity, think of it as being useful in cases where an identity does not fit the business requirements.
Recently was bit by something to consider for identity vs sequence. Seems MSFT now suggests sequence if you may want to keep identity without gaps. We had an issue where there were huge gaps in the identity, but based on this statement highlighted would explain our issue that SQL cached the identity and after reboot we lost those numbers.
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql-identity-property?view=sql-server-2017
Consecutive values after server restart or other failures – SQL Server might cache identity values for performance reasons and some of the assigned values can be lost during a database failure or server restart. This can result in gaps in the identity value upon insert. If gaps are not acceptable then the application should use its own mechanism to generate key values. Using a sequence generator with the NOCACHE option can limit the gaps to transactions that are never committed.

Some sort of “different auto-increment indexes” per a primary key values

I have got a table which has an id (primary key with auto increment), uid (key refering to users id for example) and something else which for my question won’t matter.
I want to make, lets call it, different auto-increment keys on id for each uid entry.
So, I will add an entry with uid 10, and the id field for this entry will have a 1 because there were no previous entries with a value of 10 in uid. I will add a new one with uid 4 and its id will be 3 because I there were already two entried with uid 4.
...Very obvious explanation, but I am trying to be as explainative an clear as I can to demonstrate the idea... clearly.
What SQL engine can provide such a functionality natively? (non Microsoft/Oracle based)
If there is none, how could I best replicate it? Triggers perhaps?
Does this functionality have a more suitable name?
In case you know about a non SQL database engine providing such a functioality, name it anyway, I am curious.
Thanks.
MySQL's MyISAM engine can do this. See their manual, in section Using AUTO_INCREMENT:
For MyISAM tables you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
The docs go on after that paragraph, showing an example.
The InnoDB engine in MySQL does not support this feature, which is unfortunate because it's better to use InnoDB in almost all cases.
You can't emulate this behavior using triggers (or any SQL statements limited to transaction scope) without locking tables on INSERT. Consider this sequence of actions:
Mario starts transaction and inserts a new row for user 4.
Bill starts transaction and inserts a new row for user 4.
Mario's session fires a trigger to computes MAX(id)+1 for user 4. You get 3.
Bill's session fires a trigger to compute MAX(id). I get 3.
Bill's session finishes his INSERT and commits.
Mario's session tries to finish his INSERT, but the row with (userid=4, id=3) now exists, so Mario gets a primary key conflict.
In general, you can't control the order of execution of these steps without some kind of synchronization.
The solutions to this are either:
Get an exclusive table lock. Before trying an INSERT, lock the table. This is necessary to prevent concurrent INSERTs from creating a race condition like in the example above. It's necessary to lock the whole table, since you're trying to restrict INSERT there's no specific row to lock (if you were trying to govern access to a given row with UPDATE, you could lock just the specific row). But locking the table causes access to the table to become serial, which limits your throughput.
Do it outside transaction scope. Generate the id number in a way that won't be hidden from two concurrent transactions. By the way, this is what AUTO_INCREMENT does. Two concurrent sessions will each get a unique id value, regardless of their order of execution or order of commit. But tracking the last generated id per userid requires access to the database, or a duplicate data store. For example, a memcached key per userid, which can be incremented atomically.
It's relatively easy to ensure that inserts get unique values. But it's hard to ensure they will get consecutive ordinal values. Also consider:
What happens if you INSERT in a transaction but then roll back? You've allocated id value 3 in that transaction, and then I allocated value 4, so if you roll back and I commit, now there's a gap.
What happens if an INSERT fails because of other constraints on the table (e.g. another column is NOT NULL)? You could get gaps this way too.
If you ever DELETE a row, do you need to renumber all the following rows for the same userid? What does that do to your memcached entries if you use that solution?
SQL Server should allow you to do this. If you can't implement this using a computed column (probably not - there are some restrictions), surely you can implement it in a trigger.
MySQL also would allow you to implement this via triggers.
In a comment you ask the question about efficiency. Unless you are dealing with extreme volumes, storing an 8 byte DATETIME isn't much of an overhead compared to using, for example, a 4 byte INT.
It also massively simplifies your data inserts, as well as being able to cope with records being deleted without creating 'holes' in your sequence.
If you DO need this, be careful with the field names. If you have uid and id in a table, I'd expect id to be unique in that table, and uid to refer to something else. Perhaps, instead, use the field names property_id and amendment_id.
In terms of implementation, there are generally two options.
1). A trigger
Implementations vary, but the logic remains the same. As you don't specify an RDBMS (other than NOT MS/Oracle) the general logic is simple...
Start a transaction (often this is Implicitly already started inside triggers)
Find the MAX(amendment_id) for the property_id being inserted
Update the newly inserted value with MAX(amendment_id) + 1
Commit the transaction
Things to be aware of are...
- multiple records being inserted at the same time
- records being inserted with amendment_id being already populated
- updates altering existing records
2). A Stored Procedure
If you use a stored procedure to control writes to the table, you gain a lot more control.
Implicitly, you know you're only dealing with one record.
You simply don't provide a parameter for DEFAULT fields.
You know what updates / deletes can and can't happen.
You can implement all the business logic you like without hidden triggers
I personally recommend the Stored Procedure route, but triggers do work.
It is important to get your data types right.
What you are describing is a multi-part key. So use a multi-part key. Don't try to encode everything into a magic integer, you will poison the rest of your code.
If a record is identified by (entity_id,version_number) then embrace that description and use it directly instead of mangling the meaning of your keys. You will have to write queries which constrain the version number but that's OK. Databases are good at this sort of thing.
version_number could be a timestamp, as a_horse_with_no_name suggests. This is quite a good idea. There is no meaningful performance disadvantage to using timestamps instead of plain integers. What you gain is meaning, which is more important.
You could maintain a "latest version" table which contains, for each entity_id, only the record with the most-recent version_number. This will be more work for you, so only do it if you really need the performance.