What caused SQL Server's Sequence number to skip? - sql

I am using SQL Server's Sequence to generate running number for my document number on my "doc_no" column. A quick check revealed my table's Identity column number to skip is due to server restart and one way to prevent that is to disable the cache (on my local machine). What about sequence? Is it the same way for Sequence? Is there any way to see the current sequence number in the Sequence itself?
CREATE SEQUENCE [dbo].[PB_SEQ_DOCNO]
AS [int]
START WITH 1
INCREMENT BY 1
MINVALUE -2147483648
MAXVALUE 2147483647
CACHE
GO

The CACHE keyword (without an explicit cache size) uses the default server cache size - which is probably 1000 (just like for the IDENTITY columns). So SQL Server in the background caches the next 1000 sequence numbers. If your server crashes or unexpectedly restarts, those cached sequence numbers are "lost" (as in your case here, obviously).
You can use a smaller cache size (by specifying e.g. CACHE 50) or you can even turn this off altogether (NO CACHE) - benefit is not/less gaps, drawbacks is slower performance when dishing out the sequence numbers.

Related

Spring Batch Meta Data Schema Sequences

This is more of a request for a quick explanation of the sequences used to generate ID from the Spring Batch tables that store Job and Step information.
I've ran the below sequences in DB2 for Spring Boot + Batch application:
CREATE SEQUENCE AR_REPORT_ETL_STEP_EXECUTION_SEQ AS BIGINT MAXVALUE 9223372036854775807 NO CYCLE;
CREATE SEQUENCE AR_REPORT_ETL_JOB_EXECUTION_SEQ AS BIGINT MAXVALUE 9223372036854775807 NO CYCLE;
CREATE SEQUENCE AR_REPORT_ETL_JOB_SEQ AS BIGINT MAXVALUE 9223372036854775807 NO CYCLE;
When the Spring Batch job is running, each ID field is being incremented by 20 on each new record. Though this isn't a major issue, it's still slightly confusing as to why.
I had removed the sequences and added them again with INCREMENT BY 1. This is now incremented every second record by 1 and the other record by 20.
Any tips or explanation would be a great learning opportunity.
For performance reasons, Db2 for Linux Unix Windows by default will preallocate 20 numbers of a sequence and keep them in memory for faster access.
If you don't want that caching behaviour and can tolerate the overhead of allocating without caching, then you can use the NO CACHE option when defining the sequence. But be aware that without caching, Db2 must do synchronous transaction-log-write for each number to be allocated from the sequence, which is usually undesirable in high frequency insert situations.
Remember to explicitly activate the database (i.e do not depend on auto-activation), as unused pre-allocated cached sequence numbers get discarded when the database deactivates.
Example no cache syntax:
CREATE SEQUENCE AR_REPORT_ETL_STEP_EXECUTION_SEQ
AS BIGINT MAXVALUE 9223372036854775807 NO CACHE NO CYCLE;
You can read more details in the documentation.

How to increment sequence only by 1?

First I run all these in java with jdbc driver...
Here I define a table:
create table HistoryCCP(
ID NUMBER(6) NOT NULL,
SCRIPT VARCHAR2(1000) NOT NULL
)
Here I define a sequence:
CREATE SEQUENCE SYSTEM.HistoryId
MINVALUE 1
MAXVALUE 1000000
INCREMENT BY 1
START WITH 1
NOORDER
NOCYCLE
Now I insert to table by using this here:
insert into HistoryCCP
values (SYSTEM.HistoryId.nextval ,'HELLOOOO ')
Whenever I close the program and run it again and try to insert, it increments it by ten!
And when I defined sequence like this:
CREATE SEQUENCE SYSTEM.HistoryId
MINVALUE 1
MAXVALUE 1000000
INCREMENT BY 1
START WITH 1
CACHE 100 -- added cache parameter
NOORDER
NOCYCLE
It increase it by 100!
Do you know why it behaves like this and how to increment it by 1?
Never rely on sequences for gap free numbering.
The cache value is the number of sequence values that are held in memory by the database server to avoid the need to keep updating it's internal $SEQ table with the most recently used value. If you reduce the cache value then you increase the rate at which the $SEQ table has to be modified, which slows the system.
Cached values can be aged out, and are lost on system restart, and values are not reused if a transaction gets rolled back.
The presence of gaps should not be a problem for you -- if it is then you'll need to use something other than a sequence to generate the numbers, and doing so will serialise inserts to that table.
Try to use NOCACHE options for sequence.
http://docs.oracle.com/cd/B28359_01/server.111/b28310/views002.htm
NOCACHE would work, but also would be a bad idea under for many reasons, and a total nonsense if you plan to bring your application on a Oracle RAC.
Oracle Sequences are for (internal) unique ID, not for strictly progressive number imposed by requirements. As example, let's say that using a sequence for generating the classical "protocol number" is a common flaw of many financial accounting software: looks easy when beginning but when the project grows it kills you.

Oracle Sequences , altering and viewing

I want to update the cache size of an existing sequence and i want to describe a sequence in oracle like table . how to do it ?
and what are all the drawbacks of increasing the cache value of an sequence
Alter sequence seq_name cache 20;
See the docs.
To get the ddl you may use the dbms_metadata package, wich can be used for any object:
select dbms_metadata.get_ddl('SEQUENCE','SEQ_NAME') from dual;
Increasing the cache size is useful when you have massive fetches from sequence. Increasing it has no drawback considering the fact that you use them.
But if you generate 1 milion values at a time and you use only 10, maybe is not a good ideea, because 999990 values are lost. Next session will generate another 1000000 values.
I think the engine works to generate them and allocate values for your session.
For example in my opinion, a cache 10 times less than you normally use in a session is ok.
UPDATE: Adding David Aldridge's comment:
The usefullness of a large cache is really related to the rate at
which it is used in general, so not just for large selects but for
systems with many session all using one value at a time. As
background, the performance problem with a small cache is caused by
the need for the SEQ$ system table to be modified when the cache is
exhausted. It's a small operation but obviously you don't want to be
doing it 100 times a second.
So, increasing the cache you'll have fewer concurent sessions on the same resource.

Sequence vs identity

SQL Server 2012 introduced Sequence as a new feature, same as in Oracle and Postgres. Where sequences are preferred over identities? And why do we need sequences?
I think you will find your answer here
Using the identity attribute for a column, you can easily generate
auto-incrementing numbers (which as often used as a primary key). With
Sequence, it will be a different object which you can attach to a
table column while inserting. Unlike identity, the next number for the
column value will be retrieved from memory rather than from the disk –
this makes Sequence significantly faster than Identity. We will see
this in coming examples.
And here:
Sequences: Sequences have been requested by the SQL Server community
for years, and it's included in this release. Sequence is a user
defined object that generates a sequence of a number. Here is an
example using Sequence.
and here as well:
A SQL Server sequence object generates sequence of numbers just like
an identity column in sql tables. But the advantage of sequence
numbers is the sequence number object is not limited with single sql
table.
and on msdn you can also read more about usage and why we need it (here):
A sequence is a user-defined schema-bound object that generates a
sequence of numeric values according to the specification with which
the sequence was created. The sequence of numeric values is generated
in an ascending or descending order at a defined interval and may
cycle (repeat) as requested. Sequences, unlike identity columns, are
not associated with tables. An application refers to a sequence object
to receive its next value. The relationship between sequences and
tables is controlled by the application. User applications can
reference a sequence object and coordinate the values keys across
multiple rows and tables.
A sequence is created independently of the tables by using the CREATE
SEQUENCE statement. Options enable you to control the increment,
maximum and minimum values, starting point, automatic restarting
capability, and caching to improve performance. For information about
the options, see CREATE SEQUENCE.
Unlike identity column values, which are generated when rows are
inserted, an application can obtain the next sequence number before
inserting the row by calling the NEXT VALUE FOR function. The sequence
number is allocated when NEXT VALUE FOR is called even if the number
is never inserted into a table. The NEXT VALUE FOR function can be
used as the default value for a column in a table definition. Use
sp_sequence_get_range to get a range of multiple sequence numbers at
once.
A sequence can be defined as any integer data type. If the data type
is not specified, a sequence defaults to bigint.
Sequence and identity both used to generate auto number but the major difference is Identity is a table dependant and Sequence is independent from table.
If you have a scenario where you need to maintain an auto number globally (in multiple tables), also you need to restart your interval after particular number and you need to cache it also for performance, here is the place where we need sequence and not identity.
Although sequences provide more flexibility than identity columns, I didn't find they had any performance benefits.
I found performance using identity was consistently 3x faster than using sequence for batch inserts.
I inserted approx 1.5M rows and performance was:
14 seconds for identity
45 seconds for sequence
I inserted the rows into a table which used sequence object via a table default:
NEXT VALUE for <seq> for <col_name>
and also tried specifying sequence value in select statement:
SELECT NEXT VALUE for <seq>, <other columns> from <table>
Both were the same factor slower than the identity method. I used the default cache option for the sequence.
The article referenced in Arion's first link shows performance for row-by-row insert and difference between identity and sequence was 16.6 seconds to 14.3 seconds for 10,000 inserts.
The Caching option has a big impact on performance, but identity is faster for higher volumes (+1M rows)
See this link for an indepth analysis as per utly4life's comment.
I know this is a little old, but wanted to add an observation that bit me.
I switched from identity to sequence to have my indexes in order. I later found out that sequence doesn't transfer with replication. I started getting key violations after I setup replication between two databases since the sequences were not in sync. just something to watch out for before you make a decision.
I find the best use of Sequences is not to replace an identity column but to create a "Order Number" type of field.
In other words, an Order Number is exposed to the end user and may have business rules along with it. You want it to be unique, but just using an Identity Column isn't really correct either.
For example, different order types might require a different sequence, so you might have a sequence for Internet Order, as opposed to In-house orders.
In other words, don't think of a Sequence as simple a replacement for identity, think of it as being useful in cases where an identity does not fit the business requirements.
Recently was bit by something to consider for identity vs sequence. Seems MSFT now suggests sequence if you may want to keep identity without gaps. We had an issue where there were huge gaps in the identity, but based on this statement highlighted would explain our issue that SQL cached the identity and after reboot we lost those numbers.
https://learn.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql-identity-property?view=sql-server-2017
Consecutive values after server restart or other failures – SQL Server might cache identity values for performance reasons and some of the assigned values can be lost during a database failure or server restart. This can result in gaps in the identity value upon insert. If gaps are not acceptable then the application should use its own mechanism to generate key values. Using a sequence generator with the NOCACHE option can limit the gaps to transactions that are never committed.

SQL Auto-Increment in Oracle APEX occasionally skips a chunk of numbers when incrementing?

I have created a table in APEX that has a PK that is incremented by a SQL sequence:
CREATE SEQUENCE seq_increment
MINVALUE 1
START WITH 880
INCREMENT BY 1
CACHE 10
This seems to work perfectly. The issue is that sometimes, usually when I get on in the morning and run a process to enter a new row, it skips a bunch of numbers. I only care because these numbers are being used as the ID# of documents in my company and losing/skipping blocks of numbers is not going to be acceptable when this tool goes live.
It does seem to jump to the next '10' number. i.e. yesterday my last test assigned 883 and this morning it assigned 890 as the next number. Looking at my code for creation of the sequence I notice that I have set it up to cache 10 values so that it will process quicker. Is it possible that this cache is getting dumped over night and that it is pulling 890 because it had 880-889 in cache and it was dumped?
Are there other potential causes and solutions?
Sequences will not and can not generate gap-free values. So you'd expect that numbers will occasionally be skipped. That's perfectly normal when you're using sequences.
As you've surmised, the most likely scenario is that the sequence cache is aging out of the shared pool overnight when the APEX application isn't being used. You can reduce the frequency of gaps by declaring your sequence NOCACHE but that will decrease performance and it will not eliminate gaps it will just make them less frequent.
Oracle sequences are never guaranteed to be contiguous. If you need an absolutely contiguous set of values, you'll need to implement a custom solution.
Odds are that CACHE 10 is why you're losing numbers in this case. The cache value is how many sequence values are stored in memory for future use. Rebooting will clear the cache and cause 10 new values to be retrieved. Similarly, if the sequence is not used for long enough, the current set of values may be flushed out of the shared pool, also causing a new set of values to be retrieved.
This is clearly not the case in your instance, but sequence numbers can also be lost due to rollbacks. A rolled back transaction involving one or more sequences discards the sequence value(s).
Some sequence numbers have been aged out of one of the in-memory structures (shared pool I think?). This is expected behaviour for sequences. The only guarantee that you have is that they are unique. If you need to present gap-free sequences you'll have to do this at reporting time using e.g. rownum pseudo-column. It is made this way deliberately otherwise you would have to serialise all inserts i.e. lock table. And even that wouldn't work properly if an insert was rolled back!