How would I define a column in PostgreSQL such that each value must be in a sequence, not the sequence you get when using type serial but one such that a value 2 cannot be inserted unless there exists a value 1 already in the column?
I wrote a detailed example of a gapless sequence implementation using PL/PgSQL here.
The general idea is that you want a table to store the sequence values, and you use SELECT ... FOR UPDATE followed by UPDATE - or the shorthand UPDATE ... RETURNING - to get values from it while locking the row until your transaction commits or rolls back.
Theoretically, you could use a constraint that worked like this. (But it won't work in practice.)
Count the rows.
Evaluate max(column) - min(column) + 1.
Compare the results.
You'd probably have to insert one row before creating the CHECK constraint. If you didn't, max(column) would return NULL. With one row,
Count the rows (1).
Evaluate max(column) - min(column) + 1. (1 - 1 + 1 = 1)
Compare the results. (1 = 1)
With 10 rows . .
Count the rows (10).
Evaluate max(column) - min(column) + 1. (10 - 1 + 1 = 10)
Compare the results. (10 = 10)
It doesn't matter whether the sequence starts at 1; this way of checking will always show a gap if one exists. If you needed to guarantee that the gapless sequence started at 1, you could add that to the CHECK constraint.
As far as I know, there isn't any way to do this declaratively with any current dbms. To do it, you'd need support for CREATE ASSERTION. (But I could be wrong.) In PostgreSQL, I think your only shot at this involves procedural code in multiple AFTER triggers.
I only have one table that needs to be gapless. It's a calendar table. We run a query once a night that does these calculations, and it lets me know whether I have a gap.
You write an on insert tigger or a check constraint. However, this will still allow to delete "1" afterwards and "2" stays in the table, you'll probably have to address this too.
Related
I was wondering if there was any way to check if, in a column, there were all values between a range. Example: i have an INTEGER column with values
0
1
2
3
5
6
i want to check if between 0 and 6 i have all values. (false in this example)
I think a solution might be: MAX(Column)-MIN(Column)+1 and the result has to be equal to COUNT(Column) but i'm not sure how to write it as a CONSTRAINT.
To enforce this you would need a TRIGGER. A trigger can run before or after you insert or delete rows and can run for every row or for each statement. If you are only inserting a single row at a time then both are equivalent. If you might insert multiple rows you probably want a trigger that runs AFTER the STATEMENT is executed.
The trigger should run the check you described and should raise an exception if the condition is not met. There is an example of constructing a trigger in the manual (links below).
You may find it helpful to write two functions - one which checks the table returns true/false depending on if it is acceptable, and then a trigger function that calls that. This makes it easier to test your condition logic.
https://www.postgresql.org/docs/current/plpgsql-trigger.html
https://www.postgresql.org/docs/current/sql-createtrigger.html
you can use
{select * from table
where(column BETWEEN 0 AND 10)}
0 and 10 can be replaced with any numerical value you want
I have the following table:
CREATE TABLE dbo.TestSort
(
Id int NOT NULL IDENTITY (1, 1),
Value int NOT NULL
)
The Value column could (and is expected to) contain duplicates.
Let's also assume there are already 1000 rows in the table.
I am trying to prove a point about unstable sorting.
Given this query that returns a 'page' of 10 results from the first 1000 inserted results:
SELECT TOP 10 * FROM TestSort WHERE Id <= 1000 ORDER BY Value
My intuition tells me that two runs of this query could return different rows if the Value column contains repeated values.
I'm basing this on the facts that:
the sort is not stable
if new rows are inserted in the table between the two runs of the query, it could possibly create a re-balancing of B-trees (the Value column may be indexed or not)
EDIT: For completeness: I assume rows never change once inserted, and are never deleted.
In contrast, a query with stable sort (ordering also by Id) should always return the same results, since IDs are unique:
SELECT TOP 10 * FROM TestSort WHERE Id <= 1000 ORDER BY Value, Id
The question is: Is my intuition correct? If yes, can you provide an actual example of operations that would produce different results (at least "on your machine")? You could modify the query, add indexes on the Values column etc.
I don't care about the exact query, but about the principle.
I am using MS SQL Server (2014), but am equally satisfied with answers for any SQL database.
If not, then why?
Your intuition is correct. In SQL, the sort for order by is not stable. So, if you have ties, they can be returned in any order. And, the order can change from one run to another.
The documentation sort of explains this:
Using OFFSET and FETCH as a paging solution requires running the query
one time for each "page" of data returned to the client application.
For example, to return the results of a query in 10-row increments,
you must execute the query one time to return rows 1 to 10 and then
run the query again to return rows 11 to 20 and so on. Each query is
independent and not related to each other in any way. This means that,
unlike using a cursor in which the query is executed once and state is
maintained on the server, the client application is responsible for
tracking state. To achieve stable results between query requests using
OFFSET and FETCH, the following conditions must be met:
The underlying data that is used by the query must not change. That is, either the rows touched by the query are not updated or all
requests for pages from the query are executed in a single transaction
using either snapshot or serializable transaction isolation. For more
information about these transaction isolation levels, see SET
TRANSACTION ISOLATION LEVEL (Transact-SQL).
The ORDER BY clause contains a column or combination of columns that are guaranteed to be unique.
Although this specifically refers to offset/fetch, it clearly applies to running the query multiple times without those clauses.
If you have ties when ordering the order by is not stable.
LiveDemo
CREATE TABLE #TestSort
(
Id INT NOT NULL IDENTITY (1, 1) PRIMARY KEY,
Value INT NOT NULL
) ;
DECLARE #c INT = 0;
WHILE #c < 100000
BEGIN
INSERT INTO #TestSort(Value)
VALUES ('2');
SET #c += 1;
END
Example:
SELECT TOP 10 *
FROM #TestSort
ORDER BY Value
OPTION (MAXDOP 4);
DBCC DROPCLEANBUFFERS; -- run to clear cache
SELECT TOP 10 *
FROM #TestSort
ORDER BY Value
OPTION (MAXDOP 4);
The point is I force query optimizer to use parallel plan so there is no guaranteed that it will read data sequentially like Clustered index probably will do when no parallelism is involved.
You cannot be sure how Query Optimizer will read data unless you explicitly force to sort result in specific way using ORDER BY Id, Value.
For more info read No Seatbelt - Expecting Order without ORDER BY.
I think this post will answer your question:
Is SQL order by clause guaranteed to be stable ( by Standards)
The result is everytime the same when you are in a single-threaded environment. Since multi-threading is used, you can't guarantee.
My requirement is as follows.
(a) I have already sequence created and one table is (lets assume employee having id,name..etc).
(b) Some how my sequence get corrupted and currently current value of sequence is not sync with the max value of id column of employee table.
now i want to reset my sequence to the max value of the id column of employee table. i know we can do it easily by using PL/SQL,Stored procedure. but i want to write plain query which will do following tasks.
1- Fetch max value of id and current value of my sequence .Take a difference and add that difference to the sequence by using increment by.( here my current value of sequence is lesser than max id value of id column)
You change the values of a sequence with the 'ALTER SEQUENCE' command.
To restart the sequence with a new base value, you need to drop and recreate it.
I do not think you can do this with a straightforward SELECT query.
Here is the Oracle 10g documentation for ALTER SEQUENCE.
You can't change the increment from plain SQL as alter sequence is DDL, so you need to increment it multiple times, one by one. This would increment the sequence as many times as the highest ID you currently have:
select your_sequence.nextval
from (
select max(id) as max_id
from your_table
) t
connect by level < t.max_id;
SQL Fiddle demo (fudged a bit as the sequence isn't reset if the schema is cached).
If you have a high value though that might be inefficient, though as a one-off adjustment that probably doesn't matter. You can't refer to the current sequence value in a subquery or CTE, but you could look at the USER_SEQUNECES view to get a rough guide of how far out you are to begin with, and reduce the number of calls to within double the case size (depending on how many waiting values the cache holds):
select your_sequence.nextval
from (
select max(id) as max_id
from your_table
) t
connect by level <= (
select t.max_id + us.cache_size + 1 - us.last_number
from user_sequences us
where sequence_name = 'YOUR_SEQUENCE'
);
SQL Fiddle.
With low existing ID values the second one might do more work, but with higher values you can see the second comes into its own a bit.
I have some data in oracle table abot 10,000 rows i want to genrate column which return 1 for ist row and 2 for second and so on 1 for 3rd and 2 for 4th and 1 for 5th and 2 for 6th and so on..Is there any way that i can do it using sql query or any script which can update my column like this.that it will generate 1,2 as i mentioned above i have thought much but i didn't got to do this using sql or any other scencrio for my requirements.plz help if any possibility for doing this with my table data
You can use the combination of the ROWNUM and MOD functions.
Your query would look something like this:
SELECT ROWNUM, 2 - MOD(ROWNUM, 2) FROM ...
The MOD function will return 0 for even rows and 1 for odd rows.
select mod(rownum,5)+1,fld1, fld2, fld3 from mytable;
Edit:
I did not misunderstand requirements, I worked around them. Adding a column and then updating a table that way is a bad design idea. Tables are seldom completely static, even rule and validation tables. The only time this might make any sense is if the table is locked against delete, insert, and update. Any change to any existing row can alter the logical order. Which was never specified. Delete means the entire sequence has to be rewritten. Update and insert can have the same effect.
And if you wanted to do this you can use a sequence to insert a bogus counter. A sequence that cycles over and over, assuming you know the order and can control inserts and updates in terms of that order.
I'm introducing a primary key column to a table that doesn't have one yet. After I have added a normal field Id (int) with a default value of 0 I tried using the following update statement to create unique values for each record:
update t1
set t1.id = (select count(*) from mytable t2 where t2.id <> t1.id)
from mytable t1
I would expect the subquery to be executed for each row because I'm referencing t1. Each time the subquery would be executed the count should be one less but it doesn't work.
The result is that Id is still 0 in every record. I have used this before on other DBMS with success. I'm using SQL Server 2008 here.
How do I generate unique values for each record and update the Id field?
Trying to explain why it doesn't work as you expect:
I would expect the subquery to be executed for each row because I'm referencing t1.
It is executed and it can affect all rows. But an UPDATE stetement is one statement and it is executed as one statement that affects a whole table (or a part of it if you have a WHERE clause).
Each time the subquery would be executed the count should be one less but it doesn't work.
You are expecting the UPDATE to be executed with one evaluation of the subquery per row. But it is one statement that is first evaluated - for all affected rows - and then the rows are changed (updated). (A DBMS may do it otherwise but the result should be nonetheless as if it was doing it this way).
The result is that Id is still 0 in every record.
That's the correct and expected behaviour of this statement when all rows have the same 0 value before execution. The COUNT(*) is 0.
I have used this before on other DBMS with success.
My "wild" guess is that you have used it in MySQL. (Correction/Update: my guess was wrong, this syntax for Update is not valid for MySQL, apparently the query was working "correctly" in Firebird). The UPDATE does not work in the standard way in that DBMS. It works - as you have learned - row by row, not with the full table.
I'm using SQL Server 2008 here.
This DBMS works correctly with UPDATE. You can write a different Update statement that would have the wanted results or, even better, use an autogenerated IDENTITY column, as others have advised.
The SQL is updating every row with the number of records where the ID doesn't equal 0. As all the rows ID equal 0 then there are no rows that are not equal to 0, and hence nothing gets updated.
Try looking at this answer here:
Adding an identity to an existing column