Prohibit breaking the sequence for rows field (no gaps) - sql

Is there a way to creacte check or index or anything else to prohibit breaking the sequence for rows field?
Let assume I have chapters table with order column.
Chapter table:
uuid | order
dad | 1
1dd | 2
xxss | 3
sdsd | 4
5aa | 5
Chapters order start from 1 and should not contain sequence gaps like 1,2,4,5 (3 is missing). Any chapter can be deleted, or inserted in any order (with reordering).
If there is no way to forbid skips, then how can i reoder chapters after insert or delete to erase skips (reoder from 1 to max)?

I am unsure that there is an easy way to prevent gaps. I would start with a unique constraint, that avoids duplicates.
Then, you can use a view that assigns an autoincrementing id based on the existing column:
create view myview as
select uuid, row_number() over(order by ord) as new_ord
from mytable
Whenever you want to display the sequential chapter numbers, you can query the view instead of the table.
Note: order is a language keyword; I used ord instead in the query.

Related

Proper way to generate counter ID for unique values in a SQL column?

pk_id unq_id content
1 1 foo
2 2 bar
3 1 foo
4 1 foo
5 3 baz
6 2 bar
7 4 qux
I am populating a table with known content that can repeat a random number of times.
I want to auto-generate the unq_id column which counts the nth appearance of the unique value in the content column.
I am thinking about some foreign key constraint but not particular sure how to construct this kind of constraint. Searching on the web for a long time without result, I can only ask here.
Could someone shed some light? Any help would be appreciated.
This is more simply done when you query the table, using row_number():
select pk_id, content,
row_number() over (partition by content order by pk_id) as unq_id
from t;
You can put this logic in a view.
Actually storing the value in the table requires a bit of work. If you don't pre-calculate the value, your'll need to use a trigger.
Alternatively, if the data does not change, then you can load into a staging table and use the above query to create the final table.

Updating / deleting arbitrary rows in column without primary key

I am building a tool that will display all the tables in a given PostgreSQL database (client's legacy app), then the user would dig in and can see all the data in given table. It is essentially a database viewer.
Next step will be to allow user to update each row, in a similar manner to how one updates data in Airtable.
While for most columns I will have the primary keys so I can use to build appropriate Update ... where ID=? statements, I realized that may not be the case always. For some join tables, for example, I do not have the ID or any other primary key.
I still would like to have the functionality where the user looks at the grid of data displayed from such columns, selects a row with click of mouse and provides new values.
PostgreSQL used to use OIDs to uniquelly identify rows for such cases, but this is no longer the case even for the legacy database I am dealing with.
The only solution I can think of is using the offset/sort order to figure out which row is to be updated, but this leads to race conditions if sort changes in the meantime or the user deletes/adds some rows.
Any ideas how I can update such "anonymous" rows?
Each table in Postgres has a system column ctid which unambiguously identifies a row. Example:
drop table if exists my_table;
create table my_table(id int, str text);
insert into my_table values
(1, 'one'),
(1, 'two'),
(2, 'one');
select ctid, *
from my_table;
ctid | id | str
-------+----+-----
(0,1) | 1 | one
(0,2) | 1 | two
(0,3) | 2 | one
(3 rows)
You can use the column in delete or update:
delete from my_table
where ctid = '(0,2)'
returning *
id | str
----+-----
1 | two
(1 row)
DELETE 1
Note however, that there is no guarantee that a row has always the same ctid, per the documentation:
ctid
The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is useless as a long-term row identifier. The OID, or even better a user-defined serial number, should be used to identify logical rows.

Column with alternate serials

I would like to create a table of user_widgets which is primary keyed by a user_id and user_widget_id, where user_widget_id works like a serial, except for that it starts at 1 per each user.
Is there a common or practical solution for this? I am using PostgreSQL, but an agnostic solution would be appreciated as well.
Example table: user_widgets
| user_id | user_widget_id | user_widget_name |
+-----------+------------------+----------------------+
| 1 | 1 | Andy's first widget |
+-----------+------------------+----------------------+
| 1 | 2 | Andy's second widget |
+-----------+------------------+----------------------+
| 1 | 3 | Andy's third widget |
+-----------+------------------+----------------------+
| 2 | 1 | Jake's first widget |
+-----------+------------------+----------------------+
| 2 | 2 | Jake's second widget |
+-----------+------------------+----------------------+
| 2 | 3 | Jake's third widget |
+-----------+------------------+----------------------+
| 3 | 1 | Fred's first widget |
+-----------+------------------+----------------------+
Edit:
I just wanted to include some reasons for this design.
1. Less information disclosure, not just "Security through obscurity"
In a system where user's should not be aware of one another, they also should not be aware of eachother's widget_id's. If this were a table of inventory, weird trade secrets, invoices, or something more sensitive, they be able to start have their own uninfluenced set of ID's for those widgets. In addition to the obvious routine security checks, this adds an implicit security layer where the table has to be filtered by both the widget id and the user id.
2. Data Imports
Users should be permitted to import their data from some other system without having to trash all of their legacy IDs (if they have integer IDs).
3. Cleanliness
Not terribly dissimilar from my first point, but I think that users who create less content than other may be baffled or annoyed by significant jumps in their widget ID's. This of course is more superficial than functional, but could still be valuable.
A possible solution
One of the answers suggests the application layer handles this. I could store a next_id column on that user's table that gets incremented. Or perhaps even just count the rows per user, and not allow deletion of records (using a deleted/deactivated flag instead). Could this be done with a trigger function, or even a stored procedure rather than in the application layer?
If you have a table:
CREATE TABLE user_widgets (
user_id int
,user_widget_name text --should probably be a foreign key to a look-up table
PRIMARY KEY (user_id, user_widget_name)
)
You could assign user_widget_id dynamically and query:
WITH x AS (
SELECT *, row_number() OVER (PARTITION BY user_id
ORDER BY user_widget_name) AS user_widget_id
FROM user_widgets
)
SELECT *
FROM x
WHERE user_widget_id = 2;
user_widget_id is applied alphabetically per user in this scenario and has no gaps, Adding, changing or deleting entries can result in changes, obviously.
More about window functions in the manual.
Somewhat more (but not completely) stable:
CREATE TABLE user_widgets (
user_id int
,user_widget_id serial
,user_widget_name
PRIMARY KEY (user_id, user_widget_id)
)
And:
WITH x AS (
SELECT *, row_number() OVER (PARTITION BY user_id
ORDER BY user_widget_id) AS user_widget_nr
FROM user_widgets
)
SELECT *
FROM x
WHERE user_widget_nr = 2;
Addressing question update
You can implement a regime to count existing widgets per user. But you will have a hard time making it bulletproof for concurrent writes. You would have to lock the whole table or use SERIALIZABLE transaction mode - both of which are real downers for performance and need additional code.
But if you guarantee that no rows are deleted you could go with my second approach - one sequence for user_widget_id across the table, that giving you a "raw" ID. A sequence is a proven solution for concurrent load, preserves the relative order in user_widget_id and is fast. You could provide access to the table using a view that dynamically replaces the "raw" user_widget_id with the corresponding user_widget_nr like my query above.
You could (in addition) "materialize" a gapless user_widget_id by replacing it with user_widget_nr at off hours or triggered by events of your choosing.
To improve performance I would have the sequence for user_widget_id start with a very high number. Seems like there can only be a handful of widgets per user.
SELECT setval(user_widgets_user_widget_id_seq', 100000);
If no number is high enough to be safe, add a flag instead. Use the condition WHERE user_widget_id > 100000 to quickly identify "raw" IDs. If your table is huge you may want to add a partial index using the condition (which will be small). For use in the mentioned view in a CASE statement. And in this statement to "materialize" IDs:
UPDATE user_widgets w
SET user_widget_id = u.user_widget_nr
FROM (
SELECT user_id, user_widget_id
,row_number() OVER (PARTITION BY user_id
ORDER BY user_widget_id) AS user_widget_nr
FROM user_widgets
WHERE user_widget_id > 100000
) u
WHERE w.user_id = u.user_id
AND w.user_widget_id = u.user_widget_id;
Possibly follow up with a REINDEX or even VACUUM FULL ANALYZE user_widgets at off hours. Consider a FILLFACTOR below 100, as columns will be updated at least once.
I would certainly not leave this to the application. That introduces multiple additional points of failure.
I am going to join in, in questioning the specific requirements. In general, if you are trying to order things of this sort, that might be better left to the application. If you knew me you'd realize this was really saying something. My concern is that every case I can think of may require re-ordering on the part of the application because otherwise the numbers would be irrelevant.
So I would just:
CREATE TABLE user_widgets (
user_id int references users(id),
widget_id int,
widget_name text not null,
primary key(user_id, widget_id)
);
And I'd leave it at that.
Now based on your justification, this addresses all of your concerns (imports). However I have once in a long while had to do something similar. The use case I had was a case where a local tax jurisdiction required that packing slips(!) be sequentially numbered without gaps, separate from invoices. Counting records, btw won't meet your import requirements.
What we did was create a table with one row per sequence and use that and then tie that in with a trigger.

Logs with arbitrary numbers of entries in PostgreSQL

I'm designing a db in PostgreSQL that primarily stores info about different people. I'd like to associate a log with each person, consisting of the date and a text entry. Logs can have arbitrary numbers of entries. Here's the ideas I've toyed with:
What I think I want is a log_table like this:
person_id | row_num | row_date | row_text
-----------------------------------------
1 | 1 | 01/01/12 | Blah...
2 | 1 | 01/02/12 | Foo...
1 | 2 | 01/04/12 | Bar...
But I don't know how to get row_num to increment properly; it should default to one more than the largest current row_num for that person_id. In other words, the row_nums for a given person_id should be sequential.
Or I can just have row_num increment regardless of person_id so that every log entry has a distinct row number. But it doesn't seem very satisfying to have person_id 1's log jump from row 1 to row 3, and this could also make errors hard to spot.
My last idea is to include the log directly in the person table, by making a composite type log_entry = (date, text). Then a column log in the person table can store an array:
person_id | name | log
----------------------
1 | Bob | {(01/01/12, Blah...), (01/04/12, Bar...)}
But this seems cumbersome.
So my questions are, a) which solution if any is good design; b) any way to solve the auto-incrementing problem for solution 1? If it matters, this is a small db for personal use; I want good structure but it's highly likely I'll be the only user. Thanks so much for any help!
Why don't you use a timestamp to store the time when the row has been inserted?
That way you don't need the extra row_num column in the table, and you can always "calculate" it on the fly:
SELECT person_id,
row_number() over (partition by person_id order by row_timestamp) as row_num,
row_timestamp,
row_text
FROM log_table
Of course if there are chances that a user generates more than one entry per micro second that you might wind up with log entries with exactly the same timestamp.
But even in a busy system this is quite unlikely (but not impossible).
If you can't (or don't want to to) use a timestamp, you can always use a sequence that increments for all users and then use the row_number() function to generate a gapless row number during retrieval (as shown above, just use an order by on the column populated by the sequence).

How can I reorder rows in sql database

Is it possible to reorder rows in SQL database?
For example; how can I swap the order of 2nd row and 3rd row's values?
The order of the row is important to me since i need to display the value according to the order.
Thanks for all the answers. But 'Order by' won't work for me.
For example, I put a list of bookmarks in database.
I want to display based on the result I get from query. (not in alphabet order). Just when they are inserted.
But user may re-arrange the position of the bookmark (in any way he/she wants). So I can't use 'order by'.
An example is how the bookmark display in the bookmark in firefox. User can switch position easily. How can I mention that in DB?
Thank you.
It sounds like you need another column like "ListOrder". So your table might look like:
BookMark ListOrder
======== =========
d 1
g 2
b 3
f 4
a 5
Then you can "order by" ListOrder.
Select * from MyTable Order By ListOrder
If the user can only move a bookmark one place at a time, you can use integers as the ListOrder, and swap them. For example, if the user wants to move "f" up one row:
Update MyTable
Set ListOrder=ListOrder+1
Where ListOrder=(Select ListOrder-1 From MyTable where BookMark='f')
Update MyTable
Set ListOrder=ListOrder-1
Where BookMark='f'
If the user can move a bookmark up or down many rows at once, then you need to reorder a segment. For example, if the user wants to move "f" to the top of the list, you need to:
if (increment) {
update MyTable
Set ListOrder=ListOrder-1
where ListOrder<=1 -- The New position
and ListOrder >(Select ListOrder from MyTable where BookMark='f')
} else {
update MyTable
Set ListOrder=ListOrder+1
where ListOrder>=1 -- The New position
and ListOrder <(Select ListOrder from MyTable where BookMark='f')
}
update MyTable
Set ListOrder=1 -- The New Position
Where Bookmark='f'
As others have mentioned, it's not a good idea to depend on the physical order of the database table. Relational tables are conceptually more like unordered sets than ordered lists. Assuming a certain physical order may lead to unpredictable results.
Sounds like what you need is a separate column that stores the user's preferred sort order. But you'll still need to do something in your query to display the results in that order.
It is possible to specify the physical order of records in a database by creating a clustered index, but that is not something you'd want to do on an arbitrary user-specified basis. And it may still lead to unexpected results.
Use ORDER BY in your SELECT query. For example, to order by a user's last name, use:
SELECT * FROM User ORDER BY LastName
The order of the rows on the actual database should not matter.
You should use the ORDER BY clause in your queries to order them as you need.
Databases can store the data in any way they want. Using the "order by" clause is the only way to guarantee an ordering of the data. In your bookmark example, you could have an integer field that indicates the ordering, and then update that field as a user moves things around. Then ORDER BY that column to get things in the right order.
A little late to the party, but anyone still looking for an answer to this problem, you need to use the Stern-Brocot technique.
Here's an article explaining the theory behind it
For each item you need to store a numerator and denominator. Then you can also add a computed column which is the division of both. Each time you move an item inbetween 2 others, the item's numerator becomes the sum of both neighboring numerators, and the item's denominator becomes the sum of both neighboring denominators.
These numbers won't skyrocket as fast as with the "averaging" method, where you lose all accuracy after 17 swaps.
I also created a demo where the method is implemented.
I have a solution for this that I have used a few times. I keep an extra field "sort_order" in the table, and update this when reordering. I've used this in cases when I have some sort of containers with items, and the order of the items should be editable inside the container. When reordering, I only update the sort_order for the items in the current container, which means not to many (usually in practice only a few) rows have to be updated.
In short, I do the following:
add a sort_order field to the items table
when inserting a new row, I set sort_order=id
when reordering (needs id of item to move, and id of item to insert after):
select id, sort_order from items where container = ID order by sort_order
split the id and sort_order from rows in two arrays
remove the id of the item to move from the id-list
insert the id of the item to move after the id of the item to insert after
merge the list of ids and the list of sort_order into a two dimensional array, as [[id, sort_order], [id2, sort_order], ...]
run update item set sort_order=SORT_ORDER where id=ID (executemany) with merged list
(If moving item to another container, after updating "container foreign key" move first or last depending on app.)
(If the update involves a large number of items, I do not think this solution is a good approach.)
I have made an example using python and mysql on http://wannapy.blogspot.com/2010/11/reorder-rows-in-sql-database.html (copy and try it) along with some extra explanations.
I guess a simple order by would be what you're looking for?
select my_column from my_table order by my_order_column;
As others have stated use an order by.
Never depend on the order data exists in a physical table, always base it of the data you are working with, be it one or more key fields.
First, let me agree with everyone here that the order in the table shouldn't matter. Use a separate [SortOrder] column that you update and include an Order By clause.
That said, SQL Server databases do allow for a single "clustered index" on a table that will actually force the position in the underlying table storage. Primarily useful if you have a big dataset and always query by something specific.
Add a position column to your table and store as a simple integer.
If you need to support multiple users or lists, your best bet is to create a bookmarks table, an users table and a table to link them.
bookmarks: id,url
users: id,name
users_bookmarks: user_id, bookmark_id, position, date_created
Assuming date_created is populated when inserting rows you can get secondary list ordering based on date.
select bookmark_id from users_bookmarks where user_id = 1 order by position, date_created;
At times like this, I am reminded of a quote from the Matrix: "Do not try and order the database. That's impossible. Instead, only realize the truth... there is no order. Then you will see that it the table that orders itself, it is you who orders the table."
When working with MySQL through a GUI, there is always a decision to make. If you run something like SELECT * FROM users, MySql will always make a decision to order this by some field. Normally, this will be the primary key.
+----------------
| id | name |
-----------------
| 1 | Brian |
| 2 | Carl |
| 3 | Albert |
-----------------
When you add an ORDER BY command to the query, it will make the decision to order by some other field.
For Example Select * From users ORDER BY name would yield:
+----------------
| id | name |
-----------------
| 3 | Albert |
| 1 | Brian |
| 2 | Carl |
-----------------
So to your question, you appear to want to change the default order by which your table displays this information. In order to do that, check to see what your Primary Key field
is. For most practical purposes, having a unique identifying natural number tends to do the trick. MySQL has an AUTO_INCREMENT function for this. When creating the table, it would look something like field_name int NOT NULL AUTO_INCREMENT.
All of this is to say: if you would like to change "the row order", you would need to update this value. However, since the identifier is something that other tables would use to reference your field, this seems a little bit reckless.
If you for example went: UPDATE table Set id = 1 where id = 2;, this would initially fail, since the id fields would end up being both an identical value and fail the PrimaryKey check (which insists on both uniqueness and having a value set). You could Juggle this by running three update statements in a row:
UPDATE users Set id = 100000000 where id = 1;
UPDATE users Set id = 1 where id = 2;
UPDATE users Set id = 2 where id = 100000000;
This would result in the rows for this table looking like:
+----------------
| id | name |
-----------------
| 1 | Carl |
| 2 | Brian |
| 3 | Albert |
----------------+
Which technically would work to reorder this table, but this is in a bubble. MySQL being a relational database means that any table which was depending on that data to be consistent will now be pointed to the wrong data. For example, I have a table which stores birthdays, referencing the initial user table. It's structure might look like this:
+----------------------------+
| id | user_id | birthdate |
+----------------------------+
| 1 | 1 | 1993-01-01 |
| 1 | 2 | 1980-02-03 |
| 1 | 3 | 1955-01-01 |
+----------------------------+
By switching the ID's on the user table, you MUST update the user_id value on the birthdays table. Of course MySQL comes prepared for this: enter "Foreign Key Constraints". As long as you configured all of your foreign key constraints to Cascade Updates, you wouldn't need to manually change the reference to every value you changed.
These queries would all be a lot of manual work and potentially weaken your data's integrity. If you have fields you would like to rank and reorder regularly, the answer posed by Mike Lewis on this question with the "table order" would be a more sensible answer (and if that is the case, then his is the best solution and just disregard this answer).
In response to your post here, the answer you may be looking for is:
To order chronologically, add a DateAdded or similar column with a datetime or smalldatetime datatype.
On all methods that insert into the database, make sure you insert CURRENT_TIMESTAMP in the DateAdded column.
On methods that query the database, add ORDER BY DateAdded at the end of the query string.
NEVER rely on the physical position in the database system. It may work MOST of the time but definitely not ALL of the time.
The question lacks any detail that would let anyone give you correct answer. Clearly you could read the records into memory and then update them. But this is bad on so many different levels.
The issue is like this. Depending on the schema that is actually implemented there is logic to the way that the records are physically written to disk. Sometimes they are written in order of insert and other times they are inserted with space between blocks (see extents).
So changing the physical order is not likely without swapping column data; and this has a deep effect on the various indices. You are left having to change the logical order.
As I read your update... I'm left to understand that you may have multiple users and each user is to have bookmarks that they want ordered. Looks like you need a second table that acts as an intersection between the user and the bookmark. Then all you need is an inner join and an order by.
But there is not enough information to offer a complete solution.
Here is a stored procedure script to increment or decrement (one at a time) in MySQL.
Note, MySQL doesn't allow you to select in the same query you're updating so the above answers don't work.
I have also set it to return an error if there is no item above / below if you're incrementing / decrementing, respectively.
DELIMITER $$
CREATE PROCEDURE `spReorderSequenceItems` (
IN _SequenceItemId INT,
IN _SequenceId INT,
IN IncrementUp TINYINT,
OUT Error VARCHAR(255)
)
BEGIN
DECLARE CurrentPosition INT;
SELECT Position INTO CurrentPosition
FROM tblSequenceItems
WHERE SequenceItemId = _SequenceItemId;
IF IncrementUp = 1 THEN
IF (
SELECT Position
FROM tblSequenceItems
WHERE Position = CurrentPosition + 1 AND SequenceId = _SequenceId
) THEN
UPDATE tblSequenceItems
SET Position = Position - 1
WHERE Position = CurrentPosition + 1 AND SequenceId = _SequenceId;
UPDATE tblSequenceItems
SET Position = Position + 1
WHERE SequenceItemId = _SequenceItemId;
ELSE
SELECT 'No Item Above' AS _Error INTO Error;
END IF;
ELSE
IF (
SELECT Position
FROM tblSequenceItems
WHERE Position = CurrentPosition - 1 AND SequenceId = _SequenceId
) THEN
UPDATE tblSequenceItems
SET Position = Position + 1
WHERE Position = CurrentPosition - 1 AND SequenceId = _SequenceId;
UPDATE tblSequenceItems
SET Position = Position - 1
WHERE SequenceItemId = _SequenceItemId;
ELSE
SELECT 'No Item Below' AS _Error INTO Error;
END IF;
END IF;
END
$$
DELIMITER ;
Call it with
CALL spReorderSequenceItems(1, 1, 1, #Error);
SELECT #Error;