postgresql sequence number depending on rows?

postgresql sequence number depending on rows? - sql

Taking a course in databases and i am unsure of how to create this view.
I have this table(postgresql):
CREATE TABLE InQueue (
id INT REFERENCES Student(id),
course VARCHAR(10) REFERENCES RestrictedCourse(course_code),
since TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id,course),
UNIQUE (course,since)
);
I am supposed to create a view that lists course, id, number, where number is calculated with since. Basically the lowest since gives the queuenumber 1, the 2nd lowest gives 2 and so forth. (course,number) is unique, but not number by itself, since there are many different courses.
What i think needs to be done is to first order the table by (course,since) and then just add sequence numbers, but eventually the course will change and then the sequence numbers need to start over, beginning with 1 again.
Could someone point me in the right direction? :)

Use:
Select row_number() over (partition by course order by since asc) as yournumber, id,
course, since from InQueue
You can read about analytic functions here: http://www.postgresql.org/docs/9.4/static/tutorial-window.html

Related

Create a row which does not violate a unique index

I have a table with Id, AccountId, PeriodId, and Comment. I have a unique index for (AccountId, PeriodId). I want to create such a row which does not violate the unique index. AccountId and PeriodId cannot be null, they are foreign keys.
My only idea is to cross join the Account and Period table to get all valid combination and discard the already existing combinations and chose one from the rest?
Is it a good approach or is there a better way?
Update 1: Some of you wanted to know the reason. I need this functionality in a program which adds a new row to the current table and only allows the user to modify the row after it is already in the db. This program now uses a default constructor to create a new row with AccountId1 and PeriodId1, empty comment. If this combination is available then after insertion the user can change it and provide a meaningful comment (there can be at most one comment for each (AccountId, PeriodId). But if the combination is not available then the original insertion will fail. So my task is to give a combination to the program which is not used, so it can be safely inserted.

As it turns out my original idea is quick enough. This query returns an unused (AccountId, PeriodId).
select top 1 *
from
(
select Account.Id as AccountId, [Period].Id as PeriodId
from Account cross join [Period]
except
select AccountId, PeriodId
from AccountPeriodFilename
) as T

How to produce a reproducible column of random integers in SQL

I have a table of patient, with a unique patientID column. This patientID cannot be shared with study teams, so I need a randomised set of unique patient identifiers to be able to share. The struggle is that there will be several study teams, so every time a randomised identifier is produced, it needs to be different to the identifier produced for other studies. To make it even more complicated, we need to be able to reproduce the same set of random identifiers for a study at any point (if the study needs to re-run the data for example).
I have looked into the RAND() and NEWID() functions but not managed to figure out a solution. I think this may be possible using RAND() with a seed, and a while loop, but I haven't used these before.
Can anyone provide a solution that allows me to share several randomised sets of unique identifiers, that never have the same identifier for the same patient, and which can be re-run to produce the same list?
Thanks in advance to anyone that helps with this!

Your NEWID() should work as long as you have correct datatype.
Using UNIQUEIDENTIFIER as datatype should be unique across entire database/server. See full details from link below:
sqlshack.com/understanding-the-guid-data-type-in-sql-server
DECLARE #UNI UNIQUEIDENTIFIER
SET #UNI = NEWID()
SELECT #UNI
Comments from link:
As mentioned earlier, GUID values are unique across tables, databases, and servers. GUIDs can be considered as global primary keys. Local primary keys are used to uniquely identify records within a table. On the other hand, GUIDs can be used to uniquely identify records across tables, databases, and servers.

One method is to use the patientid as a seed to rand():
select rand(checksum(patientid))
This returns a value between 0 and 1. You can multiply by a large number.
That said, I think you should keep a list of patients in each study -- so you don't have to reproduce the results. Reproducing results seems dangerous, especially for something like a "study" that could have an impact on health.

This is too much for a comment. It's not black and white from your description and comments what you are asking for, but it appears you want to associate a new random ID value for each existing patients' ID, presumably being able to tie it back to the source ID, and produce the same random ID at a later date repeatedly.
It sounds like you'll need an intermediary table to store the randomly produced IDs (otherwise, being random how do you guarantee to get the same value for the same PatientID?)
Could you therefore have a table something like
create table Synonyms (
Id int not null identity(1,1),
PatientId int not null,
RandomId uniqueidentifier not null default newid(),
Createdate datetime not null default getdate()
)
PatientId is the foreign key to the actual Id of the Patent.
Each time you need a new random PatientId, insert the PatientIDs into this table and then join to it when querying out the patient data, supplying the RandomId instead. That way, you can reproduce the same random Id each time it's needed.
You could have a view that always provides the most recent RandomId value for each PatientId, or by some mechanism to track which "version" a report gets.
If you need a new Id for the patient, insert its Id again and you are guaranteed to get the same Id via whatever logic you need - ie you could have a ReportNo column as a sequence partitioned by PatientId or any number of other ways.
If you prefer to avoid a GUID you could make it an int and use a function to generate it by checking it's not already used, possibly a computed column with an inline function that selects top 1 from a numbers table that doesn't already exist as a RandomId... or something like that!
I may have completely misunderstood, hopefully it might give you some ideas though.

PostgreSQL Sequence Ascending Out of Order

I'm having an issue with Sequences when inserting data into a Postgres table through SQL Alchemy.
All of the data is inserted fine, the id BIGSERIAL PRIMARY KEY column has all unique values which is great.
However when I query the first 10/20 rows etc. of the table, the id values are not ascending in numeric order. There are gaps in the sequence, fine, that's to be expected, I mean rows will go through values randomly not ascending like:
id
15
22
16
833
30
etc...
I've gone through plenty of SO and Postgres forum posts around this and have only found people talking about having huge serial gaps in their sequences, not about incorrect ascending order when being created
Screenshots of examples:
The table itself has being created through standard DDL statement like so:
CREATE TABLE IF NOT EXISTS schema.table_name (
id BIGSERIAL NOT NULL,
col1 text NOT NULL,
col2 JSONB[] NOT NULL,
etc....
PRIMARY KEY (id)
);

However when I query the first 10/20 rows etc. of the table
Your query has no order by clause, so you are not selecting the first rows of the table, just an undefined set of rows.
Use order by - you will find out that sequence number are indeed assigned in ascending order (potentially with gaps):
select id from ht_data order by id limit 30
In order to actually check the ordering of the sequence, you would actually need another column that stores the timestamp when each row was created. You could then do:
select id from ht_data order by ts limit 30

In general, there is no defined "order" within a SQL table. If you want to view your data in a certain order, you need an ORDER BY clause:
SELECT *
FROM table_name
ORDER BY id;
As for gaps in the sequence, the contract of an auto increment column generally only guarantees that each newly generated id value with be unique and, most of the time (but not necessarily always), will be increasing.

How could you possibly know if the values are "out of order"? SQL tables represent unordered sets. The only indication of ordering in your table is the serial value.
The query that you are running has no ORDER BY. The results are not guaranteed to be in any particular ordering. Period. That is a very simply fact about SQL. That you want the results of a SELECT to be ordered by the primary key or by insertion order is nice, but not how databases work.
The only way you could determine if something were out of order would be if you had a column that separate specified the insert order -- you could have a creation timestamp for instance.
All you have discovered is that SQL lives up to its promise of not guaranteeing ordering unless the query specifically asks for it.

Latest entry for append-only log in postgresql

I want to store users' settings in a postgresql database.
I would like to keep full history of their settings, and also be able to query the latest settings for a given user.
I have tried storing settings in a table like this:
CREATE TABLE customer (
customer_id INTEGER PRIMARY KEY,
name VARCHAR NOT NULL
);
CREATE TABLE customer_settings (
customer_id INTEGER REFERENCES customer NOT NULL,
sequence INTEGER NOT NULL, -- start at 1 and increase, set by the application
settings JSONB NOT NULL,
PRIMARY KEY(customer_id, sequence)
);
So customer_settings is an append-only log, per customer.
Then to query latest settings I use a long query that will do a subquery to SELECT the max sequence for the given customer_id, then will select the settings for that id.
This is awkward! I wonder if there is a better way? May I use a view or a trigger to make a second table latest_customer_settings??

You can make a view. To get the settings for multiple customers in Postgres, I would recommend:
select distinct on (customer_id)
from customer_settings cs
order by customer_id, sequence desc;
And for this query, I would recommend an index on customer_settings(customer_id, sequence desc).
In addition, you can have Postgres generate the sequence -- if you can deal with one overall sequence number for all customers.
CREATE TABLE customer_settings (
customer_settings_id bigserial primary key,
customer_id INTEGER REFERENCES customer NOT NULL,
settings JSONB NOT NULL
);
Then, the application does not need to set a sequence number. You can just insert customer_id and settings into the table.
Having the application maintain this information has some short-comings. First, the application has to read from the database before it can insert anything into the table. Second, you can have race conditions if multiple threads are updating the table at the same time (in this case for the same customer).

you can use row_number() window function , It will help to you to get each customers latest settings
with cte as (select cs.*,
row_number() over(partition by c.customer_id order by sequence desc) rn
from customer c join customer_settings cs on c.customerid=cs.customerid
) select * from cte where rn=1

Assuming you just want the single latest log for a given user, and also assuming that the sequence is always increasing and unique, then actually you only need a simple query:
SELECT *
FROM customer_settings
WHERE customer_id = 123
ORDER BY sequence DESC
LIMIT 1;
If you want to invest some time into creating a better logging framework, then try looking into things like MDC (Mapped Diagnostic Context, see here). With MDC, each log statement is written out with a completely unique identifier, which also gets sent in the response header or body. Then, it becomes easy and foolproof to correlate an exception between backend and frontend or consumer.

Linked List in SQL

What's the best way to store a linked list in a MySQL database so that inserts are simple (i.e. you don't have to re-index a bunch of stuff every time) and such that the list can easily be pulled out in order?

Using Adrian's solution, but instead of incrementing by 1, increment by 10 or even 100. Then insertions can be calculated at half of the difference of what you're inserting between without having to update everything below the insertion. Pick a number large enough to handle your average number of insertions - if its too small then you'll have to fall back to updating all rows with a higher position during an insertion.

create a table with two self referencing columns PreviousID and NextID. If the item is the first thing in the list PreviousID will be null, if it is the last, NextID will be null. The SQL will look something like this:
create table tblDummy
{
PKColumn int not null,
PreviousID int null,
DataColumn1 varchar(50) not null,
DataColumn2 varchar(50) not null,
DataColumn3 varchar(50) not null,
DataColumn4 varchar(50) not null,
DataColumn5 varchar(50) not null,
DataColumn6 varchar(50) not null,
DataColumn7 varchar(50) not null,
NextID int null
}

Store an integer column in your table called 'position'. Record a 0 for the first item in your list, a 1 for the second item, etc. Index that column in your database, and when you want to pull your values out, sort by that column.
alter table linked_list add column position integer not null default 0;
alter table linked_list add index position_index (position);
select * from linked_list order by position;
To insert a value at index 3, modify the positions of rows 3 and above, and then insert:
update linked_list set position = position + 1 where position >= 3;
insert into linked_list (my_value, position) values ("new value", 3);

A linked list can be stored using recursive pointers in the table. This is very much the same hierarchies are stored in Sql and this is using the recursive association pattern.
You can learn more about it here (Wayback Machine link).
I hope this helps.

The simplest option would be creating a table with a row per list item, a column for the item position, and columns for other data in the item. Then you can use ORDER BY on the position column to retrieve in the desired order.
create table linked_list
( list_id integer not null
, position integer not null
, data varchar(100) not null
);
alter table linked_list add primary key ( list_id, position );
To manipulate the list just update the position and then insert/delete records as needed. So to insert an item into list 1 at index 3:
begin transaction;
update linked_list set position = position + 1 where position >= 3 and list_id = 1;
insert into linked_list (list_id, position, data)
values (1, 3, "some data");
commit;
Since operations on the list can require multiple commands (eg an insert will require an INSERT and an UPDATE), ensure you always perform the commands within a transaction.
A variation of this simple option is to have position incrementing by some factor for each item, say 100, so that when you perform an INSERT you don't always need to renumber the position of the following elements. However, this requires a little more effort to work out when to increment the following elements, so you lose simplicity but gain performance if you will have many inserts.
Depending on your requirements other options might appeal, such as:
If you want to perform lots of manipulations on the list and not many retrievals you may prefer to have an ID column pointing to the next item in the list, instead of using a position column. Then you need to iterative logic in the retrieval of the list in order to get the items in order. This can be relatively easily implemented in a stored proc.
If you have many lists, a quick way to serialise and deserialise your list to text/binary, and you only ever want to store and retrieve the entire list, then store the entire list as a single value in a single column. Probably not what you're asking for here though.

This is something I've been trying to figure out for a while myself. The best way I've found so far is to create a single table for the linked list using the following format (this is pseudo code):
LinkedList(
key1,
information,
key2
)
key1 is the starting point. Key2 is a foreign key linking to itself in the next column. So your columns will link something link something like this
col1
key1 = 0,
information= 'hello'
key2 = 1
Key1 is primary key of col1. key2 is a foreign key leading to the key1 of col2
col2
key1 = 1,
information= 'wassup'
key2 = null
key2 from col2 is set to null because it doesn't point to anything
When you first enter a column in for the table, you'll need to make sure key2 is set to null or you'll get an error. After you enter the second column, you can go back and set key2 of the first column to the primary key of the second column.
This makes the best method to enter many entries at a time, then go back and set the foreign keys accordingly (or build a GUI that just does that for you)
Here's some actual code I've prepared (all actual code worked on MSSQL. You may want to do some research for the version of SQL you are using!):
createtable.sql
create table linkedlist00 (
key1 int primary key not null identity(1,1),
info varchar(10),
key2 int
)
register_foreign_key.sql
alter table dbo.linkedlist00
add foreign key (key2) references dbo.linkedlist00(key1)
*I put them into two seperate files, because it has to be done in two steps. MSSQL won't let you do it in one step, because the table doesn't exist yet for the foreign key to reference.
Linked List is especially powerful in one-to-many relationships. So if you've ever wanted to make an array of foreign keys? Well this is one way to do it! You can make a primary table that points to the first column in the linked-list table, and then instead of the "information" field, you can use a foreign key to the desired information table.
Example:
Let's say you have a Bureaucracy that keeps forms.
Let's say they have a table called file cabinet
FileCabinet(
Cabinet ID (pk)
Files ID (fk)
)
each column contains a primary key for the cabinet and a foreign key for the files. These files could be tax forms, health insurance papers, field trip permissions slips etc
Files(
Files ID (pk)
File ID (fk)
Next File ID (fk)
)
this serves as a container for the Files
File(
File ID (pk)
Information on the file
)
this is the specific file
There may be better ways to do this and there are, depending on your specific needs. The example just illustrates possible usage.

There are a few approaches I can think of right off, each with differing levels of complexity and flexibility. I'm assuming your goal is to preserve an order in retrieval, rather than requiring storage as an actual linked list.
The simplest method would be to assign an ordinal value to each record in the table (e.g. 1, 2, 3, ...). Then, when you retrieve the records, specify an order-by on the ordinal column to get them back in order.
This approach also allows you to retrieve the records without regard to membership in a list, but allows for membership in only one list, and may require an additional "list id" column to indicate to which list the record belongs.
An slightly more elaborate, but also more flexible approach would be to store information about membership in a list or lists in a separate table. The table would need 3 columns: The list id, the ordinal value, and a foreign key pointer to the data record. Under this approach, the underlying data knows nothing about its membership in lists, and can easily be included in multiple lists.

This post is old but still going to give my .02$. Updating every record in a table or record set sounds crazy to solve ordering. the amount of indexing also crazy, but it sounds like most have accepted it.
Crazy solution i came up with to reduce updates and indexing is to create two tables (and in most use cases you don's sort all records in just one table anyway). Table A to hold the records of the list being sorted and table B to group and hold a record of the order as a string. the order string represents an array that can be used to order the selected records either on the web server or browser layer of a webpage application.
Create Table A{
Id int primary key identity(1,1),
Data varchar(10) not null
B_Id int
}
Create Table B{
Id int primary key Identity(1,1),
GroupName varchat(10) not null,
Order varchar(max) null
}
The format of the order sting should be id, position and some separator to split() your string by. in the case of jQuery UI the .sortable('serialize') function outputs an order string for you that is POST friendly that includes the id and position of each record in the list.
The real magic is the way you choose to reorder the selected list using the saved ordering string. this will depend on the application you are building. here is an example again from jQuery to reorder the list of items: http://ovisdevelopment.com/oramincite/?p=155

https://dba.stackexchange.com/questions/46238/linked-list-in-sql-and-trees suggests a trick of using floating-point position column for fast inserts and ordering.
It also mentions specialized SQL Server 2014 hierarchyid feature.

I think its much simpler adding a created column of Datetime type and a position column of int, so now you can have duplicate positions, at the select statement use the order by position, created desc option and your list will be fetched in order.

Increment the SERIAL 'index' by 100, but manually add intermediate values with an 'index' equal to Prev+Next / 2. If you ever saturate the 100 rows, reorder the index back to 100s.
This should maintain sequence with primary index.

A list can be stored by having a column contain the offset (list index position) -- an insert in the middle is then incrementing all above the new parent and then doing an insert.

You could implement it like a double ended queue (deque) to support fast push/pop/delete(if oridnal is known) and retrieval you would have two data structures. One with the actual data and another with the number of elements added over the history of the key. Tradeoff: This method would be slower for any insert into the middle of the linked list O(n).
create table queue (
primary_key,
queue_key
ordinal,
data
)
You would have an index on queue_key+ordinal
You would also have another table which stores the number of rows EVER added to the queue...
create table queue_addcount (
primary_key,
add_count
)
When pushing a new item to either end of the queue (left or right) you would always increment the add_count.
If you push to the back you could set the ordinal...
ordinal = add_count + 1
If you push to the front you could set the ordinal...
ordinal = -(add_count + 1)
update
add_count = add_count + 1
This way you can delete anywhere in the queue/list and it would still return in order and you could also continue to push new items maintaining the order.
You could optionally rewrite the ordinal to avoid overflow if a lot of deletes have occurred.
You could also have an index on the ordinal to support fast ordered retrieval of the list.
If you want to support inserts into the middle you would need to find the ordinal which it needs to be insert at then insert with that ordinal. Then increment every ordinal by one following that insertion point. Also, increment the add_count as usual. If the ordinal is negative you could decrement all of the earlier ordinals to do fewer updates. This would be O(n)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

postgresql sequence number depending on rows? - sql

Use: Select row_number() over (partition by course order by since asc) as yournumber, id, course, since from InQueue You can read about analytic functions here: http://www.postgresql.org/docs/9.4/static/tutorial-window.html

Related

Create a row which does not violate a unique index

How to produce a reproducible column of random integers in SQL

PostgreSQL Sequence Ascending Out of Order

Latest entry for append-only log in postgresql

Linked List in SQL

Categories

Resources