Latest entry for append-only log in postgresql - sql

I want to store users' settings in a postgresql database.
I would like to keep full history of their settings, and also be able to query the latest settings for a given user.
I have tried storing settings in a table like this:
CREATE TABLE customer (
customer_id INTEGER PRIMARY KEY,
name VARCHAR NOT NULL
);
CREATE TABLE customer_settings (
customer_id INTEGER REFERENCES customer NOT NULL,
sequence INTEGER NOT NULL, -- start at 1 and increase, set by the application
settings JSONB NOT NULL,
PRIMARY KEY(customer_id, sequence)
);
So customer_settings is an append-only log, per customer.
Then to query latest settings I use a long query that will do a subquery to SELECT the max sequence for the given customer_id, then will select the settings for that id.
This is awkward! I wonder if there is a better way? May I use a view or a trigger to make a second table latest_customer_settings??

You can make a view. To get the settings for multiple customers in Postgres, I would recommend:
select distinct on (customer_id)
from customer_settings cs
order by customer_id, sequence desc;
And for this query, I would recommend an index on customer_settings(customer_id, sequence desc).
In addition, you can have Postgres generate the sequence -- if you can deal with one overall sequence number for all customers.
CREATE TABLE customer_settings (
customer_settings_id bigserial primary key,
customer_id INTEGER REFERENCES customer NOT NULL,
settings JSONB NOT NULL
);
Then, the application does not need to set a sequence number. You can just insert customer_id and settings into the table.
Having the application maintain this information has some short-comings. First, the application has to read from the database before it can insert anything into the table. Second, you can have race conditions if multiple threads are updating the table at the same time (in this case for the same customer).

you can use row_number() window function , It will help to you to get each customers latest settings
with cte as (select cs.*,
row_number() over(partition by c.customer_id order by sequence desc) rn
from customer c join customer_settings cs on c.customerid=cs.customerid
) select * from cte where rn=1

Assuming you just want the single latest log for a given user, and also assuming that the sequence is always increasing and unique, then actually you only need a simple query:
SELECT *
FROM customer_settings
WHERE customer_id = 123
ORDER BY sequence DESC
LIMIT 1;
If you want to invest some time into creating a better logging framework, then try looking into things like MDC (Mapped Diagnostic Context, see here). With MDC, each log statement is written out with a completely unique identifier, which also gets sent in the response header or body. Then, it becomes easy and foolproof to correlate an exception between backend and frontend or consumer.

Related

Create a row which does not violate a unique index

I have a table with Id, AccountId, PeriodId, and Comment. I have a unique index for (AccountId, PeriodId). I want to create such a row which does not violate the unique index. AccountId and PeriodId cannot be null, they are foreign keys.
My only idea is to cross join the Account and Period table to get all valid combination and discard the already existing combinations and chose one from the rest?
Is it a good approach or is there a better way?
Update 1: Some of you wanted to know the reason. I need this functionality in a program which adds a new row to the current table and only allows the user to modify the row after it is already in the db. This program now uses a default constructor to create a new row with AccountId1 and PeriodId1, empty comment. If this combination is available then after insertion the user can change it and provide a meaningful comment (there can be at most one comment for each (AccountId, PeriodId). But if the combination is not available then the original insertion will fail. So my task is to give a combination to the program which is not used, so it can be safely inserted.
As it turns out my original idea is quick enough. This query returns an unused (AccountId, PeriodId).
select top 1 *
from
(
select Account.Id as AccountId, [Period].Id as PeriodId
from Account cross join [Period]
except
select AccountId, PeriodId
from AccountPeriodFilename
) as T

SQL Server : how to delete a row number

I have a SQL Server table (as shown above) and I am ordering it in a table on my website by using this command
SELECT *
FROM [user]
ORDER BY idNum DESC;
This table (running on my website) has all the information my database holds (at least for the [user] table)
I have buttons to delete a row off the information (it gets the row number that I want to delete), as shown in this screenshot:
What I want to ask is there a way to delete a row using a row number?(Cause I get a row number off the button click I just want to delete that row)?
You could use a CTE here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY idNum DESC) rn
FROM [user]
)
Then delete using a row number, coming from the UI:
DELETE
FROM cte
WHERE rn = <some value from UI>;
But many (most?) UI frameworks, e.g. Angular, would have the ability to send the entire metadata for a user to the UI. So, you would typically be able to delete just using the idNum value coming directly from the button in the UI. As #marc_s just commented, deleting using the primary key is a safe way to do deletions.
you can use ROW_NUMBER()
DELETE FROM USER
WHERE idnum IN (SELECT idnum from (SELECT idnum,ROW_NUMBER() OVER(ORDER BY
idnum desc) AS rw FROM USER) res WHERE res.rw = #rw)
#rw is your row number
Your method of using row_number() is simply wrong. It is not thread-safe -- another user could be added into the database, throwing off all the "row numbers" that you have shown to users. Gosh, a user to load a page and keep the app open for a week, and a bunch of new users could have added or removed before the user gets around to deleting something.
Fundamentally, you have a malformed table. It doesn't have a primary key or even any unique constraints.
I would strongly advise:
create table Users (
userId int identity(1, 1) primary key,
. . .
);
This is the primary key that you should be using for deletion. You don't need to show the primary key to the user, as long as you can associate it with each row.
Primary keys are important for other reasons. In general, one uses databases because to store more than one table. The primary key is how you connect the different tables to each other (using foreign key relationships).

postgresql sequence number depending on rows?

Taking a course in databases and i am unsure of how to create this view.
I have this table(postgresql):
CREATE TABLE InQueue (
id INT REFERENCES Student(id),
course VARCHAR(10) REFERENCES RestrictedCourse(course_code),
since TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id,course),
UNIQUE (course,since)
);
I am supposed to create a view that lists course, id, number, where number is calculated with since. Basically the lowest since gives the queuenumber 1, the 2nd lowest gives 2 and so forth. (course,number) is unique, but not number by itself, since there are many different courses.
What i think needs to be done is to first order the table by (course,since) and then just add sequence numbers, but eventually the course will change and then the sequence numbers need to start over, beginning with 1 again.
Could someone point me in the right direction? :)
Use:
Select row_number() over (partition by course order by since asc) as yournumber, id,
course, since from InQueue
You can read about analytic functions here: http://www.postgresql.org/docs/9.4/static/tutorial-window.html

Creating a column of an SQL table to number records based on their order

I have a web program (PHP and JavaScript) that needs to display entries in the table based on how recently they were added. To accomplish this I want to have one column of the table represent the entry number while another would represent what it should be displayed at.
For example, if I had 10 records, ID=10 would correspond to Display=1 . I was wondering if there would be a simple way to update this, ordering by the entry ID and generating the display IDs accordingly.
Your question is a little vague, but here goes....
Normally IDs ascend, with the highest ID being the most recently added, so you can ORDER by ID desc in your query to determine which should be displayed. The results you get from the query will be the display order.
Why aren't you making use of SQLServer's default column values? Have a look here to see an example: Add default value of datetime field in SQL Server to a timestamp
For example, you have a table like this:
create table test (
entry_id number,
message varchar(100),
created_time datetime default GETDATE()
);
Then you can insert like
insert into test values (1, "test1");
insert into test values (2, "test2");
And select like
select entry_id, message from test order by created_time desc
There are lots of ways to do this. As the others have noted, it wouldn't be common practice to store the reverse or inverted id like this. You can get the display_id several ways. These come to mind quickly:
CREATE TABLE test (entry_id INT)
GO
INSERT INTO test VALUES (1),(2),(3)
GO
--if you trust your entry_id is truly sequential 1 to n you can reverse it for the display_id using a subquery
SELECT entry_id,
(SELECT MAX(entry_id) + 1 FROM test) - entry_id display_id
FROM test
--or a cte
;WITH cte AS (SELECT MAX(entry_id) + 1 max_id FROM test)
SELECT entry_id,
max_id - entry_id display_id
FROM test
CROSS APPLY
cte
--more generally you can generate a row easily since sql server 2005
SELECT entry_id
,ROW_NUMBER() OVER (ORDER BY entry_id DESC) display_id
FROM test
You could use any of those methods in a view to add display id. Normally I'd recommend you just let you presentation layer handle the numbering for display, but if you intend to query back against it you might want to persist it. I can only see storing it if the writes are infrequent relative to reads. You could update a "real" column using a trigger. You could also create a persisted computed column.

SQL Server view design

I have a database that I'm running large queries against that I want to simplify with a view. Though there's more of them, the tables basically look like this (pseudo code):
TABLE document (
Id int PRIMARY KEY,
/*more metadata columns*/
)
TABLE name (
Id int PRIMARY KEY,
documentId int FOREIGN KEY REFERENCES document.Id,
date DATETIME,
text varchar(MAX)
)
TABLE description (
Id int PRIMARY KEY,
documentId int FOREIGN KEY REFERENCES document.Id,
date DATETIME,
text varchar(MAX)
)
So the idea is that the 'document' table contains the basic information about a document and the Id that ties the rest of the tables to the document. All other tables are for individual attributes of the document that are updateable. Each update gets its own row with a timestamp. What I want the view to pull is one row per document with the most up to date versions of each attribute contained in the other tables (if this needs further elaboration or an example, please let me know and I will provide). What is the least convoluted way I can pull this off? Using SQL Server 2008.
A view won't increase efficiency. A view is just a macro that expands.
There is no magic in a view: but can suffer if you join onto this view because the expanded queries can get massive.
You can index a view, but these work best with Enterprise Edition unless you want to use the NOEXPAND hint all over.
That said, the query is quite easy: unless you want to index the view when you have limitations.
One approach is the CTE as per Stuart Ainsworth's approach. Another is the "Max one per group" approach I described here on dba.se. Neither of these are safe for indexed views.
You could use a CTE for each attribute inside the view to return the latest attribute values for the documentid, like so:
; WITH cName AS
(SELECT *
FROM (SELECT ID, documentID,
date, text,
ranking = ROW_NUMBER () OVER (PARTITION BY documentID ORDER BY date DESC)
FROM name) x
WHERE ranking = 1),
.... [more CTE's here]
SELECT columnlist
FROM document d JOIN cName cn ON d.id=cn.documentid
Sql server 2008 supports computed column in the index. So you could set a column - "is_latest" as 1 for the row with latest time for that document_id. Now while querying you could use the is_latest column and it would be much faster. Refer - http://msdn.microsoft.com/en-us/library/ms189292.aspx