Hsqldb 2.5.0 SYSTEM VERSIONING - hsqldb

I would like to use the system versioning coming with version 2.5.0 but when I launch a request of the type:
SELECT firstname, lastname, email FROM customer FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP - 1 YEAR
It does not find any record when there are some which have just been inserted ...

Your query means "return the rows that existed a year ago". Because you inserted the rows recently, or added system versioning recently, the query returns no rows.

As I have found some of the things I am looking to do with the system versioning coming with Hsqldb 2.5.x, I allow myself to post this answer.
To retrieve the previous version of a row, which is the last version before the current version, or more simply to find all the UPDATE made for 1 year, simply execute the query:
SELECT previous.* FROM customer AS current, customer FOR SYSTEM_TIME
FROM CURRENT_TIMESTAMP - 1 YEAR TO CURRENT_TIMESTAMP AS previous
WHERE current.customerid = ? AND previous.customerid = ? AND
current.start = previous.stop;
Assuming that:
start: is the column declared as TIMESTAMP GENERATED ALWAYS AS ROW START.
stop: is the column declared as TIMESTAMP GENERATED ALWAYS AS ROW END.
customerid: is the primary key to the customer table, because I have to find the UPDATE for each customer.
I now have to find the INSERT and DELETE queries, I would come back for that.
PS: I didn't invent anything, it comes from BD2 documentation, so it may not be optimal, but it works with Hsqldb system versioning.
Edit: This answer is not satisfactory, see here for something much better

Related

How should I reliably mark the most recent row in SQL Server table?

The existing design for this program is that all changes are written to a changelog table with a timestamp. In order to obtain the current state of an item's attribute we JOIN onto the changelog table and take the row having the most recent timestamp.
This is a messy way to keep track of current values, but we cannot readily change this changelog setup at this time.
I intend to slightly modify the behavior by adding an "IsMostRecent" bit to the changelog table. This would allow me to simply pull the row having that bit set, as opposed to the MAX() aggregation or recursive seek.
What strategy would you employ to make sure that bit is always appropriately set? Or is there some alternative you suggest which doesn't affect the current use of the logging table?
Currently I am considering a trigger approach, which turns the bit off all other rows, and then turns it on for the most recent row on an INSERT
I've done this before by having a "MostRecentRecorded" table which simply has the most recently inserted record (Id and entity ID) fired off a trigger.
Having an extra column for this isn't right - and can get you into problems with transactions and reading existing entries.
In the first version of this it was a simple case of
BEGIN TRANSACTION
INSERT INTO simlog (entityid, logmessage)
VALUES (11, 'test');
UPDATE simlogmostrecent
SET lastid = ##IDENTITY
WHERE simlogentityid = 11
COMMIT
Ensuring that the MostRecent table had an entry for each record in SimLog can be done in the query but ISTR we did it during the creation of the entity that the SimLog referred to (the above is my recollection of the first version - I don't have the code to hand).
However the simple version caused problems with multiple writers as could cause a deadlock or transaction failure; so it was moved into a trigger.
Edit: Started this answer before Richard Harrison answered, promise :)
I would suggest another table with the structure similar to below:
VersionID TableName UniqueVal LatestPrimaryKey
1 Orders 209 12548
2 Orders 210 12549
3 Orders 211 12605
4 Orders 212 10694
VersionID -- being the tables key
TableName -- just in case you want to roll out to multiple tables
UniqueVal -- is whatever groups multiple rows into a single item with history (eg Order Number or some other value)
LatestPrimaryKey -- is the identity key of the latest row you want to use.
Then you can simply JOIN to this table to return only the latest rows.
If you already have a trigger inserting rows into the changelog table this could be adapted:
INSERT INTO [MyChangelogTable]
(Primary, RowUpdateTime)
VALUES (#PrimaryKey, GETDATE())
-- Add onto it:
UPDATE [LatestRowTable]
SET [LatestPrimaryKey] = #PrimaryKey
WHERE [TableName] = 'Orders'
AND [UniqueVal] = #OrderNo
Alternatively it could be done as a merge to capture inserts as well.
One thing that comes to mind is to create a view to do all the messy MAX() queries, etc. behind the scenes. Then you should be able to query against the view. This way would not have to change your current setup, just move all the messiness to one place.

SQL Latest Date by circuit ID

I would like help with this sql. I need to find the latest date for each circuit ID (field name-strip_ec_circuit_id) based on a created date(field name-create_bill_date). I need to only find the latest date, while the other ones can be deleted. Can you help me do this?
This is a basic group by query that should work in any database:
select strip_ec_circuit_id, max(create_bill_date) as lastDate
from t
group by strip_ec_circuit_id
I'm not sure what you mean by delete all the others. Do you actually want to delete the rows from the table that are not the max?

How to query the number of changes that have been made to a particular column in SQL

I have a database with a column that I want to query the amount of times it has changed over a period of time. For example, I have the username, user's level, and date. How do I query this database to see the number of times the user's level has changed over x amount of years?
(I've looked in other posts on stackoverflow, and they're telling me to use triggers. But in my situation, I want to query the database for the number of changes that has been made. If my question can't be answered, please tell me what other columns might I need to look into to figure this out. Am I supposed to use Lag for this? )
A database will not inherently capture this information for you. Two suggestions would be to either store your data as a time series so instead of updating the value you add a new row to a table as the new current value and expire the old value. The other alternative would be to just add a new column for tracking the number of updates to the column you care about. This could be done in code or in a trigger.
Have you ever heard of the LOG term ?
You have to create a new table, in wich you will store your wanted changes.
I can imagine this solution for the table:
id - int, primary key, auto increment
table - the table name where the info has been changed
table_id - the information unique id from the table where changes
have been made
year - integer
month - integer
day - integer
knowin this, you can count everything
In case you are already keeping track of the level history by adding a new row with a different level and date every time a user changes level:
SELECT username, COUNT(date) - 1 AS changes
FROM table_name
WHERE date >= '2011-01-01'
GROUP BY username
That will give you the number of changes since Jan 1, 2011. Note that I'm subtracting 1 from the COUNT. That's because a user with a single row on your table has never changed levels, that row represents the user's initial level.

Efficient sliding window sum over a database table

A database has a transactions table with columns: account_id, date, transaction_value (signed integer). Another table (account_value) stores the current total value of each account, which is the sum of all transaction_values per account. It is updated with a trigger on the transactions table (i.e., INSERTs, UPDATEs and DELETEs to transactions fire the trigger to change the account_value.)
A new requirement is to calculate the account's total transaction value only over the last 365 days. Only the current running total is required, not previous totals. This value will be requested often, almost as often as the account_value.
How would you implement this "sliding window sum" efficiently? A new table is ok. Is there a way to avoid summing over a year's range every time?
This can be done with standard windowing functions:
SELECT account_id,
sum(transaction_value) over (partition by account_id order by date)
FROM transactions
The order by inside the over() claues makes the sum a "sliding sum".
For the "only the last 356 days" you'd need a second query that will limit the rows in the WHERE clause.
The above works in PostgreSQL, Oracle, DB2 and (I think) Teradata. SQL Server does not support the order by in the window definition (the upcoming Denali version will AFAIK)
As simple as this?
SELECT
SUM(transaction_value), account_id
FROM
transactions t
WHERE
-- SQL Server, Sybase t.DATE >= DATEADD(year, -1, GETDATE())
-- MySQL t.DATE >= DATE_SUB(NOW(), INTERVAL 12 MONTH)
GROUP BY
account_id;
You may want to remove the time component from the date expressions using DATE (MySQL) or this way in SQL Server
If queries of the transactions table are more frequent than inserts to the transactions table, then perhaps a view is the way to go?
You are going to need a one-off script to populate the existing table with values for the preceding year for each existing record - that will need to run for the whole of the previous year for each record generated.
Once the rolling year column is populated, one alternative to summing the previous year would be to derive each new record's value as the previous record's rolling year value, plus the transaction value(s) since the last update, minus the transaction values between one year prior to the last update and one year ago from now.
I suggest trying both approaches against realistic test data to see which will perform better - I would expect summing the whole year to perform at least as well where data is relatively sparse, while the difference method may work better if data is to be frequently updated on each account.
I'll avoid any actual SQL here as it varies a lot depending on the variety of SQL that you are using.
You say that you have a trigger to maintain the existing running total.
I presume that it also (or perhaps a nightly process) creates new daily records in the account_value table. Then INSERTs, UPDATEs and DELETEs fire the trigger to add or subtract from the existing running total?
The only changes you need to make are:
- add a new field, "yearly_value" or something
- have the existing trigger update that in the same way as the existing field
- use gbn's type of answer to create today's records (or however far you backdate)
- but initialise each new daily record in a slightly different way...
When you insert a new row for a new day, it should be initialised to yesterday's value - the value 365 days ago. After that, the behavior should be identical to what you're already used to.

How to get last inserted records(row) in all tables in Firebird database?

I have a problem. I need to get last inserted rows in all tables in Firebird db. And one more, these rows must contain specfied column name. I read some articles about rdb$ but have a few experience with that one.
There is no reliable way to get "last row inserted" unless the table has a timestamp field which stores that information (insertion timestamp).
If the table uses integer PK generated by sequense (generator in Firebird lingo) then you could query for the higest PK value but this isn't reliable either.
There is no concept of 'last row inserted'. Visibility and availability to other transactions depends on the time of commit, transaction isolation specified etc. Even use of a generator or timestamp as suggested by ain does not really help because of this visibility issue.
Maybe you are better of specifying the actual problem you are trying to solve.
SELECT GEN_ID(ID_HEDER,0)+1 FROM ANY_TABLE INTO :ID;
INSERT INTO INVOICE_HEADER (No,Date_of,Etc) VALUES ('122','2013-10-20','Any text')
/* ID record of INVOICE_HEADER table gets the ID_number from the generator above. So
now we have to check if the ID =GEN_ID(ID_HEADER,0) */
IF (ID=GEN_ID(ID_HEADER,0)) THEN
BEGIN
INSERT INTO INVOICE_FOOTER (RELACION_ID, TEXT, Etc) Values (ID, 'Text', Etc);
END
ELSE
REVERT TRANSACTION
That is all