I'm planning to move from SAP SQLAnywhere + NHibernate + Envers to RavenDB4.
I'm doing some migration and performance tests. At this time, I'm looking for an way to migrate Envers Audit to Raven revision.
What I'm trying:
Read source audit record.
Insert it on RavenDB.
Patch it, for every record on audit.
Everything is working fine, but I need to change raven revision timestamp from migration point to saved Envers timestamp.
Any clue to do that or perhaps a better way to do this migration?
I want to avoid create another collection to legacy data or adding original timestamp as metadata.
thank you
There is no way to modify the revision data. This is done as part of the frozen nature of revisions.
Related
What is the best approach to load only the Delta into the analytics DB from a highly transactional DB?
Note:
We have a highly transactional system and we are building an analytic database out of it. At present, we are wiping off all the fact and dimension tables from the analytics DB and loading the entire "processed" data at midnight. Problem with this approach is that, we are loading the same data again and again every time along with the few new data that got added/updated on that particular day. We need to load the "Delta" alone (rows which are inserted newly & the old rows which got updated). Any efficient way to do this?
It is difficult to tell something without knowing the details e.g. the database schema, the database engine... However the most natural approach for me is to use timestamps. This solution assumes that entities (single record in a table, or group of related records) that are loaded/migrated from a transactional DB into an analytic one have a timestamp.
This timestamp says when given entity was created or updated the last time. While loading/migrating data you should take into account only these entities for each the timestamp > the date of the last migration. This approach has this advantage that is quite simple and does not require any specific tool. The question is if you already have timestamps in your DB.
Another approach might be to utilize some kind of change tracking mechanism. For example MMSQL server has something like that (see this article). However, I have to admit that I've never used it so I'm not sure if it is suitable in this case. If your database doesn't support change tracking, you can try to create it on your own based on triggers, but in general it is not easy thing to do.
We need to load the "Delta" alone (rows which are inserted newly & the old rows which got updated). Any efficient way to do this?
You forgot rows that got deleted. And that is the crux of the problem. Having a updated_at field on every table and polling for rows with updated_at > #last_poll_time works, more or less, but polling like this does not give you a transaction ally consistent image because each table is polled at a different moment. Tracking deleted rows induces complications at app/data model layer, as rows have to be either logically deleted (is_deleted) or moved to an archive table (for each table!).
Another solution is to write triggers in the database, attach a trigger to each table, and have the trigger write into table_history the changes that occurred. Again, for each table. These solutions are notoriously difficult to maintain long term in presence of schema changes (columns added, modified, tables dropped etc etc)
But there are database specific solution that can help. For instance SQL Server has Change Tracking and Change Data Capture. These can be leveraged to build an ETL pipeline that maintains an analytical data warehouse. Database schema changes are still a pain, though.
There is no silver bullet, no pixie dust.
I was wondering if anyone had succeeded in auditing a native query (SQL) with Hibernate Envers? I know this is probably just wrong, but it would spare me a lot of refactoring time.
Cheers
Nick
I just want to leave my thoughts here so others might benefit when they choose to Envers. We tried Hibernate envers in one our recent project and it did not work out. Below are the reason
Hibernate Envers captures the Audit information only when the updates happen through Persistence Context.
We did not like one audit table for each entity. It was too much schema pollution.
We have lot of batch jobs and data synchronization scripts that updates data directly using sql queries. Any update that is happening outside the persistence context will not be captured in these Hibernate ENvers created Audit tables.
SO we went with Database trigger appraoach with just only one AUDIT table which will capture the table_name, column_name, primary_key, old_value and new_value. It worked for us.
I am working on jhipster spring using angular js and database as "liquibase".Why We need to delete whole database when we have done change in our db-changelog.xml?.if i have add one field to old table in database then i have a get exception t_user table is already exist.which mean we have to remove t_user table or loss our data.please help and provide any other way to change our database without deleting whole database.
Thanks in advance
Yesterday, we released the version 0.11; which supports the generation of the changelogs containing only your changes. The changes are applied automatically on the database. No need to drop your database now.
Try it. http://jhipster.github.io/2014/02/19/jhipster-release-0.11.0.html
I have not used jhipster at all, but normal liquibase usage is to not keep dropping the database but rather append new changeSets to your db-changelog.xml file. For example, if you originally had a changeSet and need to add a column you append an changeSet. That way you don't lose data and liquibase keeps track of which changes have been ran against your databases.
I'm writing an application that is using a database (currently MySQL 4) to store data.
It is likely that I will make changes to this in the form of updates later to add additional data. Updating the application is simple, it essentially comes down to overwriting the program files with the new ones. However how do I go about updating the database schema?
The database is remote and so my application might exist in several places, so simply dumping the ALTER and CREATE statements in an installer would result in the changes being made multiple times, and I have been asked explicitly for an automatic solution that allows for the application copies to be updated over a transition period, and for schema updates to be automatic.
I considered examining the schema at start-up to look for missing tables and columns, and adding them as needed, however this does not seem like a clean solution. I also considered putting some kind of “schema version” number on the database, but can’t see any way to do this short of a single row table with an int “Version” column which doesn’t seem a good way either.
I can highly recommend Liquibase. It really does work - I've used it and was very impressed.
Essentially, it keeps its own log of statements run on a database and runs them only if not already run/needed. It is XML driven and allows you to use optional pre- and post-execution statements and conditions. You check your XML files into your source control and invoke it from your build tool. It's even suitable for driving production releases.
It's magic.
Rather than rolling your own system for versioning your database it's probably worth looking into an existing framework that will manage it for you.
I use liquibase and have integrated into my build using the maven plugin. Worth checking out!
Just as you proposed, add a table where you store the current version of the database schema. Then you only have to apply the changes between your last schema update and the new release, and set the new version number accordingly. I've done this to update our production database about 300 times, it just works.
I'd like to know if there is a scenario for versioning database with SVN which will ensure no conflicts when few developers try to commit changes simultaneously.
Me and my team have been using changescripts with increasing schema version number (similar to this solution: http://odetocode.com/blogs/scott/archive/2008/02/02/versioning-databases-change-scripts.aspx ).
It's a pretty good solution, but its main flaw is that conflicts can occur when multiple developers try to commit change script with the same schema number - it's not only a simple SVN conflict, but also requires users with that conflict to manually change database table with schema versions, revert their db changes, change script files' numbers to have all the db updates. Is it possible to avoid this obstacles? I don't mean technical solutions only, but maybe there is a better way to organize this task? Any ideas?
Some of these techniques + links could help you.
From SO:
Versioning SQL Server database
Mechanisms for tracking DB schema changes
Techniques:
http://www.jilles.net/perma/2003/10/17/database-versioning-techniques/
http://martinfowler.com/articles/evodb.html
Rails solved this exact problem by using a timestamp instead of an incrementing version number. The odds of two users creating new schema versions in the same second is pretty low.