I'm looking over the dynamo documentation and it looks like they have optimistic. I'm wondering if this is used by default or not.
From the documentation, it looks like you need to code up the java application to use the #DynamoDBVersionAttribute annotation and get and set the versions. Without doing this, it looks like you can write to DynamoDB without any sort of locking.
Is that correct?
On a side note, I'm not too familiar with DBs without some sort of locking so what would happen if 2 people wrote to the same item at the same time in DynamoDB without any locking? Say the item we're writing to has 4 fields, would one write completely fail or is it possible that DynamoDB updates 2/4 fields with 1 write, and the other 2 fields with the other write?
You are correct. DynamoDB does NOT have optimistic locking by default. There are various SDKs for DynamoDB and as far as I am aware the only one which provides optimistic locking functionality is the Java SDK.
Here's what the Java SDK optimistic locking actually supports:
Creates an attribute in your table called version
You must load an item from the database before updating it
When you try and save an item the SDK tests that the client item version number matches the one in the table, and if it does, the save is completed and the version number is incremented
This is pretty simple to implement yourself if you are using a different SDK. You would create the version attribute yourself. You would create a wrapper for the putItem method (and any other required save/update operations). You would use the Condition Expression to test that the version number in the database is one less than the version you saving.
To answer the second part of your question, both updates would succeed (assuming you had put no conditions on your update). The first one would make any updates specified, and the second one would come along and overwrite them.
Dynamodb doesn't support optimistic locking by default. As you mentioned, you need to use the annotation in the Java model class in order to use the optimistic locking.
If two threads write to the same item, the Dynamodb item will have the last write data (i.e. last thread which writes the data).
Related
In an earlier blog written by members of CockroachDB: https://www.cockroachlabs.com/blog/sql-in-cockroachdb-mapping-table-data-to-key-value-storage/, the author states that CockroachDB's key-value API supports a ConditionalPut(key, value, expected-value). Given that CockroachDB was built on RocksDB, how were they able to support conditional put?
CockroachDB implements ConditionalPut using the same mechanism it uses for ACID read-write transactions. Key-values are stored along with a multi-version concurrency control timestamp. To do a ConditionalPut, the storage client reads the existing value "as of" the same timestamp it's going to write the new value at. Since the write being discussed here is the write to a secondary index, there's already an implicit or explicit transaction happening, so there's no extra overhead beyond the read to check the precondition.
Dacpac is nice solution for versioning schema and we have to use pre/post deployment to amend the reference data.
Any better solution to do that?
The best way I have seen is to use merge statements, one table per file and import them into your post-deploy script using :r imports.
You get version history and easily comparable data and using sp_generate_merge makes it really simple.
Ed
If you're looking for a solution in SSDT to handle reference that does not involve the use of pre/post-deployment scripts, unfortunately there currently isn't one.
But it is currently one of the most requested features in SSDT so perhaps there's a chance it will get implemented some time in the future.
I am the maintainer of the sp_generate_merge OSS utility Ed mentioned, and at Redgate we recommend this approach to handling reference data in an offline way to our customers, in the following circumstances:
If the data within the table is changing very frequently, as the one-file-per-table approach allows for branching/merging of concurrent changes from multiple developers
If the data in the table contains environment-specific values, like application settings, as this method allows you to use SQLCMD variables and feed the values in from a deployment tool
Where the offline approach can be problematic:
Non-determinism of the MERGE statement: before actually running the deployment against your target environment, it can be difficult to know what changes will be applied (if any). Worst case scenario, you could hit one of the documented issues in MERGE
The workflow isn't necessarily the most natural way to edit data, as it requires running the utility proc and copying+pasting the output back into the original file. Editing the file directly is an alternative, but isn't the most user-friendly experience especially with large amounts of reference data
Co-ordinating changes to both the schema and data within the reference table can be a challenge, given that SSDT is still responsible applying the schema changes. For example, if you want to add a new NOT NULL column without a default.
Another solution involves following an online approach, which is supported by our SSDT-alternative, ReadyRoll database projects. It allows the data to be edited directly in the database and subsequently imported into the project, with the sync script (i.e. containing INSERT, UPDATE, DELETE statements instead of MERGE) generated by its data comparison tool, alongside any schema changes.
You can read more about how the offline and online approaches differ in the ReadyRoll documentation.
I want to make a sequence of in-memory operations atomic. I presume there is no framework supplied functionality for this and that I would have to implement my own rollback functionality using memento (or something)?
If it needs to be really atomic there is no such thing AFAIK in the Framework itself - an interesting link discussing this issue.
What you ask is called STM (Software Transactional Memory) and is an inherent part for example of Haskell.
Basically any implementation uses some sort of copy meachnism - either keeping the old data till the transaction is commited OR makring a copy first and then do all "changes" on the copy and switch references on commit... anyway always some log and/or copying mechanism involved...
For C# check these links out:
http://research.microsoft.com/en-us/downloads/6cfc842d-1c16-4739-afaf-edb35f544384/default.aspx
http://download.microsoft.com/download/9/5/6/9560741A-EEFC-4C02-822C-BB0AFE860E31/STM_User_Guide.pdf
http://blogs.msdn.com/b/stmteam/
IF F# is an option then check these links out:
http://cs.hubfs.net/blogs/hell_is_other_languages/archive/2008/01/16/4565.aspx
http://geekswithblogs.net/Podwysocki/archive/2008/02/07/119387.aspx
Another option could be to use an "in-memory-Database" - there are several out there with transaction support thus providing atomic operation via the DB... as long as the DB is "in-memory" it should perform well
I have an entity with multiple fields. There are two types of actions that may be performed on it: a long one, usually initiated by the user, and a short one, which is periodically run by the system. Both of these update the entity, but they touch different fields.
There can't be two concurrent long operations or two concurrent short operations. But the system may schedule a short operation while a long operation is in progress, and the two should execute concurrently. Since they touch different fields, I believe this should be possible.
I think NHibernate's change tracking should do the trick here - i.e., if one session loads an entity and updates some fields, and another session loads the same entity and updates different fields, then the two will not collide. However, I feel I shouldn't be relying on this because it sounds like an "optimization" or an "implementation detail". I tend to think of change tracking as an optimization to reduce database traffic, I don't want the functionality of the system to depend on it. Also, if I ever decide to implement optimistic concurrency for this entity, then I risk getting a StaleObjectException, even though I can guarantee that there is no actual collision.
What would be the best way to achieve this? Should I split the entity into two? Can't this affect database consistency (e.g. what if only one "half" of the entity is in the DB)? Can I use NHibernate to explicitly set only a single field of an entity? Am I wrong in not wanting to rely on change tracking to achieve functionality?
If it matters, I'm using Fluent NHibernate.
You could map the entity using dynamic update.
dynamic-update (optional, defaults to false): Specifies that UPDATE SQL should be generated at runtime and contain only those columns whose values have changed.
If you enable dynamic-update, you will have a choice of optimistic locking strategies:
version check the version/timestamp columns
all check all columns
dirty check the changed columns
none do not use optimistic locking
More information here.
In other words, what are the main reasons to use it?
Thanks
Versioning is commonly used to implement a form of concurrency. In tables that can be accessed from different sources at the same time, a column named version is used. Nhibernate notes down the version of an object when it reads it, and when it tries to update it, it first checks that the version hasn't changed. On updating a row, the version column is incremented.