I'm trying Unique Index implemantation with Redis db (ServiceStack Client)
Normally
Check Unique Index Duplication
If Unique Index Exists RETURN WITH WARNING
WATCH for Unique Index (for race-condition)
Open Transaction
Insert new record, Insert new records unique index
Close Transaction
How can I get rid of 1st step?
WATCH for existence. I'm not related with changing of key. I'm related with creation or existance. (surely out of my transaction)
If you are trying to use redis just for checking duplicated then use hashset:
http://redis.io/commands#hash
how do you use the servicestack client? with native client? typed client? (then i can show you how to do that)
and use that command: http://redis.io/commands/hsetnx
Related
I'm creating an application with Java Spring and Oracle DB.
In the app, I want to generate a primary key value that is unique as well as ordered and without gaps: 1,2,3,4,5 instead of 1,2,5,7,8,9.
I've at one point used max(id) + 1 to get the maximum value of the id, and the id of the next/current transaction. However I know it isn't perfect in the case of concurrency with multiple users or multiple sessions.
I've tried using sequences, but even with the ORDER tag it could still create gaps with the possibility of a failed transaction.
REATE SEQUENCE num_seq
START WITH 1
INCREMENT BY 1
ORDER NOCACHE NOCYCLE;
I need there to be gapless values as a requirement, however I'm unsure how it's possible in the case of multiple users/multiple sessions.
Don't do it.
The goal of primary keys is not to be displayed on the UI or to be exposed to the external world, but only to provide a unique identifier of the row.
In simple words, a primary key doesn't need to be sexy or good looking. It's an internal identifier.
If you are considering the idea of having serial identifier, that means you probably want to display it somewhere or you want to expose it to the external world. If that's the case, then create a secondary column (also unique) that serves this "public relations" goal. It can be automatically generated, or updated at leisure without affecting the integrity of the database.
It can also be generated by a secondary process that runs in a deferred way (e.g. every 10 minutes) that finds all the "unassigned" new rows, and gives them the new number. This has the advantage that is not vulnerable to concurrency.
I am moving some "live" data structures from MySQL to REDIS. Using StackExchange C# Redis Client, I'm writing (due to some very project-specific restrictions) my own microORM code to store and retrieve object class entities from a Redis Database.
I am pushing c# object as hash keys in Redis.
My general question is about indexing on fields other than the "primary key".
Ok, I've read all the theory of sets and sorted sets, and how to add and remove members from sets, and so on.
I've added some code to correctly create set keys which contain entities hash keys, so that I can lookup those objects by simple indexes or sorted indexes.
However I cannot find or figure out a good strategy for solving the following problems:
1. Index maintenance on expiration
I'd like to add expiration to some object (hash) keys, so that old entities get purged automatically by Redis. However I cannot find a reilable way to update/purge relevant indexes besides running periodically a background task that scans index set keys for expired members and removes them (notification is not good for me)
2. Index updating when some object fields change
In some cases I need to update only a small fraction of hash key values, not the whole entity. If the fields being updated are part of one or more index set keys, I cannot figure out the best way to properly update the set keys.
For example, let's say I need to store a "Session" entity whose primary key is its ID (simple numerical integer), and I need to add an index on the "Node" string field (Node being the reference to the server currently serving the session):
class Session {
[RedisKey]
public int ID { get; set; }
public string RemoteIP { get; set; }
[RedisSimpleIndex]
public string Node { get; set; }
}
RedisKey and RedisSimpleIndex are attributes I use to extract via reflection which fields are used as primary key and which are used for indexing.
Let's suppose I have an instance of Session like this:
{ ID = 2, RemoteIP = "1.2.3.4", Node = "Server10" }
My routines are creating the following keys in Redis:
Hash key: "obj:Session:2"
Hash values: "ID" = "1", "RemoteIP" = "1.2.3.4", "Node" = "Server10"
Set key "idx:Session:Node:Server10"
Set members: "obj:Session:2"
which is fine for looking up all sessions on Server10.
However, if the very same session needs to be moved to a different server (e.g. Server8)and I want to update only the Node field in the Hash set, how can I update indexes too?
The only way I found so far is to SCAN all index keys with pattern idx:Session:Node:* and remove from them any member obj:Session:2, then create/update the index key for the new node (idx:Session:Node:Server8).
Moreover the SCAN command is not available in IDatabase or ITransaction interfaces, and in a HA Clustered environment things get worse since I need to determine which Redis server is holding relevant keys to make this procedure work.
Is there a better way to build/represent simple indexes in Redis? Is my approach wrong?
I'd like to add expiration to some object (hash) keys, so that old entities get purged automatically by Redis. However I cannot find a reilable way to update/purge relevant indexes besides running periodically a background task that scans index set keys for expired members and removes them (notification is not good for me)
You cannot expire individual KV pairs within a hash. This is was discussed in #167. There don't appear plans to change this.
I think, you should be able to use keyspace notifications to subscribe to expire events. You would have to have some worker that subscribes for them and updates all relevant indices accordingly. However, you might get some inconsistent data. For example, your worker might crash and leave the stale indices behind. Also the indices wouldn't be updated instantaneously, so you'd end up with a bit of stale data regardless.
Probably not the best idea, but you could also hack in some custom indexing logic into expire.c. The code seems fairly straightforward. The C module API by contrast doesn't appear to provide any way to hook into the eviction logic.
Another option is to not rely on Redis when it comes to handling expiration logic. So... you would still have a background job, but it would actually issue corresponding DEL commands for expired KV-pairs. This would also allow you to keep the index 100% up to date via transactions.
In some cases I need to update only a small fraction of hash key values, not the whole entity. If the fields being updated are part of one or more index set keys, I cannot figure out the best way to properly update the set keys.
I'm not sure which Redis client you're using, but I found the following pattern to be quite useful in the past:
You have some form of "Updater" class for each hash. It has setters for all relevant fields that could be updated (setFirstName, setLastName etc.).
When you set a field, you mark that particular field as "dirty" (e.g. via a separate boolean).
When you call "save", you update indices for fields that were marked as dirty.
The only way I found so far is to SCAN all index keys with pattern idx:Session:Node:* and remove from them any member obj:Session:2, then create/update the index key for the new node (idx:Session:Node:Server8).
This is cumbersome, but seems like the way to go. Sadly I don't think there is a better solution for this. You might want to consider maintaining a separate set with keys of index KV-pairs that would have to be updated though, as that way you'd avoid going over a bunch of keys that aren't relevant.
You might also want to check out an article about how to maintain those indices. As you already alluded to, there are basically two options: real-time using MULTI transactions or using batch jobs. Once you get into the territory of using key expiration, you are more or less forced to use the batch approach.
We're using ORACLE 11.2.0.3.0, configured as 3 node RAC.
In our application we have hibernate over UCP and OJDBC with compatible version to RAC. Hibernate use some sequence to get ID for any record in database. I database we've got table with UNIQUE_CONSTRAINT (some_value) on it. It's used to synchronized many instance of application, every transaction in application requires unique row in this table. So application A tries to insert in this table (some_value="A"), if other application already inserted row with (some_value="A"), first instance get ORA-00001 unique constrain violated, and retry this with other value (some_value="B").
UNIQUE_CONSTRAINT fires very often. Like one in 8tx.
We run two tests:
service pinned to one node: response time avg 6ms
service on all 3 nodes: response time avg 800-1000ms
High level question is why? What is happening in 3 node RAC when UNIQUE_CONSTRAINT occurs, and why it's slowing down so much application. How can I diagnose this case?
Michal
Use service level scaling on RAC. Create a "LOADER" service the RAC side. Make this service active on one node only. And let hibernate use these service "LOADER" connections for loads.
The explanation is - very vague - each cluster node is mastering some subset of database's address space. When using unique constraint, each node must request data blocks of the unique index from it's mastering node. When a duplicit key is found and both duplicit keys were inserted via transactions which were not commited yet. Oracle has to enqueue one session and let it wait till the other session(belonging to another node) commits or rollbacks.
If you need to generate a unique value, you should let the database do it for you. You can create an object called a SEQUENCE. You then get the next value of a sequence simply by
my_seq.nextval
And the current value of the sequence is simply
my_seq.currval
So if you are inserting record...
insert into my_table( my_seq.nextval, 'xxx', yyy, 123, ... )
With SQL Server 2K8 from C# I'm trying to do a batch insert/updates of records to a parent/child tables to optimize.
The inserts/updates will generate a key automatically which I'd like to extract via an OUTPUT, etc. and then reassign back in the domain model. For batch inserts I need to keep track of which newly generated ID belongs to which domain object in the batch list.
This example comes close to what I need, but was wondering if there's a way to not have an extra column added to the table (SequenceNumber) and still achieve the same results: http://illdata.com/blog/2010/01/13/sql-server-batch-inserts-of-parentchild-data-with-ibatis/
ie. could we rely on the order of the inserts generated from the OUTPUT into the temp table, or pass a ref GUID set on the data model and passed temporarily to the SQL just for reference purposes?
In SQL Server 2008 it is possible to use merge and output to get a mapping between the generated key and the key used in the staging table.
Have a look at this question. Using merge..output to get mapping between source.id and target.id
Unless I've misunderstood...
A surrogate key (IDENTITY or NEWID etc) isn't your actual object identifier. It's an implementation detail and has no intrinsic meaning.
You must have another identifier (name, ISBN, serial number, transaction code/date, etc) that is the real (natural) key.
Your OUTPUT clause can return the surrogate key and the natural key. You then use this to map back
Probably a trivial question, but I want to get the best possible solution.
Problem:
I have two or more workers that insert keys into one or more tables. The problem arises when two or more workers try to insert the same key into one of those key tables at the same time.
Typical problem.
Worker A reads the table if a key exists (SELECT). There is no key.
Worker B reads the table if a key exists (SELECT). There is no key.
Worker A inserts the key.
Worker B inserts the key.
Worker A commits.
Worker B commits. Exception is throws as unique constraint is violated
The key tables are simple pairs. First column is autoincrement integer and the second is varchar key.
What is the best solution to such a concurrency problem? I believe it is a common problem. One way for sure is to handle the exceptions thrown, but somehow I don't believe this is the best way to tackle this.
The database I use is Firebird 2.5
EDIT:
Some additional info to make things clear.
Client side synchronization is not a good approach, because the inserts come from different processes (workers). And I could have workers across different machines someday, so even mutexes are a no-go.
The primary key and the first columns of such a table is autoincrement field. No problem there. The varchar field is the problem as it is something that the client inserts.
Typical such table is a table of users. For instance:
1 2056
2 1044
3 1896
4 5966
...
Each worker check if user "xxxx" exists and if not inserts it.
EDIT 2:
Just for the reference if somebody will go the same route. IB/FB return pair of error codes (I am using InterBase Express components). Checking for duplicate value violation look like this:
except
on E: EIBInterBaseError do
begin
if (E.SQLCode = -803) and (E.IBErrorCode = 335544349) then
begin
FKeysConnection.IBT.Rollback;
EnteredKeys := False;
end;
end;
end;
With Firebird you can use the following statement:
UPDATE OR INSERT INTO MY_TABLE (MY_KEY) VALUES (:MY_KEY) MATCHING (MY_KEY) RETURNING MY_ID
assuming there is a BEFORE INSERT trigger which will generate the MY_ID if a NULL value is being inserted.
Here is the documentation.
Update: The above statement will avoid exceptions and cause every statement to succeed. However, in case of many duplicate key values it will also cause many unnecessary updates.
This can be avoided by another approach: just handle the unique constraint exception on the client and ignore it. The details depend on which Delphi library you're using to work with Firebird but it should be possible to examine the SQLCode returned by the server and ignore only the specific case of unique constraint violation.
I do not know if something like this is avalible in Firebird but in SQL Server you can check when inserting the key.
insert into Table1 (KeyValue)
select 'NewKey'
where not exists (select *
from Table1
where KeyValue = 'NewKey')
First option - don't do it.
Don't do it; Unless the WORKERS are doing extraordinary amounts of work (we're talking about computers, so requiring 1 second per record qualifies as "extraordinary amount of work"), just use a single thread; Even better, do all the work in a stored procedure, you'd be amazed by the speedup gained by not transporting data over whatever protocol into your app.
Second option - Use a Queue
Make sure your worker threads don't all work on the same ID. Set up a Queue, push all the ID's that need processing into that queue, have each working thread Dequeue an ID from that Queue. This way you're guaranteed no two workers work on the same record at the same time. This might be difficult to implement if your workers are not all part of the same process.
Last resort
Set up an DB-based "Reservation" system so an Worker Thread can mark a Key for "work in process" so no two workers would work on the same Key. I'd set up a table like this:
CREATE TABLE KEY_RESERVATIONS (
KEY INTEGER NOT NULL, /* This is the KEY you'd be reserving */
RESERVED_UNTIL TIMESTAMP NOT NULL /* We don't want to keep reservations for ever in case of failure */
);
Each of your workers would use short transactions to work on that table: Select a candidate Key, one that's not in the KEY_RESERVATIONS table. Try to INSERT. Failed? Try an other KEY. Periodically delete all reserved key with old RESERVED_UNTIL timestamps. Make sure the transactions for working with KEY_RESERVATIONS are as short as possible, so that two threads both trying to reserve the same key at the same time would fail quickly.
This is what you have to deal with in an optimistic (or no-) locking scheme.
One way to avoid it is to put a pessimistic lock on the table around the whole select, insert, commit sequence.
However, that means you will have to deal with not being able to access the table (handle table-locked exceptions).
If by workers you mean threads in the same application instance instead of different users (application instances), you will need thread synchronization like kubal5003 says around the select-insert-commit sequence.
A combination of the two is needed if you have multiple users/application instances each with multiple threads.
Synchronize your threads to make it impossible to insert the same value or use a db side key generation method (I don't know Firebird so I don't even know if it's there, eg. in MsSQL Server there is identity column or GUIDs also solve the problem because it's unlikely to generate two identical ones)
You should not rely the client to generate the unique key, if there's possibility for duplicates.
Use triggers and generators (maybe with help of stored procedure) to create always unique keys.
More information about proper autoinc implementation in Firebird here: http://www.firebirdfaq.org/faq29/