Redis: How to distinguish between client tracking invalidation of keys across multiple databases

Redis: How to distinguish between client tracking invalidation of keys across multiple databases - redis

Is there anyway to distinguish which database an invalidation applies to?
example:
Tracking socket:
CLIENT ID // 77
PSUBSCRIBE __redis__:*
Main socket:
CLIENT TRACKING on REDIRECT 77 OPTIN
SELECT 1
SET MYKEY VALUE1
CLIENT CACHING YES
GET MYKEY //VALUE1
SELECT 2
SET MYKEY VALUE2
GET MYKEY //VALUE2
SELECT 1
GET MYKEY //VALUE1
The issue i have is that the tracking socket receives a: redis:invalidate 1) MYKEY when MYKEY is set in database 2. However the key I wanted to track is in database 1.
Short of redesigning the application to avoid key collisions across databases or creating a socket per database+tracking, how can i use tracking in a meaningful way?
Edit: Redis 6.0.8 stand alone install

Found the answer in Redis documentation:
"There is a single keys namespace, not divided by database numbers. So if a client is caching the key foo in database 2, and some other client changes the value of the key foo in database 3, an invalidation message will still be sent. This way we can ignore database numbers reducing both the memory usage and the implementation complexity."

Related

redis how to get first key-value pair of hash map

I would like to have something like the following table in redis.
host name
back queue
stanford.edu
23
microsoft.com
17
As far as I know, the best way to implement this is to use redis hashes (with host name as key and back queue as value). However, in my use case, I also want to get the first key-value pair present in the hash map.
How can this be implemented? Are there any redis datatypes specifically for this?

Copy one key from one redis instance to another

I have a Redis implementation with 6 nodes (3 masters 3 slaves - cluster enabled). I have load in every master an amount of keys.
So, my question is:
Is it possible to actual copy one key from 127.0.0.1:30001 to 127.0.0.1:30002?
For example lets say that my key has the name "testkey". If i copy this key from 30001 to 30002, when i want to get the key from 30001 or from 30002 the response must fetch the value of "testkey" in both calls.

No, that not how it works.
Keys in the cluster are assigned to hash slots and slots are assigned to master nodes. The keys' assignment is done by hashing their names (or the hash tag in them) so it is consistent, meaning that a given key name always hashes to the same slot.
A key can exist only once in the keyspace, but the slot it belongs to can be moved between masters. To scale reads from that key you can use the slave of the applicable master.
A good point to start understanding how the cluster works is by referring to the [tutorial](https://redis.io/topics/cluster-tutorial].

Redis: how to use it similar to multi-tables

It seems that Redis has no any entity corresponding to "table" in relational database.
For instance, I have to store:
(token, user_id)
(cart_id, token, [{product_id, count}])
If it doesn't separate store those two, the get method would search from both, which would cause chaos.
By the way, (cart_id, token, [{product_id, count}]) is a shopping cart, how to design such data structure in redis?

It seems that Redis has no any entity corresponding to "table" in relational database.
Right, because it is not a relational database. It is a data structure server which is very different and requires a different approach to be used well.
Ultimately to use Redis in the way it is intended you need to not think in relational terms, but think of the data structures you use in the code. More specifically, how do you need the data when you want to consume it? That will be the most likely way to store it in Redis.
In this case there are a few options, but the hash method works incredibly well for this one so I'll detail it here.
First, create a hash, call it users:to:tokens. Store as the key in the hash the user id, and the value the token. Next create the inverse, a hash called 'tokens:to:users'. You will probably be wanting both of these - the ability to look one up from the other - and this foundation will provide that.
Next, for your carts. This, too, will be a hash: carts:cart_id. In this hash you have the product_id and the count.
Finally up is your third hash token:to:cart which builds an index of tokens to cart id. I'd go a step further and do user:to:cart to be able to pull carts by user as well.
Now as to whether to store the keynote in the map or not, I tend to go with "no". By just storing the ID you can easily build the Redis cart key and not store the key's full path in the data store as well the saving memory usage.
Indeed, if you can do so use integers for all of your IDs. By using integers you can take advantage of Redis' integer storage optimizations to keep memory usage down. Hashes storing integers are quite efficient and very fast.
If needed you can use Redis to build your IDs. You can use the INCR command to build a counter for each data type such as userid:counter, cartid:counter, and tokenid:counter. As INCR returns the new value you make a single call to increment and get the new ID and get cartid:counter will always give you the largest ID if you wanted to quickly see how many carts have been created. Kinda neat , IMO.
Now, where it gets tricky is if you want to use expiration to automatically expire carts as opposed to leaving them to "lie around" until you want to clean things up. By setting an expiration on the cart hash (which has the product,count mapping) your carts will automatically expire. However, their references will still be hanging out in the token:to:cart hash. Removing that is a simple periodic task which treats over the members of token:to:cart and does an exists check on the cart's key. If it doesn't exist delete it from the hash.

Redis is a key-value storage. From redis.io:
Redis is an open source (BSD licensed), in-memory data structure
store, used as database, cache and message broker. It supports data
structures such as strings, hashes, lists, sets, sorted sets with
range queries, bitmaps, hyperloglogs and geospatial indexes with
radius queries.
So if you want to store two diffetent types (tokens and carts) you will need to store two keys for different datatypes. For example:
127.0.0.1:6379> hset tokens.token_id#123 user user123
(integer) 1
127.0.0.1:6379> hget tokens.token_id#123 user
"user123"
Where tokens is a namespace for tokens only. It is stored as Redis-Hash:
Redis Hashes are maps between string fields and string values, so they
are the perfect data type to represent objects
To store lists I would do the following:
127.0.0.1:6379> hmset carts.cart_1 token token_id#123 cart_contents cart_contents_key1
OK
127.0.0.1:6379> hmget carts.cart_1 token cart_contents
1) "token_id#123"
2) "cart_contents_key1" # cart_contents is a list of receipts.
cart_contents are represented as a Redis-List:
127.0.0.1:6379> rpush cart_contents.cart_contents_key1 receipt_key1
(integer) 1
127.0.0.1:6379> lrange cart_contents.cart_contents_key1 0 -1
1) "receipt_key1"
Receipt is Redis-Hash for a tuple (product_id, count):
127.0.0.1:6379> hmset receipts.receipt_key1 product_id 43 count 2
OK
127.0.0.1:6379> hmget receipts.receipt_key1 product_id count
1) "43" # Your final product id.
2) "2"
But do you really need Redis in this case?

Redis, session expiration, and reverse lookup

I'm currently bulding a web app and would like to use Redis to store sessions. At login, the session is inserted into Redis with a corresponding user id, and expiration set at 15 minutes. I would now like to implement reverse look-up for sessions (get sessions with a certain user id). The problem here is, since I can't search the Redis keyspace, how to implement this. One way would be to have a redis set for each userId, containing all session ids. But since Redis doesn't allow expiration of an item in a set, and sessions are set to expire, there would be a ton of inexistent session ids in the sets.
What would be the best way to remove ids from sets on key expiration? Or, is there a better way of accomplishing what I want (reverse look-up)?

On the current release branch of Redis (2.6), you cannot have notifications when items are expired. It will probably change with the next versions.
In the meantime, to support your requirement, you need to manually implement expiration notification support. So you have:
session:<sessionid> -> a hash storing your session data - one of the field is <userid>
user:<userid> -> a set of <sessionid>
You need to remove sessionid from the user set when the session expires. So you can maintain a additional sorted set whose score is a timestamp.
When you create session 10 for user 100:
MULTI
HMSET session:10 userid:100 ... other session data ...
SADD user:100 10
ZADD to_be_expired <current timestamp + session timeout> 10
EXEC
Then, you need to build a daemon which will poll the zset to identify the session to expire (ZRANGEBYSCORE). For each expired session, it has to maintain the data structure:
pop the session out of the zset (ZREMRANGEBYRANK)
retrieve session userid (HMGET)
delete session (DEL)
remove session from userid set (SREM)
The main difficulty is to ensure there is no race conditions when the daemon polls and processes the items. See my answer to this question to see how it can be implemented:
how to handle session expire basing redis?

In more recent versions of Redis (2.8.0 and up) Keyspace Notifications for expired events are supported. I.e. when a key with a TTL expires this event is triggered.
This is what to subscribe to:
'__keyevent#0__:expired'
So subscribing to this event allows you to have a single index for all sessions and you can remove the key from the index when the key expires.
Example:
Use a sorted set as a secondary index with the uid as the weight:
ZADD "idx-session-uid" <uid> <sessionkey>
Search for sessionkeys for a specific user with:
ZRANGEBYSCORE "idx-session-uid" <uid> <uid>
When a session is deleted or expired we do:
ZREM "idx-session-uid" <sessionkey>

Best way to maintain data integrity between local and remote sql databases

So I have, what would seem like a common question that I can't seem to find an answer to. I'm trying to find what is the "best practice" for how to architect a database that maintains data locally, then syncs that data to a remote database that is shared between many clients. To make things more clear, this remote database would have many clients that use it.
For example, if I had a desktop application that stored to-do lists (in SQL) that had individual items. Then I want to be able to send that data to a web-service that had a "master" copy of all the different clients information. I'm not worried about syncing problems as much as I am just trying to think through actual architecture of the client's tables and the web-services tables
Here's an example of how I was thinking about it:
Client Database
list
--list_client_id (primary key, auto-increment)
--list_name
list_item
--list_item_client_id (primary key, auto-increment)
--list_id
--list_item_text
Web Based Master Database (Shared between many clients)
list
--list_master_id
--list_client_id (primary key, auto-increment)
--list_name
--user_id
list_item
--list_item_master_id (primary key, auto-increment)
--list_item_remote_id
--list_id
--list_item_text
--user_id
The idea would be that the client can create todo lists with items, and sync this with the web service at any given time (i.e. if they lose data connectivity, and aren't able to send the information until later, nothing will get out of order). The web service would record the records with the clients id's as just extra fields.
That way, the client can say "update list number 4 with a new name" and the server takes this to mean "update user 12's list number 4 with a new name".

I think they general concept you're working with is the right direction, but you may need to pay careful attention to the use of auto-increment columns. For example, auto-increment on the server is useless if the client is the owner of this ID. Instead, you probably want list.list_master_id to be an auto-increment. Everything else you've mentioned is entirely plausible, though the complexity may increase if there may be multiple clients per user. Then, the use of an auto-increment alone probably isn't sufficient. Instead, you may need a guid or a datatype that also includes a client identifier to prevent id collision.
Without having more details it would be difficult to speculate on what other situations you may need to consider.

SERVER:
list
--id
--name
--user_id
--updated_at
--created_from_device_id
Those 2 tables link all records, might be grouped in one table also.
list_ids
--list_id
--device_id
--device_record_id
user_ids
--user_id
--device_id
--device_record_id
CLIENT (device_id=5)
list
--id
--name
--user_id
--updated_at
That will allow you to save records as(only showing relevant fields):
server
list: id=1, name=shopping, user_id=1234
user: id=27, name=John Doe
list_ids: list_id=1, device_id=5, device_record_id=999
user_ids: user_id=27, device_id=5, device_record_id=567
client
id=999, name=shopping, user_id=567
This way they are totally unaware of any ID's, translations can be done quite fast and you can supply the clients only with information and ID's they know of.

I have the same issue with a project i am working on, the solution in my case was to create an extra nullable field in the local tables named remote_id. When synchronizing records from local to remote database if remote_id is null, it means that this row has never been synchronized and needs to return a unique id matching the remote row id.
Local Table Remote Table
_id (used locally)
remote_id ------------- id
name ------------- name
In the client application i link tables by the _id field, remotely i use the remote id field to fetch data, do joins, etc..
example locally:
Local Client Table Local ClientType Table Local ClientType
_id
remote_id
_id -------------------- client_id
remote_id client_type_id -------------- _id
remote_id
name name name
example remotely:
Remote Client Table Remote ClientType Table Remote ClientType
id -------------------- client_id
client_type_id -------------- id
name name name
This scenario, and without any logical in the code, would cause data integrity failures, as the client_type table may not match the real id either in the local or remote tables, therefor whenever a remote_id is generated, it returns a signal to the client application asking to update the local _id field, this fires a previously created trigger in sqlite updating the affected tables.
http://www.sqlite.org/lang_createtrigger.html
1- remote_id is generated in the server
2- returns a signal to client
3- client updates its _id field and fires a trigger that updates local tables that join local _id
Of course i use also a last_updated field to help synchronizations and to avoid duplicated syncs.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas