Jedis Benchmarking - How fast is Jedis - redis

I am using Jedis to connect to Redis and push data into a list. I am using rpush for the JSON data.
These are the steps I do:
Fetch Data from Rabbitmq
Collect info from JSON data and prepare a key , value pair
Push the data into redis using the key and the value.
I dont see my code scaling more than 3000 requests per second.
Note:
I am not using pipeline , every message will result in getting jedis resource , add it to redis and close of resourse.

Options for persisting faster in Redis are
Pipelining
Jedis Connection Pooling
To Avoid:
3. No frequent opening/closing of resource, i.e open a resource and reuse it
Good link:
https://tech.trivago.com/2017/01/25/learn-redis-the-hard-way-in-production/
How I solved My Problem:
My design was perfectly fine. But I was pushing data into the same key for all my tests. When I started pushing data into different keys Performance increased hugely.

Related

Redis java map pagination

I am thinking of using redis in my java application either through jedis or redisson client. I have a use case where I need to show all the map entries in redis on UI. I can't fetch them all at once because the data is around 100 mb which is a lot over network. How can I fetch the entries in the map by providing a range for pagination?

Redis: Using lua and concurrent transactions

Two issues
Do lua scripts really solve all cases for redis transactions?
What are best practices for asynchronous transactions from one client?
Let me explain, first issue
Redis transactions are limited, with an inability to unwatch specific keys, and all keys being unwatched upon exec; we are limited to a single ongoing transaction on a given client.
I've seen threads where many redis users claim that lua scripts are all they need. Even the redis official docs state they may remove transactions in favour of lua scripts. However, there are cases where this is insufficient, such as the most standard case: using redis as a cache.
Let's say we want to cache some data from a persistent data store, in redis. Here's a quick process:
Check cache -> miss
Load data from database
Store in redis
However, what if, between step 2 (loading data), and step 3 (storing in redis) the data is updated by another client?
The data stored in redis would be stale. So... we use a redis transaction right? We watch the key before loading from db, and if the key is updated somewhere else before storage, storage would fail. Great! However, within an atomic lua script, we cannot load data from an external database, so lua cannot be used here. Hopefully I'm simply missing something, or there is something wrong with our process.
Moving on to the 2nd issue (asynchronous transactions)
Let's say we have a socket.io cluster which processes various messages, and requests for a game, for high speed communication between server and client. This cluster is written in node.js with appropriate use of promises and asynchronous concepts.
Say two requests hit a server in our cluster, which require data to be loaded and cached in redis. Using our transaction from above, multiple keys could be watched, and multiple multi->exec transactions would run in overlapping order on one redis connection. Once the first exec is run, all watched keys will be unwatched, even if the other transaction is still running. This may allow the second transaction to succeed when it should have failed.
These overlaps could happen in totally separate requests happening on the same server, or even sometimes in the same request if multiple data types need to load at the same time.
What is best practice here? Do we need to create a separate redis connection for every individual transaction? Seems like we would lose a lot of speed, and we would see many connections created just from one server if this is case.
As an alternative we could use redlock / mutex locking instead of redis transactions, but this is slow by comparison.
Any help appreciated!
I have received the following, after my query was escalated to redis engineers:
Hi Jeremy,
Your method using multiple backend connections would be the expected way to handle the problem. We do not see anything wrong with multiple backend connections, each using an optimistic Redis transaction (WATCH/MULTI/EXEC) - there is no chance that the “second transaction will succeed where it should have failed”.
Using LUA is not a good fit for this problem.
Best Regards,
The Redis Labs Team

How can I write a tuple in to redis as well as cassandra using trident topology

I am writing a Trident topology to process stream of data from Kafka and feed in to Redis and Cassandra. I am able to write the data in to Cassandra. Now I would like to write the same data in to Redis.
Is there a way to duplicate the tuples and branch it in to 2 flow where one goes in to Redis and another goes in to Cassandra?
For Trident you can go with smth like this:
TridentTopology topology = new TridentTopology();
Stream stream = topology.newStream("MySpout", spout);
stream.partitionPersist(...); // to Redis
stream.partitionPersist(...); // to Cassandra
So it will be saving the data from your stream to both databases in parallel.
However I would also think if such parallel thing should be done inside a single topology or if having two different topologies reading from the same topic is a better idea. Imagine Cassandra cluster goes down. In case of two topologies you'll still be able to continue saving the data to Redis. But if there's just a single topology, every tuple failed to go to Cassandra most likely will result in a FailedException to trigger replaying and every subsequent replay of a tuple will involve saving the tuple to Redis once again unnecessarily.

Redis for session storage

I am building a security service as part of a suite of services that make up an application. I am considering using Redis to store sessions. A session is a data structure that looks like this:
{
string : sessionToken
DateTime : expiryUtc
string[] : permissionKeys
}
All I need to do is create, read and remove sessions. If I can have Redis remove expired sessions then great but not essential. As a noob to Redis I have some reading to do but can someone with Redis experience give me any guidance on the correct way to achieve this, assuming Redis is a good choice. BTW I'm on the Mono platform and have so far selected StackExchange.Redis client as at some stage I will want to cluster Redis. I am open to changing this selection.
You can go with Redis hashes, they will match your structure pretty well: http://redis.io/topics/data-types-intro#redis-hashes The session token can be the key of the whole hash. The StackExchange Redis client has a KeyExpire method which can take a DateTime parameter, so you can have Redis expire your keys. Inside Redis hashes you can't have nested structures so your permissionKeys and any other values that will go inside must be stored as simple values - you can serialize them as json.
And one more thing with hashes is that they allow for some memory optimization: http://redis.io/topics/memory-optimization#use-hashes-when-possible which can be pretty usefull if you will have many sessions to create (because Redis will store all these in ram).

What happens when redis gets overloaded?

If redis gets overloaded, can I configure it to drop set requests? I have an application where data is updated in real time (10-15 times a second per item) for a large number of items. The values are outdated quickly and I don't need any kind of consistency.
I would also like to compute parallel sum of the values that are written in real time. What's the best option here? LUA executed in redis? Small app located on the same box as redis using UNIX sockets?
When Redis gets overloaded it will just slow down its clients. For most commands, the protocol itself is synchronous.
Redis supports pipelining though, but there is no way a client can cancel the traffic still in the pipeline, but not yet acknowledged by the server. Redis itself does not really queue the incoming traffic, the TCP stack does it.
So it is not possible to configure the Redis server to drop set requests. However, it is possible to implement a last value queue on client side:
the queue is actually a represented by 2 maps indexed by your items (only one value stored per item). The primary map will be used by the application. The secondary map will be used by a specific thread. The 2 maps content can be swapped in an atomic way.
a specific thread is blocking when the primary map is empty. When it is not, it swaps the content of the two maps, sends the content of the secondary map asynchronously to Redis using aggressive pipelining and variadic parameters commands. It also receives ack from Redis.
while the thread is working with the secondary map, the application can still fill the primary map. If Redis is too slow, the application will only accumulate last values in the primary map.
This strategy could be implemented in C with hiredis and the event loop of your choice.
However, it is not trivial to implement, so I would first check if the performance of Redis against all the traffic is not enough for my purpose. It is not uncommon to benchmark Redis at more than 500K op/s these days (using a single core). Nothing prevents you to shard your data on multiple Redis instances if needed.
You will likely saturate the network links before the CPU of the Redis server. That's why it is better to implement the last value queue (if needed) on client side rather than server side.
Regarding the sum computation, I would try to calculate and maintain it in real time. For instance, the GETSET command can be used to set a new value while returning the previous one.
Instead of just setting your values, you could do:
[old value] = GETSET item <new value>
INCRBY mysum [new value] - [old value]
The mysum key will contain the sum of your values for all the items at any time. With Redis 2.6, you can use Lua to encaspulate this calculation to save roundtrips.
Running a big batch to calculate statistics on existing data (this is how I understand your "parallel" sum) is not really suitable for Redis. It is not designed for map/reduce like computation.