Get complete Map from Redis if a key contains a String - redis

I am able to add and get a particular user object from Redis I am adding object like this:
private static final String USER_PREFIX = ":USER:";
public void addUserToRedis(String serverName,User user) {
redisTemplate.opsForHash().put(serverName + USER_PREFIX + user.getId(),
Integer.toString(user.getId()),user);
}
If a userId is 100 I am able to get by key: SERVER1:USER:100
Now I want to retrieve all Users as Map<String,List<User>> ,
For example, get all users by this key SERVER1:USER: Is it possible ? Or I need to modify my addUserToRedis method? Please suggest me.

I would recommend not using the "KEYS" command in production as this can severely impact REDIS latencies (can even bring down the cluster if you have a large number of keys stored)
Instead, you would want to use a different command than plain GET/SET.
It would be better if you use a Sets or Hashes
127.0.0.1:6379> sadd server1 user1 user2
(integer) 2
127.0.0.1:6379> smembers server1
1) "user2"
2) "user1"
127.0.0.1:6379>
Using sets you can simply add your users to server keys and get the entire list of users on a server.
If you really need a map of < server, list < users > > you can use hashes with stringified user data and then convert it to actual User POJO at application layer
127.0.0.1:6379> hset server2 user11 name
(integer) 1
127.0.0.1:6379> hset server2 user13 name
(integer) 1
127.0.0.1:6379> hgetall server2
1) "user11"
2) "name"
3) "user13"
4) "name"
127.0.0.1:6379>
Also do note that keeping this much big data into a single key is not an ideal thing to do.

i dont use java but here's how to use SCAN
const Redis = require('ioredis')
const redis = new Redis()
async function main() {
const stream = redis.scanStream({
match: "*:user:*",
count: 100,
})
stream.on("data", (resultKeys) => {
for (let i = 0; i < resultKeys.length; i++) {
// console.log(resultKeys[i])
// do your things here
}
});
stream.on("end", () => {
console.log("all keys have been visited");
});
}
main()

Finally I came up with this solution with wildcard search and avoiding KEYS, and here is my complete method:
public Map<String, User> getUserMapFromRedis(String serverName){
Map<String, User> users=new HashMap<>();
RedisConnection redisConnection = null;
try {
redisConnection = redisTemplate.getConnectionFactory().getConnection();
ScanOptions options = ScanOptions.scanOptions().match(serverName + USER_PREFIX+"*").build();
Cursor<byte[]> scan = redisConnection.scan(options);
while (scan.hasNext()) {
byte[] next = scan.next();
String key = new String(next, StandardCharsets.UTF_8);
String[] keyArray=key.split(":");
String userId=keyArray[2];
User user=//get User by userId From Redis
users.put(userId, user);
}
try {
scan.close();
} catch (IOException e) {
}
}finally {
redisConnection.close(); //Ensure closing this connection.
}
return users;
}

Related

For loop with If condition in Terraform

I'm writing a module to create multiple S3 Buckets with all the related resources. Currently, I'm a little bit stuck on the server side encryption as I need to parametrize the KMS key id for a key that is not still created.
The variables passed to the module are:
A list of S3 buckets
A map with the KMS created
The structure of the S3 buckets is
type = list(object({
bucket = string
acl = string
versioning = string
kms_description = string
logging = bool
loggingBucket = optional(string)
logPath = optional(string)
}))
}
The structure of the KMS map is similar to
kms_resources = {
0 = {
kms_arn = (known after apply)
kms_description = "my-kms"
kms_id = (known after apply)
}
}
This variable is an output from a previous module that creates all the KMS. The output is created this way
output "kms_resources" {
value = {
for kms, details in aws_kms_key.my-kms :
kms => ({
"kms_description" = details.description
"kms_id" = details.key_id
"kms_arn" = details.arn
})
}
}
As you can see the idea is that, on the S3 variable, the user can select his own KMS key, but I'm struggling to retrieve the value. At this moment, resource looks like this
resource "aws_s3_bucket_server_side_encryption_configuration" "my-s3-buckets" {
count = length(var.s3_buckets)
bucket = var.s3_buckets[count.index].bucket
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = [
for k in var.kms_keys : k.kms_id
if k.kms_description == var.s3_buckets[count.index].kms_description
]
sse_algorithm = "aws:kms"
}
}
}
I thought it was gonna work but once terraform is giving me Inappropriate value for attribute "kms_master_key_id": string required.. I would also like that if the value does not exist the kms_master_key_id is set by default to aws/s3
The problem seems to be that you try to give a list as kms_master_key_id instead of just a string. The fact that you alternatively want to use "aws/s3" actually makes the fix quite easy:
kms_master_key_id = concat([
for k in var.kms_keys : k.kms_id
if k.kms_description == var.s3_buckets[count.index].kms_description
], ["aws/s3"])[0]
That way you first get a list with keys / that key that matches the description and then append the default s3 key. Afterwards you simply pick the first element of the list, either the first actual key or if none matched you pick the default key.

Efficient Redis SCAN of multiple key patterns

I'm trying to power some multi-selection query & filter operations with SCAN operations on my data and I'm not sure if I'm heading in the right direction.
I am using AWS ElastiCache (Redis 5.0.6).
Key design: <recipe id>:<recipe name>:<recipe type>:<country of origin>
Example:
13434:Guacamole:Dip:Mexico
34244:Gazpacho:Soup:Spain
42344:Paella:Dish:Spain
23444:HotDog:StreetFood:USA
78687:CustardPie:Dessert:Portugal
75453:Churritos:Dessert:Spain
If I want to power queries with complex multi-selection filters (example to return all keys matching five recipe types from two different countries) which the SCAN glob-style match pattern can't handle, what is the common way to go about it for a production scenario?
Assuming the I will calculate all possible patterns by doing a cartesian product of all field alternating patterns and multi-field filters:
[[Guacamole, Gazpacho], [Soup, Dish, Dessert], [Portugal]]
*:Guacamole:Soup:Portugal
*:Guacamole:Dish:Portugal
*:Guacamole:Dessert:Portugal
*:Gazpacho:Soup:Portugal
*:Gazpacho:Dish:Portugal
*:Gazpacho:Dessert:Portugal
What mechanism should I use to implement this sort of pattern matching in Redis?
Do multiple SCAN for each scannable pattern sequentially and merge the results?
LUA script to use improved pattern matching for each pattern while scanning keys and get all matching keys in a single SCAN?
An index built on top of sorted sets supporting fast lookups of keys matching single fields and solve matching alternation in the same field with ZUNIONSTORE and solve intersection of different fields with ZINTERSTORE?
<recipe name>:: => key1, key2, keyN
:<recipe type>: => key1, key2, keyN
::<country of origin> => key1, key2, keyN
An index built on top of sorted sets supporting fast lookups of keys matching all dimensional combinations and therefore avoiding Unions and Intersecions but wasting more storage and extend my index keyspace footprint?
<recipe name>:: => key1, key2, keyN
<recipe name>:<recipe type>: => key1, key2, keyN
<recipe name>::<country of origin> => key1, key2, keyN
:<recipe type>: => key1, key2, keyN
:<recipe type>:<country of origin> => key1, key2, keyN
::<country of origin> => key1, key2, keyN
Leverage RedisSearch? (while impossible for my use case, see Tug Grall answer which appears to be very nice solution.)
Other?
I've implemented 1) and performance is awful.
private static HashSet<String> redisScan(Jedis jedis, String pattern, int scanLimitSize) {
ScanParams params = new ScanParams().count(scanLimitSize).match(pattern);
ScanResult<String> scanResult;
List<String> keys;
String nextCursor = "0";
HashSet<String> allMatchedKeys = new HashSet<>();
do {
scanResult = jedis.scan(nextCursor, params);
keys = scanResult.getResult();
allMatchedKeys.addAll(keys);
nextCursor = scanResult.getCursor();
} while (!nextCursor.equals("0"));
return allMatchedKeys;
}
private static HashSet<String> redisMultiScan(Jedis jedis, ArrayList<String> patternList, int scanLimitSize) {
HashSet<String> mergedHashSet = new HashSet<>();
for (String pattern : patternList)
mergedHashSet.addAll(redisScan(jedis, pattern, scanLimitSize));
return mergedHashSet;
}
For 2) I've created a Lua Script to help with the server-side SCAN and the performance is not brilliant but is much faster than 1) even taking into consideration that Lua doesn't support alternation matching patterns and I have to loop each key through a pattern list for validation:
local function MatchAny( str, pats )
for pat in string.gmatch(pats, '([^|]+)') do
local w = string.match( str, pat )
if w then return w end
end
end
-- ARGV[1]: Scan Count
-- ARGV[2]: Scan Match Glob-Pattern
-- ARGV[3]: Patterns
local cur = 0
local rep = {}
local tmp
repeat
tmp = redis.call("SCAN", cur, "MATCH", ARGV[2], "count", ARGV[1])
cur = tonumber(tmp[1])
if tmp[2] then
for k, v in pairs(tmp[2]) do
local fi = MatchAny(v, ARGV[3])
if (fi) then
rep[#rep+1] = v
end
end
end
until cur == 0
return rep
Called in such a fashion:
private static ArrayList<String> redisLuaMultiScan(Jedis jedis, String luaSha, List<String> KEYS, List<String> ARGV) {
Object response = jedis.evalsha(luaSha, KEYS, ARGV);
if(response instanceof List<?>)
return (ArrayList<String>) response;
else
return new ArrayList<>();
}
For 3) I've implemented and maintained a secondary Index updated for each of the 3 fields using Sorted Sets and implemented querying with alternating matching patterns on single fields and multi-field matching patterns like this:
private static Set<String> redisIndexedMultiPatternQuery(Jedis jedis, ArrayList<ArrayList<String>> patternList) {
ArrayList<String> unionedSets = new ArrayList<>();
String keyName;
Pipeline pipeline = jedis.pipelined();
for (ArrayList<String> subPatternList : patternList) {
if (subPatternList.isEmpty()) continue;
keyName = "un:" + RandomStringUtils.random(KEY_CHAR_COUNT, true, true);
pipeline.zunionstore(keyName, subPatternList.toArray(new String[0]));
unionedSets.add(keyName);
}
String[] unionArray = unionedSets.toArray(new String[0]);
keyName = "in:" + RandomStringUtils.random(KEY_CHAR_COUNT, true, true);
pipeline.zinterstore(keyName, unionArray);
Response<Set<String>> response = pipeline.zrange(keyName, 0, -1);
pipeline.del(unionArray);
pipeline.del(keyName);
pipeline.sync();
return response.get();
}
The results of my stress test cases clearly favor 3) in terms of request latency:
I would vote for option 3, but I will probably start to use RediSearch.
Also have you look at RediSearch? This module allows you to create secondary index and do complex queries and full text search.
This may simplify your development.
I invite you to look at the project and Getting Started.
Once installed you will be able to achieve it with the following commands:
HSET recipe:13434 name "Guacamole" type "Dip" country "Mexico"
HSET recipe:34244 name "Gazpacho" type "Soup" country "Spain"
HSET recipe:42344 name "Paella" type "Dish" country "Spain"
HSET recipe:23444 name "Hot Dog" type "StreetFood" country "USA"
HSET recipe:78687 name "Custard Pie" type "Dessert" country "Portugal"
HSET recipe:75453 name "Churritos" type "Dessert" country "Spain"
FT.CREATE idx:recipe ON HASH PREFIX 1 recipe: SCHEMA name TEXT SORTABLE type TAG SORTABLE country TAG SORTABLE
FT.SEARCH idx:recipe "#type:{Dessert}"
FT.SEARCH idx:recipe "#type:{Dessert} #country:{Spain}" RETURN 1 name
FT.AGGREGATE idx:recipe "*" GROUPBY 1 #type REDUCE COUNT 0 as nb_of_recipe
I am not explaining all the commands in details here since you can find the explanation in the tutorial but here are the basics:
use a hash to store the recipes
create a RediSearch index and index the fields you want to query
Run queries, for example:
To get all Spanish Desert: FT.SEARCH idx:recipe "#type:{Dessert} #country:{Spain}" RETURN 1 name
To count the number of recipe by type: FT.AGGREGATE idx:recipe "*" GROUPBY 1 #type REDUCE COUNT 0 as nb_of_recipe
I ended up using a simple strategy to update each secondary index for each field when the key is created:
protected static void setKeyAndUpdateIndexes(Jedis jedis, String key, String value, int idxDimSize) {
String[] key_arr = key.split(":");
Pipeline pipeline = jedis.pipelined();
pipeline.set(key, value);
for (int y = 0; y < key_arr.length; y++)
pipeline.zadd(
"idx:" +
StringUtils.repeat(":", y) +
key_arr[y] +
StringUtils.repeat(":", idxDimSize-y),
java.time.Instant.now().getEpochSecond(),
key);
pipeline.sync();
}
The search strategy to find multiple keys that match a pattern including alternating patterns and multi-field filters was achieved like:
private static Set<String> redisIndexedMultiPatternQuery(Jedis jedis, ArrayList<ArrayList<String>> patternList) {
ArrayList<String> unionedSets = new ArrayList<>();
String keyName;
Pipeline pipeline = jedis.pipelined();
for (ArrayList<String> subPatternList : patternList) {
if (subPatternList.isEmpty()) continue;
keyName = "un:" + RandomStringUtils.random(KEY_CHAR_COUNT, true, true);
pipeline.zunionstore(keyName, subPatternList.toArray(new String[0]));
unionedSets.add(keyName);
}
String[] unionArray = unionedSets.toArray(new String[0]);
keyName = "in:" + RandomStringUtils.random(KEY_CHAR_COUNT, true, true);
pipeline.zinterstore(keyName, unionArray);
Response<Set<String>> response = pipeline.zrange(keyName, 0, -1);
pipeline.del(unionArray);
pipeline.del(keyName);
pipeline.sync();
return response.get();
}

Why am I getting a missing hash-tags error when I try to run JedisCluster.scan() using a match pattern?

I'm trying to run scan on my redis cluster using Jedis. I tried using the .scan(...) method as follows for a match pattern but I get the following error:
"JedisCluster only supports SCAN commands with MATCH patterns containing hash-tags"
my code is as follows (excerpted):
private final JedisCluster redis;
...
String keyPrefix = "helloWorld:*";
ScanParams params = new ScanParams()
.match(keyPrefix)
.count(100);
String cur = SCAN_POINTER_START;
boolean done = false;
while (!done) {
ScanResult<String> resp = redis.scan(cur, params);
...
cur = resp.getStringCursor();
if (resp.getStringCursor().equals(SCAN_POINTER_START)) {
done = true;
}
}
When I run my code, it gives a weird error talking about hashtags:
"JedisCluster only supports SCAN commands with MATCH patterns containing hash-tags"
In the redis-cli I could just use match patterns like that what I wrote for the keyPrefix variable. Why am I getting an error?
How do I get Jedis to show me all the the keys that match a given substring?
The problem was that the redis variable is a RedisCluster object and not a Redis object.
A redis cluster object has a collection of nodes and scanning that is different than scanning an individual node.
To solve the issue, you can scan through each individual node as such:
String keyPrefix = "helloWorld:*";
ScanParams params = new ScanParams()
.match(keyPrefix)
.count(100);
redis.getClusterNodes().values().stream().forEach(pool -> {
boolean done = false;
String cur = SCAN_POINTER_START;
try (Jedis jedisNode = pool.getResource()) {
while (!done) {
ScanResult<String> resp = jedisNode.scan(cur, params);
...
cur = resp.getStringCursor();
if (cur.equals(SCAN_POINTER_START)) {
done = true;
}
}
}
});

Kafka-Streams: How to query RocksDB persistent store when the key is avro serialized?

I am building a kafka streams topology in which I need to materialize the contents of a Global KTable in a persistent RocksDB store. This store will be further used on every instance of my application for querying.
The store consinsts of key values pairs, where the key is serialized using Avro and the payload is just a byte array.
My problem is when I try to query the store by a specific key, as shown in the code below:
fun topology() {
val streamsBuilder = StreamsBuilder()
streamsBuilder.globalTable<Key, ByteArray>(
SOURCE_TOPIC_FOR_GLOBAL_KTABLE,
Consumed.with(null /*Uses default serde provided in config which is SpecificAvroSerde*/,
Serdes.ByteArray()),
Materialized.`as`(STORE_NAME))
val kafkaStreams = KafkaStreams(streamsBuilder.build(), getProperties(), DefaultKafkaClientSupplier())
kafkaStreams.start()
}
private fun getProperties(): Properties {
val properties = Properties()
properties[StreamsConfig.BOOTSTRAP_SERVERS_CONFIG] = "kafka-brokers-ip"
properties[AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG] = "schema-registry-ip"
properties[StreamsConfig.APPLICATION_ID_CONFIG] = "app-id"
properties[ConsumerConfig.CLIENT_ID_CONFIG] = "client-id"
properties[ConsumerConfig.AUTO_OFFSET_RESET_CONFIG] = "earliest"
properties[StreamsConfig.STATE_DIR_CONFIG] = "C:/dev/kafka-streams"
properties[StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG] = SpecificAvroSerde::class.java
properties[StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG] = SpecificAvroSerde::class.java
return properties
}
And this is the function that I use to query the store:
val store = kafkaStreams
.store(STORE_NAME, QueryableStoreTypes.keyValueStore<Key, ByteArray>())
store.all().forEach { keyValue ->
if (keyValue.key == Key("25-OCT-19 USD")) {
//we get here, so the key is in the store
println("Keys are equal.")
println(keyValue) // this gets printed: KeyValue({"value": "25-OCT-19 USD"}, [B#3ca343b5)
println(store.get(Key("25-OCT-19 USD"))) //this is null
println(store.get(keyValue.key)) //this is also null
}
}
As you can see in the comments of code, there are values in the store, but when I try to query by a specific key, I get nothing back.

SCAN command performance with phpredis

I'm replacing KEYS with SCAN using phpredis.
$redis = new Redis();
$redis->connect('127.0.0.1', 6379);
$redis->setOption(Redis::OPT_SCAN, Redis::SCAN_RETRY);
$it = NULL;
while($arr_keys = $redis->scan($it, "mykey:*", 10000)) {
foreach($arr_keys as $str_key) {
echo "Here is a key: $str_key\n";
}
}
According to redis documentation, I use SCAN to paginate searches to avoid disadvantage of using KEYS.
But in practice, using above code costs me 3 times lower than using just a single $redis->keys()
So I'm wondering if I've done something wrong, or I have to pay speed to avoid KEYS's threat?
Note that I totally have 400K+ keys in my db, and 4 mykey:* keys
A word of caution of using the example:
$it = NULL;
while($arr_keys = $redis->scan($it, "mykey:*", 10000)) {
foreach($arr_keys as $str_key) {
echo "Here is a key: $str_key\n";
}
}
That can return empty array's if none of the 10000 keys scanned matches and then it will give up, and you didn't get all the keys you wanted! I would recommend doing more like this:
$it = null;
do
{
$arr_keys = $redis->scan($it, $key, 10000);
if (is_array($arr_keys) && !empty($arr_keys))
{
foreach ($arr_keys as $str_key)
{
echo "Here is a key: $str_key\n";
}
}
} while ($arr_keys !== false);
And why it takes so long, 400k+ keys, 10000, that's 40 scan requests to redis, if it's not on the local machine, add latency for every 40 redis query to your speed.
Since using keys in production environments is just forbidden because it blocks the entire server while iterating global space keys, then, there's no discussion here about use or not to use keys.
In the other hand, if you want to speed up things, you should go further with Redis: you should index your data.
I doubt that these 400K keys couldn't be categorized in sets or sorted sets, or even hashes, so when you need a particular subset of your 400K-keys-database you could run any scan-equivalent command against a set of 1K items, instead of 400K.
Redis is about indexing data. If not, you're using it like just a simple key-value store.