Spring data redis zadd command missed nx|xx|incr options - redis

Lettuce supported zadd's NX|XX|CH|INCR options on 2015 link.
But I can't find any thing support this in Lettuce's wrapper Spring data Redis (version:2.1.5).
It seem that the only two zadd methods that DefaultZSetOperations provide can't let me use NX|XX|CH|INCR options:
Boolean add(K key, V value, double score);
Long add(K key, Set<TypedTuple<V>> tuples);
So how can I use NX|XX|CH|INCR options on Spring data redis with Lettue?
Sorry for my poor english ,Thanks.

Not completely sure if this will 100% work the same for Lettuce. For Jedis, found that we have to use redisTemplate.execute(RedisCallback). A small example of using the ch argument for indicating if any records where changed (opposed to just being added to the sorted set)
redisTemplate.execute(
(RedisCallback<Long>) connection -> connection.zSetCommands().zAdd(
leaderboardKey.getBytes(StandardCharsets.UTF_8),
membersToTuples(members),
RedisZSetCommands.ZAddArgs.empty().ch()
)
)

Related

How to check the value in redis stored by spring boot #cacheable

Sorry, I'm new to the combination of redis and springboot #cacheable.
I just stored some data with springboot #cacha in redis and then I tried to check the value from redis-cli.
However, I couldn't get the value with the key, it's always null even though the API can get the value from the redis.
Here is my code:
// conde in springboot controller
#GetMapping("/")
#ResponseBody
#Cacheable(value = "dataList")
public SomeObject getDataList(SomeParameters someParameters){ ... }
# code I used to check the data in redis
$ 127.0.0.1:6379>keys *
1) "dataList::SimpleKey []"
$ 127.0.0.1:6379>llen dataList
0
$ 127.0.0.1:6379>type dataList
none
$ 127.0.0.1:6379>type dataList::SimpleKey
none
I'm wondering how to get value with that key above...
Thank you in advance.
Oh, I just figured it out on my own. It turns to be that entire value inside of double quotation is the key!
So the correct way to get the value is:
$ 127.0.0.1:6379>keys *
1) "dataList::SimpleKey []"
$ 127.0.0.1:6379>get "dataList::SimpleKey []"
Hope it could help you if you had the same confusion!

Can django-redis use dbsize?

django-redis source: https://github.com/jazzband/django-redis/tree/master/django_redis
my problem is I can not find method to get number of keys in Redis database, it call dbsize. Methods that available such as set, get, add, delete, delete_pattern, delete_many, clear, get_many, set_many, incr, decr, has_keys, keys, iter_keys, ttl, pttl, persist, expire, expire_at, pexpire, pexpire_at, lock, close, touch.
How can I used dbsize method of redis command in django-redis library?
environment:
django version : 3.2.10
django-redis: 5.2.0
I found the solution of the question
from django_redis import get_redis_connection
REDIS = get_redis_connection("default") # default is alias of redis
REDIS.dbsize() # get number of keys in the currently-selected database
this solution can use native redis command but cannot use method of django-redis plugin
WARNING: Not all pluggable clients support this feature.

Can redis Lua script contain key determined at runtime?

Look at this lua script:
local clientIds = redis.call('ZRANGEBYSCORE', KEYS[1], '-inf', ARGV[1], 'LIMIT', '0', ARGV[2]);
local prefix = 'lock:';
local lockedClientIds = {};
for _, value in ipairs(clientIds)
do
lockal key = prefix .. tostring(value)
if redis.call('EXISTS', key) == 0 then
redis.call('SET', key, 'PX', ARGV[3]);
table.insert(lockedClientIds, value)
end
end
redis.pcall('ZREM', KEYS[1], unpack(lockedClientIds));
return lockedClientIds;
It takes some values from the sorted set and uses them to create keys (after some simple concatenation). I'm not sure if this is OK because according to Redis Lua tutorials, all keys should be provided in the KEYS array so should be known in compile-time, not the runtime.
All Redis commands must be analyzed before execution to determine
which keys the command will operate on. In order for this to be true
for EVAL, keys must be passed explicitly. This is useful in many ways,
but especially to make sure Redis Cluster can forward your request to
the appropriate cluster node. Note this rule is not enforced in order
to provide the user with opportunities to abuse the Redis single
instance configuration, at the cost of writing scripts not compatible
with Redis Cluster.
So does it mean, there is a risk that this will only work with a single node and when redis is distributed across many nodes it won't work?
YES, it is (highly) possible that the script will not work in cluster mode.
It will continue to work even in cluster mode only if the keys are in same hash slot. The idea of hash tags can be used for this purpose.
Note: I'm assuming, by "redis is distributed across many nodes", you are meaning Redis Cluster mode.

How to avoid duplicates in BigQuery by streaming with Apache Beam IO?

We are using a pretty simple flow where messages are retrieved from PubSub, their JSON content is being flatten into two types (for BigQuery and Postgres) and then inserted into both sinks.
But, we are seeing duplicates in both sinks (Postgres was kinda fixed with a unique constraint and a "ON CONFLICT... DO NOTHING").
At first we trusted in the supposedly "insertId" UUId that the Apache Beam/BigQuery creates.
Then we add a "unique_label" attribute to each message before queueing them into PubSub, using data from the JSON itself, which gives them uniqueness (a device_id + a reading's timestamp). And subscribed to the topic using that attribute with "withIdAttribute" method.
Finally we paid for GCP Support, and their "solutions" do not work. They have told us to even use Reshuffle transform, which is deprecated by the way, and some windowing (that we do not won't since we want near-real time data).
This the main flow, pretty basic:
[UPDATED WITH LAST CODE]
Pipeline
val options = PipelineOptionsFactory.fromArgs(*args).withValidation().`as`(OptionArgs::class.java)
val pipeline = Pipeline.create(options)
var mappings = ""
// Value only available at runtime
if (options.schemaFile.isAccessible){
mappings = readCloudFile(options.schemaFile.get())
}
val tableRowMapper = ReadingToTableRowMapper(mappings)
val postgresMapper = ReadingToPostgresMapper(mappings)
val pubsubMessages =
pipeline
.apply("ReadPubSubMessages",
PubsubIO
.readMessagesWithAttributes()
.withIdAttribute("id_label")
.fromTopic(options.pubSubInput))
pubsubMessages
.apply("AckPubSubMessages", ParDo.of(object: DoFn<PubsubMessage, String>() {
#ProcessElement
fun processElement(context: ProcessContext) {
LOG.info("Processing readings: " + context.element().attributeMap["id_label"])
context.output("")
}
}))
val disarmedMessages =
pubsubMessages
.apply("DisarmedPubSubMessages",
DisarmPubsubMessage(tableRowMapper, postgresMapper)
)
disarmedMessages
.get(TupleTags.readingErrorTag)
.apply("LogDisarmedErrors", ParDo.of(object: DoFn<String, String>() {
#ProcessElement
fun processElement(context: ProcessContext) {
LOG.info(context.element())
context.output("")
}
}))
disarmedMessages
.get(TupleTags.tableRowTag)
.apply("WriteToBigQuery",
BigQueryIO
.writeTableRows()
.withoutValidation()
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withFailedInsertRetryPolicy(InsertRetryPolicy.neverRetry())
.to(options.bigQueryOutput)
)
pipeline.run()
DissarmPubsubMessage is a PTransforms that uses FlatMapElements transform to get TableRow and ReadingsInputFlatten (own class for Postgres)
We expect zero duplicates or the "best effort" (and we append some cleaning cron job), we paid for these products to run statistics and bigdata analysis...
[UPDATE 1]
I even append a new simple transform that logs our unique attribute through a ParDo which supposedly should ack the PubsubMessage, but this is not the case:
new flow with AckPubSubMessages step
Thanks!!
Looks like you are using the global window. One technique would be to window this into an N minute window. Then process the keys in the window and drop an items with dup keys.
The supported programming languages are Python and Java, your code seems to be Scala and as far as I know it is not supported. I strongly recommend using Java to avoid any unsupported feature for the programming language you use.
In addition, I would recommend the following approaches to work on duplicates, the option 2 could meet your need of near-real-time:
message_id. Probably you already read the FAQ - duplicates which points to deprecated doc. However, if you check the PubsubMessage object you will notice that messageId is still available and it will be populated if not set by the publisher:
"ID of this message, assigned by the server when the message is
published ... It must not be populated by the publisher in a
topics.publish call"
BigQuery Streaming. To validate duplicate during loading the data, right before inserting in BQ you can create UUID.Please refer the section Example sink: Google BigQuery.
Try the Dataflow template PubSubToBigQuery and validate there are not duplicates in BQ.

servicestack redis, when using SetEntry, it will automatic generate a set with key "ids:+objectName" in redis db, how can I disable it?

when using SetEntry, it will automatic generate a set with key "ids:+ objectName" in redis db.
For example:
typedClient.SetEntry("famyly:username:jhon",new Family {FatherName="Jhon",...});
a set with key name of "ids:Family" and a member like "2343443" will be automatic created in redis db,
and each time I update or modify the same key with SetEntry, the set of "ids:Family" will increment with an new auto generated member. And this set will grow extremely large if I update the key frequently.
How can I disable the auto generated set? this set seems useless for the current circumstances.
thanks
I ran into this same problem - I discovered that our database contained a couple dozen of these "ids:XXX" sets, each containing tens of millions of items, which were consuming significant amounts of memory.
The solution is to switch to untyped clients. You can still use typed methods on the client so you're really not giving up any type safety or automatic serialization at all. There's a couple ways to create clients; we tend to use the get-in-get-out Exec shortcuts on RedisClientsManager. You should be able to adapt this to the way you do it.
Typed client - creates "ids" sets:
// set:
redis.ExecAs<T>(c => c.SetEntry(key, value));
// get:
T value = redis.ExecAs<T>(c => c.GetValue(key));
Untyped client - no "ids" sets created:
// set:
redis.Exec(c => c.Set(key, value));
// get:
using (var cli = _redis.GetClient())
{
T value = cli.Get<T>(key);
}
The inferred auto-generated id's are when you use the high-level Redis Typed Client. Use the IRedisClient.SetEntry on the string-based RedisClient API instead.