Redis HyperLogLog - Too many errors - redis

The scenario is really simple. I'm adding 50 elements (different each time) to a HLL. Usually at the third time, I get a wrong PFCOUNT (151 instead of 150). I know that the HLL has a low error rate but is it so easy to get a false positive ? can this error be handled?
Thanks in advance
Here's the logs.
127.0.0.1:6379> PFADD test DaG4yPCb vrTDeJde SCcK4rvG K0UJPxeT s1RtvWyf EpkUaxhY y4ot0BQW vt13T2eS 5rFe0TKj yXm25gXb 4nnw8YYy Fnqdb4C6 rwuPLUyC W9uS0az7 koOtrENo hIjAa00k eT3VvI7Q zQVhYnYY 1Cshhbbk 8q3B82gH NWlnW5QH fbNYBXoy 4ti95TeI TiUyXs0W TAepHjdd CK26UGuC ESt9opXO ihYIo1L9 0XqFKx8x coh31ZxE 01G7eCjb wJZYByUo ZHfJIKoQ tFGPsdgZ 19DUQvNX 20QtyIVq Xjx4wT9z nJazaXtH cHEqmQjZ hz8j0uhT hpeygfWk hWBf44rU iUJbsPSY nIYDiV80 FgaEU3pI 7EEkDGY6 tPF0KHFM twVbY3wR xFpEg4jP 4JEW0pue
127.0.0.1:6379> PFCOUNT test
(integer) 50
127.0.0.1:6379> PFADD test elapxije pbjtcvbg pjoiaarc pogpnjqd ujzfiuyu kykxhqpl hnkwmwpq gljpsnwu rlnflrdb wexqthqe hwbcgbvt yjdddtpo lnkqcoaz tcjgnxme aiflckyh rfsmwzgw eooownar pkvhdwae tywuoxgv mojqkmqd gepsxhqj cbgrmzih jkormrfk irasppno mmealsye fdumtspr anisssut tuqlufyr coqebpyn zijsoauj akvcvkda jruskmma kalinqpr lsazgswh ozyajcpm edvodqnt befvtsbx bcaurnjh psgdgval pyktekgo kucfjnov xruaulrl rrwqzjac ppbbhdhz iohaeoiq fbztqesn zsfnxzsa masqfqjo fsybqced xzfdhtzv
(integer) 1
127.0.0.1:6379> PFCOUNT test
(integer) 100
127.0.0.1:6379> PFADD test hukqyega olgswnll ufzjkscd oygfsgdu bttlwivr xrvtjsfc criuaabz idxilrvd kitvpuzb ehwrvcip ljthitya clgciaex bagxomaq ziszyehx uuhytedx xycrfcgf nmbnxkav ylxxyyrp rfwniodp vezvqefz gomrekbf tirdnpbp fpbokjjz dwppiomo zgypqxyh kavukjeb wsomngmh oawosnvf tinruzjc bbfqchbn airifskr dqcaznzt vnpfejep jmdlwbek eubhstbo iamgnktp gfojfegy hvmbszlu poauswtc tdgozdfy cxdsprqo pjsuxult nctztxwb fbayirlw dcitezyn zufryoro tisxdwtn mmgztjie vykdkvwm dqogmhnm
(integer) 1
127.0.0.1:6379> PFCOUNT test
(integer) 151

From https://redis.io/commands/PFCOUNT
The returned cardinality of the observed set is not exact, but approximated with a standard error of 0.81%.
In your case it is 1/150~=0.67% which is well within the documented standard error.

Related

Redis - Check is a given set of ids are part of a redis list/hash

I have a large set of ids (around 100000) which I want to store in redis.
I am looking for the most optimal way through which I can check if a given list of ids, what are the ids that are part of my set.
If I use a redis set, I can use SISMEMBER to check if a id is part of my set, but in this case I want to check if, given a list of ids, which one is part of my set.
Example:
redis> SADD myset "1"
(integer) 1
redis> SADD myset "2"
(integer) 2
redis> MYCOMMAND myset "[1,2,4,5]"
(list) 1, 2
Does anything of this sort exist already ?
thanks !

How to find in sorted set an element which is just below a certain value

First, I am new to Redis. Well let's say I have done:
127.0.0.1:6379> zadd subs:x 0 0
127.0.0.1:6379> zadd subs:x 500 500
127.0.0.1:6379> zadd subs:x 1000 1000
127.0.0.1:6379> zadd subs:x 5000 5000
127.0.0.1:6379> zadd subs:x 10000 10000
And I want to find an element that is just above the value 2000 and just below.
Above is simple and easy:
127.0.0.1:6379> ZRANGEBYSCORE subs:x 2000 +inf LIMIT 0 1
1) "5000"
But how to find an element below in simple way?
1) I know I can do:
127.0.0.1:6379> ZRANGEBYSCORE subs:x -inf 2000 LIMIT 2 1
1) "1000"
But I have to know before running this command that offset is 2 so in general I have to find offset first.
2) Or I can find ZRANK and then move one step backward:
127.0.0.1:6379> ZRANK subs:x 5000
(integer) 3
127.0.0.1:6379> ZRANGE subs:x 2 2
1) "1000"
So my question is there a simple way to get element just below a certain value?
Like above, but for below, use ZREVRANGEBYSCORE, you should.
Translation from Yoda-speak:
Redis actually features a command that does just what you're looking for - ZREVRANGEBYSCORE. ZREVRANGEBYSCORE does the same thing as ZRANGEBYSCORE but uses reverse ordering (as the "REV" in its name suggests).
That would allow you to get the "below 2000" member easily with just one call, as you've shown in your comment. May the force be with you.

redis shows empty set for lrange list

I'm seeing a strange behavior in redis when using the lrange command.
I have a list called "test" with 10000000 values. When I ask for 100 rows starting at 99999 it returns an empty set?!
Any ideas why?
127.0.0.1:6379> keys *
1) "test"
127.0.0.1:6379> type test
list
127.0.0.1:6379> llen test
(integer) 10000000
127.0.0.1:6379> lrange test 99999 100
(empty list or set)
I misunderstood the arguments - it's not like similar commands in other languages.
The stop value is not "how many rows from 'start' should I pull", but rather, "which row should I stop at".
so lrange test 99999 100 meant "start at 99999 end at 100" which makes no sense.
I would have to do lrange test 99999 100099.

Redis: How to intersect a "normal" set with a sorted set?

Assume I have a set (or sorted set or list if that would be better) A of 100 to 1000 strings.
Then I have a sorted set B of many more strings, say one million.
Now C should be the intersection of A and B (of the strings of course).
I want to have every tuple (X, SCORE_OF_X_IN_B) where X is in C.
Any Idea?
I got two ideas:
Interstore
store A a sorted set with every score being 0
interstore to D
get every item of D
delete D
Simple loop in client
loop over A in my client programm
get zscore for every string
While 1. has way too much overhead on the redis side (Has to write for example. The redis page states quite a high time complexity, too http://redis.io/commands/zinterstore), 2. would have |A| database connections and won't be a good choice.
Maybe I could write a redis/lua script which will work like zscore but with an arbitrary number of strings, but I'm not sure if my hoster allows scripts...
So I just wanted to ask SO, if there is an elegant and fast solution available without scripting!
There is a simple solution to your problem: ZINTERSTORE will work with a SET and a ZSET. Try:
redis> sadd foo a
(integer) 1
redis> zadd bar 1 a
(integer) 1
redis> zadd bar 2 b
(integer) 1
redis> zinterstore baz 2 foo bar AGGREGATE MAX
(integer) 1
redis> zrange baz 0 -1 withscores
1) "a"
2) "1"
Edit: I added AGGREGATE MAX above, since redis will give each member of the (non-sorted) set foo a default score of 1, and SUM that with whatever score it has in the (sorted) set bar.

Is there MGET analog for Redis hashes?

I'm planning to start using hashes insead of regular keys. But I can't find any information about multi get for hash-keys in Redis wiki. Is this kind of command is supported by Redis?
Thank you.
You can query hashes or any keys in pipeline, i.e. in one request to your redis instance. Actual implementation depends on your client, but with redis-py it'd look like this:
pipe = conn.pipeline()
pipe.hgetall('foo')
pipe.hgetall('bar')
pipe.hgetall('zar')
hash1, hash2, hash3 = pipe.execute()
Client will issue one request with 3 commands. This is the same technique that is used to add multiple values to a set at once.
Read more at http://redis.io/topics/pipelining
No MHGETALL but you can Lua it:
local r = {}
for _, v in pairs(KEYS) do
r[#r+1] = redis.call('HGETALL', v)
end
return r
If SORT let you use multiple GETs with the -> syntax, and all your hashes had the same fields, you could get them in a bulk reply by putting their names into a set and sorting that.
SORT names_of_hashes GET *->field1 *->field2 *->field3 *->etc
But it doesn't look like you can do that with the hash access. Plus you'd have to turn the return list back into hashes yourself.
UPDATE: Redis seems to let you fetch multiple fields if you name your hashes nicely:
redis> hset hash:1 name fish
(integer) 1
redis> hset hash:2 name donkey
(integer) 1
redis> hset hash:3 name horse
(integer) 1
redis> hset hash:1 type fish
(integer) 1
redis> hset hash:2 type mammal
(integer) 1
redis> hset hash:3 type mammal
(integer) 1
redis> sadd animals 1
(integer) 1
redis> sadd animals 2
(integer) 1
redis> sadd animals 3
(integer) 1
redis> sort animals get # get hash:*->name get hash:*->type
1. "1"
2. "fish"
3. "fish"
4. "2"
5. "donkey"
6. "mammal"
7. "3"
8. "horse"
9. "mammal"
There is no command to do it on one shot, but there is a way to do it "nicely", using a list (or sorted set) where you would store you hashKeys, and then retrieve them as bulk using multi.
In PHP:
$redis->zAdd("myHashzSet", 1, "myHashKey:1");
$redis->zAdd("myHashzSet", 2, "myHashKey:2");
$redis->zAdd("myHashzSet", 3, "myHashKey:3");
$members = $redis->zRange("myHashzSet", 0, -1);
$redis->multi();
foreach($members as $hashKey) {
$redis->hGetAll($hashKey);
}
$results = $redis->exec();
I recommand using a sorted set, where you use the score as an ID for your hash, it allows to take advantages of all score based command.
Redis has a HMGET command, which returns the values of several hash keys with one command.