Redis search with index giving inconsistent results - redis

We are not able to get consistent searches to redis.
Redis server v=7.0.7 sha=00000000:0 malloc=jemalloc-5.2.1 bits=64 build=2260280010e18db8
root#redis01:~# redis-cli
127.0.0.1:6379> info modules
module:name=ReJSON,ver=20008,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]
module:name=search,ver=20405,api=1,filters=0,usedby=[],using=[ReJSON],options=[]
We have exactly 5022786 key/value pairs, the same number of entries are on our 'idx36' index. There is no incoming traffic, so this dataset of 5022786 entries remains constant all the time.
127.0.0.1:6379> info keyspace
db0:keys=5022786,expires=5022786,avg_ttl=257039122
127.0.0.1:6379> ft.info idx36
1) index_name
2) idx36
3) index_options
4) 1) "NOFREQS"
5) index_definition
6) 1) key_type
2) HASH
3) prefixes
4) 1) 36|
5) default_score
6) "1"
7) attributes
8) 1) 1) identifier
2) CheckIn
3) attribute
4) CheckIn
5) type
6) NUMERIC
2) 1) identifier
2) HotelCode
3) attribute
4) HotelCode
5) type
6) TAG
7) SEPARATOR
8)
9) num_docs
10) "5022786"
11) max_doc_id
12) "12729866"
13) num_terms
14) "0"
15) num_records
16) "1.8446744073526942e+19"
17) inverted_sz_mb
18) "162.96730041503906"
19) vector_index_sz_mb
20) "0"
21) total_inverted_index_blocks
22) "17965200"
23) offset_vectors_sz_mb
24) "0"
25) doc_table_size_mb
26) "1843.12548828125"
27) sortable_values_size_mb
28) "0"
29) key_table_size_mb
30) "194.14376831054688"
31) records_per_doc_avg
32) "3672612012032"
33) bytes_per_record_avg
34) "9.2636185181071973e-12"
35) offsets_per_term_avg
36) "0"
37) offset_bits_per_record_avg
38) "-nan"
39) hash_indexing_failures
40) "333006"
41) indexing
42) "0"
43) percent_indexed
44) "1"
45) gc_stats
46) 1) bytes_collected
2) "264321009"
3) total_ms_run
4) "1576438"
5) total_cycles
6) "5"
7) average_cycle_time_ms
8) "315287.59999999998"
9) last_run_time_ms
10) "706929"
11) gc_numeric_trees_missed
12) "0"
13) gc_blocks_denied
14) "23251"
47) cursor_stats
48) 1) global_idle
2) (integer) 0
3) global_total
4) (integer) 0
5) index_capacity
6) (integer) 128
7) index_total
8) (integer) 0
This index has a tag called 'HotelCode' and a numeric field called 'CheckIn'.
Now we try to search all entries that contain the string 'AO-B0' within 'HotelCode' (should be all):
127.0.0.1:6379> ft.search idx36 '(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 2708499
And now try to search all entries that don't contain the string 'AO-B0' within 'HotelCode' (should be 0):
127.0.0.1:6379> ft.search idx36 '-(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 0
But these two searches don't sum the total number of entries. Even if I'm wrong and not all entries contain the 'AO-B0' string, If I repeat the first search the result changes every time:
127.0.0.1:6379> ft.search idx36 '(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 2615799
(0.50s)
127.0.0.1:6379> ft.search idx36 '(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 2442799
(0.50s)
127.0.0.1:6379> ft.search idx36 '(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 2626299
(0.50s)
127.0.0.1:6379> ft.search idx36 '(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 2694899
(0.50s)
127.0.0.1:6379> ft.search idx36 '(#HotelCode:{AO\-B0})' limit 0 0
1) (integer) 2516699
(0.50s)
If now I try this search, less restrictive, I should get more entries ... but not:
127.0.0.1:6379> ft.search idx36 '#HotelCode:{AO\-B*}' limit 0 0
1) (integer) 1806899
Maybe I'm doing something wrong ... if someone can point me to the right way ...

Related

Reduce Redis Memory Usage

I am moving an existing scheduling data set to redis. This data has schedules, and users. This is a many-to-many relationship.
I store the full list of schedules in a scored zset where the score is the timestamp of the schedule date. I store it like this so I can easily find all schedules that have elapsed and act on those schedules.
I also need the ability to find all schedules that belongs to a user, so each user has their own zset containing duplicate information.
So the data may look like this:
s_1000: [ (100, "{..}"), (101, "{..}") ] # the schedules key
us_abc: [ (100, "{..}"), ] # a users schedules key
us_efg: [ (100, "{..}"), ] # another users schedules key
An actual record looks like this:
"{\"di\":10000,\"ci\":10000,\"si\":10000,\"p\":\"M14IB5A2830TE4KSSEGY0ZDX37V93FYX\",\"sse\":false}"
I've shortened the keys, and could even remove them altogether along with the json formatting for a really minimal payload, but all the data needs to be there.
This string alone is only 85 chars. Because there is a copy of each record, that would have a total of 170 chars for this record. The key for this would be us_M14IB5A2830TE4KSSEGY0ZDX37V93FYX_YYMMDD for another 42 chars. In total, i'm seeing only 255 bytes necessary to store this data.
I've inserted 100k records just like this one in the way I've described. By my count, that should only require 25mb, but I'm seeing this is taking up well over 200mb to store.
The memory info for that payload is 344 bytes (x100k = 33mb)
The memory info for the schedules key is 18,108,652 bytes (18mb)
The schedules mem usage looks correct.
Here are the memory stats:
memory stats
1) "peak.allocated"
2) (integer) 3343080744
3) "total.allocated"
4) (integer) 201656296
5) "startup.allocated"
6) (integer) 3668896
7) "replication.backlog"
8) (integer) 0
9) "clients.slaves"
10) (integer) 0
11) "clients.normal"
12) (integer) 1189794
13) "aof.buffer"
14) (integer) 0
15) "lua.caches"
16) (integer) 0
17) "db.0"
18) 1) "overhead.hashtable.main"
2) (integer) 5850304
3) "overhead.hashtable.expires"
4) (integer) 4249632
19) "overhead.total"
20) (integer) 14958626
21) "keys.count"
22) (integer) 100036
23) "keys.bytes-per-key"
24) (integer) 1979
25) "dataset.bytes"
26) (integer) 186697670
27) "dataset.percentage"
28) "94.297752380371094"
29) "peak.percentage"
30) "6.0320491790771484"
31) "allocator.allocated"
32) (integer) 202111512
33) "allocator.active"
34) (integer) 204464128
35) "allocator.resident"
36) (integer) 289804288
37) "allocator-fragmentation.ratio"
38) "1.011640191078186"
39) "allocator-fragmentation.bytes"
40) (integer) 2352616
41) "allocator-rss.ratio"
42) "1.4173845052719116"
43) "allocator-rss.bytes"
44) (integer) 85340160
45) "rss-overhead.ratio"
46) "0.98278516530990601"
47) "rss-overhead.bytes"
48) (integer) -4988928
49) "fragmentation"
50) "1.4126673936843872"
51) "fragmentation.bytes"
52) (integer) 83200072
It looks like the bytes per key is a whopping 1977 bytes.
Why does each key use 344 bytes? Is it possible to tell redis to only use 1byte per char?
Why does redis use so many bytes per key?
Is there a way I can structure my data better so I don't blow out redis on such low amounts of data (I need 100mms of records).

Redis CLUSTER NODES showing in slowlog

I am using a redis cluster with 3 master and 3 slave as mysql cache,and the client is redisson with #Cacheabel annotation.But I found some slow logs with the command CLUSTER NODES like:
3) 1) (integer) 4
2) (integer) 1573033128
3) (integer) 10955
4) 1) "CLUSTER"
2) "NODES"
5) "192.168.110.102:57172"
6) ""
4) 1) (integer) 3
2) (integer) 1573032928
3) (integer) 10120
4) 1) "CLUSTER"
2) "NODES"
5) "192.168.110.90:59456"
6) ""
So ,I want to know what was the problem?

redis LRANGE and SRANDMEMBER Together

i have some records in redis
1) "one"
2) "two"
3) "three"
4) "four"
5) "five"
6) "six"
7) "seven"
8) "eight"
9) "nine"
10) "ten"
11) "eleven"
12) "twelve"
13) "thirteen"
14) "fourteen"
15) "fifteen"
16) "sixteen"
17) "seventeen"
18) "eighteen"
19) "nineteen"
i have to get first 10 values from the list
LRANGE keyname 1 10
and i have to get last 10 values from the list
LRANGE keyname (n-10) n
or say get middle 10 values from the list
LRANGE keyname (n/2) (n/2)+10
and i have to get random 10 values from this list
SRANDMEMBER keyname 10
so in order to performance all this operation
which datatype should i use in redis to achieve this ?
i am currently doing this
LRANGE keyname randomNumber randomNumber+10
but it not completely random
EDITED
I want to perform both operation on by data in redis
get range of data(like LRANGE) and get random data(like SRANDMEMBER)?

REDIS. Get where clause

I have this keys list:
redis 127.0.0.1:6379> keys *
1) "r:fd:g1:1377550557255"
2) "r:fd:g1:1377550561240"
3) "r:fd:g1:1377550561561"
4) "r:fd:g1:1377550562300"
5) "r:fd:g1:1377550558977"
6) "r:fd:g1:1377550561344"
7) "r:fd:g1:1377550561832"
8) "r:fd:g1:1377550560344"
9) "r:fd:g1:1377550559978"
10) "r:fd:g1:1377550557777"
11) "r:fd:g1:1377550554258"
12) "r:fd:g1:1377550556772"
13) "r:fd:g1:1377550559649"
14) "r:fd:g1:1377550555460"
15) "r:fd:g1:1377550560895"
16) "r:fd:g1:1377550559139"
17) "r:fd:g1:1377550556595"
18) "r:fd:g1:1377550557634"
How i can get only this keys where timestamp is more then 1377550561300 ?
You can't do this.
But you can use sorted sets and write timestamps as scores, then you'll be able to use
zrangebyscore:
zrangebyscore key 1377550561300) +inf

Why these statement are slow in redis?

I have got following slow query log in redis. I have disable writing to disk. So database is in memory database. I am not able to understand why these two query are slow?
FYI
I have total 462698 hash. Having pattern key:<numeric_number>
1) 1) (integer) 34
2) (integer) 1364981115
3) (integer) 10112
4) 1) "HMGET"
2) "key:123"
3) "is_working"
6) 1) (integer) 29
2) (integer) 1364923711
3) (integer) 87705
4) 1) "HMSET"
2) "key:538771"
3) "status_app"
4) ".. (122246 more bytes)"