What is the equivalent to SQL's SELECT count(*) for Redis? - redis

I have a Redis Hash set up something like
"Profile:123":
"updatedAt": "2021/04/19"
and I need to be able to retrieve the count of all the Profiles that were updated on a given date. The SQL query would look something like SELECY count(*) FROM profiles WHERE updatedAt = "2021/04/19"; but I've been given the user story to implement that on a Redis cache and I'm not sure if it's possible.

Related

Is hash always preferable if I'm always getting multiple fields in redis?

Let's say I am checking information about some of my users every second. I need to take an action on some of those users that may take more than a second. Something like this:
#pseudocode
users = DB.query("SELECT * FROM users WHERE state=5");
users.forEach(user => {
if (user.needToDoThing()) {
user.doThatThing();
}
});
I want to make sure I won't accidentally run doThatThing on a user who has it already running. I am thinking of solving it by setting cache keys based on the user ID as things are processed
#pseudocode
runningUsers = redis.getMeThoseUsers();
users = DB.query("SELECT * FROM users WHERE state=5 AND id NOT IN (runningUsers)");
redis.setThoseUsers(users);
users.forEach(user => {
if (user.needToDoThing()) {
user.doThatThing();
}
redis.unsetThatUser(user);
});
I am unsure if I should...
Use one hash with a field per user
Use multiple keys with mset and hget
Is there a performance or business reason I'd want one over the other? I am assuming I should use a hash so I can use hgetall to know who is running on that hash vs doing a scan on something like runningusers:*. Does that seem right?
Generally speaking, option 1 (Use one hash with a field per user) is probably the best method in most cases because you want to access all fields for the users at once. It can be achieved by using HGETALL.
But when go you for 2nd option (use multiple keys with mset and mget) you want to query every single time in redis to get the user details.By using MGET you can access all key values but you want to know the key name for each users. It will be suitable when you are accessing few fields in an object.Disadvantages: possibly slower when you need to access all/most of the fields for the users.
NOTE: By using 1st option you can't set TTL for single users because in redis there is no support for TTL for child keys in hash structure,the only way you should set for entire hash.But by using 2nd option, you can set TTL for every single users.

Laravel Predis update/delete 1 key in array

For example i have an array/json with 100000 entries cached with Redis / Predis. Is it posible to update or delete 1 or more entries or do i have to generate the whole array/json of 100000 entries? And how can I achieve that?
It is about how you store them if you are storing it as a string then no,
set key value
get key -> will return you value
Here value is your json/array with 10000 entries.
Instead if you are storing it in a hash . http://redis.io/commands#hash
hmset key member1 value1 member2 value2 ...
then you can update/delete member1 separately.
If you are using sets/lists you can achieve it with similar commands like lpush/lpop, srem etc.
Do read the commands section to know more about redis data structures which will give you more flexibility in selecting your structure.
Hope this helps
If you are using cache service, you have to:
get data from cache
update some entries
save data back in cache
You could use advanced Redis data structures like Hashes, but you it is not supported by Cache service, you would need to write you own functions.
Thanks Karthikeyan Gopall, i made an example:
Here i changed field1 value and it works :)
$client = Redis::connection();
$client->hmset('my:hash', ['field1'=>'value1', 'field2'=>'value2']);
$changevalue= Redis::hset('my:hash' , 'field1' , 'newvaluesssssssssss');
$values1 = Redis::hmget('my:hash' , 'field1');
$values2 = Redis::hmget('my:hash' , 'field2');
print_r($values1);
print_r($values2);

Best way to store data and have ordered list at the same time

I have those datas that change enough not to be in my postgres tables.
I would like to get tops out of those data.
I'm trying to figure out a way to do this considering :
Easiness of use
Performance
1. Using Hash + CRON to build ordered sets frequently
In this case, I have lot of users data stored in hash like this :
u:25463:d = { "xp":45124, "lvl": 12, "like": 15; "liked": 2 }
u:2143:d = { "xp":4523, "lvl": 10, "like": 12; "liked": 5 }
If I want to get the top 15 of the higher lvl people. I dont think I can do this with a single command. I think I'll need to SCAN the all u:x:d datas and build sorted sets out of it. Am I mistaken ?
What about performance in this case ?
2.Multiple Ordered sets
In this case, I duplicate datas.
I still have to first case but I also update datas in the differents sorted sets and I don't need to use a CRON to built them.
I feel like the best approach is the first one but what if I have 1000000 users ?
Or is there another way ?
One possibility would be to use a single sorted set + hashes.
The sorted set would just be used as a lookup, it would store the key of a user's hash as the value and their level as the score.
Any time you add a new player / update their level, you would both set the hash, and insert the item into the sorted set. You could do this in a transaction based pipeline, or a lua script to be sure they both run at the same time, keeping your data consistent.
Getting the top players would mean grabbing the top entries in the sorted set, and then using the keys from that set, to go lookup the full data on those players with the hashes.
Hope that helps.

Azure Stream Analytics - long living calculation

I'm using Azure stream analytics for real-time analytics and I have a basic problem. I have a field which I would like to count the number of messages.
The json is in the following format:
{ categoryId: 100, name: 'hello' }
I would like to see the number of count by category, so I assume that the query in Azure stream analytics should be:
SELECT
categoryId,
count(*) as categoryCount
INTO
categoriesCount
FROM
categoriesInput
GROUP BY
categoryId
The problem is that I have to add TumblingWindow or SlidingWindows to the group by clause. Is there a way to avoid that and have the calculation running indefinitely ? Also I need to make sure the output is written to the SQL server.
How about a sliding window with the length of "1". This way it would act like a pointer and everytime it changes you can do the calculation?
Hope this helps!
Mert

Searching in values of a redis db

I am a novice in using Redis DB. After reading some of the documentation and looking into some of the examples on the Internet and also scanning stackoverflow.com, I can see that Redis is very fast, scales well but this costs the price that we have to think out how our data will be accessed at the design time and what operations they will have to undergo. This I can understand but I am a little confused about searching in the data what was so easy, however slow, with the plain old SQL. I could do it in one way with the KEY command but it is an O(N) operation and not O(log(N)). So I would lose one of the advantages of Redis.
What do more experienced colleagues say here?
Let's take an example use case: we have need to store personal data for approx. 100.000 people and those data need to be searched by name, phone nr.
For this I would use the following structures:
1. SET for storing all persons' ids {id1, id2, ...}
2. HASH for each person to store personal data and name it
like map:<id> e.g. map:id1{name:<name>, phone:<number>, etc...}
Solution 1:
1. HASH for storing all persons' ids but the key should be the phone number
2. Then with the command KEY 123* all ids could be retrieved who have a phone number
sarting with 123. On basis of the ids also the other personal data could be retrieved.
3. So forth for each data to be searched for a separate HASH should be created.
But a major drawback of this solution is that the attribute values must also be unique, so that the assigment of the phone number and the ids in the HASH would be unambiguous. On the other hand, O(N) runtime is not ideal.
Moreover, this uses more space than would be necessary and the KEY command deteriorates the access performance. (http://redis.io/commands/keys)
How should it be done in the right way? I could also imagine that ids would go in a ZSET and the data needed search could be the scores but this make only possible to work with ranges not with seraches.
Thank you also in advance, regards, Tamas
Answer summary:
Actually, both responses state that Redis was not designed to search in the values of the keys. If this use case is necessary, then either workarounds need to be implemented as shown in my original solution or in the below solution.
The below solution by Eli has a much better performance, than my original one because the access to the keys can be considered constant, only the list of ids needs to be iterated through, for the access this would give O(const) runtime. This data model also allows that one person might have the same phone number as someone else and so on also for names etc... so 1-n relationship is also possible (I would say with old ERD terminology).
The drawback of this solution is, that it consumes much more space than mine and phone numbers whose starting digits are known only, could not be searched.
Thanks for both responses.
Redis is for use cases where you need to access and update data at very high frequency and where you benefit from use of data structures (hashes, sets, lists, strings, or sorted sets). It's made to fill very specific use cases. If you have a general use case like very flexible searching, you'd be much better served by something built for this purpose like elastic search or SOLR.
That said, if you must do this in Redis, here's how I'd do it (assuming users can share names and phone numbers):
name:some_name -> set([id1, id2, etc...])
name:some_other_name -> set([id3, id4, etc...])
phone:some_phone -> set([id1, id3, etc...])
phone:some_other_phone -> set([id2, id4, etc...])
id1 -> {'name' : 'bob', 'phone' : '123-456-7891', etc...}
id2 -> {'name' : 'alice', 'phone' : '987-456-7891', etc...}
In this case, we're making a new key for every name (prefixed with "name:") and every phone number (prefixed "phone:"). Each key points to a set of ids that have all the info you want for a user. When you search, for a phone, for example, you'll do:
HGETALL 'phone:123-456-7891'
and then loop through the results and return whatever info on each (name in our example) in your language of choice (you can do this whole thing in server-side Lua on the Redis box to go even faster and avoid network back-and-forth, if you want):
for id in results:
HGET id 'name'
You're cost here will be O(m) where m is the number of users with the given phone number, and this will be a very fast operation on Redis because of how optimized it is for speed. It'll be overkill in your case because you probably don't need things to go so fast, and you'd prefer having flexible search, but this is how you would do it.
redis is awesome, but it's not built for searching on anything other than keys. You simply cant query on values without building extra data sets to store items to facilitate such querying, but even then you don't get true search, just more maintenance, inefficient use of memory, yada, yada...
This question has already been addressed, you've got some reading to do :-D
To search strings, build auto-complete in redis and other cool things...
How do I search strings in redis?
Why using MongoDB over redis is smart when searching inside documents...
What's the most efficient document-oriented database engine to store thousands of medium sized documents?
Original Secondary Indicies in Redis
The accepted answer here is correct in that the traditional way of handling searching in Redis has been through secondary indices built around Sets and Sorted Sets.
e.g.
HSET Person:1 firstName Bob lastName Marley age 32 phoneNum 8675309
You would maintain secondary indices, so you would have to call
SADD Person:firstName:Bob Person:1
SADD Person:lastName:Marley Person:1
SADD Person:phoneNum:8675309 Person:1
ZADD Person:age 32 Person:1
This allows you to now perform search-like operations
e.g.
SELECT p.age
FROM People AS p
WHERE p.firstName = 'Bob' and p.lastName = 'Marley' and p.phoneNum = '8675309'
Becomes:
ids = SINTER Person:firstName:Bob Person:lastName:Marley Person:phoneNum:8675309
foreach id in ids:
age = HGET id age
print(age)
The key challenge to this methodology is that in addition to being relatively complicated to set up (it really forces you to think about your model), it becomes extremely difficult to maintain atomically, particularly in shardded environments (where cross-shard key constraints can become problematic) consequentially the keys and index can drift apart, forcing you to periodically have to loop through and rebuild the index.
Newer Secondary Indices with RediSearch
Caveat: This uses RediSearch a Redis Module available under the Redis Source Available License
There's a newer module that plugs into Redis that can do all this for you called RediSearch This lets you declare secondary indices, and then will take care of indexing everything for you as you insert it. For the above example, you would just need to run
FT.CREATE person-idx ON HASH PREFIX 1 Person: SCHEMA firstName TAG lastName TAG phoneNumber TEXT age NUMERIC SORTABLE
That would declare the index, and after that all you need to do is insert stuff into Redis, e.g.
HSET Person:1 firstName Bob lastName Marley phoneNumber 8675309 age 32
Then you could run:
FT.SEARCH person-idx "#firstName:{Bob} #lastName:{Marley} #phoneNumber: 8675309 #age:[-inf 33]"
To return all the items matching the pattern see query syntax for more details
zeeSQL is a novel Redis modules with SQL and secondary indexes capabilities, allowing search by value of Redis keys.
You can set it up in such a way to track the values of all the hashes and put them into a standard SQL table.
For your example of searching people by phone number and name, you could do something like.
> ZEESQL.CREATE_DB DB
"OK"
> ZEESQL.INDEX DB NEW PREFIX customer:* TABLE customer SCHEMA id INT name STRING phone STRING
At this point zeeSQL will track all the hashes that start with custumer and will put them into a SQL table. It will store the fields id as an integer, name as a string and phone as a string.
You can populate the table simply adding hashes to Redis, and zeeSQL will keep everything in sync.
> HMSET customer:1 id 1 name joseph phone 123-345-2345
> HMSET customer:2 id 2 name lukas phone 234-987-4453
> HMSET customer:3 id 3 name mary phone 678-443-2341
At this point you can look into the customer table and you will find the result you are looking for.
> ZEESQL.EXEC DB COMMAND "select * from customer"
1) 1) RESULT
2) 1) id
2) 2) name
2) 3) phone
3) 1) INT
3) 2) STRING
3) 3) STRING
4) 1) 1
4) 2) joseph
4) 3) 123-345-2345
5) 1) 2
5) 2) lukas
5) 3) 234-987-4453
6) 1) 3
6) 2) mary
6) 3) 678-443-2341
The results specify, at first the name of the columns, then the type of the columns and finally the actual results set.
zeeSQL is based on SQLite and it supports all the SQLite syntax for filtering and aggregation.
For instance, you could search for people knowing only the prefix of their phone number.
> ZEESQL.EXEC DB COMMAND "select name from customer where phone like 678%"
1) 1) RESULT
2) 1) name
3) 1) STRING
4) 1) mary
You can find more examples in the tutorial: https://doc.zeesql.com/tutorial#using-secondary-indexes-or-search-by-values-in-redis