Termination condition of Redis sscan (or scan) operation - redis

I'm using Redis with set structure via SADD, SREM to insert and delete respectively.
At some point in time, I need to iterate all elements in the set and process them and lastly remove each element from the set. To iterate all elements, I used SSCAN. In the documentation, they said as follows,
The commands are also allowed to return zero elements, and the client should not consider the iteration complete as long as the returned cursor is not zero.
To iterate all elements in the set, I started the cursor from zero. And then, I processed each element returned from SSCAN and remove each element from the set by using SREM operation. I used the cursor returned from previous result of SSCAN operation for the next iteration. This process continues until both the cursor returned from SSAN is zero and there are no elements in the returned array. During the iteration no new element is added into the set.
However, After finishing the process, there exist elements left in the set.
In my expectation, the set should be empty. What's the problem?
Should I check whether there are elements in the set? For example, by using SCARD?
Here is the operation I did,
SADD key element
SADD key element
..
..
result = SSCAN key 0 Count 100
while elements and next_cursor != 0:
for e in elements
process e
SREM key
SSCAN key next_cursor Count 100
// process the last result returned with zero value cursor
for e in elements
process e
SREM key

This process continues until both the cursor returned from SSAN is zero and there are no elements in the returned array
This is incorrect. You should only check if the cursor returned is 0.
cursor = 0
while true:
cursor = SSCAN key cursor count 100
process_elements
if cursor == 0:
break

Related

Remove Redis key deletion behavior on expiration

I'm using Redis Key Space Notification to get my application notified when a specified key gets expired.
But when the key gets expired, Redis deletes the key, i need to remove this behavior because my application can use this expired information in another moment.
Is there a way to remove this behavior?
As #sazzad and #nitrin0 said, there's no way to change that.
As another option to get a similar result, I'd suggest you use a sorted set to track these "psuedo-expirations", and when they "expire", a background process does whatever else you need the key for: move it, transform it, reset the expiration, etc.
Use the command zadd to both create a new sorted set and to add members to it. The key for the set can be anything, but I'd use the members as the keys from the data that expires so you can easily work with both the real data, and the member in the sorted set.
ZADD name-of-sorted-set NX timestamp-when-data-expires key-of-real-data
Let's break this down:
name-of-sorted-set is what you'd use in the other Z* commands to work with this specific sorted set.
NX means "Only add new elements. Don't update already existing elements.". The other option is XX which is "Only update elements that already exist. Don't add new elements." For this, the only options are NX or nothing.
timestamp-when-data-expires is the score for this member, and we'll use it as the exact timestamp when the data would otherwise "expire", so you'll have to do some calculations in your application to provide the timestamp instead of just the seconds until it expires.
key-of-real-data is the exact key used for the real data this represents. Using the exact key here will help easily combine the two when you're working with this sorted set to find which members have "expired", since the members are the keys you'd use to move, delete, transform, the data.
Next I'd have a background process run zrangebyscore to see if there are any members whose scores (timestamps) are within some range:
ZRANGEBYSCORE name-of-sorted-set min-timestamp max-timestamp WITHSCORES LIMIT 0 10
Let's break this down too:
name-of-sorted-set is the key for the set we chose in ZADD above
min-timestamp is the lower end of the range to find members that have "expired"
max-timestamp is the higher end of the range
WITHSCORES tells Redis to return the name of the members AND their scores
LIMIT allows us to set an offset (the 0) and a count of items to return (the 10). This is just an example, but for very large data sets you'll likely have to make use of both the offset and count limits.
ZRANGEBYSCORE will return something like this if using redis-cli:
1) "first-member"
2) "1631648102"
3) "second-member"
4) "1631649154"
5) "third-member"
6) "1631650374"
7) "fourth-member"
8) "1631659171"
9) "fifth-member"
10) "1631659244"
Some Redis clients will change that, so you'll have to test it in your application. In redis-cli the member-score pair is returned over two lines.
Now that you have the members (keys of the actual data) that have "expired" you can do whatever it is you need to do with them, then probably either remove them from the set entirely, or remove them and replace them. Since in this example we created the sorted set with the NX example, we can't update existing records, only insert new ones.

db2 rowset cursor parametrization

Would there be a way to program a parameter that is not hard coded to this?
In place of :SomeValue host variable in this question/snippet:
EXEC SQL
FETCH NEXT ROWSET FROM C_NJSRD2_cursor_declared_and_opened
FOR :SomeValue ROWS
INTO
:NJCT0022.SL_ISO2 :NJMT0022.iSL_ISO2
etc....
Here is some clarification:
Parametrization of the request like posted in opening question actually works in case I set the host variable :SomeValue to 1 and define host variable arrays for filling from database to size 1 like
struct
??<
char SL_ISO2 ??(1??) ??(3??); // sorry for Z/os trigraphs
etc..
And it also works if I set the host variable arrays to a larger defined integer value (i.e. 20) and hard code the value (:SomeValue) to that value in cursor rowset fetch.
EXEC SQL
FETCH NEXT ROWSET FROM C_NJSRD2
FOR 20 ROWS
INTO
:NJCT0022.SL_ISO2 :NJMT0022.iSL_ISO2
,:NJCT0022.BZ_COUNTRY :NJMT0022.iBZ_COUNTRY
,:NJCT0022.KZ_RISK :NJMT0022.iKZ_RISK
I wish to receive the number of rows from the calling program (COBOL), a and ideally set the size of host variable arrays accordingly. To avoid variable array sizing problem, oversizing host variable arrays to a larger value would be good also.
Those combinations return compile errors:
HOST VARIABLE ARRAY "NJCT0022" IS EITHER NOT DEFINED OR IS NOT USABLE
And in good tradition here is an answer to my own question.
The good people of SO will upvote me for sure now.
Rowsets are very fast and beside making host variable arrays arrays or for chars array arrays these cursors just require adjusting program functions for saving values and setting null values in loops. They are declared like this:
FETCH NEXT ROWSET FROM C_NJSRD2
FOR 19 ROWS
Rowset cursors can not change host array (that is array array) size dynamically.
Unlike scroll cursors they can not jump to position or go backwards.They can however go forward not by the whole preset rowset number of rows but just by a single row.
FETCH NEXT ROWSET FROM C_NJSRD2 FOR 1 ROWS
INTO
So to answer my question, to make the algorithm able to accept any kind row number requested for fetches it is basically just a question of segmenting the request to rowsets and eventually one line fetches until the requested number is met. To calculate loop counters for rowsets and single liners:
if((iRowCount>iRowsetPreset)&&
((iRowCount%iRowsetPreset)!=0))
??<
iOneLinersCount = iRowCount % iRowsetPreset;
iRowsetsCount = (iRowCount - iOneLinersCount)
/ iRowsetPreset;
??>
if ((iRowCount==iRowsetPreset) _OR_
((iRowCount%iRowsetPreset)==0))
??<
iOneLinersCount = 0;
iRowsetsCount = iRowCount / iRowsetPreset;
??>
if (iRowCount<iRowsetPreset)
??<
iOneLinersCount = iRowCount;
iRowsetsCount = 0;
??>

Redis scan count: How to force SCAN to return all keys matching a pattern?

I am trying to find out values stored in a list of keys which match a pattern from redis. I tried using SCAN so that later on i can use MGET to get all the values, The problem is:
SCAN 0 MATCH "foo:bar:*" COUNT 1000
does not return any value whereas
SCAN 0 MATCH "foo:bar:*" COUNT 10000
returns the desired keys.
How do i force SCAN to look through all the existing keys? Do I have to look into lua for this?
With the code below you will scan the 1000 first object from cursor 0
SCAN 0 MATCH "foo:bar:*" COUNT 1000
In result, you will get a new cursor to recall
SCAN YOUR_NEW_CURSOR MATCH "foo:bar:*" COUNT 1000
To scan 1000 next object. Then when you increase COUNT from 1000 to 10000 and retrieve data you scan more keys then in your case match more keys.
To scan the entire list you need to recall SCAN until the cursor give in response return zero (i.e entire scan)
Use INFO command to get your amount of keys like
db0:keys=YOUR_AMOUNT_OF_KEYS,expires=0,avg_ttl=0
Then call
SCAN 0 MATCH "foo:bar:*" COUNT YOUR_AMOUNT_OF_KEYS
Just going to put this here for anyone interested in how to do it using the python redis library:
import redis
redis_server = redis.StrictRedis(host=settings.redis_ip, port=6379, db=0)
mid_results = []
cur, results = redis_server.scan(0,'foo:bar:*',1000)
mid_results += results
while cur != 0:
cur, results = redis_server.scan(cur,'foo:bar:*',1000)
mid_results += results
final_uniq_results = set(mid_results)
It took me a few days to figure this out, but basically each scan will return a tuple.
Examples:
(cursor, results_list)
(5433L, [... keys here ...])
(3244L, [... keys here, maybe ...])
(6543L, [... keys here, duplicates maybe too ...])
(0L, [... last items here ...])
Keep scanning cursor until it returns to 0.
There is a guarantee it will return to 0.
Even if the scan returns an empty results_list between scans.
However, as noted by #Josh in the comments, SCAN is not guaranteed to terminate under a race condition where inserts are happening at the same time.
I had a hard time figuring out what the cursor number was and why I would randomly get an empty list, or repeated items, but even though I knew I had just put items in.
After reading:
https://github.com/antirez/redis/blob/unstable/src/dict.c#L772-L855
It made more sense, but still there is some deep programming magic and compromises happening to iterate the sets.
If your use case involves Python, or if you just want to get the values once and has Python installed on your machine, this is a trivial task if you use the scan_iter method on the redis python library:
from redis import StrictRedis
redis = StrictRedis.from_url(REDIS_URI)
keys = []
for key in redis.scan_iter('foo:bar:*', 1000):
keys.append(key)
In the end, keys will contain all the keys you would get by applying #khanou 's method.
This is also more efficient than doing shell scripts, since those spawn a new client on each iteration of the loop.

How do I find the middle element of an ArrayList?

How do I find the middle element of an ArrayList? What if the size is even or odd?
It turns out that a proper ArrayList object (in Java) maintains its size as a property of the object, so a call to arrayList.size() just accesses an internal integer. Easy.
/**
* Returns the number of elements in this list.
*
* #return the number of elements in this list
*/
public int size() {
return size;
}
It is both the shortest (in terms of characters) and fastest (in terms of execution speed) method available.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/ArrayList.java#ArrayList.0size
So, presuming you want the "middle" element (i.e. item 3 in a list of 5 items -- 2 items on either side), it'd be this:
Object item = arrayList.get((arrayList.size()/2)+1);
Now, it gets a little trickier if you are thinking about an even sized array, because an exact middle doesn't exist. In an array of 4 elements, you have one item on one side, and two on the other.
If you accept that the "middle" will be biased to ward the end of the array, the above logic also works. Otherwise, you'll have to detect when the size of the elements is even and behave accordingly. Wind up your propeller beanie friends...
Object item = arrayList.get((arrayList.size()/2) + (arrayList.size() % 2));
if the arraylist is odd : list.get(list.size() / 2);
if the arratlist is even: list.get((list.size() / 2) -1);
If you have a limitation for not using arraylist.size() / arraylist.length() method; you can use two iterators. One of them iterates from beginning to the end of the array, the other iterates from end to the beginning. When they reach the same index on the arraylist, then you find the middle element.
Some additional controls might be necessary to assure iterators wait each other before next iteration, you should not miss the meeting point..etc.
While iterating, for both iterators you keep total number of elements they read. So they should iterate one element in a cycle. With cycle, I mean a process including these operations:
iteratorA reads one element from the beginning
iteratorB reads one element from the end
The iterators might need to read more than one index to read an element. In other words you should skip one element in one cycle, not one index.

Analysis of insert operation of an array-based list (cursor implementation)

So I'm studying for my algorithm analysis exam tomorrow and I'm reading over the instructors notes and examples. There's just one thing that I don't understand and it's this question:
Question: Inserting an element after a given element in an array-based list (cursor implementation) requires worst case time:
Answer: O(1)
Personally, I see the worst case being where the cursor is at the beginning of the list, therefore N-1 items in the array must be copied over to the next position before the new element is inserted and therefore it is an O(N) operation in the worst case.
However, when asked if this was a typo, the instructor stated that it wasn't.
What's the reasoning behind this? To all future answerers, thank you for your time.
Let's say we have to insert element 'a'. Well it says given an element, let's call it 'b'. What that means is you know what the next element is, let's call it 'c'. So all you have to do is to set the 'next' element of 'a' equal to 'c'. Then set the next element of 'b' equal to 'a'. This procedure is valid for any element. So the operation is constant time.
You can implement what is essentially a linked list using an array where each element in the array contains a pointer to the index of the next element.
struct Element
{
string item;
int next;
}
Given element A, you can insert a new element B after A in constant time.
int indexOfA = ..
int indexOfB = (next free index)
B.next = A.next;
A.next = indexOfB;