Which datastructure should I use in Redis for a notification system? - redis

I am trying to make a notification system with Redis rather than using MySQL which is what I use for the rest of the system. The reason for this is that I don't really need to save that much data so it can be saved in memory and I want it to be lightweight and fast.
The notifications will be kept temporarily. What I mean by that is that I do not want to save all notifications, but more like 50 latest unseen notifications for each user. So first thing I thought about was to use a linked list with a capped length of 50.
I would need to save this information for the notification:
postId
commentId
type
time
userId
username
image
So perhaps a JSON serialized string like this:
{"postId":1,"commentId":10,"type":1,"time":1462960058,"userId":2,"username":"Alexander","image":"ntfpRrgx.png"}
The notifications would be output like this on the client side:
Alexander commented on your post.
Alexander replied to your comment.
Where the type determines what kind of notification it is. I can handle "type" checks client side and output notification format accordingly. But here is the part I am having difficult with.
1) I need to be able to save the notifications in an ordered way so that I know which notification is newest.
2) I need to be able to know when a notification has been seen, so that it is not registered as not seen anymore.
3) I need to have a count of unseen notifications that I can show to the user. And If the user clicks on a notification, I need to mark that as a seen notification and decrement the count of unseen notifications.
4) I need to be able to mark all notifications as marked seen if the user wishes to do that.
5) I need to be able to get a subset of the notifications, whether seen or unseen, like an offset and limit on MySQL. For example, the user sees the newest 5 notifications, but he could click a next button and see the next 5, and the next 5 and so on.
I have no idea how to do all of this on Redis.
The key for the list or set could be user:1:notification. I know a list is sorted, and we can add and remove from the head and tail. But how do I achieve all these points?

1: You can use redis sorted sets (zset) operations and use timestamp as a score, and event id (or the entire event json) as a member.
ZADD my-set-key timestamp event-id
Then to get a page newest items you use zrevrange command. If you choose to use event id as a member, then you need additional structure to store event fields. I would recommend HSET eventid, field, value.
2: You can remove an item by member (event-id)
ZREM my-set-key event-id
3: Assuming your zset only keeps unseen, then you can use ZCARD to get size of the set
ZCARD my-set-key
4: You can remove an entire set in one shot using
DELETE my-set-key
5: You can paginate using zrange/zrevrange:
ZREVRANGE my-set-key start-position to-position
If you need to keep both seen and unseen items, then you need an extra zset where you only add, but don't remove once an item is seen

Related

Best practice for pagination based on item updated time

Let's consider I have 30 items in my db. And clientA will make an api call to get the first 10 records based on item updated time. And think of a use case where clientB updated the 11th record (item) by making some changes in it. But now when clientA makes an api call for next set of items based on the pagination page 2 (items from 11 to 20) It's because the clientB has updated the 11th item the pagination is going to break here (Bases on updated time 11th item will become 1 and 1 become 2, 2 become 3 ...10 becomes 11).There is a chance that clientA is will receive the duplicate data.
Is there any better approach for this kind of problem ??
Any help would be thankfull
I think you could retrieve all elements each time using no pagination at all, to prevent this kind of "false information" at your table.
If visualizing the actual values of each record is mandatory, you could always add a new function to your api working as a trigger. Each time a user modifies any record, this api's function will trigger a message for all active sessions to notify the user some data has been changed. As an example, think about something like the "twitter's live feed". In which when a new bunch of tweets are created, Twitter will notify all users to reload the page if they want to see realtime information.

Randomly select DynamoDB entry

I'm have a DynamoDB table called URLArray that contains a list of URL's (myURL) and a unique video name (myKey).
I need to do two things:
When a user clicks the next video button, a random entry needs to be selected from this URLArray. There could be potentially tens of thousands of rows.
The user is logged into the app. Everytime they finish watching a video, the video's unique video name is recorded. So....when the user has seen a video, its added to a list in a table called Users under the user's info row.
Soo...This random entry that gets selected when the user clicks the next video button in point 1, has to be compared to the list of videos they've already seen. To make sure that it doesn't randomly appear again for that particular user.
I do something woefully inefficient so far, that works, but it's not great:
By the way i'm using AppSync + GraphQL to interact with the DynamoDB table. I first get a local copy of the URLArray:
//Gets a list of the Key/URL pairs in the UrlArrays table in GraphQL ****IN CONSTRUCTOR, so we have this URLArray data when componentDidMount()****
listUrlArrays = async () => {
try {
URLData = await API.graphql(graphqlOperation(ListUrlArrays)); //GraphQL query
//URLData[] is available in the entire class
this.setState({urlArrayLength: apiData.data.listURLArrays.items.length}); //gets the length of URLArray (i.e. how many videos are in the database)
}
}
As an overview, when user clicks for the next video:
//When clicking next video
async nextVideo(){
await this.logVideosSeen(); //add myKey to the list of videos in *Users* table the logged in user has now seen
await this.getURL(); //get the NEXT upcoming video's details, for Video Player to play and make sure it's not been seen before
}
//This will update the 'listOfVideosSeen[]' in Users table with videos unique myKey, the logged in user has seen
logVideosSeen = async () => {
.......
}
async getURL() {
var dbIndex = this.getUniqueRandomNumber(this.state.urlArrayLength); //Choose a number between 0 and N number of videos in URLArray
//the hasVideoBeenSeen() basically gets the list of videos a user has already seen from `Users` table with the GraphQL getusers command, and creates a local copy of this list (can get big). I use javascripts indexOf() to check whether myKey already exists in the list
while(await this.hasVideoBeenSeen(this.state.URLData[dbIndex].myKey)) //while true i.e. user has seen that video before
{
dbIndex = this.getUniqueRandomNumber(this.state.urlArrayLength); //get another random number to fetch a new myKey
}
//If false, we'll exit the loop and know we've got a not seen before myKey, proceed to set to play...
if(dbIndex != null){
this.setState({ playURL: this.state.URLData[dbIndex].vidURL }); //Retrieve the URL from the local URLArray that we're going to play (i.e. the next video to come)
}
}
I can share a little more code if needed, but essentially I wanted to know how to:
Let a Lambda function select a random number based on the current URLArray size (i may need to keep a local copy of URLArray anyway). But i think point 2 here is where it's really inefficient..
Let a Lambda function check (the while loop) against the Users table whether myKey has already been seen. Mainly to shift this computational burden to the cloud instead of the local device the app runs on.
AFTER A THINK................
Thanks for the suggestion Seth. I have been thinking about it for some time, and while the randomness requirement still holds true, I think there is some truth in what you’ve suggested. The reason I need randomness, is so that 2 users sat side by side for example, can’t predict which video is coming next. It shouldn’t be a predictable sequence of videos. I'm not sure I can use Scan function with AWS Amplify/GraphQL. So remember there’s 2 things going on here: (1) a video upload, recording it in the URLArray sensibly for future reference. (2) users viewing a previously unseen random video and then moving onto another unseen random video
*(1)
I like your idea of using a number to index the URLArray, and it’s helped to make life a bit easier. So the first URL being at index 0, the next at 1 etc…
My thinking here (to avoid me doing a ListUrlArrays() and bringing the WHOLE array locally to the phone), is to create a GSI called VideoNumber for the URLArray table. This will be the unique VideoNumber column with a number 0-N. So imagine the diagram above having another column called VideoNumber. Row 1 having VideoNumber set to 0, Row 2 having VideoNumber set to 1 etc… THEN all I would need to do, is locally on the device, generate a random number between 0-N, call a getURLArrayIdbyVideoNumber() query specific for that GSI, with the number that we just generated, and it’ll unlock the information I need from the row. Voila! I think that shifts most of that heavy burden away now.
Question: Before each video is uploaded, how do I easily get the current total number of rows N in the table (or row count)? I would then increment it by one.
The other thing I can do is save this current count number in another DynamoDB table that I use for persisted data, read the number from there before upload, and write an N+1 after upload to increment it (2 DynamoDB operations per upload). It’s not ideal.
*(2)
When a user has finished watching a video, I can log in a list (under the users information in DynamoDB), which video’s they’ve already seen. So for example this could now be a seen list: [3,12,73,108,57] for the 5 videos they’ve seen so far. When the user clicks nextVideo() we’ll generate a random newNumber, and straight away compare that with any number in the seen list. I use seenlist.indexOf(newNumber) and it will, either go again or stop if the newNumber doesn’t exist in the list. THEN I can go through the GSI query, and retrieve the relevant information to display the video from URLArray.
I think that this indexOf() is the biggest computational burden on the device, and obviously gets a little slower as the seenList increases. But it should be quicker with pure integer numbers then an alphanumeric myKey as I was using before. Any other suggestion would be welcome :)
I’ve yet to try it, but it was just an idea, as I need to keep the random element. But first, do you know how I can easily find the number of rows or table count of URLArray?
I think you'll have an easier time coming up with a solution to this problem if you drop the randomness requirement. It sounds like the more important requirement is presenting the user with a video they haven't seen before.
If that's correct, it sounds like your access pattern could be stated as
Fetch previously unseen video for user
which is an easier problem to solve.
Unlike SQL databases, there are often many ways to implement a given access pattern in DynamoDB. My answer here is just one way.
Imagine your URLArray table as a giant array. The first URL is at index 0, the next URL is at index 1, the second URL at index 2, and so on. Each user of your application would start by watching the video at URL index 0, then URL index 1, etc. This would ensure the user never sees the same video twice. You would not need to store a list of all the videos they've seen. Instead, you could store the index of the last video they saw.
Your application could grab the first n videos from the table to present to your users. Once that list was exhausted, it could go grab the next n videos. And so on...
What I've described here is essentially how pagination is implemented in DynamoDB. To bring this abstraction back to the world of DynamoDB, your algorithm could look something like this:
Scan the URLArray table for the first "page" of URLs (a scan operation with no filter criteria)
Along with the results, DynamoDB will respond with a LastEvaluatedKey, which will allow you to retrieve the next page of results starting from this position
Present your user with each video you pulled back from the scan operation, making sure to record the id (the Primary Key) the the last video they saw.
When you exhaust the URLs from step 1, execute another scan operation with the ExclusiveStartKey set to the LastEvaluatedKey returned from step 2.
When users return to your application, query for the next page from the URLArray table with ExclusiveStartKey set to the id of the last video they viewed.
This effectively uses the scan operation to search through your URLArray table one page at a time. Your application would effectively be searching the table from top to bottom, keeping track of where each user is at any given time. When a user revisits your application, just start where they left off.
In response to your edit:
If your use case requires the next video to be unpredictable (e.g. no 2 users can predict what video is next), you have a few problems to solve at the same time:
Selecting an item in an unpredictable/random manner
Tracking what a user has already seen
Putting those two requirements together makes for a tricky access pattern. Let's say you have N videos in your table, and the user has viewed N-1 of these videos leaving only one video unseen. If you are fetching your next video randomly and need to ensure it has not yet been seen, how will you find the last unseen video? How many times would you need to guess before you came across the only unseen video? What query/scan operation could you perform that does this in a single request to DDB? I'm not saying it's impossible, it's just...complicated.
I think it's better to generate a strategy that is unpredictable to the user, but predictable to you when it comes to select the next unseen video.
For example, you could pre-calculate a random order of indexes from 1..N ahead of time, which would represent the order you present the videos for a given user. You could go through that list sequentially, keeping track of the last seen index. That way, you'd always know which video was next and that the video hadn't previously been seen by this user. Fetching that video would be a simple query operation to DDB.
You also asked how to find the number of items in DynamoDB. Unfortunately, there is no DynamoDB equivalent of the SQL count operation. The answer to this question is not straightforward. For the benefit of the community (and to get a diverse set of answers), I'd suggest you make a separate question on Stackoverflow regarding the number of items in a DDB table.

Stripe: webhook events order

How should you handle the fact that events received via webhooks can be received in random order ?
For instance, given the following ordered event:
A: invoiceitem.created (with quantity of 1)
B: invoiceitem.updated (with quantity going from 1 to 3)
C: invoiceitem.updated (with quantity going from 3 to 2)
How do you make sure receiving C-A-B does not result in corrupted data (ie with a quantity of 2 instead of 3)?
You could reject the webhook if the previous_attributes in Event#data do not correspond to the current state, but then you are stuck if your local model was updated already, as you will never find yourself in the state expected by the webhook.
Or you can just use treat any webhook as a hint to retrieve and update an object. You just disregard the data sent by the webhook and always retrieve it.
Even if you receive events ordered as update/delete/create it should work, as update would in fact create the object, delete would delete it, and create would fail to retrieve the object and do nothing.
But it feels like a waste of resources to retrieve data each time when the webhook offers it as event data.
This question was asked before but the answers don't cover the above solutions.
Thanks
If your application is sensitive to changes like this that can occur close in time, you really should just use the event as a signal to retrieve the object, as #koopajah noted in their comment. That's the only way to ensure you have the latest state.

How to update Redis list item?

I'm currently researching high scalable web site architectures,nearly all of the articles i've read say that Redis is very good choice for a timeline(facebook,twitter like) architecture.So let's suppose that I'm building a new social network and I want to save last 500 feeds of each user with Redis,I'm just curious about what will happen when a user delete a feed which is in last 500 feeds? I couldn't find any information about updating Redis list item,if there is no such a thing in Redis how can it be a very good choice?
Using the Lset command will allow you to update a list item: https://redis.io/commands/lset
Instead of a list, a better redis structure to use would be the sorted set, assuming you want to maintain ordering of the last 500 feeds.
To add new entry, you can use the command ZADD user_id:feeds time_in_epoch feed_id. The time_in_epoch would be score for sorting the set & will maintaining a ordering on the feeds.
To delete a feed for a user, ZREM user_id:feeds feed_id.

Can the target of a conversation receive messages from different initiators using the same conversation?

I like this article: http://technet.microsoft.com/en-us/library/dd576261(v=sql.100).aspx because of the receive top (10000) into a table variable. Processing a table variable with 10000 messages would give me a giant boost in performance.
receive top (10000) message_type_name, message_body, conversation_handle<br>
from MySSBLabTestQueue<br>
into #receive
From reading, the receive provides messages given a single conversation_handle. I have 200+ stores all sending messages with the same message type and contract to the same server. Can I implement the server to get all the messages from these stores on a single call to receive?
Thanks
A target can consolidate multiple conversations into few conversation groups, using the MOVE CONVERSATION. The RECEIVE restricts the result set to one single conversation group so moving many individual conversation into a single group can result in bigger result sets, as you desire.
For the records, initiators can also consolidate conversations using MOVE CONVERSATION, there is nothing role specific here. But initiators can also use the RELATED_CONVERSATION_GROUP clause of BEGIN DIALOG to start the conversation directly in the desired group, achieving consolidation and thus bigger result sets w/o having to use MOVE. This is useful because you can simply reverse the roles in the app, ie. instead of stores starting the dialogs with central server, have the central server start the dialogs with each store (thus reversing the roles) and the central server can start the dialogs in as few conversation groups as it likes, even 1. This removes the need to issue MOVE CONVERSATION.