Building a notification system [closed] - notifications

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 4 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am at the start of building a Facebook style notification system for our page (social gaming type) and I'm now researching what would be the best way to design such system. I'm not interested in how to push notifications to the user or anything like that (for now even). I am researching how to build the system on the server (how to store notifications, where to store them, how to fetch them etc...).
So ... some requirements that we have:
at peak times we have about 1k concurrent logged-in users (and many more guests, but they don't matter here as they will not have notifications) that will generate many events
there will be different types of notifications (user A has added you as a friend, user B has commented on your profile, user C has liked your image, user D has beaten you on game X, ...)
most events will generate 1 notification for 1 user (user X has liked your image), but there will be cases where one event will generate many notifications (it's user Y's birthday for instance)
notifications should be grouped together; if for instance four different users like some image, the owner of that image should get one notification stating that four users have liked the image and not four separate notifications (just like FB does)
OK so what I was thinking is that I should create some sort of queue where I would store events when they happen. Then I would have a background job (gearman?) that would look at that queue and generate notifications based on those events. This job would then store notifications in the database for each user (so if an event affects 10 users, there would be 10 separate notifications). Then when user would open a page with the list of notifications I would read all those notifications for him (we ware thinking to limiting this to 100 latest notifications) and group them together and then finally display them.
Things I'm concerned about with this approach:
complex as hell :)
is database the best storage here (we are using MySQL) or should I use something else (redis seems like a good fit too)
what should I store as a notification? user ID, user ID who initiated the event, type of event (so that I can group them and display appropriate text) but then I kinda don't know how to store the actual data of the notification (for instance URL&title of the image that was liked). Should I just "bake" that info when I generate the notification, or should I store the ID of the record (image, profile, ...) being affected and pull the info out of the DB when displaying the notification.
performance should be OK here, even if I have to process 100 notifications on-the-fly when displaying the notifications page
possible performance problem on every request because I would have to display the number of unread notifications to the user (which could be a problem in its own since I would group notifications together). This could be avoided though if I generated the view of notifications (where they are grouped) in the background and not on-the-fly
So what do you think about my proposed solution and my concerns? Please comment if you think I should mention anything else that would be relevant here.
Oh, we are using PHP for our page, but that shouldn't be a big factor here I think.

A notification is about something (object = event, friendship..) being changed (verb = added, requested..) by someone (actor) and reported to the user (subject). Here is a normalized data structure (though I've used MongoDB). You need to notify certain users about changes. So it's per-user notifications.. meaning that if there were 100 users involved, you generate 100 notifications.
╔═════════════╗ ╔═══════════════════╗ ╔════════════════════╗
║notification ║ ║notification_object║ ║notification_change ║
╟─────────────╢ ╟───────────────────╢ ╟────────────────────╢
║ID ║—1:n—→║ID ║—1:n—→║ID ║
║userID ║ ║notificationID ║ ║notificationObjectID║
╚═════════════╝ ║object ║ ║verb ║
╚═══════════════════╝ ║actor ║
╚════════════════════╝
(Add time fields where you see fit)
This is basically for grouping changes per object, so that you could say "You have 3 friend requests". And grouping per actor is useful, so that you could say "User James Bond made changes in your bed". This also gives ability to translate and count notifications as you like.
But, since object is just an ID, you would need to get all extra info about object you want with separate calls, unless object actually changes and you want to show that history (so for example "user changed title of event to ...")
Since notifications are close to realtime for users on the site, I would tie them with nodejs + websockets client with php pushing update to nodejs for all listeners as change gets added.

This is really an abstract question, so I guess we are just going to have to discuss it instead of pointing out what you should or shouldn't do.
Here's what I think about your concerns:
Yes, a notification system is complex, but not as hell though. You can have many different approaches on modeling and implementing such systems, and they can have from a medium to a high-level of complexity;
Pesonally, I always try to make stuff database-driven. Why? Because I can guarantee having full control of everything that's going on - but that's just me, you can have control without a database-driven approach; trust me, you are gonna want control on that case;
Let me exemplify a real case for you, so you can start from somewhere. In the past year I've modeled and implemented a notification system in some kind of a social network (not like facebook, of course). The way I used to store notifications there? I had a notifications table, where I kept the generator_user_id (the ID of the user that is generating the notification), the target_user_id (kind of obvious, isn't it?), the notification_type_id (that referenced to a different table with notification types), and all that necessary stuff we need to fill our tables with (timestamps, flags, etc). My notification_types table used to have a relation with a notification_templates table, that stored specific templates for each type of notification. For instance, I had a POST_REPLY type, that had a template kind of like {USER} HAS REPLIED ONE OF YOUR #POSTS. From there, I just treated the {} as a variable and the # as a reference link;
Yes, performance should and must be ok. When you think of notifications you think of server pushing from head to toe. Either if you are going to do it with ajax requests or whatever, you are gonna have to worry about performance. But I think that's a second time concern;
That model that I've designed is, of course, not the only one that you can follow, neither the best as well. I hope my answer, at least, follows you into the right direction.

╔════════════════════╗
║notification ║
╟────────────────────╢
║Username ║
║Object ║
║verb ║
║actor ║
║isRead ║
╚════════════════════╝
This looks a good answer rather than having 2 collections. You can query by username, object and isRead to get new events(like 3 pending friend requests, 4 questions asked etc...)
Let me know if there is problem with this schema.

I personally don't understand very well the diagram for the accepted answer, So I'm going to attach a database diagram base on what I could learn from the accepted answer and other pages.
Improvements are well received.

Related

DDD Request & Activity Tracking

I have a question about tracking activity and where it belongs.
With a lot of my domain commands, you also might want to track the activity and modifications made by users to a particular context or object.
For example:
lets say we have a items domain/context where we can create and edit items. Users are going to make requests to the api to do this. We might want to track who created an item and an modifications made to it.
In a typical CRUD model, you'd probably find the created by field in the domain object/table
Something doesn't feel right when using DDD to have the activity in the domain object. The activity log feels like a general service that would cross many boundaries? Is it right to have the activity log of who changed what in the domain object. It would feel quite clean and focused without it. The activity logging seems specific to the applications case, not the domain?
So:
Should the activity tracking be in the domain object?
If it shouldn't how do you go about handling this in one command/request. I keep hearing people saying about you should only touch 1 boundary in a command/request.
I would think of this activity log as any other piece of data. You would put it together with the business logic around it. Why do you need this information in the first place? Is your items context going to implement business logic that needs the activity log? If not, then I'd say it doesn't belong in that context.
If what you are trying to achieve with this log is some data analysis that needs the activity from several contexts, then I would say publish events from your business operations (every time a user does something with one of the contexts) and have your activity tracking context listen to them and store the activity in a way that serves this purpose.
If, instead, your items context needs to apply some sort of logic, based on the past activity, then keep it in that context in a format that allows you to implement this business logic.
It's also possible that you actually need both. Some context might just publish the events and not store the activity, while others will publish the events and also track the activity for their own internal needs.

Online Users Storing Elixir

I am working on one chatroom [all to all] application in Elixir using OTP Genserver and getting messages from js client as user gets registered with their names as first phase. Now, just bit not sure what would be the best approach to store these names at my elixir server somehow and send regular updates to client with list of users online or database storage. Please suggest the best approach.
I agree with bitwalker that ETS is a good fit.
Here's a short summary of what I did in production. It wasn't a chat server, but a server push with a couple of thousand of users connecting via long polling. Pushed data was divided in some 50 categories, and users were able to choose which ones they want. At peak times the server pushed new messages each 2 secs, and processed > 2000 reqs/sec.
Essentially, I kept a gen_server for each user, where I held pending messages and user's configuration (basically a list of selected channels). This was beneficial with long polling, since user's data is decoupled from the user's request, so the data remains while requests are transient. However, I think this approach is also good for permanent connections, such as websockets, since there might still be occasional disconnections, and keeping a more stable user's data gives you a chance of resuming after reconnect.
Obviously, when a request arrives, you need to find the user specific process, and for this, ETS is a good fit, since you don't have a single process bottleneck. Instead of manually working with ETS, I'd recommend using gproc in conjunction with via tuples. Basically, when starting a user's gen_server, you can provide name: {:via, :gproc, {:n, :l, key}} where key is some custom key (arbitrary term) you make based on your internal user's id(:n and :l indicate a unique name on the local node). You can then use that same via tuple when issuing calls/casts, and gen_server will use gproc to find the corresponding process.
Finally, you need to have some timeout/disconnect logic to cleanup user processes. In my case, I simply terminated a user's process if there was no activity from the web layer (no end-user came for data in some time). Gproc will automatically remove entries for terminated process from its internal ETS table. It's probably best to supervise user processes under a temporary strategy.
I realize all of this is still a bit vague, but I hope it makes some sense. Keep in mind that this is not the ultimate pattern (there's no such thing of course), but I think it's a reasonable first attempt.
You may also want to take a look at Phoenix web framework that has an interesting pub-sub facility in form ofTopics. I didn't try this out myself yet, but it seems interesting, and may even simplify some of the stuff I discussed above, or at least help for pushing notifications from chatroom to all users.
Sounds like a good use case for ETS.
A simpler approach might be to use an Agent to store the online users information, but it depends quite a lot on what you need from the storage mechanism you choose.

Custom iOS address book. Need advices about data structures and performance

I am currently developing a voIP app and I am really stuck with the address book.
Because of the custom design, the native address book does not fit in my app. Besides, I want to add some extra data not present in the native address book. But this is leading into some problems which I've separated into two sections:
1. Data structures:
In a section of my app I need to show to the user all his address book contacts with additional information (if the user has the same app and it's online, for example).
Right now I'm getting all the info from the Address Book api and loading it in an array directly (which is accessed by the tableView:cellForRowAtIndexPath:), but not displaying the custom information I was talking about. I don't know if its worthwhile to store all the address book info in a sqlite data base (where I'd be able to add the extra information easily) or if I should store only that extra information in a file or something.
The biggest problem of storing it in a data base is that the contact's picture is heavy enough to get a wasting-memory data base. I thought to store only a reference (the ABRecordID) and then to gather the related info from the address book instead of the data base, but the Apple documentation of the Address Book api says the ABRecordID is not guaranteed to remain the same, so it could cause my data to appear next to wrong contact data.
Any idea?
2. Performance:
The second big problem with this custom address book is that... the iOS table views are too 'manual' compared to the Android ones, for example. You need to have the data stored somewhere so that when the tableView:cellForRowAtIndexPath: method gets called you return that data. You can also load that data inside this method, but this makes it very slow.
The problem here is that preloading all the data in memory is dangerous, because a person may have 40 contacts or 2000 (and maybe he/she has taken a picture for each of them, which will be much more memory-consuming). If the iOS device runs out of memory the system will kill the app. The data base approach has no memory problems, but making queries for each cell to appear is so slow that it becomes unacceptable.
Again, I need ideas for this. Can't find a tradeoff between performance and memory consumption.
Please, don't ask for code because I'm not allowed to post it. I'd really appreciate your advices. Thank you in advance!
Data structures:
Along with the recordref you should store the name phone number and email address. Nothing else in your data store. If one of the three vales change and the other two remain the same update the changed value. The recordref can change during a restore of a device for many users at once but the name email and phone won't. If the user changes a name or email or phone they won do it across many users at once. Once in a while you end up with a recordref that does not match up with email and phone say, the contact may have changed employers so then show a list of close matches and ask the user to select one.
As far as some one having 1000s of contact I would use paging. Load 100 or 200 at a time in to an array with current row displayed in the table view as middle of your array index. Once the user scrolls 20-30 records update the records in your array from address book. Your going to be spending a lot time resaving data just to go through the collection comparing and trying to keep it up to date. You should be able to store quite a few records long as your not keeping the user images in memory, for that you should let the table view handle it. Get image and assign to cell when you get the notification about the cell about to become visible. Even then I would put a short wait before loading the image, because if the user is scrolling fast the cell will just fly by and you'll get notified that the cell scrolled out and you can release the image data. If the user is scrolling slowly then the short wait/sleep will pass and the image should show up for each cell.
I don't know how much meta your planning on storing in your app wrapping the contacts but if you should create two tables for the contact object, one with 3-4 indexed columns that will allow for faster querying and a second to hold the rest that loaded only when users viewing the contact in a detail view. Can't get too much into a tableviescell, unless your on the iPad.
Hope that helps.

In a CQRS system, how should I show the user that their request has been received?

I'm trying to decouple some of the bits of our big-ball-of-mud architecture, and identified several boundaries that are obvious candidates for using CQRS to provide a more resilient and scalable solution.
Typical example: when a customer places an order, at the moment we block their thread whilst the order is submitted for payment, approved by the sales system, etc, etc.
This can all be handled asynchronously - allowing us to accept and queue orders whilst the payment processing system is unavailable, etc. - but I'm not sure how I should manage the UI data for the customer.
In other words - they place an order. Their order goes in a queue. If they log back into their account five seconds later and click "review orders" - what happens?
If I draw it from the central repo (or from a cache that's updated based on that repo), then the user won't see their order and will probably try and place it again - or phone us and panic.
If I draw it from a local database, then I have the overhead of maintaining another database of orders - which will need to be synchronised in a load-balanced environment, and seems to undermine a lot of the advantages of CQRS.
I want to do this in lots of places - and not all of them are actions as significant as confirming an order; in some cases it's as simple as a customer changing a phone number or something - so they're not all cases where I can just say "thanks a lot, we'll send you a confirmation e-mail" - because sending confirmation e-mails for every modification to a record strikes me as a little excessive.
Any patterns or solutions I should look at to help with this?
Something worth considering is a 'user' inbox: a place in your app the user can consult 'in-progress' commands at. You could also 'push' notifications back to the user's UI when he has already moved onto another screen, but still resides in your app. This might also be an option when the user logs back on.
Another option could be faking the synchronous experience, i.e. wait around and do polling while in the background everything happens asynchronously. Granted, this might involve including timeouts as well, but I'd argue that those are embraced in today's synchronous processing as well.
On top of all this, you may want to both inform and solicit feedback from your end users about how they experience your app and its behavior.
Regardless what anybody tells you, if you want to handle this elegantly, it will take some effort on your part.
The best thing to do is lie!
The user should have no idea that their transaction is in fact a little like Schrödinger's cat, either dead or alive. From their perspective the transaction was a success, because you just indicate to them that it was successful and queue the job away for offline processing.
Because the vast majority of transactions are successful you can then handle those that are not with an appropriate compensationary mechanism.
Insignificant cases, like modification of some record:
Send the user to a confirmation page telling him something around the lines of "Thanks, your input is being processed. What do you want to do next?" and a couple of links.
If you absolutely have to send the user back to the edited record or a list thereof, in non-distributed systems we're probably talking about milliseconds until the read store has been updated. As long as it takes longer to redirect the user to the new page, from the user's POV everything's fine.
If in some cases the user actually doesn't see his update "immediately", he might call user support. They tell him to hit F5. What? It's there now? Great! Guess what he does next time before reaching for the phone.
Significant cases like offline order processing:
There might be an implicit concept of a Received Order or Pending Order in your domain. If you make this concept explicit, you can present the user with accurate information.
"Thank you very much! Your order has been received an we'll keep you updated once it has been shipped. [Click here] to see a list of your pending orders..."
I think the simplest thing, doing nothing, can often be good enough. If user changes phone number, and the system processes this command in 1-2s, it is a good chance user has not had the opportunity to see old data in-between this operation.
If that is not satisfactory, and your user must absolutely know that his request was fulfilled, your UI can subscribe to domain events. Once the command is executed successfully, your UI gets notification and can inform the user. There are various ways you could do this in UI. You could simply block until the success notification arrives. Or you can say "we received your request", and once you get confirmation, show the notification window "your request was fulfilled" somewhere in the corner.

Trac plugin to send email number of new and closed tickets and their details based on define schedule

I am looking for a way or a plugin so that trac sends me email about the number of new or closed tickets (and some information about these tickets also ) for a specific duration lets say for the last three days.
Basically I need to know how many tickets have been created in last week and how many of them have been closed at the end of week.
Of course the email only should be sent to the admin and not to all the users.
For additional Trac funcionality we have Trac plugins, yes. And the first place to look for them is trac-hacks.org .
The excellent TagsPlugin in use overthere already delivers some hints on resources tagged with notification or notifications. The most comprehensive and mature solution is certainly TracAnnouncer with a just reworked configuration interface providing a highly sophisticated opt-in and opt-out subscription system. Unfortunately digest notification are not integrated today.
Still there are other plugins, that fill in the gap, i.e. check the XMailPlugin. It claims to do configurable instant, daily and weekly notifications, so this may be for you. Since this is a relativly new plugin, you should expect some pending issues, but the author might be very open to your suggestion. If you're becoming a heavy user giving valuable test feedback and a bit lucky too, asking kindly could be enought to make things happen.
There's a slightly different way to solve this problem that doesn't require any plugins. First, create a custom "timeline" view that displays the information that you want. In your example, this would be all "opened and closed tickets" starting from "today" and going back three days. When viewing this custom view, you should see a link at the bottom of the page that says "RSS Feed" (on my system, the resulting URL looks something like this: http://myserver/timeline?ticket=on&max=50&authors=&daysback=3&format=rss). Click on this link to subscribe to the feed using your web browser, email client, or other program capable of reading feeds. Now, you can view the results live at any time. What you can do at this point is only limited by the capabilities of your feed reader app, but most can at least be configured to notify you when the feed is updated.