I am currently researching the possibility of using NServiceBus for one of our applications. The current application takes large text files and parses the details into a database. The users perform various operations on the file content, approve the changes, and finally release the updated file. When the file is released, various other services need to do something with that file data (drop file in ftp folder, email customer, bill customer).
From what I read services should be autonomous and not share data, except via messages. I like that concept, however in my case I am wondering if it is practical. Some of these files can contain up to a million records.
So my question is, should each service (operations, billing, emailer) all have their own database and table for storing this file data, and move the data via the DataBus? Or should I be more pragmatic and only send the fileID in the message which references a central file table?
Thanks for any guidance you can offer.
There are a couple of things that one should not do with a service bus:
move masses of data
perform queries
large ETL operations
You are certainly able to do all these things but you will probably be left disappointed. The messaging to enable some of these operations is fine, though. Your idea of sending the FileID is definitely the way to go.
As an example: I have previously implemented an e-mail sending service. This service can send attachments but these can be large. So instead of including the attachments in the messages I stored the attachments on a shared folder and sent a SendEMailCommand message that also included the unique attachment ids that need to be sent with the e-mail. The e-mail service would then pick up the attachments from the shared folder. After the service successfully sent the mail an EMailSentEvent message would be published.
Related
I have 2 simple questions that I did not find by reading the official documentation about google nearby message API
https://developers.google.com/nearby/messages/android/pub-sub
If you publish multiple messages with the publish method (on the same instance of an app), the messages are saved as several different messages or are updated and overwritten (on cloud console)?.
Is it possible with the publish method to update a message?
I'm building an application where each user sees what others are posting, but I just need to know the most up-to-date data of each user, I don't need all the messages.
Thank you.
With PubSub, you publish message in a queue. Then you can't update or delete them, they are published.
On the consumer side, the messages are usually distributed in the order, but without any guarantee. In each message, you have a published timestamp.
In your use case, it could be interesting to keep in memory the userID and the latest processed timestamp. If you application is distributed, the best is to store these data in memorystore.
Like that, when a message comes in
Either it is newer than the value in memory store and your process it
Or, it's older and your trash it.
I want export channel messages to ftp server or external drive. I thing we can export messages via rest API. Could you any one help on this..
If you want to send messages to a REST API, you can use the HTTP Sender destination connector type.
If your REST API Endpoint requires any special headers or authentication, you will need to configure this appropriately (such as by setting variables in the Destination Transformer). Don't forget to put something in the "Content" box at the bottom of the screen - this usually has a value such as ${message.transformedData} or ${message.rawData}.
If you want to send messages to an FTP server, you can use the File Writer destination connector type. Again, make sure you put something such as ${message.transformedData} in the "Template" field.
The POST /channels/{channelId}/messages/_export endpoint is to export messages to files on the server filesystem. When the client does an export to the local file system, it basically writes the results of GET /channels/{channelId}/messages with one file per message and attachments included. See Source.
Possibly the most effective way to get all your processed messages offsite is to just take a database backup.
The data pruner also has an option to archive messages to disk as they are pruned, and those files could be picked up and sent offsite if desired.
All:
I am pretty new to deepstream, on its website, it described in core concepts section as:
data-sync Interactive JSON documents that can be edited and observed.
Changes are persisted and synced across clients.
and
publish-subscribe Many clients can subscribe to topics and receive
data whenever other clients publish it to the same topic
I wonder what is the diff between its data-sync and pub-sub in terms of their purpose, in anther way, what task can one do while the other can not?
Thanks
PubSub is a way for clients and servers to send messages to each other. These messages can contain all sorts of data, but once the message is delivered its gone - there's no storage or statefulness. If you're familiar with EventEmitters in e.g. JavaScript you're already familiar with the pattern.
Data-Sync on the other hand is stateful, persistent data. Clients can request JSON documents called records, update them and subscribe to changes made by other records. Records can be arranged in lists and lists can be referenced by records, allowing for data-sync to become the realtime backbone for all the data that drives your app.
I'm going to use Amazon SES for sending emails in the website I'm building currently. According to the sample java code they have provided in their API documentation I developed the functionality and I was able to send the emails. But when it comes to handle huge number of emails in a very short time of period what is the best mechanism to follow up? Do they provide any queue mechanism for emails? I couldn't find this from their API documentation and their technical service is available only for users who has purchased the account.
Can anyone has come across a solution for this problem?
Generally I use a custom SQS solution for a batch mailing process like this.
Sending more than a few emails from a web server isn't ideal, so I usually only have the website submit the request for the emails to a back-end process in a single call, then I create an SQS message for each recipient and (in my case) use a windows service that requests messages from SQS and sends the emails at the pace I want them to go out. If errors are encountered the message stays in the queue, and get retried automatically.
With an architecture like this, depending on your volumes you can spin up new instances automatically if the SQS queue size gets too large for a single instance to process in a timely manner.
I'm playing around with windows azure and I would like to build a clouded server application that receives messages from many different clients, such as mobile and desktop.
I would like to build the client so that they work while in "offline-mode", i.e. I would like the client to build up a local queue of messages that are sent to the azure server as soon as they get online.
Can I accomplish this using wcf and/or azure queing mechanism, so that I don't have to worry about whether the client is online or offline when I write the code?
You won't need queuing in the cloud to accomplish this. For the client app to be "offline enabled" you need to do queuing on the client. For this there are many options, a local database, xml files, etc. Whenever the app senses network availability you can upload your queue to Azure. And yes, you can use WCF for that.
For the client queue/sync stuff you could take a look at the Sync Framework.
I haven't found a great need for the queue so far. Maybe it's just that I'm not seeing it in my app view. Could also be that the data you can store in the queue is minimal. You basically store short text strings (like record ids), and then you have to do something with the ID when you pull it from the queue, such as look it up, delete it, whatever.
In my app, I didn't use the queue at all, just as Peter suggests. I wrote directly to table storage (accessed via it's REST interface using StorageClient) from the client. If you want to look at a concrete example, take a look at http://www.netalerts.mobi/traffic. Like you, I wanted to learn Azure so I built a small web site.
There's a worker_role that wakes up every 60 seconds. Using one thread, it retrieves any new data from it's source (screen scraping a web page). New entries are stored directly in table storage (no need for a queue). Another thread deletes entries in table storage that are older than a specified threshold (there's no issue with running multiple threads against table storage). And then I'm working on the third thread which is designed to send notifications to handheld devices.
The app itself is a web_role, obviously.