Best practice for sending large messages on ServiceBus

Best practice for sending large messages on ServiceBus - size

We need to send large messages on ServiceBus Topics. Current size is around 10MB. Our initial take is to save a temporary file in BlobStorage and then send a message with reference to the blob. The file is compressed to save upload time. It works fine.
Today I read this article: http://geekswithblogs.net/asmith/archive/2012/04/10/149275.aspx
The suggestion there is to split the message in smaller chunks and on the receiving side aggregate them again.
I can admit that is a "cleaner approach", avoiding the roundtrip to BlobStore. On the other hand I prefer to keep things simple. The splitting mechanism introduces increased complexity. I mean there must have been a reason why they didn't include that in the ServiceBus from the beginning ...
Has anyone tried the splitting approach in real life situation?
Are there better patterns?

I wrote that blog article a while ago, the intention was to implement
the splitter and aggregator patterns using the Service Bus. I found this question by chance when searching for a better alternative.
I agree that the simplest approach may be to use Blob storage to store the message body, and send a reference to that in the message. This is the scenario we are considering for a customer project right now.
I remember a couple of years ago, there was some sample code published that would abstract Service Bus and Storage Queues from the client application, and handle the use of Blob storage for large message bodies when required. (I think it was the CAT team at Microsoft, but I'm not sure).
I can't find the sample with a Quick google search, but as it's probably a couple of years old, it will be out of date, as the Service Bus client library has been improved a lot since then.
I have used the splitting of messages when the message size was too large, but as this was for batched telemetry data there was no need to aggregate the messages, and I could just process a number of smaller batches on the receiving end instead of one large message.
Another disadvantage of the splitter-aggregator approach is that it requires sessions, and therefore a session enabled Queue or Subscription. This means that all messages will require sessions, even smaller ones, and also the Session Id cannot be used for another purpose in the implementation.
If I were you I would not trust the code on the blog post, it was written a long time ago, and I have learned a lot since then :-).
The Blob Storage approach is probably the way to go.
Regards,
Alan

In case someone will stumble in the same scenario, the Claim Check approach would help.
Details:
Implement Claim Check Pattern
Use ServiceBus.AttachmentPlugin (Assuming you use C#. Optionally, you can create your own)
Use extral storage e.g. Azure Storage Account (optionally, you can use other storage)
C# Code Snippet
using ServiceBus.AttachmentPlugin;
...
// Getting connection information
var serviceBusConnectionString = Environment.GetEnvironmentVariable("SERVICE_BUS_CONNECTION_STRING");
var queueName = Environment.GetEnvironmentVariable("QUEUE_NAME");
var storageConnectionString = Environment.GetEnvironmentVariable("STORAGE_CONNECTION_STRING");
// Creating config for sending message
var config = new AzureStorageAttachmentConfiguration(storageConnectionString);
// Creating and registering the sender using Service Bus Connection String and Queue Name
var sender = new MessageSender(serviceBusConnectionString, queueName);
sender.RegisterAzureStorageAttachmentPlugin(config);
// Create payload
var payload = new { data = "random data string for testing" };
var serialized = JsonConvert.SerializeObject(payload);
var payloadAsBytes = Encoding.UTF8.GetBytes(serialized);
var message = new Message(payloadAsBytes);
// Send the message
await sender.SendAsync(message);
References:
https://learn.microsoft.com/en-us/azure/architecture/patterns/claim-check
https://learn.microsoft.com/en-us/samples/azure/azure-sdk-for-net/azuremessagingservicebus-samples/
https://www.enterpriseintegrationpatterns.com/patterns/messaging/StoreInLibrary.html
https://github.com/SeanFeldman/ServiceBus.AttachmentPlugin
https://github.com/mspnp/cloud-design-patterns/tree/master/claim-check/code-samples/sample-3

Related

How do I make free read-only calls to a smart contract on Hedera blockchain network without incurring charges?

The problem is that I am trying to make free read-only calls to a smart contract on the Hedera network, but am encountering unexpected results. I have tried various methods, but am unable to successfully make the calls without incurring charges. I am looking for a solution or guidance on how to properly make these free read-only calls to the smart contract on Hedera.
//Create the transaction
const transaction = new ContractExecuteTransaction()
.setContractId(newContractId)
.setFunction("get_message")
I expected this get_message to not charge me HBAR since that function just returns a hardcoded string but I cant execute it for free like I want to. How do I do this?

If you're using the SDK, using ContractCallQuery() is better suited for read-only queries. See sample below:
// Query the contract to check changes in state variable
const contractQueryTx1 = new ContractCallQuery()
.setContractId(contractId)
.setGas(100000)
.setFunction("get_message";
const contractQuerySubmit1 = await contractQueryTx1.execute(client);
Note that the SDK still requires some small amount of gas.
There are a couple of other ways to do cost-free queries.
Use mirror nodes. These two tutorials can give you additional information on working with mirror nodes: https://hedera.com/blog/how-to-inspect-smart-contract-transactions-on-hedera-using-mirror-nodes and https://hedera.com/blog/how-to-look-up-transaction-history-on-hedera-using-mirror-nodes-back-to-the-basics
If you use Hashio (https://swirldslabs.com/hashio/) as a JSON-RPC relay, then you can use EVM tooling to deploy and interact with contracts on Hedera. Then you can simply call contracts the way you would in a chain like Ethereum. Here are some examples: https://github.com/hashgraph/hedera-json-rpc-relay/tree/main/tools

Correct way to instrument a Jersey client

My goal is to instrument a Jersey client to collect data on HTTP response/execution time, and I had thought I had the right approach by implementing a JAX-RS ClientRequestFilter and a ClientResponseFilter with code in each to record the start and end of the request. However when used with code like the following:
InputStream is = client.target("https://mytarget")
.request(MediaType.APPLICATION_OCTET_STREAM).get(InputStream.class));
// Actually consume the stream, which seems to block waiting for data
final byte[] bytes = IOUtils.toByteArray(is);
... I experience a significant problem as I only end up measuring what appears to be time to header download, and do not measure time it takes to download the entire entity which seems to be happen as I convert the input stream to a byte array (just an example, in my case I am reading the entity contents as they become available).
My use case is to instrument it for use with a distributed tracer (AWS X-Ray). An alternative I had considered was to use a library for Apache HttpClient for this purpose, but that would require changing the default transport layer for Jersey which seemed like an extreme modification to fix a simple problem. (This is slightly more straightforward with RESTEasy since it does appear to use Apache HttpClient by default so with RESTEasy I might have gone that route.)
Is it possible with Jersey to set up a filter that executes when the last byte is written to the entity? Or is there a better way to go about instrumenting a Jersey client?

Wcf integration task, need to transfer large amount of data via soap service

I have to design an integration solution that transfers large amount of data and works once a day. The company X we work with will invoke the service / services and give the data as parameters.
Do you have any suggestions for this solution?
For example do you think that I have to tell the company X that they have to send compressed (gzip?) data?
Or do I have to realise this usage scenario:
while(!allDataSent)
{
SendData(List<object> objects);
}
TransferCompleted();
How do you develop this kind of tasks?

A good starting point is to have separate endpoints for the service and client that are going to do the data transfer because you need to change the timeouts and maximum limits for how much data you can send and receive within the connection. If it's not time critical you can use IEnumerable<YourType> as the return type for the function and on the client end, they can use it as a stream and can batch save it while it's getting the data so that there won't be a need to have it all in memory.
On the clientside it could look something like this:
foreach (var bytes = servliceClient.GetLargeAmountOfData())
SaveByteToDisc(bytes);
Information about the binding properties can be found at MSDN

memory exception using wcf wshttpbinding

I have an application to upload files to a server. I am using nettcpbinding and wshttpbinding. When the files is larger than 200 MB, I get a memory exception. Working around this, I have seen people recommend streaming, and of course it works with nettcpbinding for large files (>1GB), but when using wshttpbinding, what would be the approach?? Should I change to basichttpbinding?? what?? Thanks.

I suggest you expose another end point just to upload such large size data. This can have a binding that supports streaming. In our previous project we needed to do file uploads to server as part of business process. We ended up creating 2 endpoints one just dedicated to file upload, and another for all other business functionality.
The streaming data service can be a generic service to stream any data to the server and maybe return a token for identifying the data on server.For subsequent requests this token can be passed along to manipulate the data on server.

If you don't want to (or cannot because of legit reasons) change the binding nor use streaming, what you can do is have some method with a signature along the lines of the following:
void UploadFile(string fileName, long offset, byte[] data)
Instead of sending the whole file, you send little packets, and tell where the data should be placed. You can add more data, of course, like the whole filesize, CRC of the file to know if the transfer was successful, etc.

WCF Paged Results & Data Export

I've walked into a project that is using a WCF service for the data tier. Currently, when data is needed for a grid, all rows are returned and the results are bound to a grid and the dataset is stuffed into a session variable for paging/sorting/rebinding. We've already hit a max message size problem, so I'm thinking it's time to convert from fetch and cache to fetch only the current page.
Face value this seems easy enough, but there's a small catch. The user is allowed to export the entire result set at any point. This means that for grid viewing purposes fetching the current page is fine, but when they want to do an export, I still need to make a call for all data.
This puts me back into the max message size issue. What is the recommended approach for this type of setup?
We are currently using the wsHttpBinding...
Thanks for any assistance.

I think the recommended approach for large files is to use WCF streaming. I'm not sure the exact details for your scenario, but you could take a look at this as a starting point:
http://msdn.microsoft.com/en-us/library/ms789010.aspx

I would probably do something like this in your case
create a service with a "paged" GetData() method - where you specify the page index and the page size as additional parameters. This should give you a nice clean interface for "regular" use, and that should not hit the maxMessageSize limits
create a second service (or method) that would send all data - ideally, you could bundle that up into a ZIP file or something on the server, before sending it. If that ZIP file is still too large, you might want to check out WCF streaming for handling large files, as Andy already pointed out
The maxMessageSizeLimit is in place for a good reason: to avoid Denial of Service attacks where a WCF service would just get flooded with large messages and thus brought to its knees. If you can, always try to keep that in mind and don't just jack up the maxMessageSize to 2 GB - it might come back to bite you :-)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas