How to view Azure storage account queue - azure-storage

I have a queue in my Azure storage account queue. I want a tool to see what is inside that queue.
I have tried 'CloudBerry Explorer for Azure Blob Storage', but that does not let me see the content of the queue.
And I tried 'Azure Storage Explorer', I can only see top 32 messages of the queue.
Can I see all the message in the queue?
And is there a tool allow me to change the order of message in the queue?

Azure Queue storage does not guarantee message ordering and therefore does not also provide a way to reorder messages.
There are two ways to see the contents of existing messages: Get Messages API and Peek Messages API. Both allow retrieving up to 32 messages at a time, so there is no way to view more than 32 messages without making the first 32 invisible (dequeue) first with Get Messages API.

Related

Google Cloud Storage Notification with Pub/Sub and docs

In the docs about GCP Storage and Pub/Sub notification I find this sentence that is not really clear:
Cloud Pub/Sub also offers at-least-once delivery to the recipient [that's pretty clear],
which means that you could receive multiple messages, with multiple
IDs, that represent the same Cloud Storage event [why?]
Can anyone give a better explanation of this behavior?
Thanks!
Google Cloud Storage uses at-least-once delivery to deliver your notifications To Cloud Pub/Sub. In other words, GCS will publish at least one message into Cloud Pub/Sub for each event that occurs.
Next, a Cloud Pub/Sub subscription will deliver the message to you, the end user, at least once.
So, say that in some rare case, GCS publishes two messages about the same event to Cloud Pub/Sub. Now that one GCS event has two Pub/Sub message IDs. Next, to make it even more unlikely, Pub/Sub delivers each of those messages twice. Now you have received 4 messages, with 2 message IDs, about the same single GCS event.
The important takeaway of the warning is that you should not attempt to dedupe GCS events by Pub/Sub message ID.
An at-least-once delivery means that the service must receive confirmation from the recipient to ensure that the message was received. In this case, we need some sort of timeout period in order to re-send the message. It is possible, due to network latency or packet loss, etc, to have the recipient send a confirmation, but the sender to not receive the confirmation before the timeout period, and therefore the sender will send the message again.
This is a common problem is network communications and distributed systems, and there are different types of messaging to address this issue.
To answer the question of 'why'
'At least once' delivery just means messages will be retried via some retry mechanism until successfully delivered (i.e. acknowledged). So if there's a failure or timeout then there's a retry.
By it's essence (retrying mechanism) this means you might occasionally have duplicates / more than once delivery. It's the same whether it's PubSub or GCS notifications delivering the message.
In the scenario you quote, you have:
The Publisher (GCS notification) -- may send duplicates of GCS events to pubsub topic
The PubSub topic messages --- may contain duplicates from publisher
no deduplication as messages come in
all messages assigned unique PubSub message_id even if they are duplicates of the same GCS event notification
PubSub topic Subscription(s) --- may also send duplicates of messages to subscribers
With PubSub
Once a message is sent to a subscriber, the subscriber must either acknowledge or drop the message. A message is considered outstanding once it has been sent out for delivery and before a subscriber acknowledges it.
A subscriber has a configurable, limited amount of time, or ackDeadline, to acknowledge the message. Once the deadline has passed, an outstanding message becomes unacknowledged.
Cloud Pub/Sub will repeatedly attempt to deliver any message that has not been acknowledged or that is not outstanding.
Source: https://cloud.google.com/pubsub/docs/subscriber#at-least-once-delivery
With Google Cloud Storage
They need to do something similar internally to 'publish' the notification event from GCS to PubSub - so reason is essentially the same.
Why this matters
You need to expect occasional duplicates originating from GCS notifications as well as the PubSub subscriptions
The PubSub message id can be used to detect duplicates from the pubsub topic -> subscriber
You have to figure out your own idempotent id/token to handle duplicates from the 'publisher' (the GCS notification event)
generation, metageneration, etc.. from the resource representation might help
If you need to de-duplicate or achieve exactly once processing, you can then build your own solution utilising the idempotent ids/tokens or see if Cloud Dataflow can accommodate your needs.
You can achieve exactly once processing of Cloud Pub/Sub message streams using Cloud Dataflow PubsubIO. PubsubIO de-duplicates messages on custom message identifiers or those assigned by Cloud Pub/Sub.
Source: https://cloud.google.com/pubsub/docs/faq#duplicates
If interested in a more fundamental exploration of the why we see:
There is No Now - Problems with simultaneity in distributed systems

How can I get data from RabbitMQ? I don't want consume it from queue

Is there a tool can view data from queue? I just want know what data in queue, but I don't want consume these data. Web UI and REST API just show count number, I want details.
How can I use Mnesia query queue's data? like MySQL client.
There are a few options
Firehose
You may consider firehose feature
https://www.rabbitmq.com/firehose.html
RabbitMQ has a "firehose" feature, where the administrator can enable
(on a per-node, per-vhost basis) an exchange to which publish- and
delivery-notifications should be CCed.
rabbitmq_tracing plugin
https://www.rabbitmq.com/plugins.html
Second queue
Just setup your exchange so it will deliver messages to two queues. One queue is for actual business procesing. Second queue is for debug pourposes only. Reading messages from second queue will consume them. For that debug queue you may enable reasonable TTL and/or Queue Length Limit. Otherwise, unconsumed messages will eventually eat all disk space.
Consume and re-send
You may consume message (to see it) and immediatelyre-send same message to the same queue. RabbitMQ management GUI has this option. Note that this will change order of the messages.

Is there a way to do hourly batched writes from Google Cloud Pub/Sub into Google Cloud Storage?

I want to store IoT event data in Google Cloud Storage, which will be used as my data lake. But doing a PUT call for every event is too costly, therefore I want to append into a file, and then do a PUT call per hour. What is a way of doing this without losing data in case a node in my message processing service goes down?
Because if my processing service ACKs the message, the message will no longer be in Google Pub/Sub, but also not in Google Cloud Storage yet, and at that moment if that processing node goes down, I would have lost the data.
My desired usage is similar to this post that talks about using AWS Kinesis Firehose to batch messages before PUTing into S3, but even Kinesis Firehose's max batch interval is only 900 seconds (or 128MB):
https://aws.amazon.com/blogs/big-data/persist-streaming-data-to-amazon-s3-using-amazon-kinesis-firehose-and-aws-lambda/
If you want to continuously receive messages from your subscription, then you would need to hold off acking the messages until you have successfully written them to Google Cloud Storage. The latest client libraries in Google Cloud Pub/Sub will automatically extend the ack deadline of messages for you in the background if you haven't acked them.
Alternatively, what if you just start your subscriber every hour for some portion of time? Every hour, you could start up your subscriber, receive messages, batch them together, do a single write to Cloud Storage, and ack all of the messages. To determine when to stop your subscriber for the current batch, you could either keep it up for a certain length of time or you could monitor the num_undelivered_messages attribute via Stackdriver to determine when you have consumed most of the outstanding messages.

Scaling NServicebus hosted in azure worker role and message order

I have an NServiceBus endpoint hosted in an azure worker role using AzureMessageQueue as the transport. I am pretty sure that i will be running the worker role with more than one instance configured in azure. I also have a few messages where order is important.
Here is my question. Is there a way to control the order with this type of setup (azure worker role scaled out)?
Should i be looking at a saga? Techniques like the one described below (using the bus.send(object[] messages) overload) will work in this model, i am guessing, but this is only ideal if there are a few messages due to the size limit on an azure queue.
http://mikaelkoskinen.net/post/NServiceBus-In-order-message-processing.aspx
Bus.Send in batch will help and so will configuring the read batch size from the message queue if messages are small enough, you can definitly control the order of messages this way. Batch send will put multiple message instances in the same physical queue message, while batch read will pull multiple physical messages from the queue in one go and process them in order on the node.
Another option is to synchronize the reads in the message handlers, using leases on blobs. See the code of the nsb timeoutmanager on how to use these for synchronization or have a look at steve marx's blog post on the topic. http://blog.smarx.com/posts/managing-concurrency-in-windows-azure-with-leases
But please do note that azure messages queues DO NOT GUARANTEE ORDER themselves, it's a best effort. If you want to guarantee order at the transport level, you need to use Azure Servicebus Queues with message ordering enabled.
Kind regards,
Yves

NServiceBus queue concept

Just started learning NServiceBus and trying to understand the concept.
When it talks about queues, are we talking about MSMQs on both publisher and subscriber?
So, if I have an application that generates a list of something (say, name of animals), then it dumps the list into publisher’s queue. The publisher polls the queue every minute and if there is something in the queue, it will publish to subscriber’s queue for further processing. Does this make sense?
Thanks.
The sequence of events for a publish is as follows:
The Publisher will start up(Windows Service)
A Subscriber will start up and place a message into the Publisher's input queue(MSMQ)
The Publisher will take that message, read the address of the Subscriber and place that into storage(subscription storage: memory, MSMQ, or RDBMS)
When it is time to publish and event, the Publisher will inspect the type of message and then read subscription storage to find Subscribers interested in that message
The Publisher will then send a message to each of the Subscribers found in subscription storage
The Subscriber receives the message in its input queue(MSMQ) and processes it
You can leverage other messaging platforms instead of MSMQ, but MSMQ is the default. There really is no polling done, all the endpoints are signaled when a message hits the queues.
MSMQ is a transport layer. It passes the messages around.
The application will publish something using a NServiceBus queue. If you configured it to use MSMQ, that's what it will use for its transport layer and this is what the subscribers will be looking at.
NServiceBus follows the publisher/subscriber model as you have correctly stated. However your confusion is based on the use of two queues. This is incorrect. The server (publisher) will maintain the queue which is interfaced via the MSMQ protocol and so your application would communicate directly with this possibly remotely or locally.
You would typically use a WCF service which would raise an event upon a new message being pushed onto the queue. Your application can then make use of this new message as desired. See the NServiceBus documentation for examples: http://www.nservicebus.com/ArchitecturalPrinciples.aspx