What are the metrics that can used as heartbeat for Gobblin application? - hadoop-yarn

Below are the services that are running for which trying to find heartbeat metrics :
JobMgmtServer
AdminWebServer
GobblinApplicationMaster
GobblinWorkUnitRunner/GobblinYarnTaskRunner

Related

HiveMQ/RabbitMQ as load balancing MQTT node(s) before Thingsboard IoT system

Our endpoint devices are pushing data over MQTT to an IoT system based on the Thingsboard IoT platform. There is only one MQTT topic called /telemetry where all devices connect. The server knows which device the data belongs to based on the device's token used as the MQTT username.
Due to not rare peaks of data loading, outages happen.
My question is:
Is it possible and how to use HiveMQ (RabbitMQ or some similar product) between devices and our IoT system to avoid data loss and smooth out peaks?
This post explains how to use Quality of Service levels, offline buffering, throttling, automatic reconnect and more to avoid data loss and maintain uptime.
The tldr; is that MQTT and HiveMQ have features built in to help avoid data loss, guaranteed delivery, traffic spikes and to handle back-pressure.
It may be worth considering what you can do with your existing tools before expanding your deploy footprint which just adds unnecessary complexity if unwarranted.
I would recommend using Apache Kafka or Confluent in between the MQTT Broker & ThingsBoard. Kafka stores all data on disk (instead of RAM in the case of RabbitMQ) and is scalable among multiple cluster nodes. You could also reload data to ThingsBoard by resetting offsets. This could be useful if there was an error in the configuration of a rulechain and you would have ThingsBoard reprocess the data again.
To connect with Kafka/Confluent you can use the ThingsBoard Integration.
Find more details here:
https://medium.com/python-point/mqtt-and-kafka-8e470eff606b

How to check logs between OPC Publisher and IoT Hub to confirm the data transfer

I have setup IoT Edge up in one of our machines and installed OPC Publisher and connected it to one of our opc-ua servers which then sends data to OPC Publisher and then to IoT Hub. We have not received any data to our IoT hub for the last 10 days and suddenly today we have received the data. How can we troubleshoot why the data is missing for the last 10 days?
You can generate a support bundle on your edge device that will collect the logs of all deployed modules as well as the edge runtime logs.
sudo iotedge support-bundle --since 11d
More details on troubleshooting IoT Edge here
You can first look into the logs of the publisher and validate if the connection to the OPC UA Server was/is active. If this is fine then have a look into the edgeHub and validate if the upstream connectivity to IoT Hub was affected.
One of the most powerful tools to monitor your edge deployments is the integration with Azure Monitor. It will collect metrics from the edgeHub and edgeAgent, which combined will give you an overview of where your messages are going. It can show you how many messages are sent to your upstream endpoint and when.
Source of image
For a full overview of the capabilities, you can check out this blog post. Installation steps are here
Edit:
OPC Publisher aLso supports diagnostic logging, which will give you more information about the connections to OPC servers. To do this, you need to set the diagnostic interval. You can do this by specifying the --di command argument in your createOptions:
"OPCPublisher":{
"settings":{
"image":"<image>",
"createOptions":{
"Cmd":["di=60"]
}
},
"type":"docker",
"version":"1.0",
"status":"running",
"restartPolicy":"always"
}
The example above will log diagnostic metrics every 60 seconds. You can then upload the logs using the support bundle command from Cristian's answer, or use the UploadSupportBundle direct method to do the same without needing access to the device.

How do I configure the broker heartbeat in airflow so I can see the heartbeat in Connections tab in RMQ management UI?

Currently using airflow 1.10.15 with CeleryExecutor and RabbitMQ as broker. I need to configure heartbeat so that in RMQ Management Connections tab, I can see the heartbeat value. I see heartbeat configuration in airflow.cfg but I think those are application level heartbeat. I need the configuration for RMQ. I think this is set to some default value by airflow but I am not seeing the value in the Connections tab. How can I set the broker heartbeat in airflow?
Update 9/14: Looks to me the airflow workers has a heartbeat (it got default of 60sec). But there are connections that I see that don't have a heartbeat. Is this because of the zoombie tasks(tasks/pods that airflow cannot tract anymore)? Also if I delete a connection, another one will be created with zero heartbeat which I suspect is created by airflow worker.

Azure IoT on Edge - IoTSDK - Read batch of messages from ModuleClient

I'm tryng to develop an high-frequency message dispatching application and i'm observing for the behavior of the SDK about message reading from the ModuleClient connected to the edgeHub by using "MQTT on TCP Only" transport settings.
Seems that there is no way to read multiple messages at time (batch) from the edgeHub (I think is something related to the underlying protocol).
So the result is that one must sequentially read a message, process it and give the ack to the hub.
Does exist a way to process multiple message at time without waiting for the processing of the previous?
Is this "limitation" tied to the underlyng protocol?
I'm using Microsoft.Azure.Devices.Client 1.37.2 on a .NET Core 3.1 application deployed on Azure Kubernetes (AKS) by using Azure IoT Edge on Kubernetes workload.
You are correct, you cannot use batch in MQTT protocol. This is a limitation tied to IoTHub when using MQTT Protocol.
IoT Hub only supports batch send over AMQP and HTTPS at the moment.
The MQTT implementation loops over the batch and sends each message
individually.
Ref: https://github.com/Azure/azure-iot-sdk-csharp
Suggest that you add a new feature request, if need IoTHub to support batch when connecting using MQTT: https://feedback.azure.com/forums/321918-azure-iot-hub-dps-sdks

Accessing app specific logging/metrics data in Kubernetes cluster

I have a Python app running on a Kubernetes Cluster. I want to get app specific monitoring information e.g. logging info that I have in my app (using python logging module) and also metrics info that I am collecting using collectd.
I understand Operations Management Suite can be used to monitor the cluster itself but can it also provide access to app specific logs and metrics?
Appreciate any pointers on how to do this.
Thanks
Rajeev