understanding azure heartbeat queries - azure-log-analytics

I have a queries(Distribution by Agent Category)
Heartbeat | summarize AggregatedValue = count() by Category
which return some response like : Direct Agent 4,222
I have four Vms and the number returned is not matching to number of Vms. My understanding is that we install one log agent per linux Vm and if that's true then how come I get something like 4222? Any help is highly appreciated.

If you are looking for the OS, use this query:
Heartbeat
| where TimeGenerated > ago(12h)
| summarize dcount(Computer) by OSType,OSName,OSMajorVersion, OSMinorVersion

Category property defines whether the logs are directly obtained from LA agent or SCOM Agent or SCOM Management Server. To understand each of the properties in Heartbeat table, kindly refer this document.
Heartbeat records are logged once per minute for each virtual machine.You can verify by rounding values to an integer within given bin size for each of the virtual machine. Here is a sample kusto query for reference.
Heartbeat
| where TimeGenerated >= ago(2h) and ResourceType == "virtualMachines"
| summarize count() by Computer , Category , bin(TimeGenerated, 1h)
Kindly note if there is discrepancy in the count of the heartbeats sent in a hour, then it may be due to intermittent issues or the virtual machine is not reachable at specific time in the interval.

Related

What are 10 database transaction units in the Azure free trial?

I am looking for a cloud service provider to host a SQL DB in and access through API calls. After looking through multiple providers I have seen that Azure has a 12-month free trial but only 250 GB S0 instance with 10 database transaction units.
Could anyone explain to be what they mean by 10 DB transaction units? Any help is greatly appreciated.
For reference our database would not be large in scale just holding candidate and judges applications which we only get maximum 600 candidates per year.
I tried looking transactional units online and saw it make be a single REST API call which seems absurd to me.
Please examine the output of the following query:
SELECT * FROM sys.dm_user_db_resource_governance
That will tell you the following information about the current service tier:
min_cores (cores available on the service)
max_dop (the MAX_DOP value for the user workload)
max_sessions (the maximum number of sessions allowed)
max_db_max_size_in_mb (the maximum max_size value for a data file, in MB)
log_size_in_mb
instance_max_worker_threads (worker thread limit for the SQL Server instance)
The above information will give the details of what 10 DTU means in terms of resources available. You can run this query every time you change service tier of the database.

Query Redis data after elapsed time

I am currently in the midst of a POC where I plan to store some IOT data in my Redis.
Here's my question:
I would like to monitor the data sent by multiple IOT devices and raise alarms if a device fails to report telemetry under a certain time threshold.
For Example:
Device 1: Booting: 09:00am : expected turn around time 2min
After 02 min, 01 sec
Device 1 has failed to report back in the given time.
Is there a way to use Redis to query, in order for it to return back the data which has passed a certain time threshold?
Any references will be appreciated, thanks!

Counting Ignite Physical Servers

Newbie question, but not seeing a clear answer in the docs: I want to run a query on Ignite (2.13) that returns the number of physical ignite servers - despite Ignite running within containers. I suspect this will require some inference, as Ignite reports IP address per server (container or physical).
Something like Select * from sys.Nodes; but somehow collapsing containers on the same server together.
Any thoughts? Thx!
There is no such built in machinery, why would you need it?
I suppose you might do the following instead:
mark all containers running at the same machine by a single User Attribute, like MY_SERVER_NUMBER_1, MY_SERVER_NUMBER_2
query the nodes and filter by unique attriubteId, something like:
select count (distinct Name) from sys.NODE_ATTRIBUTES where Name like 'MY_SERVER_NUMBER_%'

Show message chain in search

I have a message thread, these messages are coming on splunk.
The chain consists of ten different messages: five messages from one system, five messages from another (backup) system.
Messages from the primary system use the same SrcMsgId value, and messages from the backup system are combined with a common SrcMsgId.
Messages from the standby system also have a Mainsys_srcMsgId value - this value is identical to the main system's SrcMsgId value.
The message chain from the backup system enters the splunk immediately after the messages from the main system.
Tell me how can I display a chain of all ten messages? Perhaps first messages from the first system (main), then from the second (backup) with the display of the time of arrival at the server.
With time, I understand, I will include _time in the request. I got a little familiar with the syntax of queries, but still I still have a lot of difficulties with creating queries.
Please help me with an example of the correct request.
Thank you in advance!
You're starting with quite a challenging query! :-)
To combine the two chains, they'll need a common field. The SrcMsgId field won't do since it can represent different message chains. What you can do is create a new common field using Mainsys_srcMsgId, if present, and SrcMsgId. Then link the messages via that field using streamstats. Finally sort by the common field to put them together. Here's an untested sample query:
index=foo
```Get Mainsys_srcMsgId, if it exists; otherwise, get SrcMsgId```
| eval joiner = coalesce(Mainsys_srcMsgId, SrcMsgId)
| streamstats count by joiner
```Find the earliest event for each chain so can sort by it later```
| eventstats min(_time) as starttime by joiner
```Order the results by time, msgId, sequence
| sort starttime joiner count
```Discard our scratch fields```
| fields - starttime joiner count

Storing time intervals efficiently in redis

I am trying to track server uptimes using redis.
So the approach I have chosen is as follows:
server xyz will keep on sending my service ping indicating that it was alive and working in the last 30 seconds.
My service will store a list of all time intervals during which the server was active. This will be done by storing a list of {startTime, endTime} in redis, with key as name of the server (xyz)
Depending on a user query, I will use this list to generate server uptime metrics. Like % downtime in between times (T1, T2)
Example:
assume that the time is T currently.
at T+30, server sends a ping.
xyz:["{start:T end:T+30}"]
at T+60, server sends another ping
xyz:["{start:T end:T+30}", "{start:T+30 end:T+60}"]
and so on for all pings.
This works fine , but an issue is that over a large time period this list will get a lot of elements. To avoid this currently, on a ping, I pop the last element of the list, check if it can be merged with the latest time interval. If it can be merged, I coalesce and push a single time interval into the list. if not then 2 time intervals are pushed.
So with this my list becomes like this after step 2 : xyz:["{start:T end:T+60}"]
Some problems I see with this approach is:
the merging is being done in my service, and not redis.
incase my service is distributed, The list ordering might get corrupted due to multiple readers and writers.
Is there a more efficient/elegant way to handle this , like maybe handling merging of time intervals in redis itself ?