Getting CPU statistics from libvirt - virtual-machine

Is it possible using the python bindings of libvirt to get the running, waiting and ready time of a VM from the host?

I'm not sure what you mean by "waiting" and "ready" time, but whatever that is, I believe it is possible using the getCPUStats() function called on a domain object. The docstrings of that function follows:
getCPUStats(total, flags=0)
Extracts CPU statistics for a running domain. On success it will return a list of data of dictionary type. If boolean total is False or 0, the first element of the list refers to CPU0 on the host, second element is CPU1, and so on. The format of data struct is as follows:
[{cpu_time:xxx}, {cpu_time:xxx}, ...]
If it is True or 1, it returns total domain CPU statistics in the format of
[{cpu_time:xxx, user_time:xxx, system_time:xxx}]

Related

Spark - Failed to load collect frame - "RetryingBlockFetcher - Exception while beginning fetch"

We have a Scala Spark application, that reads something like 70K records from the DB to a data frame, each record has 2 fields.
After reading the data from the DB, we make minor mapping and load this as a broadcast for later usage.
Now, in local environment, there is an exception, timeout from the RetryingBlockFetcher while running the following code:
dataframe.select("id", "mapping_id")
.rdd.map(row => row.getString(0) -> row.getLong(1))
.collectAsMap().toMap
The exception is:
2022-06-06 10:08:13.077 task-result-getter-2 ERROR
org.apache.spark.network.shuffle.RetryingBlockFetcher Exception while
beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to /1.1.1.1:62788
at
org.apache.spark.network.client.
TransportClientFactory.createClient(Transpor .tClientFactory.java:253)
at
org.apache.spark.network.client.
TransportClientFactory.createClient(TransportClientFactory.java:195)
at
org.apache.spark.network.netty.
NettyBlockTransferService$$anon$2.
createAndStart(NettyBlockTransferService.scala:122)
In the local environment, I simply create the spark session with local "spark.master"
When I limit the max of records to 20K, it works well.
Can you please help? maybe I need to configure something in my local environment in order that the original code will work properly?
Update:
I tried to change a lot of Spark-related configurations in my local environment, both memory, a number of executors, timeout-related settings, and more, but nothing helped! I just got the timeout after more time...
I realized that the data frame that I'm reading from the DB has 1 partition of 62K records, while trying to repartition with 2 or more partitions the process worked correctly and I managed to map and collect as needed.
Any idea why this solves the issue? Is there a configuration in the spark that can solve this instead of repartition?
Thanks!

ScrapeFrequencyInSecs is not working in metrics-collector module

I have deployed metrics-collector module with a ScrapeFrequencyInSecs value of 60. so it should scrape data every minutes, but when I check data in insightsmetrics, I am still getting data every 5 minutes or so
I am using
mcr.microsoft.com/azureiotedge-metrics-collector:1.0
Version 1.
The mcr.microsoft.com/azureiotedge-metrics-collector:1.0 module has ScrapeFrequencyInSecs Default value: 300.
Note: After updating the configuration parameter, environment variables need a restart of the module.
You can refer to MS Q&A: can we set the interval for sending metrics using Metrics collector module to log analysis workspace

What is the best way to communicate among multiple processes in ubuntu

I've three different machine learning models in python. To improve performance, I run them on different terminals in parallel. They are communicating and sharing data with one another through files. These models are creating batches of files to make available for other. All the processes are running in parallel but dependent on data prepared by other process. Once a process A prepares a batch of data, it creates a file to give signal to other process that data is ready, then process B starts processing it, while looking for other batch too simultaneously. How can this huge data be shared with next process without creating files? Is there any better way to communicate among these processes without creating/deleting temporary files in python?
Thanks
You could consider running up a small Redis instance... a very fast, in-memory data structure server.
It allows you to share strings, lists, queues, hashes, atomic integers, sets, ordered sets between processes very simply.
As it is networked, you can share all these data structures not only within a single machine, but across multiple machines.
As it has bindings for C/C++, Python, bash, Ruby, Perl and so on, it also means you can use the shell, for example, to quickly inject commands/data into your app to change its behaviour, or get debugging insight by looking at how variables are set.
Here's an example of how to do multiprocessing in Python3. Instead of storing results in a file the results are stored in a dictionary (see output)
from multiprocessing import Pool, cpu_count
def multi_processor(function_name):
file_list = []
# Test, put 6 strings in the list so your_function should run six times
# with 6 processors in parallel, (assuming your CPU has enough cores)
file_list.append("test1")
file_list.append("test2")
file_list.append("test3")
file_list.append("test4")
file_list.append("test5")
file_list.append("test6")
# Use max number of system processors - 1
pool = Pool(processes=cpu_count()-1)
pool.daemon = True
results = {}
# for every item in the file_list, start a new process
for aud_file in file_list:
results[aud_file] = pool.apply_async(your_function, args=("arg1", "arg2"))
# Wait for all processes to finish before proceeding
pool.close()
pool.join()
# Results and any errors are returned
return {your_function: result.get() for your_function, result in results.items()}
def your_function(arg1, arg2):
try:
print("put your stuff in this function")
your_results = ""
return your_results
except Exception as e:
return str(e)
if __name__ == "__main__":
some_results = multi_processor("your_function")
print(some_results)
The output is
put your stuff in this function
put your stuff in this function
put your stuff in this function
put your stuff in this function
put your stuff in this function
put your stuff in this function
{'test1': '', 'test2': '', 'test3': '', 'test4': '', 'test5': '', 'test6': ''}
Try using a sqlite database to share files.
I made this for this exact purpose:
https://pypi.org/project/keyvalue-sqlite/
You can use it like this:
from keyvalue_sqlite import KeyValueSqlite
DB_PATH = '/path/to/db.sqlite'
db = KeyValueSqlite(DB_PATH, 'table-name')
# Now use standard dictionary operators
db.set_default('0', '1')
actual_value = db.get('0')
assert '1' == actual_value
db.set_default('0', '2')
assert '1' == db.get('0')

Ignite Data streamer optimization

I am using below settings:
allowOverwrite: false
nodeParallelOperations: 1
autoFlushFrequency: 10
perNodeBufferSize: 5000000
My records size is around 2000 bytes. And see the "grid-data-loader-flusher"
thread stats as below:
Thread Count Average Longest Duration
grid-data-loader-flusher-#100 38 4,737,793.579 30,427,862 180,036,156
What would be the best configurations for Data streamer?
Thanks
Its good to have parallel streaming mode for data streamer. You can achieve this by collecting you key-value records in java Map and call the streamer.addData() method in parallel mode over that map. Here is the snippet.
maptoStream.entrySet().parallelStream().forEach(streamer::addData);
Also, if you are setting allowOverWrite to false then you cant use your custom stream receiver to process your collection of records. In this case it will skip the record(s) if it is already there in cache.
Regarding buffersize, you need to wait till buffer gets full each time to get it flushed automatically to cache. flush frequency comes to your rescue in this case and it will do periodic flushing. so whatever condition first satisfies(either buffer gets full or flush frequency reach) it will do flush. I preferred calling manual flush after above method call.
I observed that streamer works well with much more big collection on which you will call streamer.addData() method in parallel.

AUTOSAR configuaration - DCM module

I am stuck at a point where I am configuring the DCM module and the current parameter I am trying to configure DcmTimStrP2AdjustServer,
The requirement is P2CAN_SERVER_MAX = 25ms; P2STARCAN_SERVER_MAX = 5000ms;
Is DcmDspSessionP2ServerMax the same as P2CAN_SERVER_MAX? and if it is the same
What is the need for DcmTimStrP2AdjustServer and how do I find the best value for DcmTimStrP2AdjustServer.(The values all should be a multiple of DcmTaskTime which I find to be logical).
DcmTaskTime = 5ms;
I am following Autosar 4.0.3, using ETAS tool for configuring the parameters.
To fulfill your requirement, you need to configure respectively
DcmDspSessionP2ServerMax & DcmDspSessionP2StarServerMax for each session control in the DcmDspSessionRows at Dcm/DcmConfigSet/DcmDsp/DcmDspSession/.
i.e.
DcmDspSessionP2ServerMax 25
DcmDspSessionP2StarServerMax 5000
There is no DcmTimStrP2AdjustServer, but I guess you're referring to DcmTimStrP2ServerAdjust instead. DcmTimStrP2ServerAdjust & DcmTimStrP2StarServerAdjust should be configured to a multiple of your DcmTaskTime (5ms in your case, so i.e. 5ms, 10ms, 15, ms, ... is applicable) and are used to safeguard that the response is available on the bus before triggering the P2 or P2* timeouts. In your case you may want to set these values to the same values as in the DcmDspSessionRows if there is no other specification given, because the chosen timeout values there are already multiples of your DcmTaskTime:
DcmTimStrP2ServerAdjust 25
DcmTimStrP2StarServerAdjust 5000
The adjust value is an internal value, in order to adjust the delay between the Dcm Transmit Request and the message being actually on the Bus.
The definition of P2ServerMax and P2*ServerMax and their corresponding Adjust values is the same:
This parameter is used to guarantee that the diagnostic response is available on the bus before reaching P2 by adjusting the current DcmDspSessionP2ServerMax. This parameter mainly represents the software architecture dependent communication delay between the time the transmission is initiated by DCM and the time when the message is actually transmitted to the bus