Is there any limitations to number of running processes on host? - process

In platform file I have only one host:
<host id="Worker1" speed="100Mf" core="101"/>
Then in worker.c I create 101 (or > 100) processes expecting that each on each core one process will be launched. But I noticed that only 100 first processes able to execute task or write with XBT_INFO:
int worker(int argc, char *argv[])
{
for (int i = 0; i < 101; ++i) {
MSG_process_create("x", slave, NULL, MSG_host_self());
}
return 0;
}
int slave(){
MSG_task_execute(MSG_task_create("kotok", 1e6, 0, NULL));
MSG_process_kill(MSG_process_self());
return 0;
}
Other processes above 100 first ones are unable to manage and kill:
[ 1.000000] (0:maestro#) Oops ! Deadlock or code not perfectly clean.
[ 1.000000] (0:maestro#) 1 processes are still running, waiting for something.
[ 1.000000] (0:maestro#) Legend of the following listing: "Process <pid> (<name>#<host>): <status>"
[ 1.000000] (0:maestro#) Process 102 (x#Worker1): waiting for execution synchro 0x26484d0 (kotok) in state 2 to finish
UPDATE
Here some code functions are:
main
int main(int argc, char *argv[])
{
MSG_init(&argc, argv);
MSG_create_environment(argv[1]); /** - Load the platform description */
MSG_function_register("worker", worker);
MSG_launch_application(argv[2]); /** - Deploy the application */
msg_error_t res = MSG_main(); /** - Run the simulation */
XBT_INFO("Simulation time %g", MSG_get_clock());
return res != MSG_OK;
}
deployment.xml
<?xml version='1.0'?>
<!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid/simgrid.dtd">
<platform version="4">
<process host="Worker1" function="worker">
<argument value="0"/>
</process>
</platform>

There is actually an internal limit for the size of a maxmin system (the core of SimGrid), which is 100, and may be hit in this case. I just added a flag to make this limit configurable. Could you pull the last version, and try setting maxmin/concurrency_limit to 1000 and see if it fixes your issue?

the number of process that can be started on a host has nothing to do with the number of cores. As on a real machine, you can have several processes running "simultaneously" thanks to time sharing mechanisms. It's the same here. when the number of running processes is greater than the number of cores (be it 1 or more), they have to share the resources.
The cause of your issue is elsewhere, but you do not provide a full minimal working example (main? deployment file?) and it's hard to help.

Related

How to put BG96 on power save mode between sending messages to Azure IoT Hub over HTTP

I'm using a Nucleo L496ZG, X-NUCLEO-IKS01A2 and the Quectel BG96 module to send sensor data (temperature, humidity etc..) to Azure IoT Central over HTTP.
I've been using the example implementation provided by Avnet here, which works fine but it's not power optimized and with a 6700mAh battery pack it only lasts around 30 hours sending telemetry ever ~10 seconds. Goal is for it to last around a week. I'm open to increasing the time between messages but I also want to save power in between sending.
I've gone over the Quectel BG96 manuals and I've tried two things:
1) powering off the device by driving the PWRKEY and turning it back on when I need to send a message
I've gotten this to work, kinda… until I get a hardfault exception which happens seemingly randomly anywhere from within ~5 minutes of running to 2 hours (messages successfully sending prior to the exception). Output of crash log parser is the same every time:
Crash location = strncmp [0x08038DF8] (based on PC value)
Caller location = _findenv_r [0x0804119D] (based on LR value)
Stack Pointer at the time of crash = [20008128]
Target and Fault Info:
Processor Arch: ARM-V7M or above
Processor Variant: C24
Forced exception, a fault with configurable priority has been escalated to HardFault
A precise data access error has occurred. Faulting address: 03060B30
The caller location traces back to my .map file and I don't know what to make of it.
My code:
// Copyright (c) Microsoft. All rights reserved.
// Licensed under the MIT license. See LICENSE file in the project root for full license information.
//#define USE_MQTT
#include <stdlib.h>
#include "mbed.h"
#include "iothubtransporthttp.h"
#include "iothub_client_core_common.h"
#include "iothub_client_ll.h"
#include "azure_c_shared_utility/platform.h"
#include "azure_c_shared_utility/agenttime.h"
#include "jsondecoder.h"
#include "bg96gps.hpp"
#include "azure_message_helper.h"
#define IOT_AGENT_OK CODEFIRST_OK
#include "azure_certs.h"
/* initialize the expansion board && sensors */
#include "XNucleoIKS01A2.h"
static HTS221Sensor *hum_temp;
static LSM6DSLSensor *acc_gyro;
static LPS22HBSensor *pressure;
static const char* connectionString = "xxx";
// to report F uncomment this #define CTOF(x) (((double)(x)*9/5)+32)
#define CTOF(x) (x)
Thread azure_client_thread(osPriorityNormal, 10*1024, NULL, "azure_client_thread");
static void azure_task(void);
EventFlags deleteOK;
size_t g_message_count_send_confirmations;
/* create the GPS elements for example program */
BG96Interface* bg96Interface;
//static int tilt_event;
// void mems_int1(void)
// {
// tilt_event++;
// }
void mems_init(void)
{
//acc_gyro->attach_int1_irq(&mems_int1); // Attach callback to LSM6DSL INT1
hum_temp->enable(); // Enable HTS221 enviromental sensor
pressure->enable(); // Enable barametric pressure sensor
acc_gyro->enable_x(); // Enable LSM6DSL accelerometer
//acc_gyro->enable_tilt_detection(); // Enable Tilt Detection
}
void powerUp(void) {
if (platform_init() != 0) {
printf("Error initializing the platform\r\n");
return;
}
bg96Interface = (BG96Interface*) easy_get_netif(true);
}
void BG96_Modem_PowerOFF(void)
{
DigitalOut BG96_RESET(D7);
DigitalOut BG96_PWRKEY(D10);
DigitalOut BG97_WAKE(D11);
BG96_RESET = 0;
BG96_PWRKEY = 0;
BG97_WAKE = 0;
wait_ms(300);
}
void powerDown(){
platform_deinit();
BG96_Modem_PowerOFF();
}
//
// The main routine simply prints a banner, initializes the system
// starts the worker threads and waits for a termination (join)
int main(void)
{
//printStartMessage();
XNucleoIKS01A2 *mems_expansion_board = XNucleoIKS01A2::instance(I2C_SDA, I2C_SCL, D4, D5);
hum_temp = mems_expansion_board->ht_sensor;
acc_gyro = mems_expansion_board->acc_gyro;
pressure = mems_expansion_board->pt_sensor;
azure_client_thread.start(azure_task);
azure_client_thread.join();
platform_deinit();
printf(" - - - - - - - ALL DONE - - - - - - - \n");
return 0;
}
static void send_confirm_callback(IOTHUB_CLIENT_CONFIRMATION_RESULT result, void* userContextCallback)
{
//userContextCallback;
// When a message is sent this callback will get envoked
g_message_count_send_confirmations++;
deleteOK.set(0x1);
}
void sendMessage(IOTHUB_CLIENT_LL_HANDLE iotHubClientHandle, char* buffer, size_t size)
{
IOTHUB_MESSAGE_HANDLE messageHandle = IoTHubMessage_CreateFromByteArray((const unsigned char*)buffer, size);
if (messageHandle == NULL) {
printf("unable to create a new IoTHubMessage\r\n");
return;
}
if (IoTHubClient_LL_SendEventAsync(iotHubClientHandle, messageHandle, send_confirm_callback, NULL) != IOTHUB_CLIENT_OK)
printf("FAILED to send! [RSSI=%d]\n", platform_RSSI());
else
printf("OK. [RSSI=%d]\n",platform_RSSI());
IoTHubMessage_Destroy(messageHandle);
}
void azure_task(void)
{
//bool tilt_detection_enabled=true;
float gtemp, ghumid, gpress;
int k;
int msg_sent=1;
while (true) {
powerUp();
mems_init();
/* Setup IoTHub client configuration */
IOTHUB_CLIENT_LL_HANDLE iotHubClientHandle = IoTHubClient_LL_CreateFromConnectionString(connectionString, HTTP_Protocol);
if (iotHubClientHandle == NULL) {
printf("Failed on IoTHubClient_Create\r\n");
return;
}
// add the certificate information
if (IoTHubClient_LL_SetOption(iotHubClientHandle, "TrustedCerts", certificates) != IOTHUB_CLIENT_OK)
printf("failure to set option \"TrustedCerts\"\r\n");
#if MBED_CONF_APP_TELUSKIT == 1
if (IoTHubClient_LL_SetOption(iotHubClientHandle, "product_info", "TELUSIOTKIT") != IOTHUB_CLIENT_OK)
printf("failure to set option \"product_info\"\r\n");
#endif
// polls will happen effectively at ~10 seconds. The default value of minimumPollingTime is 25 minutes.
// For more information, see:
// https://azure.microsoft.com/documentation/articles/iot-hub-devguide/#messaging
unsigned int minimumPollingTime = 9;
if (IoTHubClient_LL_SetOption(iotHubClientHandle, "MinimumPollingTime", &minimumPollingTime) != IOTHUB_CLIENT_OK)
printf("failure to set option \"MinimumPollingTime\"\r\n");
IoTDevice* iotDev = (IoTDevice*)malloc(sizeof(IoTDevice));
if (iotDev == NULL) {
return;
}
setUpIotStruct(iotDev);
char* msg;
size_t msgSize;
hum_temp->get_temperature(&gtemp); // get Temp
hum_temp->get_humidity(&ghumid); // get Humidity
pressure->get_pressure(&gpress); // get pressure
iotDev->Temperature = CTOF(gtemp);
iotDev->Humidity = (int)ghumid;
iotDev->Pressure = (int)gpress;
printf("(%04d)",msg_sent++);
msg = makeMessage(iotDev);
msgSize = strlen(msg);
sendMessage(iotHubClientHandle, msg, msgSize);
free(msg);
iotDev->Tilt &= 0x2;
/* schedule IoTHubClient to send events/receive commands */
IOTHUB_CLIENT_STATUS status;
while ((IoTHubClient_LL_GetSendStatus(iotHubClientHandle, &status) == IOTHUB_CLIENT_OK) && (status == IOTHUB_CLIENT_SEND_STATUS_BUSY))
{
IoTHubClient_LL_DoWork(iotHubClientHandle);
ThisThread::sleep_for(100);
}
deleteOK.wait_all(0x1);
free(iotDev);
IoTHubClient_LL_Destroy(iotHubClientHandle);
powerDown();
ThisThread::sleep_for(300000);
}
return;
}
I know PSM is probably the way to go since powering on/off the device draws a lot of power but it would be useful if someone had an idea of what is happening here.
2) putting the device to PSM between sending messages
The BG96 library in the example code I'm using doesn't have a method to turn on PSM so I tried to implement my own. When I tried to run it, it basically runs into an exception right away so I know it's wrong (I'm very new to embedded development and have no prior experience with AT commands).
/** ----------------------------------------------------------
* this is a method provided by current library
* #brief Tx a string to the BG96 and wait for an OK response
* #param none
* #retval true if OK received, false otherwise
*/
bool BG96::tx2bg96(char* cmd) {
bool ok=false;
_bg96_mutex.lock();
ok=_parser.send(cmd) && _parser.recv("OK");
_bg96_mutex.unlock();
return ok;
}
/**
* method I created in an attempt to use PSM
*/
bool BG96::psm(void) {
return tx2bg96((char*)"AT+CPSMS=1,,,”00000100”,”00000001”");
}
Can someone tell me what I'm doing wrong and provide any guidance on how I can achieve my goal of having my device run on battery for longer?
Thank you!!
I got Power Saving Mode working by using Mbed's ATCmdParser and the AT+QPSMS commands as per Quectel's docs. The modem doesn't always go into power saving mode right away so that should be noted. I also found that I have to restart the modem afterwards or else I get weird behaviour. My code looks something like this:
bool BG96::psm(char* T3412, char* T3324) {
_bg96_mutex.lock();
if(_parser.send("AT+QPSMS=1,,,\"%s\",\"%s\"", T3412, T3324) && _parser.recv("OK")) {
_bg96_mutex.unlock();
}else {
_bg96_mutex.unlock();
return false;
}
return BG96Ready(); }//restarts modem
To send a message to Azure, the modem will need to be manually woken up by driving the PWRKEY to start bi-directional communication, and a new client handle needs to be created and torn down every time since Azure connection uses keepAlive and the modem will be unreachable when it's in PSM.

My .c program create threads. But one of them is killed by system after 10 min running time. How to protect this thread?

My .c program create threads. But one of them is killed by system after 10 min running time. How to protect this thread?
I creates some threads do period work. But one of them is killed by system. I notice this because the this thread does not print the period information after about 10 min running time.
How to protect this thread from being killed? Anybody met this before?
int main(void){
/* threads */
pthread_t thrid_up;
pthread_t thrid_down;
pthread_create( &thrid_up, NULL, (void * (*)(void *))thread_up, NULL);
pthread_create( &thrid_down, NULL, (void * (*)(void *))thread_down, NULL);
while (!exit_sig && !quit_sig) {
//statistics
...
}
}
void thread_up(void) {
while (!exit_sig && !quit_sig) {
//deal interrupt and send the messages to the server
...
}
}
void thread_down(void) {
while (!exit_sig && !quit_sig) {
//deal the messages sent by the server
...
}
}
The thread_up thread is killed by the system everytime when this program runs about 10 min long. But thread_down works well all the time.

Linux-Xenomai Serial Communication using xeno_16550A module

I'm starter of RTOS and I'm using Xenomai v2.6.3.
I'm trying to get some data using Serial communication.
I did my best on the task following the xenomai's guide and open sources, but it doesn't work well.
the link of the guide --> (https://xenomai.org//serial-16550a-driver/)
I just followed the sequence to use the module xeno_16550A. (with port io = 0x2f8 and irq=3)
I followed open source http://www.acadis.org/pages/captain.at/serial-port-example
It works well in write task, but read task doesn't work well.
It gave me the error sentence with error while RTSER_RTIOC_WAIT_EVENT, code -110 (it means connection timed out)
Moreover I checked the irq number3 by typing command 'cat /proc/xenomai/irq', but the interrupt number doesn't increase.
In my case, I don't need to write data, so I erase the write task code.
The read task proc is follow
void read_task_proc(void *arg) {
int ret;
ssize_t red = 0;
struct rtser_event rx_event;
while (1) {
/* waiting for event */
ret = rt_dev_ioctl(my_fd, RTSER_RTIOC_WAIT_EVENT, &rx_event );
if (ret) {
printf(RTASK_PREFIX "error while RTSER_RTIOC_WAIT_EVENT, code %d\n",ret);
if (ret == -ETIMEDOUT)
continue;
break;
}
unsigned char buf[1];
red = rt_dev_read(my_fd, &buf, 1);
if (red < 0 ) {
printf(RTASK_PREFIX "error while rt_dev_read, code %d\n",red);
} else {
printf(RTASK_PREFIX "only %d byte received , char : %c\n",red,buf[0]);
}
}
exit_read_task:
if (my_state & STATE_FILE_OPENED) {
if (!close_file( my_fd, READ_FILE " (rtser)")) {
my_state &= ~STATE_FILE_OPENED;
}
}
printf(RTASK_PREFIX "exit\n");
}
I could guess the causes of the problem.
buffer size or buffer is already full when new data is received.
rx_interrupt doesn't work....
I want to check whether the two things are wrong or not, but How can I check?
Furthermore, does anybody know the cause of the problem? Please give me comments.

How properly use MSG_host_set_property_value() in SimGrid?

There is one host:
<host id="Worker1" speed="1Mf" core="101"/>
Only one process deploys on this host:
<process host="Worker1" function="worker"/>
The worker function is below:
int worker(int argc, char *argv[])
{
MSG_host_set_property_value(MSG_host_self(), "activeCore", "0", xbt_free_f);
MSG_task_execute(MSG_task_create("kot", 10e6, 0, NULL));
XBT_INFO("I've executed some stuff and changed my value");
MSG_host_set_property_value(MSG_host_self(), "activeCore", "1", xbt_free_f);
return 0;
}
When I use MSG_host_set_property_value for the second time the Segmentation fault raises.
How to avoid it?
I know that the reason in xbt_free_f. If I change it to NULL simulation will work fine. But I fear that will impact on productivity.
looking at examples/msg/platform-properties/platform-properties.c you should write
MSG_host_set_property_value(MSG_host_self(), "activeCore", xbt_strdup("0"), xbt_free_f);
and
MSG_host_set_property_value(MSG_host_self(), "activeCore", xbt_strdup("1"), xbt_free_f);

Why printing time does not decrease using processes in this program?

I'm making a program in C on CentOS using processes, and am trying to print the numbers from 0 to 1000000 in two different terminals in the shortest time possible. But time increased twice compared to printing on a single terminal. In a single terminal takes 6 or 7 seconds to print, with the two terminals takes 12-13 seconds. Is there any reason why this happens?
main(int argc,char *argv[])
{
pid_t pid;
time_t t1,t2;
time(&t1);
pid=fork();
if(pid==0)
{
int x;
FILE *da1=fopen("/dev/pts/2","w+");
for(x=0;x<=1000000;x++)
fprintf(da1,"%d ",x);
exit(0);
}
else
{
int y;
FILE *da2=fopen("/dev/pts/3","w+");
for(y=0;y<=1000000;y++)
fprintf(da2,"%d ",y);
}
wait(0);
time(&t2);
printf("\nTime %i\n",t2-t1);
}