mpich cluster test error, unable to change wdir - ssh

I have built up a mpich2 cluster, and the machinefile is:
pc3#ub3:4 # this will spawn 4 process on ub3
pc1#ub1 # this will spawn 1 process on ub1
when I run the test process, it should print:
Hello from processor 0 of 8
Hello from processor 1 of 8
Hello from processor 2 of 8
Hello from processor 3 of 8
Hello from processor 4 of 8
Hello from processor 5 of 8
Hello from processor 6 of 8
Hello from processor 7 of 8
But it returned:
pc1#ub1:~$ mpiexec -n 8 -f machinefile ./mpi_hello
[proxy:0:0#ub3] launch_procs (./pm/pmiserv/pmip_cb.c:648): unable to change wdir to /home/pc1 (No such file or directory)
[proxy:0:0#ub3] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:893): launch_procs returned error
[proxy:0:0#ub3] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0#ub3] main (./pm/pmiserv/pmip.c:206): demux engine error waiting for event
[mpiexec#ub1] control_cb (./pm/pmiserv/pmiserv_cb.c:202): assert (!closed) failed
[mpiexec#ub1] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec#ub1] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:197): error waiting for event
[mpiexec#ub1] main (./ui/mpich/mpiexec.c:331): process manager error waiting for completion
I have successfully enable passwordless SSH so that pc1 can connect passwordlessly to pc3. Though it is, I still think there is something wrong with SSH or access permission. My OS is Ubuntu 14.04 LTS 32bit
Thanks for help.

make sure all the user names are the same. So change machine file to
ub3:4 # this will spawn 4 process on ub3
ub1 # this will spawn 1 process on ub1
And copy all the compiled file to the corresponding directory.
Make sure all the hostnames all in all the nodes' /etc/hostname file.

Related

mycode.exe: *** fatal error - console device allocation failure - too many consoles in use, max consoles is 32

Code File: mycode.c
Compiler: C:\cygwin64\bin\g++.exe
Compile Command: g++ -o mycode mycode.c
Output File: mycode.exe
mycode.exe Execution Time Approx: 1 Hour
Os: Windows 10
My Work Need: Run 100 cme.exe different parallel windows terminal and run mycode.exe on each terminal
Problem Statement: After 32 windows running, at 33rd window, i am getting error *"mycode.exe: *** fatal error - console device allocation failure - too many consoles in use, max consoles is 32"*
Need your expert help on "how to recover from this error?" and run mycode.exe 100 times parallel.
Solution: change compiler to https://sourceforge.net/projects/tdm-gcc/
Compile Command: C:\TDM-GCC-64\bin\c++ -o mycode mycode.c
Observation: i am able to run 100 parallel mycode.exe

Making Dockerized Flask server concurrent

I have a Flask server that I'm running on AWS Fargate. My task has 2 vCPUs and 8 GB of memory. My server is only able to respond to one request at a time. If I run 2 API requests at the same, each that takes 7 seconds, the first request will take 7 seconds to return and the second will take 14 seconds to return.
This is my Docker file (using this repo):
FROM tiangolo/uwsgi-nginx-flask:python3.7
COPY ./requirements.txt requirements.txt
RUN pip3 install --no-cache-dir -r requirements.txt
RUN python3 -m spacy download en
RUN apt-get update
RUN apt-get install wkhtmltopdf -y
RUN apt-get install poppler-utils -y
RUN apt-get install xvfb -y
COPY ./ /app
I have the following config file:
[uwsgi]
module = main
callable = app
enable-threads = true
These are my logs when I start the server:
Checking for script in /app/prestart.sh
Running script /app/prestart.sh
Running inside /app/prestart.sh, you could add migrations to this file, e.g.:
#! /usr/bin/env bash
# Let the DB start
sleep 10;
# Run migrations
alembic upgrade head
/usr/lib/python2.7/dist-packages/supervisor/options.py:298: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2019-10-05 06:29:53,438 CRIT Supervisor running as root (no user in config file)
2019-10-05 06:29:53,438 INFO Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2019-10-05 06:29:53,446 INFO RPC interface 'supervisor' initialized
2019-10-05 06:29:53,446 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2019-10-05 06:29:53,446 INFO supervisord started with pid 1
2019-10-05 06:29:54,448 INFO spawned: 'nginx' with pid 9
2019-10-05 06:29:54,450 INFO spawned: 'uwsgi' with pid 10
[uWSGI] getting INI configuration from /app/uwsgi.ini
[uWSGI] getting INI configuration from /etc/uwsgi/uwsgi.ini
;uWSGI instance configuration
[uwsgi]
cheaper = 2
processes = 16
ini = /app/uwsgi.ini
module = main
callable = app
enable-threads = true
ini = /etc/uwsgi/uwsgi.ini
socket = /tmp/uwsgi.sock
chown-socket = nginx:nginx
chmod-socket = 664
hook-master-start = unix_signal:15 gracefully_kill_them_all
need-app = true
die-on-term = true
show-config = true
;end of configuration
*** Starting uWSGI 2.0.18 (64bit) on [Sat Oct 5 06:29:54 2019] ***
compiled with version: 6.3.0 20170516 on 09 August 2019 03:11:53
os: Linux-4.14.138-114.102.amzn2.x86_64 #1 SMP Thu Aug 15 15:29:58 UTC 2019
nodename: ip-10-0-1-217.ec2.internal
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 2
current working directory: /app
detected binary path: /usr/local/bin/uwsgi
your memory page size is 4096 bytes
detected max file descriptor number: 1024
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to UNIX address /tmp/uwsgi.sock fd 3
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
Python version: 3.7.4 (default, Jul 13 2019, 14:20:24) [GCC 6.3.0 20170516]
Python main interpreter initialized at 0x55e1e2b181a0
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) ***
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 1239640 bytes (1210 KB) for 16 cores
*** Operational MODE: preforking ***
2019-10-05 06:29:55,483 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-10-05 06:29:55,484 INFO success: uwsgi entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

running gem5 with SPEC2006

when running GEM5 X86 in SE mode, I am trying to run bzip2 from SPEC2006, at first it was failing because it says it can't run dynamic execution so I compiled it with -static flag.
now I get this error:
gem5 Simulator System. http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.
gem5 compiled Oct 27 2018 00:36:02
gem5 started Dec 22 2018 18:16:40
gem5 executing on Dan
command line: ./build/X86/gem5.opt configs/example/se.py -c /home/dan/SPEC2006/benchspec/CPU2006/401.bzip2/exe/bzip2_base.ia64-gcc42 -i /home/dan/SPEC2006/benchspec/CPU2006/401.bzip2/data/test/input/dryer.jpg
Could not import 03_BASE_FLAT
Could not import 03_BASE_NARROW
Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (4096 Mbytes)
0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
**** REAL SIMULATION ****
info: Entering event queue # 0. Starting simulation...
panic: Tried to write unmapped address 0xffffedd8. Inst is at 0x400da4
# tick 5500
[invoke:build/X86/arch/x86/faults.cc, line 160]
Memory Usage: 4316736 KBytes
Program aborted at tick 5500
Aborted (core dumped)
I am running gem5 on ubuntu 17.10.
I tried to find solutions in google but I didn't see any one referring to this problem, does anyone know how to fix the problem?
Please check your host machine configuration. Bzip2 does not work in a 32-bit machine. My desktop is dual core have 32-bit X86 architecture, I tried to run bzip2 it had shown same error.

Minishift: Problems creating virtual machine

my question about the installation of openshift environment using minishift on virtual box.
minishift v1.4.1+0f658ea
VirtualBox-5.1.26-117224-Win.exe
The installation is incomplete due to the folowing error:-
C:\Users\xyzdgs\Desktop\Openshift_n_Docker\OpenShift Developer>minishift.exe start --vm-driver=C:\Program Files\Oracle\VirtualBox\VBoxSVC.exe
-- Starting local OpenShift cluster using 'C:\Program' hypervisor ...
-- Minishift VM will be configured with ...
Memory: 2 GB
vCPUs : 2
Disk size: 20 GB
Downloading ISO 'https://github.com/minishift/minishift-b2d-iso/releases/download/v1.1.0/minishift-b2d.iso'
40.00 MiB / 40.00 MiB [===========================================] 100.00% 0s
-- Starting Minishift VM ... | Unsupported driver: C:\Program
So, to solve this I simply put the directory where all drivers are located in the installation and run it again
C:\Users\xyzdgs\Desktop\Openshift_n_Docker\OpenShift Developer>minishift.exe start --vm-driver=C:\Program Files\Oracle\VirtualBox\
-- Starting local OpenShift cluster using 'C:\Program' hypervisor ...
-- Starting Minishift VM ... / FAIL E0825 11:20:43.830638 1260 start.go:342]
Error starting the VM: Error getting the state for host: machine does not exist.
Retrying.
| FAIL E0825 11:20:44.297638 1260 start.go:342] Error starting the VM: Error getting the state for host: machine does not exist. Retrying.
/ FAIL E0825 11:20:44.612638 1260 start.go:342] Error starting the VM: Error getting the state for host: . Retrying.
Error starting the VM: Error getting the state for host: machine does not exist
Error getting the state for host: machine does not exist
Error getting the state for host: machine does not exist
It says "machine does not exist", shouldn't the machine be created by minishift itself (see te procedure here: blog.novatec-gmbh.de/getting-started-minishift-openshift-origin-one-vm/)
Not sure what is causing this. Please guide.
The main issue with the command -- and what it's really complaining about -- is that you're passing in an unquoted path:
minishift.exe start --vm-driver=C:\Program Files\Oracle\VirtualBox\VBoxSVC.exe
should have been
minishift.exe start --vm-driver="C:\Program Files\Oracle\VirtualBox\VBoxSVC.exe"
But according to the MiniShift documentation, you should update to VirtualBox 5.1.12+ (which you have) and use the following syntax:
minishift.exe start --vm-driver=virtualbox
7 months after this question was asked and using VirtualBox v4.3.30, I can get MiniShift v1.15.1 running with the last command, but can't get it to accept your previous syntax or even produce the same error from it.

Why does my script not work on FreeBSD? (awk: syntax error)

Why does this script not work on FreeBSD? I ran the script on Centos and Debian, all was fine. On FreeBSD (10.2) I encounter the following error:
awk: syntax error at source line 1
context is
match($0, "^listen >>> queue:[[:space:]]+(.*)", <<<
awk: bailing out at source line 1
-0.9902
As an example, here is some output of php-form status:
pool: www
process manager: ondemand
start time: 29/Feb/2016:15:18:54 +0200
start since: 2083770
accepted conn: 1467128
listen queue: 0
max listen queue: 129
listen queue len: 128
idle processes: 1
active processes: 2
total processes: 3
max active processes: 64
max children reached: 1
slow requests: 0
On Centos and Debian, when I run:
/path/to/script/php-fpm-check.sh "idle processes" http://127.0.0.1/status
I get 1, but on FreeBSD the error mentioned above.
The 3-argument form of match is a GNU awk extension (docs). You'll have to find another way to capture the match (perhaps using the RSTART and RLENGTH variables set as a side-effect of match()), or install gawk on your freebsd system.