PID recv: short read in CRIU - migration

I am receiving PID recv: short read error while using lazy pages migration with CRIU.
At the source, I run the following command:
memhog -r1000 64m
cd /tmp/dump sudo -H -E criu dump -t $(pidof memhog) -D /tmp/dump --lazy-pages --address 10.237.23.102 --port 1234 --shell-job --display-stats -vvvv -o d.log
Then, in a separate terminal on the source machine itself:
scp -r /tmp/dump/ dst:/tmp/
Now, on the destination machine I start the daemon:
cd /tmp/dump criu lazy-pages --page-server --address $(gethostip -d src) --port 1234 --display-stats -vvvvv
And finally, the restore command:
cd /tmp/dump criu restore -D /tmp/dump/ --shell-job --lazy-pages -vvvv --display-stats -o restore.log -vvvv
The error is thrown by the lazy server daemon on the destination machine.
Furthermore, it works fine for the memhog installed from numactl. However, it does not if I build it from the source.
Any suggestions for solving this will be appreciated.
::Update:: Solved. See answer

Found the issue:
I was building them separately on two different machines due to which their "build-id" was not matching. Solution: Build on one machine and then just scp it over to the other machine.

Related

rsync not finding local directory when sending through SSH on pipeline

Using bitbucket pipelines to push to our remote from the build process that you get from the pipeline.
This is a snippet of the bitbucket-pipelines.yml file
- pipe: atlassian/ssh-run:0.2.2
variables:
SSH_USER: $PRODUCTION_USER
SERVER: $PRODUCTION_SERVER
COMMAND: '''rsync -zrSlh -e "ssh -p 22007" --stats --max-delete=0 $BITBUCKET_CLONE_DIR/ $PRODUCTION_USER#$PRODUCTION_SERVER:home/$PRODUCTION_USER'''
PORT: '22007'
The connection itself works, and it does run the command correctly once it is remoted onto the server...
INFO: Executing the pipe...
INFO: Using default ssh key
INFO: Executing command on {HOST}
ssh -A -tt -i /root/.ssh/pipelines_id -o StrictHostKeyChecking=no -p 22007 {USER}#{HOST} 'rsync -zrSlh -e "ssh -p 22007" --stats --max-delete=0 /opt/atlassian/pipelines/agent/build/ {USER}#{HOST}:home/{USER}'
bash: rsync -zrSlh -e "ssh -p 22007" --stats --max-delete=0 /opt/atlassian/pipelines/agent/build/ {USER}#{HOST}:home/{USER}: No such file or directory
Connection to {HOST} closed.
I've tried to run the same command locally from the directory on my machine
ssh -A -tt -i /root/.ssh/pipelines_id -o StrictHostKeyChecking=no -p 22007 {USER}#{HOST} 'rsync -zrSlh -e "ssh -p 22007" --stats --max-delete=0 "$PWD" {USER}#{HOST}:/home/{USER}'
but it just duplicates the home directory on the remote.
It looks to me like it's looking for the source directory on the server and not looking at the docker container from bitbucket (or the files on my local machine with pwd).
If I try to run the command without the '' then it fails because it's using port 22 by default. I've also tried offsetting the command into a bash script and using MODE: 'Script' which is an acceptable pattern for the plugin, but I can't use my environment variables in the sh file.
If all you wan't to do is to copy the files from the pipeline to the production server, you should you the rsync-deploy pipe, instead of the ssh-run. Your pipe configuration is gonna look pretty much like the following:
script:
- pipe: atlassian/rsync-deploy:0.3.2
variables:
USER: $PRODUCTION_USER
SERVER: $PRODUCTION_USER
REMOTE_PATH: 'home/$PRODUCTION_USER'
LOCAL_PATH: 'build'
SSH_PORT: '22007'
Make sure to configure your SSH keys in pipelines properly (here is a link to our docs for configuring SSH keys https://confluence.atlassian.com/bitbucket/use-ssh-keys-in-bitbucket-pipelines-847452940.html)
I've found another way around this instead of needing a plugin, instead I'm running an rsync as a script step
image: atlassian/default-image:latest
- rsync -rltDvzCh --max-delete=0 --stats --exclude-from=excludes -e 'ssh -e none -p 22007' $BITBUCKET_CLONE_DIR/ $PRODUCTION_USER#$PRODUCTION_SERVER:/home/$PRODUCTION_USER
It seems the -e none is an important addition, as is loading in the atlassian image, as fails to find the rsync function, otherwise. I found this info on this post on Atlassian Community.
This seems to work pretty well for me
image: node:10.15.3
pipelines:
default:
- step:
name: <project-path>
script:
- apt-get update && apt-get install -y rsync
- ssh-keyscan -H $SSH_HOST >> ~/.ssh/known_hosts
- cd $BITBUCKET_CLONE_DIR
- rsync -r -v -e ssh . $SSH_USER#$SSH_HOST:/<project-path>
- ssh $SSH_USER#$SSH_HOST 'cd <project-path> && npm install'
- ssh $SSH_USER#$SSH_HOST 'pm2 restart 0'
Note: Avoid using sudo cmd in pipeline scripts
same issue with atlassian/default-image:3
rsync -azv ./project_path/*
bash: rsync: command not found
Solution:
apt-get update && apt-get install -y rsync

How do I start plack application on boot

Does anyone know how to start a plack application on boot.
The os is raspbian(raspberry pi).
I think i have run it as a normal user(pi). That's how i start it manually.
I have tried adding something like this to rc.local but without success
su pi -c 'cd /path/to/app && plackup -d -p 5000 -r -R ./lib,./t -a ./bin/app.psgi &'
This will in-turn be used by Apache and the app is written in dancer2 if it makes any difference.
On a raspberry pi I use systemd to create and start a service, in the file:
/etc/systemd/system/dancer.service
[Unit]
Description=NCI Starman Dancer App
After=syslog.target
[Service]
Type=forking
ExecStart=/usr/local/bin/starman --daemonize -l 127.0.0.1:3004 \
--user myuser --group myuser --workers 8 -D -E production \
--pid /var/run/dancer.pid -I/home/myuser/webservers/Dancer/lib \
--error-log=/home/myuser/logs/dancer_error.log \
/home/myuser/webservers/Dancer/bin/app.psgi
Restart=always
[Install]
WantedBy=multi-user.target
And then I enable this with systemctl enable dancer.service
Or start it manually with systemtctl start dancer.service
Instead of startman, you can of course use plackup.
The issue was that the perl 5 environment variables were not initialised (which are in .bashrc).
so the solution was to run the plackup command inside bash -i so that it reads .bashrc or set the PERL5LIB before invoking plackup
You may also want to use monit or supervisord to be sure your app is always run and will be restarted in case of kill by any reason, for example OOM

Docker HTTPS access - ONLYOFFICE3

I'm following the ONLYOFFICE Docker documentation
(GITHUB ONLYOFFICE docker HTTPS access) to get ONLYOFFICE
documentserver and communityserver running with HTTPS.
What I've tried:
1.
I've created the cert files (.crt, .key, .pem) like mentioned in the documentation. After that I created a file named env.list in my home dir /home/jw/data/ with the following content:
SSL_CERTIFICATE_PATH=/opt/onlyoffice/Data/certs/onlyoffice.crt
SSL_KEY_PATH=/opt/onlyoffice/Data/certs/onlyoffice.key
SSL_DHPARAM_PATH=/opt/onlyoffice/Data/certs/dhparam.pem
SSL_VERIFY_CLIENT=true
2.
After that I added the directory /home/jw/data/ to my $PATH environment
variable:
PATH=$PATH:/home/jw/data/; export PATH
3.
On the same shell I started the docker container like this:
sudo docker run -i -t -d --name onlyoffice-document-server -p 443:443 -v /opt/onlyoffice/Data:/var/www/onlyoffice/Data --env-file /home/jw/data/env.list onlyoffice/documentserver
4.
The documentserver is running fine. After that I've started the
communityserver with:
sudo docker run -i -t -d --link onlyoffice-document-server:document_server --env-file /home/jw/data/env.list onlyoffice/communityserver
5.
With the command docker ps -a I see booth docker containers running fine:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4f573111f2e5 onlyoffice/communityserver "/bin/sh -c 'bash -C " 29 seconds ago Up 28 seconds 80/tcp, 443/tcp, 5222/tcp lonely_mcnulty
23543300fa51 onlyoffice/documentserver "/bin/sh -c 'bash -C " 42 seconds ago Up 41 seconds 80/tcp, 0.0.0.0:443->443/tcp onlyoffice-document-server
But when I'm trying to access https://localhost there is an error "Secure
Connection Failed" in Firefox.
Did I miss something?
Okay got it:
I've changed the environment variables in env.list to:
SSL_CERTIFICATE_PATH=/var/www/onlyoffice/Data/certs/onlyoffice.crt
SSL_KEY_PATH=/var/www/onlyoffice/Data/certs/onlyoffice.key
SSL_DHPARAM_PATH=/var/www/onlyoffice/Data/certs/dhparam.pem
After that used the following command to run ONLY the documentserver:
sudo docker run -i -t -d --name onlyoffice-document-server -p 443:443 -v /opt/onlyoffice/Data:/var/www/onlyoffice/Data --env-file /home/jw/data/env.list onlyoffice/documentserver
The ONLYOFFICE OnlineEditor API is now available over HTTPS:
https://localhost/OfficeWeb/apps/api/documents/api.js
If you want to use CommunityServer with HTTPS just change the run command above to:
sudo docker run -i -t -d --name onlyoffice-community-server -p 443:443 -v /opt/onlyoffice/Data:/var/www/onlyoffice/Data --env-file /home/<username>/env.list onlyoffice/communityserver
Thank you anyway!

ssh connection to Vagrant virtual machine using Ansible fails

I'm new to Ansible.I set-up an Ubuntu virtual machine using Vagrant. I'm able to ssh into the machine using ssh vagrant#172.16.23.228. I have created an ssh key with the same password as the vm, added it to the agent and specified the path in my hosts file.
After following the instructions here I started to receive the following errors, when running this command (ansible all --inventory-file=hosts.ini --module-name ping -u vagrant -vvvv):
Not sure what I'm missing from my set-up, what else I need to check?
<172.16.23.228> ESTABLISH CONNECTION FOR USER: vagrant
<172.16.23.228> REMOTE_MODULE ping
<172.16.23.228> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/Users/user/.ansible/cp/ansible-ssh-%h-%p-%r" - o Port=22 -o IdentityFile="~Users/user/.ssh/onemachine_rsa" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=vagrant -o ConnectTimeout=10 172.16.23.228 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1451080871.59-247915080664557 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1451080871.59-247915080664557 && echo $HOME/.ansible/tmp/ansible-tmp-1451080871.59-247915080664557'
172.16.23.228 | FAILED => SSH Error: tilde_expand_filename: No such user Users
while connecting to 172.16.23.228:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
My hosts file looks like:
[testserver]
172.16.23.228 ansible_ssh_port=22 ansible_ssh_user=vagrant ansible_ssh_private_key_file=~Users/user/.ssh/onemachine_rsa
What you're doing can work, but I highly recommend using the built-in Ansible provisioner in Vagrant. It will make your life easier and improve your Vagrant skills at the same time. And if you need to execute any shell scripts, use the shell provisioner.
Providing this answer for the benefit of those, like me, who arrive later at the party. Latest Vagrant installations install a private key in a local directory instead of using the admittedly insecure private key for every VM. You'll have to create an ansible_hosts file like this one:
[vagrantboxes]
jessie ansible_ssh_port=2222 ansible_ssh_host=127.0.0.1
[vagrantboxes:vars]
ansible_ssh_user=vagrant
ansible_ssh_private_key_file=.vagrant/machines/default/virtualbox/private_key
Where the key is the last line, which provides a path to the actual private key used in the virtual machine that has been started up from this particular directory.
The path to your ansible_ssh_private_key_file is incorrect. Try ansible_ssh_private_key_file=~/.ssh/onemachine_rsa instead. The tilde in this case expands to the home directory of your user on the local machine you're running ansible from.

Ansible script ssh error

I am creating a vm in openstack (linux vm) and launching ansible script from there.I am getting following ssh error.
---
- hosts: licproxy
user: my-user
sudo: yes
tasks:
- name: Install tinyproxy#
command: sudo apt-get install tinyproxy
- name: Update tinyproxy
command: sudo apt-get update
- name: Install bind9
shell: yes '' | sudo apt-get install bind9
Though I am directly able to ssh to machine 10.32.1.40 from the linux box in openstack admin-keydev29
PLAY [licproxy] ***********************************************************
GATHERING FACTS ***************************************************************
<10.32.1.40> ESTABLISH CONNECTION FOR USER: my-user
<10.32.1.40> REMOTE_MODULE setup
<10.32.1.40> EXEC ssh -C -tt -vvv -o StrictHostKeyChecking=no -o IdentityFile="/opt/apps/installer/tenant-dev29/ssh/admin-key-dev29" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=my-user -o ConnectTimeout=10 10.32.1.40 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1450797442.33-90087292637238 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1450797442.33-90087292637238 && echo $HOME/.ansible/tmp/ansible-tmp-1450797442.33-90087292637238'
EXEC previous known host file not found for 10.32.1.40
fatal: [10.32.1.40] => SSH Error: ssh: connect to host 10.32.1.40 port 22: Connection refused
while connecting to 10.32.1.40:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
TASK: [Install tinyproxy] *****************************************************
FATAL: no hosts matched or all hosts have already failed -- aborting
I removed from known_host entry and ran the script again it is still showing me same message.
UPDATE
I observed manual ssh is working fine.but ansible script is giving ssh error.
I logged in to the newly created vm using ssh key and checked /var/log/auth.log file
Dec 30 13:00:33 licproxy-vm sshd[1184]: Server listening on :: port 22.
Dec 30 13:01:10 licproxy-vm sshd[1448]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Dec 30 13:01:10 licproxy-vm sshd[1448]: Connection closed by 192.168.0.106 [preauth]
Dec 30 13:01:32 licproxy-vm sshd[1450]: error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
The vm has sshd version OpenSSH_6.6.1 version
I checked /etc/ssh folder i found ssh_host_ed25519_key and ssh_host_ed25519_key.pub missing
I created those file using command ssh-keygen -A.
Now I want to know why these files are missing from ssh folder.Is this a bug?
Problem was because of ssh port 22.The port was not up.
I added the following code.which basically wait for ssh port to come up.
while ! nc -z $PROXY_SERVER_IP 22; do
sleep 10s
done