I've read this article:
How is Docker different from a normal virtual machine?
I have huge intend of converting all my virtual images into docker instances.
I can't see an angle where vm still make sense...
So what's the point to VM now? Ok... maybe the desktop virtualization to have pulseaudio working?
Once docker solve this, what else?
UPDATE
Okay... So I can't run docker in "non-linux" favour hosts...
For one point you can't run an operating system within your container that is different from the OS on the host.
On Windows and Mac OSX boot2docker is used to run Docker which is VirtualBox running a reduced Linux OS which runs Docker.
The benefits of containers are clear and well known, but the disadvantages have been glossed over somewhat.
Specifically, you don't just need the same OS type (aka linux), you get the same version of the kernel (including any mods you want.) Since containers are an OS construct, there are resource islands per OS kernel version (and different implementations for Windows, BSD or any non-linux if they exist).
VM's are secured with CPU level isolation, containers are secured with OS level isolation (with arguably a bigger attack surface).
There are many claims out there that containers are as slow and as big as VM's once you load up your container with everything you need for production and add lots of overlays, but these are all anecdotal and no large scale survey or trustable data is available yet.
Related
I'm trying to understand the basic concepts of Docker, and lots of docs say that "Docker is not virtual machine, but a process". To me, this sentence looks quite awkward, since as far as I know, virtual machine it self also runs on host os, which makes itself a 'Process'.
Is there any big difference between the way the virtual machine works and the other normal applications/process do?
Docker is a brand name of a container management software system.
TL;DR:
Containers are a packaging concept.
VMs are a compatibility concept.
VMs are a security concept.
A container is not a process, it is an isolation of a collection of processes within a single-system-image. What is isolated? First, and foremost, the path name space. Processes within a given container share a path name space, so that they agree that /usr/bin/env is the same thing. Two processes in different containers, or perhaps inside the non-containered environment, would not necessarily see the same file for /usr/bin/env. This functionality has been a feature of UNIX derived systems for at least 40 years; under the service chroot().
More recently, containers have taken to isolate things that are not in the namespace, like processes, user ids and network interfaces. In older chroot-based systems, running ps in a container would show processes that were not in that container; although special handling hacked into to prevent a chrooted root user from gaining root access on the underlying system.
In these modern systems, not only is the pid space partitioned, but also user ids, so that root in a container does not correspond to root on the overall system.
All this is accomplished by controlling many features of the kernel in a single-system-image. The software that controls these features: Docker, amongst others.
A Virtual Machine is not part of a single-system-image. Each VM is its own logical computer, running its own kernel, shell, etc.. With some careful configuration, you can make it so various files appear within many of the VMs; but that is no different than mounting file systems exported by a network file system.
Why choose one over the other: containers share my os, and are handy to escape the .so verionitis hell caused by conflicting software systems; I can package my software in a container, and it is isolated from whatever the running system is. I cannot, however, package the kernel I need; so if my software requires ubuntu 14.02; and I am running 18.04, containers will not save me. Containers are a packaging concept.
VMs are handy to support multiple versions or types of operating systems in a single computer. Since each VM runs unique system software, I can run my 14.02 app on my 18.04 system and none is the wiser. VMs are a compatibility concept.
VMs are also handy as a security layer. Imagine that a web page has a js-bomb that can corrupt my kernel (I know, quite a stretch). If I run my browser in a container, I have corrupted my kernel. If I run it in a VM, I have corrupted that VMs kernel -- I merely have to delete it, or rewind it, and the corruption is gone. VMs are a security concept.
i'm working on an application that needs to be tested in a HPC cluster.
i'm thinking about using xcat as a resource manager.
i don't have much hardware resources, i have one HP desktop and MacBook laptop.
the question: is it possible to set up a virtual cluster (using virtualBox or KVM) on one hardware resource
thanks,
The short answer here is yes, depending on how much memory and disk you have available on your one machine. I've done this numerous times on a MacBook Pro with 8 GB of RAM.
The long answer is that there is absolutely nothing magical about an HPC cluster. All you need to test basic parallel applications in a simulated cluster environment are two or more VMs which meet these criteria:
Same OS, as identical as possible.
Passwordless authentication (ssh key based auth).
Same software stack in same location on all nodes (See #4 or use rsync).
At least one shared filesystem, e.g. NFS mounted $HOME
Shared network with name resolution configured (correct /etc/hosts on all nodes)
None of this requires job schedulers, provisioning tools or any complex networking. You can find many NFS setup howtos to help get one node set up to share $HOME to the others, this might be the most complicated part. VirtualBox does a good job of setting up local networking.
On top of this you can layer setting up a job scheduler like SLURM (highly recommended), provisioning tools like Warewulf or xCat, parallel filesystems across the VMs (BeeGFS is easy to set up and a great introduction), etc. I have had a full featured stateless cluster simulated on my Macbook Pro a number of times using tools from this list and VirtualBox VMs. It's a great way to learn about setting up an HPC cluster.
I like the Docker Hub with dockerfiles idea very much.
Is there a similar way to get a small working linux VirtualBox instance in a few commands, that could also be controlled from a command line?
Vagrant is a great tool that does just what you want and much more! It's a ruby application written for fast and simple setup of minimal development environments.
By default it creates VirtualBox images, but it supports VMWare and many others too. The whole setup of a box is managed by a single Vagrantfile! Your vm options, network settings and provisioning is done there.
Setting up a virtualbox box is as easy as executing just two shell commands. Checkout the Getting Started Guide for an example using Ubuntu.
You can use a vast range of prepared images from the Hashicorp Atlas or build your owns.
Also, vagrant doesn't limit you to one virtual machine per development setup, it enables you to model cluster setups on a single machine using multiple vms. I myself use docker for that part though.
Edit: fixed a typo :<
I received a rather puzzling question from my lecturer about Docker after doing a presentation on the differences between docker.io and virtual machines. I told him that the main purpose of docker.io is to deploy software applications without the need of a virtual machine's hypervisor.
The question is: Is it possible for Docker to deploy images with CentOS as base to several servers with no OS installed?
Docker uses an existing OS kernel that it makes available to the containers, so : No, it cannot run on "bare-metal", you need an underlying OS to provide the kernel.
But it does not have to be CentOS to run CentOS-based containers (as long as it uses a CentOS-compatible kernel).
In addition to that, the docker software itself needs some userland utilities to run, too.
Heres the problem. I use around three different machines for development. My partner is using two. We have to go through the same freaking set up procedure on all five machines to get to work.
Working with a php project here, so:
Install and configure, PDT, a php debugger, and some version of XAMPP.
Then possible install an svn client, and any other tools.
Again, to each of the five machines.
What if, instead, we did all of this once, in a virtual machine that is set up with the same stack, same versions, as the production server. Then each of us could grab a copy of the VM image, run that image on each of the five machines and do all of our development in that VM. Put Eclipse, apache, mysql, the works, all in that vm.
The only negative of this approach, and please correct me on the only part, is performance. Is it really that big of an issue though? The slowest machine out of the five is a Samsung NC10 powered by an Intel Atom 1.6 ghz processor.
Do you think this is possible and practically usable? Or am I crazy?
I use a VM for development (running on my laptop) and have never had performance problems. Another approach that you could take would be to image the drive in the state that you want. Use Acronis or Ghost to re-image each machine when you need to. Only takes about 5-10 minutes to restore an image on any modern PC.
I use a VM for all my "work" as it keeps it away from my "play". This set up allows me to use the office VPN without exposing my whole machine to the office environment (which I trust about as much as the internets. ;-) Also I don't have to worry about messing up my development environment by trying games or other software. My work VM is currently running inside VirtualBox but I have used VMWare in the past. I have only noticed performance issues when using graphic intensive programs like Webex or the Terminal Server Client.
It can certainly be done. What turns me off is the size of the VM image, which would normally be several GBs. Having it on a network share means it can take longer to transfer then your current setup process takes. I guess an external hard drive would be the easiest way to move it around.
Performance wouldn't be an issue with any web development.
I have to ask why your current machines need to be "re-imaged" each time you sit down for work?
If you're using Windows you'll probably want to use SYSPREP on the master image so that the 'mini-setup' runs when you boot up the virtual machines for the first time.
Otherwise in terms of Windows' point of view, the machines have the exact same SID, hostname and other things - running multiple machines with the same SID on the same network can cause tons of headaches. Even more if you want them to communicate with each other.
I've run websphere for zSeries on a vmware virtual machine with no problem and websphere is more resource intensive then any PHP stack. I find that having a multi core machine or at least hyper threading makes it run a lot faster.
With vmware, disk operations are slower. For PHP development I doubt it would be a problem, but you'd definitely notice it if you are compiling a large C++ project. There is also Sun's VirtualBox which is free, and the latest version is rather nice (but I haven't looked at how slow disk operations are yet).
I am using that idea in practice. Virtual machines are generally great for development.
To run on multiple operating systems and multiple separate development environments.
Preserver older development environments for later support.
Can be easily backed up, when hard drive crashes no need to start from beginning.
Can be copied from developer to another, so everyone don't have to do tedious installations and configurations.
Down sides are:
Virtual machines are slower, you need more powerful computers than you would need otherwise. I would recommend having at least 4 G of ram, but preferably more like 16, fast multi core processors and fast hard drives.
Copying Windows OS virtual machines, each used copy of virtual machine should have it's own product key. When you make a copy, it needs to be registered with new product key.
Did you think about a software configuration manager like ansible, chef or puppet? With such software automation of such tasks is very easy! It can even create fresh vm and then configure it.