To virtualize or not to virtualize a bare metal server for a kubernetes deployment - virtual-machine

I'd like to deploy kubernetes on a large physical server (24 cores) and I'm uncertain as to a number of things.
What are the pros and cons of creating virtual machines for the k8s cluster other than running on bare-metal.
I have the following considerations:
Creating vms will allow for work load isolation. New vms for experiments can be created and assigned to devs.
On the other hand, with k8s running on bare metal a new NAMESPACE can be created for each developer for experimentation and they can run their code in it. After all their code should be running in docker containers.
Security:
Having vms would limit the amount of access given to future maintainers, limiting the amount of damage that could be done. While on the other hand the primary task for any future maintainers would be adding/deleting nodes and they would require bare metal access to do that.
Authentication:
At the moment devs would only touch the server when their code runs through the CI pipeline and their running deployments are deployed. But what about viewing logs? Could we setup tiered kubectl authentication to allow devs to only access whatever namespaces have been assigned to them (I believe this should be possible with the k8s namespace authorization plugin).
A number of vms already exist on the server. Would this be an issue?

128 cores and doubts.... That is a lot of cores for a single server.
For kubernetes however this is not relevant:
Kubernetes can use different sized servers and utilize them to the maximum. However if you combine the master server processes and the node/worker processes on a single server, you might create unwanted resource issues. You can manage those with namespaces, as you already mention.
What we do is use continuous integration with namespaces in a single dev/qa kubernetes environment in which changes have their own namespace (So we run many many namespaces) and run full environment deployments in those namespaces. A bunch of shell scripts are used to manage this. This works both with a large server as what you have, as well as it does with smaller (or virtual) boxes. The benefit of virtualization for you could mainly be in splitting the large box in smaller ones so that you can also use it for other purposes then just kubernetes (yes, kubernetes runs except MS Windows, no desktops, no kernel modules for VPN purposes, etc).

I would separate dev and prod in the form of different vms. I once had a webapp inside docker which used too many threads so the docker daemon on the host crashed. It was limited to one host luckily. You can protect this by setting limits, but it's a risk: one mistake in dev could bring down prod as well.

I think the answer is "it depends!" which is not really an answer. Personally, I would split up the machine using VM's and deploy that way. You've got better flexibility as to how much of the server's resources you carve out and you can easily create new environments, then destroy easily.
Even if these vms are really big, I think it's still easier to manage also given that you have existing vm's on the machine.
That said, there's not a technical reason that you can't run a single node server, but you may run into problems with downtime with upgrades (if that's an issue), as well as if that server needs patched or rebooted, then your entire cluster is down.
I would look at your environment needs for HA and uptime, as well as how you are going to deploy VM's (if you go that route), and decide what works the best for you.

Related

Liferay Cloud IDE, Multiple developpers working on same liferay server

We want to start working with liferay. But the server is too heavy and the developpers computer don't have enought RAM. We want to centralize the server instance.
In other words, we want to build a development server where all developpers can connect and directly develop in their web browser, compile, view the result and push the code to git repository.
I found some good cloud IDE like eclipse CHE and a good maven archetype for liferay projet. So i can build the projet with maven. But now i want to know if it is possible to configure Liferay like every developpers can work without troubling another. And if possible, How ?
The developpers can share the same database and can use different port. Maybe, the server can generate tempory URL like some online cloud editor.
I found this post Liferay With Multiple Server Instances, but i don't think is the best way because he create one server per project. I think is too heavy.
If necessary, We have kubernetes in our IS.
Liferay's tomcat bundle, by default, is configured to take a maximum of 2.5G for the process, but it can run with far less - the default only recently was bumped up, because many people never change the default and then wonder why production systems run out of memory. For 1 concurrent user (the sole developer) on a machine, I guess that the previous default of 1G heap space is enough. Are you saying that that's too much for your developers' machines?
Having many developers on a shared server poses one problem: Yes, you may deploy different code from different machines, but: How about setting a breakpoint? Can you connect with multiple debuggers? If something fails, how do you know whos recent deployment caused the failure?
Sharing a server is an integration technique, not a development technique. If your developers don't have enough memory available for running their own Liferay server next to their IDE, it's a lot cheaper to upgrade their machines than to slow them down when everybody is accessing the same server and they can't properly debug. You pay the memory once, but your waiting developers by the hour.
Is it possible to share one server? Sure it is.
Is it possible to share one server without troubling each other? I doubt.
When you say: You think it's too heavy: What are you basing that assumption on? What does the actual developer machine look like and what keeps you from investing in the extra memory?
It's trivial to share some infrastructure - i.e. have all of them connect to the same database server (and give everyone their own schema). But just the extra effort and setup might require you to pay the developers by the hour as much as you'd otherwise pay for a couple of memory chips.
And yet another option is: Run Liferay on a remote server, but keep 1 instance per developer. This way you don't need the local memory, but can have the memory in the cloud. Calculate if you pay more for remote cloud machines than for local memory - that decision is up to you.

Do containers always require less resources or do they create additional overhead

I have been learning about the Docker and its advantages. These advantages include, but are not limited to:
Rapid Deployment
Portability
Security
Isolation
Version control
Lightweight footprint with minimal overhead
My questions:
Will it always be less stressful on a host machine to run multiple applications such as Zeppelin, Hadoop, Flume, ect. in individual docker containers or would the application virtual machine be added on top of the container (docker creates overhead)?
At some point does the number of containers running create an overhead that will cost more resources than it would to just run all of the tools directly off the host machine?
Would it be better to run all of the apps in one container?
Video about docker: https://www.youtube.com/watch?v=YFl2mCHdv24
I found this forum reply by someone at Docker.
Will it always be less stressful on a host machine to run multiple applications such as Zeppelin, Hadoop, Flume, ect. in individual docker containers or would the application virtual machine be added on top of the container (docker creates overhead)?
The post seems to indicate Docker does come with overhead that scales linearly with the number of containers being ran on a given host system. This has to do with Docker being written in Golang and how Golang operates.
At some point does the number of containers running create an overhead that will cost more resources than it would to just run all of the tools directly off the host machine?
Judging from the aforementioned Docker staff response, it seems like this would be the case. I'd not look at "overhead" in isolation though; running 40 different Java containers with disparate versions on one host is unrealistic for technical reasons. Docker allows you to do this easily because it isolates each process.
So if I may do a slanted comparison, the human management overhead created by managing 40 Java applications on one host would certainly be greater cost than the additional system resources you might save by foregoing containerizing them.
Would it be better to run all of the apps in one container?
Assuming the context here is system overhead, and taking into consideration the linear scaling of Docker process overhead with running container count, fewer containers = less Docker process overhead. Similar to the previous question though, you might be introducing complexity by lumping everything into one container that would cost some poor soul a lot of hours troubleshooting or fixing.
If you have several pieces (web server, app server, database, memcache, activemq), putting them in one container will eventually become very inefficient because you can't scale them individually. If you need to scale your app server, you can't just scale your app server, you have to also unnecessarily scale all the other services in the container.

Docker- what real value does it bring for our team?

I am very very new to Docker. Our team has had a very nice deployment line up where We have different CI engines for different projects including Jenkis and TeamCity.
Developers usually check-in and CI takes over, deploys and its perfectly ready for test team to test. I always thought this to be a perfect model. Of course, some parts and our implementation have their flaws but it worked very well for what we wanted.
Now, our Dev-Ops is introducing Docker where test teams get a Docker Image from Docker Registry Everytime we run a build from teamcity. While it sounds really really fancy I am still failing to understand the benefit of it.
After my research, my conclusion was that Dockers can be a good light weight replacements for VM. BUT that is ONLY IF you are using any VMs? We are not using any VMS? I just do not understand what is the real value here? Also, while searching I found a relatively good link on Docker:
https://www.ctl.io/developers/blog/post/what-is-docker-and-when-to-use-it/
Where they discuss when you should use Docker and one of the point says that:
Use Docker whenever your app needs to go through multiple phases of development (dev/test/qa/prod, try Drone or Shippable, both do Docker CI/CD)
Ok. Howeve rthey do not further elaborate on why is docker useful when my app has to go through multiple phases?
And how it is exptremely helpful over regular Dev/Test set up when the existing set up is already working smooth?
First, you are right about comparing it to VMs in that it is similar to a VM. However, docker is incredibly lightweight. This property is the one that surprised me most in the beginning. As opposed to virtual machines, containers share resources much more efficiently. Virtual machines are isolated. Containers can run simultaneously on a host machine with very little overhead. You can configure containers to be able to talk to each other (via volume or port bindings).
Furthermore, in my team, docker brings the following benefits:
our application consists of one big application and several other few microservices. But we want to release all as one package with inter-dependencies among the applications, which eliminates problems with figuring out which version of application and microservices should be deployed together (compatiblity) etc. That is, the image contains all you need and you can bring all applications or one-by-one up/down using docker-compose. You do not need to deploy, you simply pull the image and fire a container/s. If you wish to stop one of the microservices, it can be done without affecting the others.
developers in the team, can run the very same image on local machine, for example to troubleshoot a problem occurred in the production; which means troubleshooting can be done in the same environment as in the production. This brings environment standardization and no more "but it works on machine" talk.
another benefit it brings to us is the following: we build a docker image, run our tests against it, and push it to the registry once all these phases succeed, which translates into a great portability.
Ability to version control the containers. You can easily inspect containers between the current version and the previous versions. If you wish to rollback - that is done smoothly.
Isolating and securing applications. All containers are isolated and you can easily control what goes in and out.
It took me a year before I got used to the idea, but now it seems simple enough.
I think part of that comes from the fact that people keep calling Docker a "virtual machine", which is not accurate. That's really just a nickname for what's happening behind the scenes. In a lot of ways, Docker will NOT replace a complete virtualization solution, such as VMWare. It does, however, bring forth a new way of thinking about infrastructure. One that many people have a difficult time wrapping their heads around.
You can start asking yourself: What makes a Linux distribution unique?
Aside from the kernel, everything else is just a "standard way" of organizing binaries, libraries, runtime and configuration files. You need your binaries in /bin, your libs in /lib, your configuration in /etc. User installations get placed under /usr...
Most distributions will keep the main structure from the Unix legacy and add its own quirks. Each one will have its own way to manage and distribute packages. Each will maintain their own versions of libraries, drivers, etc.
The key ingrident is the kernel. That's something they all have in common. Nowadays, recent builds of the Linux kernel are compatible with pretty much all major distributions available. So, aside from /boot, most of everything else is just a matter of having the right files in the right place with the right permissions.
Now, imagine you take all that distribution bundle (except the kernel) and place it all in another directory of your running OS. Taking advantage of the same kernel you are already running, you isolate a new process so that it "thinks" that / is now that directory. Bingo! This process now "thinks" it's running all by itself on another operating system.
Docker builds on top of Linux Containers, which allows us to do excatly that, but in a more friendly and easier way. Don't think of it as a virtual machine. Think of it as process isolation. The running kernel will share the machine's resources with this process, while keeping it isolated from the rest of the system. It's like jails on steroids.
That was a broad simplification. But, given the concept, think about the implications of this idea.
You can have on the same host, multiple processes with completely different environments that might otherwise conflict with each other. One may be a legacy binary that needs old libraries in place (legacy systems that never die). Another may be the most recent build of a bleeding edge technology. Sharing the same kernel is a efficient, and valuable resource management.
The most value I found comes from managing the infrastructure. Once you install Docker on the hosts, configure a swarm, and define a way of deploying containers, you mostly forget about the hosts. Adding users, installing packages, customizing, editing configuration files... All that becomes a development task on your desktop. There's an incentive to script more, to automate more. To keep your hands away from the physical or virtual machines, unless absolutely necessary.
Gone are the days when someone changed some obscure setting on the server to work around some weird application behavior, forgot to tell anyone about it and took a vacation. Changes to the environment can be commited to version control, tracked and improved by everyone on the team. If your datacenter goes through a disaster, recreating the whole environment is a matter of rebuilding images and redeploying containers. Your infrastructure becomes consistent and reproducible, while keeping the doors open to a wide variety of operating systems and customized configurations for each application.
Developers can take advantage of Docker with the ability of recreating dev/staging/production environments on their desktops. No need to polute a dev machine with application servers and database installations, or even the toll of Virtual Box to emulate all that.
Testing can be automated with a higher level of isolation. The Selenium team already has official Docker images. Creating an entire test hub should be a walk in the park with those puppies.
Building custom software, such as compiling Nginx with third party modules, can also be done inside containers from specialized images. No need to keep an entire server dedicated to it, or even polute your desktop with all the dependencies and build packages.
Overall, we've been having a great experience with Docker. We've migrated our staging environment to this new platform, and plan to migrate other parts of the infrastructure as well, eventually into production. So far, so good.
I hope you can convince enough people to take a better look at it. I'll admit, it took me sometime to get used to the idea. But once you get it, it's actually worth it.

best practices for setting development environment

I use Linux as primary OS. I need some suggestions regarding how should I set up my desktop and development. I do work on mostly .Net and Drupal, but some time on other lamp products and C/C++, Qt. I'm also interested in mobile (android..) and embedded development.
Currently I install everything on my main OS, even I use it a little. I use VMs a little (for lamp server).
Should I use separate VM for each kind of development (like one for .Net/Mono, another C++, one for mobile and one for db only, one for xyz things etc)
Keep primary development environment on main os and move others in VM.
main os should be messed up
keep things easy to organize (must)
performance should be optimal (optimal settings for best performance of components)
I'm interested to know how others' are doing.
There are both pros and cons with VM's.
Pros:
portability: you can move image to
different server
easy backup (but lengthy)
replication (new member joins team)
Cons
performance
hardware requirements
size of backups (20-40 GB per VM ...)
management of backed up images (what is the difference is not obvious)
keeping all images up to date
(patching / Windows updates)
For your scenario, I would create base VM with core OS and shared components (Web server, database), replicated it and installed specific tools into separate VM. If you combine tools within VM, you may end up with same mess as in case of using base OS - the advantage is that it is much easier to get rid of it ;-)
Optimal performance != using VMs
if you need to use VMs anyway, then yes: it could be better to use a separate VM for each thing that need one, unless you need more than one at once
Now that OCI containers are stable and well supported, using those through docker, podman or other similar tool is an increasingly popular option.
They are isolated, but under the same kernel, so:
they are almost as portable as virtual machines,
like virtual machines they can have their own virtual IP addresses, so they can run services not visible from the outside and without occupying port on the host, but
they don't reserve any extra space on disk or in memory like virtual machines and
they are not slowed by any virtualization layers and
mounting directories from the host is easy and does not require any special support.
The usual approach is to have the checkout in the developer's normal home directory and mount it into containers for building, testing and running.
Also building in containers is now supported by Remote Development extension for Visual Studio Code

Stress testing a server and VPS's vs. Dedicated servers

We used to have a dedicated server (1&1) and very infrequently ran into problems with the server having issues.
Recently, we migrated to a VPS (Wiredtree.com) with similar specs to our old dedicated server, but notice frequent problems running out of memory, mysql having to restart, etc... both when knowingly running intensive scrips and also just randomly during normal use.
Because of this, we're considering migrating to another at VPS - this time at Slicehost to see if it performs better.
My question is two fold...
Are their straightforward ways we could stress test a VPS at Slicehost to see if the same issues occur without having to actually migrate everything over?
Also, is it possible that the issues we're facing aren't just because of the provider (Wiredtree) but just the difference between a dedicated box and VPS (despite having similar specs)?
The best way to stress test an environment is to put it under load. If this VPS is hosting a web application, use one of the many available web server benchmark tools: ab, httperf, Siege or http_load. You don't necessarily care that much about the statistics from the tool itself, but more that it puts a predictable load on the server so that you can tune Apache to handle it, or at least not crash and burn.
The one problem you have with testing against Slicehost is that you are at the mercy of the Internet and your bandwidth to Slicehost. You may not be able to put enough load on the server to reach a meaningful conclusion.
Instead, you might find it just as valuable to run one of the many virtualization products on the market and set up a VM with comparable specs to the VPS plan you're considering. Local testing over your LAN will allow you to put a higher and more predictable load on the server.
In either case, you don't need to migrate everything, but you will need to set up an environment for your application to run in, with representative data in your database.
A VPS with similar specs to a dedicated server should perform approximately the same, but in order to get good performance, you still need to tune Apache, MySQL and any other long-lived server processes. In my experience, the out-of-the-box configuration of Apache in many Linux distributions is not ideal and will allow far too many child processes, overcommitting memory and sending the server into a swap-death spiral.