Virtual Enviroment CPU Allocation

Virtual Enviroment CPU Allocation - virtual-machine

I am currently attempting to spec out a virtual environment and I am having a hard time understanding how many cores or "cpu's" I can apply to virtual machines.
Can someone let me know how many usable cores I have in the attached image spec?
In other words, how many cores can I assign to VMs before I hit my limit, or run into issue with performance.
Server spec
(2) xeon silver 4214 2.2 12c per Server
4 servers total. Based on this I should have 192 virtual cores that I can allocate? Or am I wrong??

you have 48 logical processors on 1 server with listed CPU. Now think this way - you might have other VMs that will consume some amount of resources like CPU and RAM. If you will assign lets say 16 cpus to a VM, will other hosts in your cluster (I assume you clustered all 4 hosts) be able to handle the load of other VMs+ this with 16cpus?
You should check VMs usage on idle and with some load so then you could do some calculation of how many cpus each VM should have before you gonna experience major performance issues.

Related

H/W requirements for Confluent single node

What are the hardware requirements for Single node Confluent installation?
I checked their official site, but it has specifications for multi-node: https://docs.confluent.io/platform/current/installation/system-requirements.html

Unclear what all you're trying to run. Zookeeper and Kafka alone have been made to run with limited resources on a Raspberry Pi and definitely can run on any modern laptop or computer.
If you're running a single node, it's not considered "production grade", so with at least 5 services (ZK, broker, Schema Registry, REST proxy, ksqlDB) at 2 GB max heap each, that'd require 10 GB RAM + overhead for the OS, so call it 16 GB of memory to be conservative
If you also want to (reasonably) run Control Center, it's suggested to have 6GB for that, increasing your memory requirements up to at least 24 GB on a single node if you want to include calculation for your own Kafka client applications
Of course, you can opt out of certain services and tune each JVM to how you want...
As far as disk space goes, really depends on how much data you plan on having. 500 GB would be a good starting point, but a single disk wouldn't be fault tolerant

Resource usage of a static web server

I came across this question in a blog post. It was asked by Mozilla in their internship interview. (Blog Post)
You are running a HTTP server (nginx, Apache, etc) that is configured
to serve static files off the local filesystem of your modern,
multi-core server connected to a gigabit network. A handful of clients
start requesting the same 4kb static file as fast as they can. What
system resource do you think will be exhausted first?
a. CPU
b. Disk / I/O
c. Memory
d. Network
e. Other
According to me, none of this would be exhausted on a modern machine, with Nginx/Apache. Won't the web server cache such a small file and just keep serving that. Also, for repeated request it can easily send a Not-Modified header.
In case of Apache, I guess due to it handling multiple clients by spawning threads, CPU will be exhausted first, but for a "handful" of clients, that won't matter.
I wanted to know what others have to say about this question.

It reeeeeeeeally depends. 4k is that magical size that will fit into as good as all caches and buffers at their default settings, so it is easy (and fast) to pass around. memory is not a limiting factor here as webservers will operate on filehandles, not entire files. In this case I would assume they keep it right in memory, but that would be one file per worker instance which would usually come down to 4kb * (num_cores + 1) at most, which is not really an issue.
One could argue that either memory- or diskspeed were an issue. But former one were neglectable when methods like sendfile are properly configured, enabling for a zero-copy approach. Latter one would amortize over time once a copy of the file got loaded into memory.
Lastly, there's the interface and the CPU(s). Overall, CPU time tends to be a lot cheaper than network time, so I would expect the NIC to be the bottleneck long before the CPU - if at all.
The question is a bit unspecific on the location of the clients. If they are connected to the same GbE network, they could indeed have the power to saturate your NIC with their requests. If not, some intermediary could become the limiting factor.
Now let us assume those clients were in our network and we had a single-homed 10GbE NIC here, connected via 8 lanes (which is fairly standard IMHO): PCIe 3.0 x8 is specified with 7,877 MB/s. A Core i7 3770 has a bus speed of 5GT/s, which is translating to roughly 8 GB/s at 8 lanes. Assuming no other I/O-intensive workload, this CPU could easily saturate the NIC.
So in summary: Network/NIC saturation before CPU saturation before anything else.

What would be a good hardware configuration for a Redis dedicated server?

I am planning to configure Redis in Master/Slave configuration.
I have got three machines (8GB RAM, 8 cores), planing to to use one master and two slaves.
What would be the recommended hardware configuration for these machines?

Redis is not CPU intensive, so you should get at least 2 cores per server (one for redis, one for backups, maybe one more to do basic stuff on the server?), more is not really relevant. Redis is single-threaded.
Get as much RAM as you can as it defines the size of your store. Also making a dump consumes RAM so your true space size is less than you can think. Monitor your RAM usage to prevent surprises.
For RAM type, if it fails, redis fails and sometimes silently (consistency broken). If you need to be careful with your data always use ECC RAM, it is expensive but maybe less expensive than broken data in RAM accessed through redis causing unknown effects. Redis has no known checks against hardware errors from RAM, even if it is quite rare (less likely to happen than a broken hard drive) it does happen.

Virtualization of Hyper-threaded cores

I'm looking for some guidance before I spend tons of time reorganizing a legacy program. If I have cores that are part of a virtual cluster. I have a computation that is broken into many parts and distributed to each member of the cluster. If each cores is hyper-threaded which of the following is most efficient:
2 virtual machines, one for each logical core. Half the computation is sent to each
1 virtual machine, where the OS handles the use of the logical cores.
1 virtual machine, where OpenMP is used to create 2 threds to split the computation.
My gut feeling is option 2, because a hyper-threaded core isn't a true core and option 3 requires additional overhead of starting threads and communicating data while one thread is idle. Any insight is greatly appreciated. Thanks.

You can get some idea from this post Intel Core i5 And Core i7: Intel’s Mainstream Magnum Opus

Hardware requirements for a Virtual Server

We have decided to go with a virtualization solution for a few of our development servers. I have an idea of what the hardware specs would be like if we bought separate physical servers, but I have no idea how to consolidate that information into the specification for a generalized virtual server.
I know intuitively that the specs are not additive - I shouldn't just add up all the RAM requirements from each machine to get the RAM required for the virtual server. I can't really treat them as parallel systems either because no matter how good the virtualization software is, it can't abstract away two servers trying to peg the CPU at the same time.
So my question is - is there a standard method to estimating the hardware requirements for a virtualized system given hardware requirement estimations for the underlying virtual machines? Is there a +C constant for VMWare/MS Virtual Server overhead (and if so, what is C?)?
P.S. I promise to move this over to serverfault once it goes into beta (Promise kept)

Yes add 25% additional resources to manage the VM. So if I need 4 servers that are equal to single core 2 ghz machines with 2 gigs of ram I will need 10 ghz processing power plus 10 gigs of ram. This will allow all systems to redline and still be ok.
In the real world this will never happen though, all your servers will not always be running all the time. You can get a feel for usage by profiling your current servers and determine their exact requirements and then adding an additional 25% in resources.
Check out this software for profiling utilization http://confluence.atlassian.com/display/JIRA/Profiling+Memory+and+CPU+usage+with+YourKit

The requirements are in fact additive. You should add up the memory requirements for each VM, and the disk requirements, and have at least one processor core per VM. Then add on whatever you need for the host system.
VMs can share a CPU, to some extent, if you have really low performance requirements, but they cannot share disk space or memory.

Answers above are far too high, second (1 core per VM) is closer. You can either 1) plan ahead and probably over-purchase 2) add just-in-time. Do you have some reason that you must know well ahead (yearly budget? your chosen host platform doesn't cluster hosts, so you can't add later?)
Unless you have an incredible simple usage profile, it will be hard to predict before and you'll over purchase. The answer above (+25%) would be several times more than you need for an modern server virtualization software (VMware, Zen, etc) that manages resources smartly. It's accurate only for desktop products like VPC. I chose to rough it out on a napkin and profile my first environment (set of machines) on the host. I'm happy.
Examples of things that will confound your estimation
Disk space, Some systems (Lab
Manager) use only the difference in
space from the base template. 10
deployed machines with 10 GB drives
using about 10 GB (template) + 200MB.
Disk space: You'll then find you
don't like the deltas in specific
scenarios.
CPU / Memory: This is dev
shop - so you'll have erratic load.
Smart hosts don't reserve memory and CPU.
CPU / Memory: But then you'll
want to do perf testing, and want to
reserve CPU cycles (not all hosts can
do that)
We all virtualize for different reasons. Many of the guests
in our environment don't have much work. We want them there to see how something behaves with a cluster of 3 servers of type X. Or, we have a bundle of weird client desktops waiting around, being used one at time by a tester. They rarely consume many host resources.
So, if you are using something like that doesn't do delta disks, disk space might be somewhat calculable. If lab manager (delta disk), disk space is really hard to predict.
Memory and processor usage: You'll have to profile or over-purchase heavily. I have many more guest CPUs than host CPUS, and don't have perf problems - but that's because of the choppy usage in our QA environments.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas