We have a heavy duty I/O node which transfers hundreds of giga bytes from disk to the network in each data transmission. We observe that our transmission application almost halts when the "flush"'s and "kswapd"'s CPU utilization approaches 100%.
Can we increase the number of these 2 system daemon processes?
How do we change the behavior of them? For example, changing the threshold system parameters which trigger the running of them.
Where are the executables and configuration files for them?
Many thanks!
Related
Having some problems with httpd (Apache/2.2) memory usage.
Over time, memory usage in the httpd processes creep up until it's eventually at 100%. then it will restart automatically
The problem seems to be related to a specific machine (a different
machine with a similar configuration (Apache 2.2, code , OS version)
does not exhibit this behavior.
You can set MaxRequestsPerchild to recycle the processes periodically.
it is easy to allocate a process to particular core , but how to make sure only that process should run on that particular core or cores.Rest processes can run on other cores. Please help me in this.
i got answer from similar question
Add the parameter isolcpus=[cpu_number] to the Linux kernel command line form the boot loader during boot. This will instruct the Linux scheduler to no run any regular tasks on that CPU unless specifically requested using cpu affinity.
Use IRQ affinity to set other CPUs to handle all interrupts so that your isolated CPU will not received any interrupt.
Use CPU affinity to fix your specific task to the isolated CPU
So I have a cloud virtual machine on google compute, does this mean by nature that it is highly available? If the VM is running on a single piece of hardware on GCE, if the piece of hardware breaks then the VM could go down. Is the VM running on some kind of RAID, but for servers? So if one of the machines goes down another machine will pick up and continue running the vm? Thanks.
The machine itself is not highly available. However, Google takes several steps to increase reliability:
Storage is replicated and independent of the physical machine the VM is running on (obviously not for local SSD). This means that even if the physical machine catches on fire, only the "runtime" state is lost but the attached disks are fine.
VMs can live-migrate. This is a setting you can control. If enabled, the VM will be migrated to a different physical machine on maintenance events. Live-migration can lead to brief performance degradation while memory etc. is synced to the other host but the machine is not shut down / restarted.
Even when the physical host suddenly dies, you can set your instance to restart automatically on a new machine. If you plan to use this mode, make sure your instance is able to cleanly boot to serving state without manual intervention.
If you need high availability, the best approach is to spread your instances among zones of the same region and using a network or HTTP(S) loadbalancer. These will automatically stop sending traffic to a machine in case it becomes unhealthy. Also see this short youtube video on Google's network architecture for more info.
For high availability of your application data, there are highly available options like Datastore for database-like usage and Cloud Storage for file-oriented data. Keep in mind that Cloud SQL also runs on a single instance/physical machine which means that you have to setup slaves/replicas to get high availability. However, you can also do that with your favorite DB system on plain Compute Engine instances if you are willing to maintain them yourself.
Our Carbon daemon (in Graphite) takes up no more than 9% CPU on a 2-core machine. However, our Graphite webapp has shot the HTTPD usage high recently to about 95%. Out of this, we have noted that the process "wsgi:graphite" is taking up as much as 93% CPU.
Has anyone come across this problem? What is the solution? We have a lot of monitoring scripts querying graphite via the Graphite URL/Render API. This will of course increase Graphite's HTTPD activity, but we havne't made any drastic changes.
I would appreciate your comments.
I have an ESXi 5.5 server running one virtual machine which is given one socket with two cores. What I want to do is to limit these cores to 1500 MHz each to simulate software behaviour on slow machines. How can I do this?
To improve CPU power efficiency, ESX/ESXi can take advantage of performance states (also known as P-states)
to dynamically adjust CPU frequency to match the demand of running virtual machines. When a CPU runs at
lower frequency, it can also run at lower voltage, which saves power. This type of power management is
typically called Dynamic Voltage and Frequency Scaling (DVFS). ESX/ESXi attempts to adjust CPU frequencies
so that virtual machine performance is not affected.
When a CPU is idle, ESX/ESXi can take advantage of power states (also known as C-states) and put the CPU
in a deep sleep state. As a result, the CPU consumes as little power as possible and can quickly resume from
sleep when necessary.
There are power management policies which you have to select for proper CPU utilization. You select a policy for a host using the vSphere
Client. If you do not select a policy, ESX/ESXi uses High Performance by default.
Prerequisites
ESX/ESXi supports the Enhanced Intel SpeedStep and Enhanced AMD PowerNow! CPU power management
technologies. For the VMkernel to take advantage of the power management capabilities provided by these
technologies, you must enable power management, sometimes called Demand-Based Switching (DBS), in the
BIOS.
Procedure
In the vSphere Client inventory panel, select a host and click the Configuration tab.
Under Hardware, select Power Management and select Properties.
Select a power management policy for the host and click OK.
The policy selection is saved in the host configuration and can be used again at boot time. You can change
it at any time, and it does not require a server reboot.