Expression Engine Apache and SSD - apache

Recently I've been working on an expression engine project that has a performance problem. On a test with 50 concurrent connections
Extremely high (100%) CPU usage
Low RAM usage (2 gigs out of 8)
Low CPU/RAM usage on the database
And the web server has 4 CPUs. Now, if I turn on the cache, the utilization is lower, but the content is such that dynamic caching had to be taken off. Now the expression engine is made up of templates that have to be read into memory and parsed. For those not familiar with expression engine, it is built using CodeIgniter.
My thinking is this that if Apache and the expression engine files were taken off HDD and put onto an SSD, I/O for the templates, it would be a lot faster and would lower the CPU utilization by Apache. Would this kind of performance improvement actually happen or would an SSD make no difference?

SSD will always be faster then spinny turny disks where disk I/O is concerned, but it doesn't sound like that's where your bottleneck is.
You're not using RAM and as you correctly stated, the templates have to be parsed. You have 4 CPU's, but they may be from 1998 (we don't know). If they are more recent, it sounds like it should be more than enough for 50 concurrent connections, but you may be rendering the contents of the Library of Congress (again, we don't know).
You might get some benefit with tag caching or some of the other techniques mentioned in The Guide.
Also found this: http://eeinsider.com/articles/using-cache-wisely-with-expressionengine/

Related

Is it possible to make CPU work on large persistent storage medium directly without using RAM?

Is it possible to make CPU work on large persistent storage medium directly without using RAM ? I am fine if performance is low here.
You will need to specify the CPU architecture you are interested, most standard architectures (x86, Power, ARM) assume the existence of RAM for their data bus, I am afraid only a custom board for this processors would allow to use something like SSD instead of RAM.
Some numbers comparing RAM vs SSD latencies: https://gist.github.com/jboner/2841832
Also, RAM is there for a reason, to "smooth" access to bigger storage from CPU,
have a look at this image (from https://www.reddit.com/r/hardware/comments/2bdnny/dram_vs_pcie_ssd_how_long_before_ram_is_obsolete/)
As side note, it is possible to access persistent storage without CPU involving (although RAM is still needed), see https://www.techopedia.com/definition/2767/direct-memory-access-dma or https://en.wikipedia.org/wiki/Direct_memory_access

How much is 1/8th of a core?

I'm new to cloud computing and, for the life of me, I can't figure out how "much" 1/8th of a core is in practical terms.
I know what kind of CPUs Amazon EC2 are using for m1.small, but let's say (for education purposes) that it is a single-core 1GHz CPU.
How is 1/8th of core calculated? Does it mean my application will run at 128MB RAM and 1/1GHz of CPU? Or will my application be able to run only a certain number of operations/CPU cycles before I'll be charged for an addition app-cell?
What I need is a practical explanation of the phrase. Perhaps, on an a simple vert.x HTTP server, where each successful connection calculates 2 + 3? Vert.x uses less than 128MB of RAM.
Afaik, you don't have a limit on the number of cycles: if you application requires many CPU cycles it will probably run slower since it would only use 1/8 of core.
Regarding the memory, if you are just using 1 app cell but your app requires more than 128MB, then it will probably result in an OUT OF MEMORY exception.
slicing of the server to 8th isn't as mathematic as you expect. Sharing server resource with multiple tenant allows to better use CPU globaly, compared to a classic server, so even you path inly 1/8 of the server you actually get more resources, but only when you application actually use them.

mongodb high cpu usage

I have installed MongoDB 2.4.4 on Amazon EC2 with ubuntu 64 bit OS and 1.6 GB RAM.
On this server, only MongoDB running nothing else.
But sometime CPU usage reach to 99% and load average: 500.01, 400.73,
620.77
I have also installed MMS on server to monitor what's going on server.
Here is MMS detail
As per MMS details, indexing working perfectly for each queries.
Suspect details as below
1) HIGH non-mapped virtual memory
2) HIGH page faults
Can anyone help me to understand what exactly causing high CPU usage ?
EDIT:
After comments of #Dylan Tong, i have reduced active connetions but
still there is high non-mapped virtual memory
Here's a summary of a few things to look into:
1. Observed a large number of connections and cursors (13k):
- fix: make sure your connection pool is appropriate. For reporting, and your current request rate, you only need a few connections at most. Also, I'm guessing you have a m1small instance, which means you only have 1 core.
2. Review queries and indexes:
- run your queries with explain(), to observe how the queries are executed. The right model normally results in queries only pulling very few documents and utilization of an index.
3. Memory (compact and readahead setting):
- make the best use of memory. 1.6GB is low. Check how much free memory you have, and compare it to what is reported as resident. A couple of common causes of low resident memory is due to fragmentation. If there are alot of documents moving, changing size and such, you should run the compact command to defragment your data files. Also, a bad readahead can lead to poor use of memory as well. Check your readahead setting (http://manpages.ubuntu.com/manpages/lucid/man2/readahead.2.html). Try a few values starting with low values (http://docs.mongodb.org/manual/administration/production-notes/). The production notes recommend 32 (for standard 512byte blocks). Sometimes higher values are optimal if your documents are larger. The hope is that resident memory should be close to your available memory and your page faults should start to lower.
If you're using resources to the fullest after this, and you're still capped out on CPU then it means you need to up your resources.

If not CPU, disk or network, what is the bottleneck during query execution?

I work with SQL Server 2005 and wonder, if not CPU, disk or network, what are users waiting for when SQL Server is working. The strange thing is that system monitor shows that the 4 processors are at an average of 5%, the disk (demonstrated 50MB/s write) works with about 5-8 MB/s, but the execution (inserts and selects) take up to 10 minutes. I'd be happy to install additional hardware, but I don't see what device is the bottleneck and how do I measure its capacity and current workload.
Any advice would be appreciated.
Thanks
additional info: RAM is constantly at about 70% capacity and I am running windows xp.
check your disk read and write 'wait' time. a heavy load database may just make a lot of read and write request with very small piece of data that saturates the IO.
As others mention, disks are rarely the bottleneck when it comes to bandwidth, but rather in the number of IO operations they can perform per second - commonly called IOPS.
The IOPS capabilities of your disks will vary according to disk type, cache and the RAID setup you have.
Another thing you may run into is locking. If you have a lot of concurrent access to the same data, especially inside of large transactions, you may see other transactions being blocked - causing no network, CPU nor disk usage while being blocked, just wasted time.
Probably the disks. If you are seeking all over the place, the throughput (MB/s) will be low even though the disks are running as fast as they can.
Generic advice: try increasing SQLServer's cache, tune your queries, check that the appropriate indexes exist and are used (and that you don't have too many).

How To Simulate Lower CPU Processor Machines For Browser Testing

We have some users which are using lower-CPU powered machines and they're encountering slow response times using our web application. Is there any way for me to do testing so that I can simulate lower CPU rates?
For example, I have 2.3 Ghz computing power, can I lower it to 1.6 Ghz or lower so that I may be able to test it?
BTW, our customers are using Windows. I have to simulate low computing power on Internet Explorer as browser.
Most new CPUs multiplier can easily be lowered (Intel: Speedstep, AMD: PowerNow!). This is used to save power. With RMclock you can manually adjust your multiplier and thus lower your frequency and make your pc slower. I use this tool myself so I can tell you that it works.
http://cpu.rightmark.org/products/rmclock.shtml
The virtual machine Bochs(pronounced boxes) allows you to set a instructions per second directive. It's probably the slowest emulator out there as it is though...
Create some virtual machines.
You can use VirtualPC or VirtualBox both are free.
I would recommend to start something on the background which eats up all your processor cycles.
A program which finds primenumbers or something similar.
Another slight option in addition to those above is to boot windows in a lower resource config. Go to the start menu,, select run and type MSCONFIG. You can go to the boot tab, click on advanced options and limit the memory and number of of processsors. It's not as robust as the above, but it does give you another option.
Lowering the CPU clock doesn't always give expected results.
Newer CPUs feature architecture improvements which make them more efficient on an equvialent clock basis than older chips. Incidentally, because of this virtual machines are a bad way of testing performance for "older" tech as well.
Your best bet is to simply buy a couple of older machines. Using similar RAM (types and amounts), processor, motherboard chipsets, hard drives, and video cards. All of which feed into the total performance of the machine itself.
I bring the other components up because changing just one of them can have an impact on even browser performance. A prime example is memory. If your clients are constrained to something like 512MB of RAM, the machines could be performing a lot of hard drive access for VM swaps, even for just running the browser. In this situation downgrading the clock speed on your processor while still retaining your 2GB (assuming) of RAM would still not perform anywhere near the same even if everything else was equal.
Isak Savo'sanswer works, but can be a bit finicky, as the modern tpl is going to try and limit cpu load as much as possible. When I tested it out, It was hard (though possible with some testing) to consistently get the types of cpu usages I wanted.
Then I remembered, http://www.cpukiller.com/, which does this already. Highly recommended. As an aside, I found this util from playing old 90s games on modern machines, back when frame rate was pegged to cpu clock time, making playing them on modern computers way too fast. Great utility.
Another big difference between high-performance and low-performance CPUs is the number of cores available. This can realistically differ by a factor of 4, way more than the difference in clock frequency you're likely to encounter.
You can solve this by setting the thread affinity. Even IE6 will use 13 threads just to show google.com. That means it will benefit from a multi-core CPU. But if you set the thread affinity to one core only, all 13 IE threads will have to share that one core.
I understand that this question is pretty old, but here are some receipts I personally use (not only for Web development):
BES. I'm getting some weird results while using it.
Go to Control Panel\All Control Panel Items\Power Options\Edit Plan Settings\Change Advanced Power Settings, then go to the "Processor" section and set it's maximum state to 5% (or something else). It works only if your processor supports dynamic multiplier change and ACPI driver is installed correctly.
Run Task Manager and set processor affinity to a single core (or whatever number of cores you want) for your browser's (or any other's) process. Not a best practice for browsers, because JavaScript implementations are usually single-threaded, but, as far as I see, modern browsers actually DO use multiple cores.
There are a few different methods to accomplish this.
If you're using VirtualBox, go into the Settings for the VM you want to slow the CPU speed for. Go to System > Processor, then set the Execution Cap. The percentage controls how slow it will go: lower values are slower relative to the regular speed. In practice, I've noticed the results to be choppy, although it does technically work.
It is also possible to set the CPU speed for the whole system. In the Windows 10 Settings app, go to System > Power & Sleep. Then click Additional Power Settings on the right hand side. Go to Change Plan Settings for the currently selected plan, then click Change Advanced Power Plan Settings. Scroll down to Processor Power Management and set the Maximum Processor State. Again, this is a percentage. Although this does work, I find that in practice, it doesn't have a big impact even when the percentage is set very low.
If you're dealing with a videogame that uses DirectX or OpenGL and doesn't have a framerate cap, another common method is to force Vsync on in your graphics driver settings. This will usually slow the rendering to about 60 FPS which may be enough to play at a reasonable rate. However, it will only work for applications using 3D hardware rendering specifically.
Finally: if you'd rather not use a VM, and don't want to change a system global setting, but would rather simulate an old CPU for one specific process only, then I have my own program to do that called Old CPU Simulator.
The main brain of the operation is a command line tool written in C++, but there is also a GUI wrapper written in C#. The GUI requires .NET Framework 4.0. The default settings should be fine in most cases - just select the CPU you'd like to simulate under Target Rate, then hit New and browse for the program you'd like to run.
https://github.com/tomysshadow/OldCPUSimulator (click the Releases tab on the right for binaries.)
The concept is to suspend and resume the process at a precise rate, and because it happens so quickly the process will appear to just be running slowly. For example, by suspending a process for 3 milliseconds, then resuming it for 1 millisecond, it will appear to be running at 25% speed. By controlling the ratio of time suspended vs. time resumed, it is possible to simulate different speeds. This is completely API agnostic (it doesn't hook DirectX, OpenGL, etc. it'll work with a command line program if you want.)
Old CPU Simulator does not ask for a percentage, but rather, the clock speed to simulate (which it calls the Target Rate.) It then automatically determines, based on your CPU's real clock speed, the percentage to use. Although clock speed is not the only factor that has improved computer performance over time (there are also SSDs, faster GPUs, more RAM, multithreaded performance, etc.) it's a good enough approximation to get fairly consistent results across machines given the same Target Rate. It also supports other options that may help with consistency, such as setting the process affinity to one.
It implements three different methods of suspending and resuming a process and will use the best available: NtSuspendProcess, NtQuerySystemInformation, or Toolhelp Snapshots. It also uses timeBeginPeriod and timeEndPeriod to achieve high precision timing without busy looping. Note that this is not an emulator; the binary still runs natively. If you like, you can view the source to see how it's implemented - it's not a large project. On my machine, Old CPU Simulator uses less than 1% CPU and less than 1 MB of memory, so the program itself is quite efficient (unlike running intensive programs to intentionally slow the CPU.)