Symfony2 performance tweaking - apache

Symfony2 was looking so promising, powerful and flexible. So we were going to use Symfony2 + mongodb for one of our projects. But it appeared too slow (Apache/2.2.25 + PHP/5.4.20). Currently the app is pretty simple. but I have noticed that the httpd.exe lads CPU up to 28% when some simple page is loaded. The page is quite lite - just user profile info and the list of his posts. I even can't imagine how hundreds of users can be served (not even talking about numbers like 100k users) if performance will not be much better.
For instance the CPU load is 2% when opening the heavy 'products' page of ActivationCloud account (which fetches a good amount of data) (PHP+Smarty+SQL).
After taking a look on Xdebug output, I have found that a gret deal of time 20% is utilized by ClassLoader->loadClass(...) - 265 calls
After performing the following steps:
*generated class map
php composer.phar dump-autoload --optimize
*installed and enabled APC
[APC]
extension=php_apc.dll
apc.enabled=1
apc.shm_segments=1
;32M per WordPress install
apc.shm_size=128M
;Relative to the number of cached files (you may need to
watch your stats for a day or two to find out a good number)
apc.num_files_hint=7000
;Relative to the size of WordPress
apc.user_entries_hint=4096
;The number of seconds a cache entry is allowed to idle
in a slot before APC dumps the cache
apc.ttl=7200
apc.user_ttl=7200
apc.gc_ttl=3600
;Setting this to 0 will give you the best performance, as APC will
;not have to check the IO for changes. However, you must clear
;the APC cache to recompile already cached files. If you are still
;developing, updating your site daily in WP-ADMIN, and running W3TC
;set this to 1
apc.stat=1
;This MUST be 0, WP can have errors otherwise!
apc.include_once_override=0
;Only set to 1 while debugging
apc.enable_cli=0
;Allow 2 seconds after a file is created before
it is cached to prevent users from seeing half-written/weird pages
apc.file_update_protection=2
;Leave at 2M or lower. WordPress does't have any file sizes close to 2M
apc.max_file_size=2M
;Ignore files
apc.filters = "/var/www/apc.php"
apc.cache_by_default=1
apc.use_request_time=1
apc.slam_defense=0
apc.mmap_file_mask=/var/www/temp/apc.XXXXXX
apc.stat_ctime=0
apc.canonicalize=1
apc.write_lock=1
apc.report_autofilter=0
apc.rfc1867=0
apc.rfc1867_prefix =upload_
apc.rfc1867_name=APC_UPLOAD_PROGRESS
apc.rfc1867_freq=0
apc.rfc1867_ttl=3600
apc.lazy_classes=0
apc.lazy_functions=0
expected a miracle after it but it did not happen.
*enabled APC class loader - in Symfony\web\app.php uncommented
/*
$loader = new ApcClassLoader('sf2', $loader);
$loader->register(true);
*/
The ClassLoader->loadClass(...) got better 'Self' is 11 instead of 21
Frankly speaking I was shocked by what I saw in xdebug :( a lot of repetitive calls like Container->get(...) -317 calls, DocumentManager->getClassMeataData(...) - 301 calls. Totally more than 2k of function calls. Hard to believe that.
These bundles are installed:
class AppKernel extends Kernel
{
public function registerBundles()
{
$bundles = array(
new Symfony\Bundle\FrameworkBundle\FrameworkBundle(),
new Symfony\Bundle\SecurityBundle\SecurityBundle(),
new Symfony\Bundle\TwigBundle\TwigBundle(),
new Symfony\Bundle\MonologBundle\MonologBundle(),
new Symfony\Bundle\SwiftmailerBundle\SwiftmailerBundle(),
new Symfony\Bundle\AsseticBundle\AsseticBundle(),
new Doctrine\Bundle\DoctrineBundle\DoctrineBundle(),
new Doctrine\Bundle\MongoDBBundle\DoctrineMongoDBBundle(),
new Sensio\Bundle\FrameworkExtraBundle\SensioFrameworkExtraBundle(),
new HWI\Bundle\OAuthBundle\HWIOAuthBundle(),
new Knp\Bundle\MenuBundle\KnpMenuBundle(),
... our bundles ...
);
if (in_array($this->getEnvironment(), array('dev', 'test'))) {
$bundles[] = new Symfony\Bundle\WebProfilerBundle\WebProfilerBundle();
$bundles[] = new Sensio\Bundle\DistributionBundle\SensioDistributionBundle();
$bundles[] = new Sensio\Bundle\GeneratorBundle\SensioGeneratorBundle();
}
return $bundles;
}
It was sad to find that Symfony2 got one of the worst benchmark results among others php frameworks http://www.techempower.com/benchmarks/#section=data-r8&hw=i7&test=json&l=sg
At the same time Francois Zaninotto said in his blog http://symfony.com/blog/who-really-uses-symfony that Yahoo uses Symfony2 for the bookmarks service, tried some apps form the list http://trac.symfony-project.org/wiki/ApplicationsDevelopedWithSymfony - they are not looking slow also on Quora http://www.quora.com/Who-is-using-Symfony2-in-production its spoken that dailymotion is using it as well.
How to make the performance acceptable?

Got Symfony working x10 faster after adding the
realpath_cache_size = 4096k
to php.ini

First you should use linux (you mentioned https.exe so I think you are using windows). Than you should use nginx instead of apache and php-5.5 with fpm instead of mod_php. Opcache instead of apc (by the way apc.stat should be turned off). Doctrine caches should be turned on and than you should use http caching wherever you can. (You can view packagist's code for some hints.)

Related

rstan() should not run in #'#example?

In package development, each example requires <5s. However, the pair of stan_model() and rstan::sampling() take long times more than 5s as follows:
Examples with CPU or elapsed time > 5s
user system elapsed
fit 1.25 0.11 32.47
So I put \donttest{} for each rstan::sampling() in roxygen comments #'#examples
In examples#'#examples, we should not run sampling() or is there any treatment ?
I had tried to create my package based on the code rstan_package_skeleton(path = 'BayesianAAA') when I was taught from you (Thank you !!) but, I do not understand many things about it.
Previously, rstan_package_skeleton(path = 'BayesianAAA') launched the errors in my computer ( but now the error does not occur).
So, I made my package without the rstan_package_skeleton(), say BayesianAAA, and in my original making, I put the Model_A.stan,Model_B.stan,Model_C.stan,.... in the inst/extdata and I refer my stan files as follows;
scr <- system.file("extdata", "Model_A.stan", package="BayesianAAA")
scr <- rstan::stan_model(scr)
I have many questions about the code rstan_package_skeleton(path = 'BayesianAAA').
1) The first question is How to include my existing stan files and how to refer my .stan files for the rstan::stan_model() ?
According to the page following page, it said that
If we had existing .stan files to include with the package we could use the optional stan_files argument to rstan_package_skeleton to include them.
So, I think I should execute, I am not sure but the following like manner is required;
`rstan_package_skeleton(path = 'BayesianAAA', stan_files = "Model_A.stan" )`.
But I do not know how to write the code for several stan files, say Model_A.stan,Model_B.stan,Model_C.stan in my existing package made without the rstan_package_skeleton(). I do not understand , but the following code is correct ? Since I do not where the files described in the variable stan_files is reflected in the new project created by the rstan_package_skeleton().
`rstan_package_skeleton(path = 'BayesianAAA', stan_files = c("Model_A.stan",`Model_B.stan`,`Model_C.stan` )`.
Here, the another question arise, that is,
2) The second question: Where I execute the code rstan_package_skeleton(path = 'BayesianAAA', stan_files = "Model_A.stan" ) ? I execute it form the R studio console in my existing package project. Is it correct ? And then, the new project arise and it is contained the old existing project. What should I do ?
https://cran.r-project.org/web/packages/rstantools/vignettes/minimal-rstan-package.html
3) I do not quite know about the packages "rstanarm" , but I try to imitate it for my package, but I can not fined any .stan file in it, I am wrong ?
I am sorry for my poor English, and Lack of study about these things.
I would be grateful if you could tell me.
You generally should not be writing a package that calls stan_model at runtime, unless like brms or tmbstan you are generating a Stan program at runtime as opposed to writing it statically. There are dozens of packages on CRAN that provide compiled Stan programs basically by following the build process developed for rstanarm, which is facilitated by the rstantools::rstan_package.skeleton function, the step-by-step guide, and the developer guidelines which directly address your question
CRAN policy permits long installation times but imposes restrictions on the time consumed by examples and unit tests that are much shorter than the time that it takes to compile even a simple Stan program. Thus, it is only possible to adequately test your package if it has pre-compiled Stan programs.
Even then, it can be difficult to sample from a posterior distribution (adequately) in five seconds, so you often have to use small datasets, one chain, a small number of iterations, etc.
It is best to pass the names of your Stan programs (which should end in a .stan extension, not use a period otherwise, and only have ASCII letters, numbers, and the underscore in their names) to rstantools::rstan_package_skeleton(). If doing so from RStudio, I would call it while not in an existing project. Then
During installation, all Stan programs will be compiled and saved in the list stanmodels that can then be used by R function in the package. The rule is that the Stan program compiled from the model code in src/stan_files/foo.stan is stored as list element stanmodels$foo.
There are dozens of R packages that have Stan programs in their src/stan_files directory (although the locations of the Stan programs are going to move to inst/stan for the next rstantools release) that for the most part just followed the vignettes and did not have to do any additional steps except write more R functions.

MediaWiki Database: Why the incredibly long response time?

I have been consolidating 3 Databases into one via prefixes in my mediawiki installation. I got three wikis using the same database like so:
en_interwiki
de_interwiki
es_interwiki
Everything works fine out of visitor perspective... but whenever a USER wants to post a new article or commit edits, the database takes up to 35 seconds to respond. This is unacceptable.
I activated debugging like so:
# Debugging:
$wgDBerrorLog = '/var/log/mediawiki/WikiDBerror.log';
$wgShowSQLErrors = true;
$wgDebugDumpSql = true;
$wgDebugLogFile = '/var/log/mediawiki/WikiDebug.log';
$wgShowDBErrorBacktrace = true;
I am getting debug info, and it seems that pagelinks is the culprit, but i am not one hundred percent sure.
Did anyone ever have this issue before?
Please help me!
Best regards,
Max
I could fix it. In my case, the memcache had the wrong port. Everything is back to normal.
In case anyone uses memcache with their MediaWiki installation: Be sure to use the right port on your server, or you will end up like me, with 30 second-wait-times.

Cannot find stored elements in Apache JSC cache

I'm using a JSC cache to store big amounts of objects that my application is using (more than 10.000.000)
I wrote a quick test to check the configuration, and although the elements seem to be stored in cache, when i'm trying to retrieve them, most of them aren't there.
I use a region cache and an auxiliary Disc Cache, as you can see from my configuration file
jcs.region.testCache1=DC
jcs.region.testCache1.cacheattributes=org.apache.jcs.engine.CompositeCacheAttributes
jcs.region.testCache1.cacheattributes.MaxObjects=1000
jcs.region.testCache1.cacheattributes.MemoryCacheName=org.apache.jcs.engine.memory.lru.LRUMemoryCache
jcs.region.testCache1.cacheattributes.UseMemoryShrinker=true
jcs.region.testCache1.cacheattributes.MaxMemoryIdleTimeSeconds=3600
jcs.region.testCache1.cacheattributes.ShrinkerIntervalSeconds=60
jcs.region.testCache1.cacheattributes.MaxSpoolPerRun=500
jcs.region.testCache1.elementattributes=org.apache.jcs.engine.ElementAttributes
jcs.region.testCache1.elementattributes.IsEternal=true
jcs.auxiliary.DC=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheFactory
jcs.auxiliary.DC.attributes=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheAttributes
jcs.auxiliary.DC.attributes.DiskPath=${user.dir}/jcs_swap
jcs.auxiliary.DC.attributes.MaxPurgatorySize=10000
jcs.auxiliary.DC.attributes.MaxKeySize=-1
I set the eternal attritbute to 'true', so that the elements are never expired and removed, an memory shrinker that puts the elements periodically to the DiscCache, and a DiscCache whose MaxKeySize is set to -1, indicating that it can host whatever amount of alements. Do u see any misconiguratiokn?
When I use this configuration with medium amount of elements (~10.000) everything works fine. When I'm using it with more than 1.000.000, I cannot retrieve most of the elements.
After some testing, I found a solution by my own. I was inserting elements in the cache by executing the following snipset
for(Integer i=0; i<2000000; i++) {
TestElement element = new TestElement();
element.setId(i);
element.setValue("element" + i);
cache.add(i, element);
}
This caused troubles, because the cache didn't have the time to spool elements in the disk cache. However if I use sleep for a couple of msecs before adding new elements (which makes more sense in a real time environment) everything works as expected

How to use multiple caches in rails? (for real)

I'd like to use 2 caches -- the in memory default one and a memcache one, though abstractly it shouldn't matter (I think) which two.
The in memory default one is where I want to load small and rarely changing data. I've been using the memory one to date. I keep a bunch of 'domain data' type stuff from the database in there, I also have some small data from external sources that I refresh every 15 min - 1 hour.
I recently added memcache because I'm now serving up some larger assets. Sort of complex how I got into this, but these are larger ~kilobytes, relatively small in quantity (hundreds), and highly cacheable -- they change, but a refresh once per hour is probably too much. This set might grow, but it's shared across all hosts. Refreshes are expensive.
The first set of data has been using the default memory cache for a while now, and has been well-behaved. Memcache is perfect for the second set of data.
I've tuned memcache, and it's working great for the second set of data. The problem is that because of my existing code that was done 'thinking' it was in local memory, I'm doing several trips to memcache per request, which is increasing my latency.
So, I want to use 2 caches. Thoughts?
(note: memcache is running on different machine(s) than my server. Even if I ran it locally, I have a fleet of hosts so it wouldn't be local to all. Also, I want to avoid needing to just get bigger machines. Even though I probably could solve this problem by making the memory bigger and just using the in memory (the data really isn't that big), this doesn't solve the problem as I scale, so it will just be kicking the can.)
ActiveSupport::Cache::MemoryStore is what you want to use. Rails.cache uses either MemoryStore, FileStore or in my case DalliStore :-)
You can have global instance of ActiveSupport::Cache::MemoryStore and use it or create a class with a singleton pattern that holds this object (cleaner). Set Rails.cache to the other cache store and use this singleton for MemoryStore
Below is this class:
module Caching
class MemoryCache
include Singleton
# create a private instance of MemoryStore
def initialize
#memory_store = ActiveSupport::Cache::MemoryStore.new
end
# this will allow our MemoryCache to be called just like Rails.cache
# every method passed to it will be passed to our MemoryStore
def method_missing(m, *args, &block)
#memory_store.send(m, *args, &block)
end
end
end
This is how to use it:
Caching::MemoryCache.instance.write("foo", "bar")
=> true
Caching::MemoryCache.instance.read("foo")
=> "bar"
Caching::MemoryCache.instance.clear
=> 0
Caching::MemoryCache.instance.read("foo")
=> nil
Caching::MemoryCache.instance.write("foo1", "bar1")
=> true
Caching::MemoryCache.instance.write("foo2", "bar2")
=> true
Caching::MemoryCache.instance.read_multi("foo1", "foo2")
=> {"foo1"=>"bar1", "foo2"=>"bar2"}
In an initializer you can just put:
MyMemoryCache = ActiveSupport::Cache::MemoryStore.new
Then you can use it like this:
MyMemoryCache.fetch('my-key', 'my-value')
and so on.
Note that if it's just for performance optimization (and depends on time expiration), it may not be a bad idea to disable it in your test environment, as follows:
if Rails.env.test?
MyMemoryCache = ActiveSupport::Cache::NullStore.new
else
MyMemoryCache = ActiveSupport::Cache::MemoryStore.new
end
Rails already provides this by allowing you to set different values config.cache_store in your environment initializers.

Rails 3 - cache web service call

In my application, in the homepage action, I call a specific web service that returns JSON.
parsed = JSON.parse(open("http://myservice").read)
#history = parsed['DATA']
This data will not change more than once per 60 seconds and does not change on a per-visitor basis, so i would like to, ideally, cache the #history variable itself (since the parsing will not result in a new result) and auto invalidate it if it is more than a minute old.
I'm unsure of the best way to do this. The default Rails caching methods all seem to be more oriented towards content that needs to be manually expired. I'm sure there is a quick and easy method to do this, I just don't know what it is!
You can use the built in Rails cache for this:
#history = Rails.cache.fetch('parsed_myservice_data', :expires_in => 1.minute) do
JSON.parse connector.get_response("http://myservice")
end
One problem with this approach is when the rebuilding of the data to be cached takes
quite a long time. If you get many client requests during this time, each of them will
get a cache miss and call your block, resulting in lots of duplicated effort, not to mention slow response times.
EDIT: In Rails 3.x you can pass the option :race_condition_ttl to the fetch method to avoid this problem. Read more about it here.
A good solution to this in previous versions of Rails is to setup a background/cron job to be run at regular intervals that will fetch and parse the data and update the cache.
In your controller or model:
#history = Rails.cache.fetch('parsed_myservice_data') do
JSON.parse connector.get_response("http://myservice")
end
In your background/cron job:
Rails.cache.write('parsed_myservice_data',
JSON.parse connector.get_response("http://myservice"))
This way, your client requests will always get fresh cached data (except for the first
request if the background/cron job hasn't been run yet.)
I don't know of an easy railsy way of doing this. You might want to look into using redis. Redis lets you set expiration times on the data you store in it. Depending on which redis gem you use it'd look something like this:
#history = $redis.get('history')
if not #history
#history = JSON.parse(open("http://myservice").read)['DATA']
$redis.set('history', #history)
$redis.expire('history', 60)
end
Because there's only one redis service this will work for all your rails processes.
We had a similar requirement and we ended up using Squid as a forward proxy for all the webservice calls from the rails server. Squid was configured to have a cache-expiry time of 60 seconds.
http_connection_factory.rb:
class HttpConnectionFactory
def self.connection
AppConfig.use_forward_proxy ? Net::HTTP::Proxy(AppConfig.forward_proxy_host, AppConfig.forward_proxy_port) : Net::HTTP
end
end
In your application's home page action, you can use the proxy instead of making the call directly.
connector = HttpConnectionFactory.connection
parsed = JSON.parse(connector.get_response("http://myservice"))
#history = parsed['DATA']
We had second thoughts about using Redis or Memcache. But, we had several service calls and wanted to avoid all the hassles of generating keys and sweeping them at appropriate times.
So, in our case, the forward proxy took care of all those nitty gritties. Please refer to Squid Wiki for the configuration parameters necessary.