I'm trying to understand how dynamic caching is done in Apache.
I read the Caching Guide of Apache and an article about dynamic caching, but still don't understand exactly the internals of how dynamic caching works.
Say for example I have a PHP page that serves content through reading from a database according to parameters in the user's URL query-string (or parameters specified in POST).
e.g. www.mySite.com?articleID=31
How is that cached then?
Does mod_cache keeps the content retrieved from the database for this specific article?
Any sources or suggestions are welcomed.
It caches the output of your script, the HTML. What I'm not sure if CacheStorePrivate On will cache PHP scripts with no cache headers. I'm using Apache 2.4.17 and look like it doesn't.
Related
I am currently hosting the contents of a site with ProviderA. I have a domain registered with ProviderB. I want users to access the contents (www.providerA.com/sub/content) by visiting www.providerB.com. A domain forward is easy enough and works as intended, however, unless I embed the site in a frame (which is a big no-no), the actual URL reads www.providerA.com/sub/content despite the user inputting www.providerB.com.
I really need a solution for this. A domain masking without the use of a frame. I'm sure this has been done before. An .htaccess domain rewrite?
Your help would be hugely appreciated! I'm going nuts trying to find a solution.
For Apache
Usual way: setup mod_proxy. The apache on providerB becomes a client to providerA's apache. It gets the content and sends it back to the client.
But looks like you only have .htaccess. So no proxy, you need full configuration access for that.
So you cannot, see: How to set up proxy in .htaccess
If you have PHP on providerB
Setup a proxy written in PHP. All requests to providerB are intercepted by that PHP proxy. It gets the content from providerA and sends it back. So it does the same thing as the Apache module. However, depending on the quality of the implementation, it might fail on some requests, types, sizes, timeouts, ...
Search for "php proxy" on the web, you will see a couple available on GitHub and others. YMMV as to how difficult it is to setup, and the reliability.
No PHP but some other server side language
Obviously that could be done in another language, I checked PHP because that is what I use the most.
The best solution would be to transfer the content to providerB :-)
I am doing some reverse engineering on a website.
We are using LAMP stack under CENTOS 5, without any commercial/open source framework (symfony, laravel, etc). Just plain PHP with an in-house framework.
I wonder if there is any way to know which files in the server have been used to produce a request.
For example, let's say I am requesting http://myserver.com/index.php.
Let's assume that 'index.php' calls other PHP scripts (e.g. to connect to the database and retrieve some info), it also includes a couple of other html files, etc
How can I get the list of those accessed files?
I already tried to enable the server-status directive in apache, and although it is working I can't get what I want (I also passed the 'refresh' parameter)
I also used lsof -c httpd, as suggested in other forums, but it is producing a very big output and I can't find what I'm looking for.
I also read the apache logs, but I am only getting the requests that the server handled.
Some other users suggested to add the PHP directives like 'self', but that means I need to know which files I need to modify to include that directive beforehand (which I don't) and which is precisely what I am trying to find out.
Is that actually possible to trace the internal activity of the server and get those file names and locations?
Regards.
Not that I tried this, but it looks like mod_log_config is the answer to my own question
I am using wp-minify and css and scripts file aggregation plugin for website optimization.
Can anybody answer the following
Which plugin is considered good for Database optimization?
How to add expiry headers to javascripts,images and stylesheet?
Which plugin is good for enabling cache for website's better performance?
There are several plugins that helps to create a cache and instead of fetching content from the database the content is loaded from the cache i.e. html version.
Moreover, you need to follow the few SEO guidelines that I have mentioned below to optimize your wordpress website for search engine like Google.
http://wordpress-tuts.blogspot.com/2017/05/best-ways-to-optimize-and-boost-seo-of.html
For expiry headers you can modify the .htaccess file in your website root folder. You can set the expiry time for all files in said root.
Here is a simple guide:
http://softstribe.com/wordpress/how-to-add-expires-headers-in-wordpress/
The webserver hosting my website is not returning last-modified or expiry headers. I would like to rectify this to ensure my web content is cacheable.
I don't have access to the apache config files because the site is hosted on a shared environment that I have no control over. I can however make configurations via an .htaccess file. The server - apache 1.3 - is not configured with mod_expires or mod_headers and the company will not install these for me.
With these limitations in mind, what are my options?
Sorry for the post here. I recognise this question is not strictly a programming question, and more a sys admin question. When serverfault is public I'll make sure I direct questions of this nature there.
What sort of content? If static (HTML, images, CSS), then really the only way to attach headers is via the front-end webserver. I'm surprised the hosting company doesn't have mod_headers enabled, although they might not enable it for .htaccess. It's costing them more bandwidth and CPU (ie, money) to not cache.
If it's dynamic content, then you'll have control when generating the page. This will depend on your language; here's an example for PHP (it's from the PHP manual, and is a bad example, as it should also set the response code):
if (!headers_sent()) {
header('Location: http://www.example.com/');
exit;
}
Oh, and one thing about setting caching headers: don't set them for too long a duration, particularly for CSS and scripts. You may not think you want to change these, but you don't want a broken site while people still have the old content in their browsers. I would recommend maximum cache settings in the 4-8 hour range: good for a single user's session, or a work day, but not much more.
Does anyone know of any reverse proxy solutions that allow the content/data of an HTTP response to be directly modified before being relayed to the requesting client?
As an example:
Proxy relays client request for pdf document to another server, response received by proxy, watermark added to pages of pdf, watermarked pdf is returned to client.
Regards,
Mike
Apache has mod_proxy and mod_proxy_html, which is used to rewrite links, headers, etc. I've only ever seen HTML or XML filters, but you should be able to write your own binary one for your PDF needs. The possible difficulty I could see is that Apache treats webpages as a stream, rather than a file. I'm not sure how to watermark a PDF doc, but if you need access to the entire file to do it, it might get complicated quickly.
Note that it would seem far easier to me to do the watermarking on the server, where you have access to the file, rather than a proxy. If server load is a concern, either a batch process, or a separate server could be an alternative solution.
I found a description of Deliverance over on the python tags, and it may be useful for what you're looking for. I have no experience with it myself, so grain of salt and all that.
http://www.openplans.org/projects/deliverance/introduction
I've had success with Pound.
I think I might go down the Squid/ICAP route.
This is for an enterprise level system, does anyone have any experience with either of these in this context?
http://wiki.squid-cache.org/Features/ICAP