How to know which files get stored in the browser cache - browser-cache

How does the browser determine which files to store in the cache? I will like to implement a feature on my website that checks the internet speed and I will do so by calculating the time it takes to download some small files. The problem is that if these files get stored in the cache then my algorithm will not work. Also if I create a large file with aspx that the only thing that varies in that file is the time, then maybe I should split it in two files so that the part that does not change can be stored in the cache.

You should look after HTTP headers HTTP headers. In c# for ASP.net :
Response.CacheControl = "no-cache";
Response.AddHeader("Pragma", "no-cache");
But perhaps you won't be able to change them (depending on you web hosting service). Then, if a HTML document :
<meta http-equiv="Expires" content="Tue, 01 Jan 2010 00:00:00 GMT">
<meta http-equiv="Pragma" content="no-cache">
At last, you may simply change the name. Adding a random parameter do the trick most of the time :
..."http://example.org/myresource.extension?time=1213232322"...

Related

Force a page to be re-downloaded, rather than fetched from browser cache - Apache Server

Ive made a minor text change to our website, the change is minor in that its a couple of words, but the meaning is quite significant.
I want all users (both new and returning) to the site to see the new text rather than any cached versions, is there a way i can force a user (new or returning) to re download the page, rather than fetch it from their browser cache ?
The site is a static html site hosted on a LAMP server.
This depends totally on how your webserver has caching set up but, in short, if it's already cached then you cannot force a download again until the cache expires. So you'll need to look at your cache headers in your browsers developer tools to see how long it's been set for.
Caching gives huge performance benefits and, in my opinion, really should be used. However that does mean you've a difficulty in forcing a refresh as you've discovered.
In case you're interested in how to handle this in the future, there are various cache busting methods, all of which basically involve changing the URL to fool the browser into thinking its a different resource and forcing the download.
For example you can add a version number to a resource so you ask for so instead of requesting index.html the browser asks for index2.html, but that could mean renaming the file and all references to it each time.
You can also set up rewrites in Apache using regular expressions so that index[0-9]*.html actually loads index.html so you don't need multiple copies of the file but can refer to it as index2.html or index3.html or even index5274.html and Apache will always serve the contents of index.html.
These methods, though a little complicated to maintain unless you have an automated build process, work very well for resources that users don't see. For example css style sheets or JavaScript.
Cache busting techniques work less well for HTML pages themselves for a number of reasons: 1) they create unfriendly urls, 2) they cannot be used for default files where the file name itself is not specified (e.g. the home page) and 3) without changing the source page, your browser can't pick up the new URLs. For this reason some sites turn off caching for the HTML pages, so they are always reloaded.
Personally I think not caching HTML pages is a lost opportunity. For example visitors often visit a site's home page, and then try a few pages, going back to the home page in between. If you have no caching then the pages will be reloaded each time despite the fact it's likely not to have changed in between. So I prefer to have a short expiry and just live with the fact I can't force a refresh during that time.

Asset cache time on Shopify servers

When editing a custom .css.liquid file that is not automatically set up by Shopify and cannot be placed in a page (since it does not have access to Shopify's Liquid templating system), I find that it can take hours for the CDNs to start serving up the new version of said .css.liquid file.
In the future, how can I cut down on this waiting time? Currently, here's what I think is going on:
Most asset urls have some number appended to them, like so: path/to/filename?270. It could be that this number is meant to represent last time file was served, version number, or some other flag to indicate to serve up the file. If so, then I can just create a template to grab this info myself (though I prefer not having to take an additional step.
The CDN servers' cache times are high, and will not reissue a new representation of the file until the data in the cache has expired. If so, there's not much I can do about this.
Please let me know if it's one of the above situations, or if it's something else.
I've had success with re-saving the layout file that calls the .css.liquid file.
For example: edit something then save it up to the server. And then edit it back again and save that back up to the server.
This seems to increment the query string on the path to the css file.

french characters not displaying properly on page in rails 3.0.3

I have created a rails application, in which I am displaying any type of name, name can be anything from person, cars, vegetables, any material.
So I thought of including some ingredient name like Crème Fraîche, whenever I copy this name from other web page and store in my database, it properly stores it.
While displaying this name on web page, I get some strange characters appearing on the page like Cr�me Fra�che.
I have used charset UTF-8, then also it displays the name like this.
I checked in my database name is stored properly, but on page and in irb it displays the name like this.
I have wasted nearly 5 days on searching for the above problem but didn't succeed.
I hope this time I will get some help
Thanks in advance
Pankaj
include the following meta tag in the section of your page
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" />
This tells browsers to use ISO encoding to interpret the text on the page
Even with the correct encoding in the webpage, characters will still display corrupted on a machine where the required fonts are not installed and obviously you cannot be sure what fonts are installed on machines looking at your page
Browsers may also need to be configured. For example, Firefox needs to be told in the Options/Tools/Content section to use the above character set by default
To be absolutely certain your characters will display correctly use the unambiguous HTML code for each non-standard character e.g. instead of Crème use Cr & egrave ; me (NB there should not be spaces between the "&" and the ";" - I had to type it in this way to stop this page from interpreting the code and displaying è).
This can have implications for web searches and ordering functions in your database. You could store the text as Crème in your database and pass everything through a conversion function before delivering to the HTML page (this will obviously introduce a slight delay in fetching the page). You might also consider having two versions of your data, a raw data version and a display version. Then you could pass new data through the conversion function, store the converted version and have the converted data written to the HTML page.

temporary blocking google crawler, will it prevent future indexing?

Sometimes I need to add several updates to my site. To keep it clean I display a maintenance page. The first time I did this it became the main indexed page on google. Therefore I added the meta tag <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> to prevent this. My question is if google comes across this temporary maintenance page he will not index it but does that mean he will never index that page again? or will they index the page again once there is new content?
I would really appreciate if someone could clear this up for me
Google has a guideline for dealing with planned maintenance/downtime: "How to deal with planned site downtime". in short you should return a 503 HTTP result code on those page(s) which are under maintenance or are down. here is an example php code to use on top of those page(s):
header('HTTP/1.1 503 Service Temporarily Unavailable');
if you know exact/approximate time/date of when maintenance/downtime will be complete you can use an optional Retry-After header like this (alongside with above 503 HTTP result code):
//when the exact completion time is known.
header('Retry-After: Sat, 8 Oct 2011 18:27:00 GMT');
or
//when the length of the downtime in seconds is known.
header('Retry-After: 86400');
for more information read google article .
Google's crawler will reindex your site once you remove that meta tag (it is constantly updating its indices).
If you're really paranoid, I'd suggest checking out Google's Webmaster Tools so that you can directly control the indexing behavior of your site: http://www.google.com/webmasters/

When is a website considered "static" or "dynamic"

I have created a site, which parses XML files and display its content on the appropriate page. Is my site a dynamic web page or static web page?
How do dynamic and static web pages differ?
I feel it's dynamic, because I parse the content from xml files; initially i don't have any content in my main page..
What do you think about this, please explain it..
I would describe your pages as dynamic. "Static" usually means that the file sitting on the web server is delivered as-is to the user; since you're assembling the pages from data files, I'd call them dynamic even if you're not building in any dynamically-changing data.
I don't think this is a hard and fast definition though. If someone feels the page is static because it's assembled from static pages, that's another way to look at it.
This is actually an interesting question..
I would have said it's a dynamic website, as the content is generated programmatically.. but if the XML files do not change, it's no less "static" than straight HTML files served though Apache.
Say you have a site that is regular HTML files - it would be considered a static web-page; but if you take those HTML files, store them in a database, and have a simple page that allows /view.php?page=index - does that make it a dynamic site?
I would say no, it's just a static site served through a database, or XML files (instead of a file-system).
Basically: if the content changes without you manually editing those XML files, I would say it's a dynamic site. If it does change, then I would say it's a static site.
Static web pages would be plain HTML content that are delivered. If you are processing any type of XML files at the server side and generating content accordingly, this is a dynamic page. Static pages change content when the page is actually edited & modified.
First result on Google if you had searched for it explains it. http://websiteowner.info/articles/pages/pagetypes.asp
Also, stating that static websites are not updated regularly is not correct. The web and HTML was around even before we started writing stuff in Perl & PHP. There are/were sites that had heavy traffic and were being modified manually.
a simple way to distinguish between static and dynamic:
Static: straight HTML files
Dynamic: HTML is generated through server-side code and a data store(XML, database, etc.)
KISS - Dynamic pages change without changing the page itself.
Your pages are dynamic, because once deployed the content can be changed without changing the page's HTML.
Any content that is fixed and always renders the same is considered Static.