Is dns-prefetch needed for resources that have cache policy? - browser-cache

Is it unnecessary to add dns-prefetch for servers that return cached resources?
Does the web browser even need to resolve the domain for assets they know they have cached?

dns-prefetch is recommended for all types of resources as its not a heavy operation.
However for cached resources dns lookups are not required and generally speaking for most of the cases the dns records are already available for the domains with cached resources.

Related

Is Cache Invalidation on S3 a One Time Event or a Type of Policy?

I have a set of files on Amazon S3 cloudfront that I do not want cached. I was able to create an Invalidation on a single file that seemed to work. However, going forward the file seems to be cached again even though the Invalidation Entry is still present. Is the Invalidation a "One Time Event"? Does anyone know the exact details of how this works.
I would like a set of files to basically never be cached going forward.
Thanks for any suggestions and best practices advice.
Invalidation removes a cached entry from CloudFront's edge locations, but has no impact on whether or not the invalidated object(s) are cached again in the future. All else held equal: after you issue an invalidation, objects that were previously cached will be cached again on subsequent requests.
Before we explore the options, two definitions that are important to understand:
Cache behaviors are effectively routes with dedicated configurations applying only to requests matching the route (known as a path pattern)
Cache policies are instructions for how CloudFront will cache your responses. Cache policies are attached to one or more cache behaviors. The min and max TTL set a floor and ceiling on the value returned in your Cache-Control/Expires headers. The default TTL determines the length of time to cache a response when you don't provide a Cache-Control/Expires header.
Do you want to prevent caching for all files in your S3 bucket?
Attach the CachingDisabled cache policy (provided by CloudFront) to your default cache behavior.
Do you want to prevent caching for only certain files in your S3 bucket?
If the files you do not want to cache live in the same directory, create a cache behavior to match that path and use the CachingDisabled cache policy (provided by CloudFront) to prevent files in that directory from being cached. This instructs CloudFront to use a cache policy that does not cache responses when processing requests that match a specific path/route.
Set a Cache-Control header as metadata on the objects in S3 to instruct CloudFront not to cache, while caching the other objects.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html (Scroll down to Adding headers to your objects using the Amazon S3 console)

Redis Cache Share Across Regions

I've got an application using redis for cache, it works well so far. However we need spread our applications to different regions(thru dynamic DNS dispatcher via user locations, local user could visit nearest server).
Considering the network limitation and bandwith, it's not likely to build a centralised redis. So we have to assign different redis for different regions. So the problem here is how can we handle the roaming case. User opens the app in location 1, while continuing using the app in location 2 without missing the cache in location1.
You will have to use a tiered architecture. This is how most CDNs like Akamai, or Amazon Cloudfront work.
Simply put, this is how it works :
When a object is requested, see if it exists in the redis cache server S1 assigned for location L1.
If it does not exist in S1, check whether it exists in caching servers of other locations i.e. S2,S3....SN.
If it is found in S2...SN, store the object in S1 as well, and serve the object.
If not found in S2...SN as well, fetch the object fresh from backend, and store in S1.
If you are using memcached for caching, then facebook's open-source mcrouter project will help, as it does centralized caching.

PyroCMS Minified Asset files across multiple machines

If I'm deploying a site across multiple servers (to be load balanced), it seems that on one of the servers, the minified assets aren't in assets/cache. It's there in the other ones, but not this one. Is that a database thing that PyroCMS uses to check which asset file to use?
PyroCMS will not check the database for cached assets, that would be weird. It just caches them whenever they are requested. Maybe traffic isn't getting to your other machine?

Strange domains in mod_pagespeed cache folder

About a year ago I have installed mod_pagespeed on my VPS server, set it up and left it running. Recently I was exploring files on my server, went to pagespeed cache folder and discovered some strange folders.
All folders usually named this way ,2Fwww.mydomain.com or ,2F111.111.111.111 for IP addresses. I was surprised to see some domains that does not belong to me, like:
24x7-allrequestsallowed.com
allrequestsallowed.com
m.odnoklassniki.ru
www.fbi.gov
www.securitylab.ru
It looks like something dodgy is going on, was my server compromised, is there any reasonable explanation?
That does look peculiar. Everything in the cache folder should be files that mod_pagespeed tried to rewrite. There are two ways that I know of that this can happen:
1) You reference some third-party resource (say an image from another domain, or google analytics script) and you have explicitly enabled rewriting of that domain with ModPagespeedDomain www.example.com or ModPagespeedDomain *.
2) If your server accepts HTTP requests with invalid Host headers. Try (for example) wget --header="Host: www.fbi.gov" www.yourdomain.com/foo/bar.html. If your server accepts requests like that it may be providing mod_pagespeed with an incorrect base domain, and then subresources would be fetched from the same domain (so if www.yourdomain.com/foo/bar.html references some.jpeg, and your server accepts invalid host headers, we could fetch www.fbi.gov/foo/some.jpeg as the resource). There was a recent security release that makes sure all of these subrequests are done against localhost (not arbitrary third-party websites). Please see: https://developers.google.com/speed/docs/mod_pagespeed/CVE-2012-4001
You might want to look through these folders and see what specific resources are in there. I think that the biggest concern you should have is that someone might be trying to perform an XSS attack on your users or maybe a DDoS attack against another website (like www.fbi.gov), using your server as one vector. I do not think that these folders are indicative that your server itself is compromised.
If you would like to discuss this more, https://groups.google.com/forum/?fromgroups#!forum/mod-pagespeed-discuss is a good list to join and email.

Can a website be cached anywhere other than a browser's cache?

My client is seeing a different version of the website on his computers then what I am seeing on mine. He claims to be deleting the cache. I'm using Safari with the cache disabled via the Develop menu and I see the correct version of the site.
Is it possible that the website is somehow cached by my client's ISP or something along those lines?
Update:
I think I need to describe the problem better:
My client has a web hosting package where he has his domains and email accounts. somedomain.com has it's A record changed to point to Behance's ProSite hosted service.
The problem is that when he goes to somedomain.com he gets the index.html that's sitting in his web server's public_html directory, and not his ProSite. Using the same domain I see the ProSite. He has cleared his cache and tried on a computer at home with the same result. This is what lead me to believe that there is some sort of caching issue somewhere along the line with his ISP(s).
Is there anything I can do about this?
Proxy servers at the ISP or even the client's site might do this. Or even network-compressors in some (mal)configurations.
Depending on the site you might also be seeing actually a different site. e.g. Google redirects to different servers using DNS load balancing.
Yes, you're right. To improve performance and the speed in loading page from the same request modern browser seem to great at caching. I myself have the same problem as well. To resolve this problem You should tag version of your projects whenever you deploy them to production.
Based on the update, the problem was with DNS cache.
DNS can be cached at the following levels:
browser
operation system
router
DNS provider
And each of them has its own way to flush DNS cache. Except DNS provider where the only thing you can is to wait for cache invalidation. Though you can replace your current DNS provider with another one who won't have your domain in his cache. You have all the chances to find such if your domain isn't popular.