I'm using a NSURLRequest to check for available data updates. Today I noticed, that NSURLRequest caches the request by default. Even after several hours no new request was sent to the server. Now I'm wondering what the default behavior of this cache is. When will the cached data be stale and a new request send to the server?
The data is a static file and the server does not send explicit cache control headers:
HTTP/1.1 200 OK
Date: Fri, 13 Apr 2012 08:51:13 GMT
Server: Apache
Last-Modified: Thu, 12 Apr 2012 14:02:17 GMT
ETag: "2852a-64-4bd7bcdba2c40"
Accept-Ranges: bytes
Content-Length: 100
Vary: Accept-Encoding,User-Agent
Content-Type: text/plain
P.S.: The new version of my app sets an explicit caching policy, so that this isn't a problem anymore, but I'm curious what the default behavior is.
Note: here specifies how this should work in detail
If there is no cache then fetch the data.
If there is a cache, then check the loading scheme
a. if re-validation is specified, check the source for changes
b. if re-validation is not specified then fetch from the local cache as per 3)
If re-validation is not specified the local cache is checked to see if it is recent enough.
a. if the cache is not stale then it pulls the data from the cache
b. if the data is stale then it re-validates from the source
From here:
The default cache policy for an NSURLRequest instance is NSURLRequestUseProtocolCachePolicy. The NSURLRequestUseProtocolCachePolicy behavior is protocol specific and is defined as being the best conforming policy for the protocol
From here:
If an NSCachedURLResponse does not exist for the request, then the data is fetched from
the originating source. If there is a cached response for the request, the URL loading
system checks the response to determine if it specifies that the contents must be
revalidated. If the contents must be revalidated a connection is made to the originating
source to see if it has changed. If it has not changed, then the response is returned
from the local cache. If it has changed, the data is fetched from the originating source.
If the cached response doesn’t specify that the contents must be revalidated, the maximum
age or expiration specified in the response is examined. If the cached response is recent
enough, then the response is returned from the local cache. If the response is determined
to be stale, the originating source is checked for newer data. If newer data is
available, the data is fetched from the originating source, otherwise it is returned from
the cache.
Other options listed here.
It used to do a fresh request after i restart my app. By restart, i do it by pressing home button twice which shows list of running apps, kill my app and do a fresh start again by clicking on app icon.
Also it is possible to disable caching if you want to. override this method of NSURLProtocol delegate
//disable caching since our files are stored in multiple servers and false response caching can cause issues.
-(NSCachedURLResponse *)connection:(NSURLConnection *)connection
willCacheResponse:(NSCachedURLResponse *)cachedResponse
{
NSCachedURLResponse *newCachedResponse = cachedResponse;
//if ([[[[cachedResponse response] URL] scheme] isEqual:#"https"])
{
newCachedResponse = nil;
}
return newCachedResponse;
}
According to this article: http://blackpixel.com/blog/2012/05/caching-and-nsurlconnection.html , if you are using NSURLRequestUseProtocolCachePolicy and the server does not return either expiration or max-age, the default cache time interval is 6 - 24 hours. So Be careful about this condition. It is a better practice to set max-age or expiration when using NSURLRequestUseProtocolCachePolicy.
Related
I have an Asp.net API application running on Windows data center 2019 and some of the requests for an API get truncated. This is happening to only one of the APIs as far as I know because it’s the only one that upload big chunks of data. All others are tiny requests.
This API takes json in the body which contains: and issue ID and a serialized image. The largest requests size is 20mb so its not too big and most of the time they go thru fine, but sometimes the request gets chopped off at the end – sometimes a little, sometimes a lot. Also, there is a pattern that when a request it truncated, the API returns a 500 code and the client device tries to call the API again, and often it will continue to be truncated and always in different places.
I have good visibility into this because I use a logger module that writes every request to a text file allowing me to see exactly what hits the server.
I know the device is sending a well formatted request because it would get an exception if it created bad formatted json.
Lastly, this is using SSL.
This is what a typical request looks like:
HEADERS
Content-Length : 7154364
Content-Type : text/plain; charset=utf-8
Accept-Encoding : gzip
Host : cmtafr-dev.nwis.net
User-Agent : Dart/2.17 (dart:io)
deviceid : xxxxxxx
appversion : 2.21.10
cap2.0_tokenkey : xxxxx
osversion : 15.6.1
devicebrand : IOS
devicemodel : iPhone13,4
PATH
/api/IssueController/ExecuteIssue_UploadImageV2/7acd5643-c112-4f74-9dfa-1d558ee3ae69
BODY
{"Isu_Id":"9EC539F2-ABDD-49C5-934A-FAD6371B3E9C","ImageData":"/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAA... bla bla bla"}
Additional info:
The screenshot below shows 2 requests attempting to upload the same image. both failed as they were truncated. Notice the file sizes are different - meaning they left the device in good shape.
However, the header on both requests show the Content-Length:
16449758
HEADERS
Connection : keep-alive
Content-Length : **16449758**
Content-Type : application/json
Accept : */*
How can I possibly troubleshoot this? I have spent hours searching and not found anything similar to this. Most posts regarding …truncated requests/responses are application specific where a vendor replies.
Thank you for any help you can offer.
I have a controller which returns SVG images.As I wanted to have good performances, I decide to use Caching.
From what I read on the web,once you set the last modified date with HttpContext.Response.Cache.SetLastModified(date)
you can request it from the browser using HttpContext.Request.Headers.Get("If-Modified-Since"). Compare the two dates. If they are equal it means that the image has not been modified, therefore you can return HttpStatusCodeResult(304, "Not Modified").
But something weird is happening, here is my code:
[OutputCache(Duration = 60, Location = OutputCacheLocation.Any, VaryByParam = "id")]
public ActionResult GetSVGResources(string id)
{
DateTime lastModifiedDate = Assembly.GetAssembly(typeof(Resources)).GetLinkerTime();
string rawIfModifiedSince = HttpContext.Request.Headers.Get("If-Modified-Since");
if (string.IsNullOrEmpty(rawIfModifiedSince))
{
// Set Last Modified time
HttpContext.Response.Cache.SetLastModified(lastModifiedDate);
}
else
{
DateTime ifModifiedSince = DateTime.Parse(rawIfModifiedSince);
if (DateTime.Compare(lastModifiedDate, ifModifiedSince) == 0)
{
// The requested file has not changed
return new HttpStatusCodeResult(304, "Not Modified");
}
}
if (!id.Equals("null"))
return new FileContentResult(Resources.getsvg(id), "image/svg+xml");
else
return null;
}
What is happening is the function
HttpContext.Response.Cache.SetLastModified(lastModifiedDate); does not set the "If-Modified-Since" return from the browser, In fact the the function HttpContext.Request.Headers.Get("If-Modified-Since") retuns exactly the time when the image is returned from the previous call return new FileContentResult(Resources.getsvg(id), "image/svg+xml");.
So my question is,
1 - What does the function HttpContext.Response.Cache.SetLastModified(lastModifiedDate) set exactly ?
2 - How can I (the server) set the "If-Modified-Since" return by the browser ?
It seems like you're muddling a bunch of related but nonetheless completely different concepts here.
OutputCache is a memory-based cache on the server. Caching something there means that while it still exists in memory and is not yet stale, the server can forgo processing the action and just returned the already rendered response from earlier. The client is not involved at all.
HTTP cache is an interaction between the server and the client. The server sends a Last-Modified response header, indicating to the client when the resource was last updated. The client sends a If-Modified-Since request header, to indicate to the server that its not necessary to send the resource as part of the response if it hasn't been modified. All this does is save a bit on bandwidth. The request is still made and a response is still received, but the actual data of the resource (like your SVG) doesn't have to be transmitted down the pipe.
Then, there's basic browser-based caching that works in concert with HTTP cache, but can function without it just as well. The browser simply saves a copy of every resource it downloads. If it still has that copy, it doesn't bother making a request to the server to fetch it again. In this scenario, a request may not even be made. However, the browser may also send a request with that If-Modified-Since header to see if the file it has is still "fresh". Then, if it doesn't get the file again from the server, it just uses its saved copy.
Either way, it's all on the client. A client could be configured to never cache, in which case it will always request resources, whether or not they've been modified, or it may be configured to always use a cache and never even bother to check it the resource has been updated or not.
There's also things like proxies that complicate things further still, as the proxy acts as the client and may choose to cache or not cache at will, before the web browser or other client of the end-user even gets a say in the matter.
What all that boils down to is that you can't set If-Modified-Since on the server and you can't control whether or not the client sends it in the request. When it comes to forms of caching that involve a client, you're at the whims of the client.
I have a general question related to caching of API calls, in this instance calls to the Github API.
Let's say I have a page in my app that shows the filenames of a repo, and the content of the README. This means that I will have to do a few API calls in order to retrieve that.
Now, let's say I want to add something like memcached in between, so I'm not doing these calls over and over, if I don't need to.
How would you normally go about this? If I don't enable a webhook on Github, I have no way of knowing whether the cache should expire. I could always make a single call to get the current sha of HEAD, and if it hadn't changed, use cache instead. But that's on a repo-level, and not on a file level.
I can imagine I could do something like that with the object-sha's, but if I need to call the API anyway to get those, it defeats the purpose of caching.
How would you go about it? I know a service like prose.io has no caching right now, but if it should, what would the approach be?
Thanks
Would just using HTTP caching be good enough for your use case? The purpose of HTTP caching is not just to provide a way of not making requests if you already have a fresh response, rather - it also enables you to quickly validate if the response you already have in cache is valid (without the server sending the complete response again if it is fresh).
Looking at GitHub API responses, I can see that GitHub is correctly setting the relevant HTTP headers (ETag, Last-modified, Cache-control).
So, you just do a GET, e.g. for:
GET https://api.github.com/users/izuzak/repos
and this returns:
200 OK
...
ETag:"df739f00c5053d12ef3c625ad6b0fd08"
Last-Modified:Thu, 14 Feb 2013 22:31:14 GMT
...
Next time - you do a GET for the same resource, but also supply the relevant HTTP caching headers so that it is actually a conditional GET:
GET https://api.github.com/users/izuzak/repos
...
If-Modified-Since:Thu, 14 Feb 2013 22:31:14 GMT
If-None-Match:"df739f00c5053d12ef3c625ad6b0fd08"
...
And lo and behold - the server returns a 304 Not modified response and your HTTP client will pull the response from its cache:
304 Not Modified
So, GitHub API does HTTP caching right and you should use it. Granted, you have to use an HTTP client that supports HTTP caching also. The best thing is that if you get a 304 Not modified response - GitHub does not decrease your remaining API calls quota. See: https://docs.github.com/en/rest/overview/resources-in-the-rest-api#conditional-requests
GitHub API also sets the Cache-Control: private, max-age=60 header, so you have 60 seconds of freshness -- which means that requests for the same resource made less than 60 seconds apart will not even be made to the server.
Your reasoning about using a single conditional GET request to a resource that surely changes if anything in the repo changed (a resource showing the sha of HEAD, for example) sounds reasonable -- since if that resource hasn't changed, then you don't have to check the individual files since they haven't surely changed.
If I set this for cache control on my site:
Header unset Pragma
FileETag None
Header unset ETag
# 1 YEAR
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|swf|mp3|mp4)$">
Header set Cache-Control "public"
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>
# 2 HOURS
<FilesMatch "\.(html|htm|xml|txt|xsl)$">
Header set Cache-Control "max-age=7200, must-revalidate"
</FilesMatch>
# CACHED FOREVER
# MOD_REWRITE TO RENAME EVERY CHANGE
<FilesMatch "\.(js|css)$">
Header set Cache-Control "public"
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>
...then what if I update any css or image or other files, will the users browser still use the caches version until it expires (a year later)?
Thanks
Your css, js and image files will never be cached, as you are setting a date in the past.
I assume this is a mistake, and you intended to set it for a year in the future, this is one reason to favour max-age over expires.
If this was the case, then your images will be cached up to a year. It's allowable to drop something out of the cache at any time, for example to clean out less-frequently used entries to reduce the size on disk that the cache is taking up.
There are two possible approaches to deal with the possibility of reducing the risk of staleness. One is to set a much lower expiry time, and use e-tags and modification dates so that after that expiry time has past you can send a 304 if there is no change, so the server need send only a few bytes rather than the entire entity.
The other is to keep the expiry at a year, but to change the URI used when you change. This can be useful in the case of e.g. a large file that is used on almost every page on your site. It requires that you change all references to that resource when it does change (because you are essentially changing to use a new resource), which can be fiddly and therefore is only advised as an optimisation in a few hotspot cases. If a file ignores query attributes (e.g. it's just served straight from a file) the browser won't know that, hence you could use something like /scripts/bigScript.js?version=1.2.3 and then change to /scripts/bigScript.js?version=1.2.4 when you change bigScript.js. This will have no effect on bigScript.js, but will cause the browser to get a new file, as for all it knows it's a completely different resource.
Yes, a response with an expiration date in the future will be considered as fresh until the expiration date:
The Expires entity-header field gives the date/time after which the response is considered stale. […]
The presence of an Expires header field with a date value of some time in the future on a response that otherwise would by default be non-cacheable indicates that the response is cacheable, unless indicated otherwise by a Cache-Control header field (section 14.9).
Note that an expiration date more than one year in the future may be interpreted as never expires:
To mark a response as "never expires," an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future.
So if a cache has the response stored, it will probably take the response from the cache even without revalidating the cached response before sending it.
Now if you change a resource that is already stored in caches and still fresh, there is no way to invalidate them:
[…] although they might continue to be "fresh," they do not accurately reflect what the origin server would return for a new request on that resource.
There is no way for the HTTP protocol to guarantee that all such cache entries are marked invalid. For example, the request that caused the change at the origin server might not have gone through the proxy where a cache entry is stored.
This is the reason for why such never expiring resources use a unique version number in the URL (e.g. style-v123.css) that is changed with each update. This is also what I recommend in this case.
By the way, declaring the response with Cache-Control as public doesn’t do anything in this case. This is only used when a response that required authorization should be cacheable:
public – Indicates that the response MAY be cached by any cache, even if it would normally be non-cacheable or cacheable only within a non- shared cache. (See also Authorization, section 14.8, for additional details.)
For further information on HTTP caching:
HTTP 1.1 specification – Caching in HTTP
Mark Nottingham’s Caching Tutorial
I'm implementing a custom web server of a kind. And am looking into adding an Expires header support. However, I'm a little unsure of how exactly to implement it.
If multiple cold-cache requests are being made to the same unchanged resource on the server and the server returned different Expires header (say it uses relative time to calculate the exact value of the Expires date e.g. +6 hours from the request time), does that invalidate the cache on all the proxy servers in-between as well? Or is it impossible to happen (per the spec)?
Does the Expires HTTP header needs to be consistent across multiple cold-cache requests?
Ok, never mind, found the relevant information under the Cache Revalidation and Reload Controls section of the HTTP Spec
Basically, you can serve all the different validators you want but you must be aware that in such case proxies may have a set of different validators from their own cache and from various user agents communicating with the proxy. They may choose to send one to you and that might not be the correct or the most optimal one for the end-users. However, a "best approach" has been suggested in the spec.
I suppose this should covers Expires headers as well as ETags, Cache-Control and whatnot.
Here's the relevant excerpt, in case anyone's interested:
When an intermediate cache is forced,
by means of a max-age=0 directive, to
revalidate its own cache entry, and
the client has supplied its own
validator in the request, the supplied
validator might differ from the
validator currently stored with the
cache entry. In this case, the cache
MAY use either validator in making its
own request without affecting semantic
transparency. However, the choice of
validator might affect performance.
The best approach is for the
intermediate cache to use its own
validator when making its request. If
the server replies with 304 (Not
Modified), then the cache can return
its now validated copy to the client
with a 200 (OK) response. If the
server replies with a new entity and
cache validator, however, the
intermediate cache can compare the
returned validator with the one
provided in the client's request,
using the strong comparison function.
If the client's validator is equal to
the origin server's, then the
intermediate cache simply returns 304
(Not Modified). Otherwise, it returns
the new entity with a 200 (OK)
response. If a request includes the
no-cache directive, it SHOULD NOT
include min-fresh, max-stale, or
max-age.