Is it possible to have GitHub Readme images follow redirects? - testing

I'm trying to add a test coverage badge to the Readme of a private repository on GitHub. Our continuous integration process saves out the image to a secured Google Cloud Storage bucket that's not accessible to the public, and should remain that way.
Google's authorization layer is smart enough that if I go to the URL for the image, I'm automatically redirected to the resource with a valid auto-generated signed URL.
E.g., if I go to http://storage.cloud.google.com/secret-files/mysecretfile.png, then if I'm logged in and allowed to view it, I'm automatically redirected to something like https://blahblah-apidata.googleusercontent.com/download/storage/v1/b/secret-files/o/mysecretfile.png?key=verylongkey, where I can load the image.
This seemed perfect. Reference the canonical path in the GitHub Readme, authenticated users see the image, unauthenticated users are still blocked, we don't have to make the file public, and we don't have to do anything complicated.
Except that GitHub is proxying the image request, meaning that it will always be unauthenticated. My browser is loading something like https://camo.githubusercontent.com/mysecretimage.png.
Is there a clever way to work around this? Or do I need to go back to the drawing board?

All images on github.com are proxied using the Camo image proxy. There are a couple reasons for this:
It preserves the privacy of users. It isn't possible for a document to track users by directing them to a different site or using cookies to track them.
It means images can be cached and served at an appropriate size.
GitHub can have a very strict content security policy that does not allow loading from untrusted sites, which means that any sort of accidental security problem (like an XSS) is a lot less likely to work.
Note the last part. Even if you found some sneaky way to get another image URL to render properly in the website, your browser wouldn't load it because it violates the Content-Security-Policy header the site sent, and moreover, your browser would tattle about that to the reporting URL that GitHub provided.
So any image URL you provide will need to be readable by GitHub's image proxy and it won't be possible to serve different content to different users.

Related

AWS Signed Cookies in an IFrame?

Is it possible to support cookies in an IFrame that won't be broken by some of the recent cookie security improvements? The IFrame is embedded on arbitrary domains that we don't control. Other than the initial request URI being passed, we don't care about any special message passing or cross domain access.
Context: an app I've inherited serves authenticated S3 content in an IFrame to users. The content is proxied by CloudFront, leaning on their Signed Cookies feature to authenticate the initial HTML page, as well as every other asset (CSS, JS, images, etc.) that might be on the page. The cookie is generated/set after a successful auth handshake.
Recently, the move towards blocking third party cookies has broken this model. Users need to downgrade their security settings, and this will flat out stop working soon.
Short of a larger architectural change, is there a way to configure the cookies or CF to work within an IFrame, embedded on domains we don't control? My assumption is that this model is fundamentally broken now, but I wanted to triple check before reaching for a larger architectural change.
Thanks

React Router + AWS Backend, how to SEO

I am using React and React Router in my single page web application. Since I'm doing client side rendering, I'd like to serve all of my static files (HTML, CSS, JS) with a CDN. I'm using Amazon S3 to host the files and Amazon CloudFront as the CDN.
When the user requests /css/styles.css, the file exists so S3 serves it.
When the user requests /foo/bar, this is a dynamic URL so S3 adds a hashbang: /#!/foo/bar. This will serve index.html. On my client side I remove the hashbang so my URLs are pretty.
This all works great for 100% of my users.
All static files are served through a CDN
A dynamic URL will be routed to /#!/{...} which serves index.html (my single page application)
My client side removes the hashbang so the URLs are pretty again
The problem
The problem is that Google won't crawl my website. Here's why:
Google requests /
They see a bunch of links, e.g. to /foo/bar
Google requests /foo/bar
They get redirected to /#!/foo/bar (302 Found)
They remove the hashbang and request /
Why is the hashbang being removed? My app works great for 100% of my users so why do I need to redesign it in such a way just to get Google to crawl it properly? It's 2016, just follow the hashbang...
</rant>
Am I doing something wrong? Is there a better way to get S3 to serve index.html when it doesn't recognize the path?
Setting up a node server to handle these paths isn't the correct solution because that defeats the entire purpose of having a CDN.
In this thread Michael Jackson, top contributor to React Router, says "Thankfully hashbang is no longer in widespread use." How would you change my set up to not use the hashbang?
You can also check out this trick. You need to setup cloudfront distribution and then alter 404 behaviour in "Error Pages" section of your distribution. That way you can again domain.com/foo/bar links :)
I know this has been a few months old, but for anyone that came across the same problem, you can simply specify "index.html" as the error document in S3. Error document property can be found under bucket Properties => static Website Hosting => Enable website hosting.
Please keep in mind that, taking this approach means you will be responsible for handling Http errors like 404 in your own application along with other http errors.
The Hash bang is not recommended when you want to make SEO friendly website, even if its indexed in Google, the page will display only a little and thin content.
The best way to do your website is by using the latest trend and techniques which is "Progressive web enhancement" search for it on Google and you will find many articles about it.
Mainly you should do a separate link for each page, and when the user clicks on any page he will be redirected to this page using any effect you want or even if it single page website.
In this case, Google will have a unique link for each page and the user will have the fancy effect and the great UX.
EX:
Contact Us

removing cookies on another domain using mod-rewrite and apache

I have built a cookie consent module that is used on many sites, all using the same server architecture, on the same cluster. For the visitors of these sites it is possible to administer their cookie settings (eg. no advertising cookies, but allow analytics cookes) on a central domain that keeps track of the user preferences (and sites that are visited).
When they change their settings, all sites that the visitor has been to that are using my module (kept in cookie) are contacted by loading it with a parameter in hidden iframes. I tried the same with images.
On these sites a rewrite rule is in place that detects that parameter and then retracts the cookie (set the date in the past) and redirects to a page on the module site (or an image on the module site).
This scheme is working in all browsers, except IE, as it needs a P3P (Probably the reason why it is not working for images is similar).
I also tried loading a non-existent image on the source domain (that is, the domain that is using the module) through an image tag, obviously resulting in a 404. This works on all browsers, except Safari, which doesn't set cookies on 404's (at least, that is my conclusion).
My question is, how would it be possible to retract the cookie consent cookie on the connected domains, given that all I can change are the rewrite rules?
I hope that I have explained the problem well enough for you guys to give an answer, and that a solution is possible...
I am still not able to resolve this question, but when looked at it the other way around there is a solution. Using JSONP (for an example, see: Basic example of using .ajax() with JSONP?), the client domain can load information from the master server and compare that to local information.
Based on that, the client site can retract the cookie (or even replace it) and force a reload which will trigger the rewrite rules...
A drawback of this solution is that it will hit the server for every pageview, and in my case, that's a real problem. Only testing that every x minutes or so (by setting a temporary cookie) would provide a solution.
Another, even more simple solution would be to expire all the cookies on the client site every x hour. This will force a revisit of the main domain as well.

Is it possible to use both S3 Query String Authentication and HTTP caching?

I have the following requirements for a (Rails) web application that uses S3/Cloudfront for image storage:
A user may only see an image if they are logged in. If the user sends an image URL to a friend, it will not work.
If a user has seen an image, it should be cached by their browser, so they don't have to download it again.
…
Requirement 1 can solved with S3's Query String Authentication (QSA)
(e.g. with 30 second expiry).
Requirement 2 can be solved using HTTP
caching.
Is it possible to use them both together?
The challenge I'm facing is that QSA effectively changes the URL of the image after expiry, even though a perfectly good copy may reside in the browser cache.

Images on SSL enabled site with Internet explorer

I have a problem with my site after implementation of SSL that images do not appear. The scenario is that images come from images.domain.com (hosted on Amazon S3) and my certificate is for www.domain.com.
This problem only seems to happen in IE and not in any other browsers.
The issue is related to "mixed content" - HTTPS pages which have HTTP resources (images, scripts, etc) embedded.
The point of using HTTPS is to ensure that only the originating server and the client have access to the secured page. However, in theory it might be possible for this security to be compromised if HTTP resources are embedded - a server might intercept an unsecured javascript file and inject some code to alter the secured page onload.
Most browsers will indicate that a secure page has mixed content by altering the "secure lock" icon, either by showing the lock as open or broken, or by making the icon red (Chrome displayed a skull and crossbones for a short time, but they realised that this was a bit serious for the potential threat level).
Internet Explorer (depending on the version) will display a message either asking whether the insecure content should be shown (IE<=7), or whether only the secure content should be shown (IE>=8). It sounds like you have somehow disabled this message to always hide the insecure content, however that's not the default behaviour.
I think the best solution for you is to replace your S3 links with HTTPS versions.
I am not a web developer, but someone who often deals with the crap experience that is IE. I am not sure what version you are using, but you do not have a wildcard SSL cert (i.e. *.domain.com), so does it have something to do with an old-school limitation in 3rd party images?
See here for what I allude to above and a very good explanation of how IE caches cross-domain HTTPS content, specifically images. I am not sure what the solution is, but I was curious so I researched a little myself and this might help.