How can I identify web requests created when someone links to my site from Google+? - http-headers

To comply with new EU legislation regarding cookies, we've had to implement cookie 'warning' banners in several places on our site. We're now seeing problems when users try and link/embed content from our site on Facebook, Google+, bit.ly and so on - the embedded page thumbnail is showing the cookie notification banner instead of a preview of the actual page.
Most of these sites use a distinctive user agent string, so we can easily identify their incoming requests and disable the cookie banner - but Google+ seems to identify itself as
Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0
which makes it very difficult to distinguish between Google+ automated requests and browsing traffic from 'real' users. Is there any way at all I can identify these requests? Custom HTTP headers I can look for or anything?

G+ has got its user agent. It contains the text: Google (+https://developers.google.com/+/web/snippet/).
Ref: https://developers.google.com/+/web/snippet/#faq-snippet-useragent

There are not any HTTP headers that you can depend on for detecting the +1 button page fetcher, but there is a feature request for it to specify a distinctive user agent. In the mean time, I would not depend on it.
You can, however, use other means to configure the snippet of content that appears on Google+. If you add structured markup such as schema.org or OpenGraph to your pages, the +1 button will pull the snippet from those tags. Google+ provides a configuration tool and docs to help you design your markup.
If you add schema.org markup it might look something like this:
<body itemscope itemtype="http://schema.org/Product">
<h1 itemprop="name">Shiny Trinket</h1>
<img itemprop="image" src="http://example.com/trinket.jpg" />
<p itemprop="description">Shiny trinkets are shiny.</p>
</body>

Related

How to add dynamic meta tags to website with no middleware or SSR

I have a relatively large app where there are a lot of user profile pages. I want to make it so that if you share one of the user's profile page it will preview their name and picture on social medias like FB and Twitter (think sharing a Twitch streamer's page on Twitter). I used create-react-app to start the project so I don't have server side rendering or any middleware for pre-rendering tools. Is there another way I can accomplish this?
There two ways you can get this to work
Is the server your files via express server and check for who has the made the request by checking user-agent header from request and if its a bot instead of sending them the usual response you can fetch the required user profile data and use that data to populate the open-graph metatags and return them the HTML with those metatags.
Second way would be to use a network interceptor from the CDN you're using to identify the who is requesting the page (either bot or a person) if its a bot, make a request to your backend to fetch related data and send them the HTML with populated metatags.
Explained approach
Every time a request comes into our server, it comes with a header value user-agent which tells the server who is requesting the resources (human or a bot from Facebook trying to do link preview). Just by comparing a list of known user-agent (so it won't work on all but will work all know platforms and 90% of others.)
Let's say we have something.com where we want the link preview and let's say a request comes for something.com/john. What we will do is check for request that is coming to the server and will check for user-agent property, if its a human it will be redirected to our normal site but if its a bot (so it just wants an HTML for link preview) what we are going to do is since it's our server we can grab the data of akkshay and set the proper metatags inside our HTML and send it back as a response.
So what happens here is whenever a human tries to go for something.com/john he will be redirected to our landing page as he is more concerned about what he sees on his browse but when a bot comes in we will send it HTML response with proper metatags as its the link previews which is the concern for the bot.
This thing can be done on our express server with something like this. But this can also be done infrastructure level.

Firefox: "Some parts of this page are not secure, such as images." What counts as insecure?

I use the browser Firefox, and sometimes, on certain webpages, the SSL icon says "Some parts of this page are not secure, such as images." What, exactly, counts as an insecure element?
Thanks!
Anything that is delivered over an insecure channel.
What this generally means is that the developer of the web page is combining HTTP-based URLs with HTTPS-based URLs in the same page. The URLs could be for images as well as JavaScript, CSS, or anything else that can be referenced from a web page. As a user, there's not much you can do about this -- it's a warning that there is a possibility that your data could be delivered to other servers in an open, unencrypted manner over the Internet. This is a Bad Thing, but you can't do much except avoid that site, or contact the support or webmaster for the site.
If you're the developer, most of the time you can use a scheme-relative URLs when referencing images or javascript, etc.
i.e. Instead of this:
<img src="http://example.com/dot.png">
use this:
<img src="//example.com/dot.png">
YMMV.
See also: https://url.spec.whatwg.org/
In firefox you can see in Inspect Element=>Network Tab=>Domain Columns.
And also please check in Console tab too.
I hope it will solve your problem.
"Insecure" simply means "not loaded via HTTPS".
This is insecure:
<img class="media-object" src="http://placehold.it/50x50">
This is secure:
<img class="media-object" src="https://placehold.it/50x50">
"Some parts of this page are not secure, such as images."
means not all content are are loading with secure https you can use this online tool to determine which resource is loading with http whynopadlock
This is My Solution
To resolve this issue make sure that the page code does not pull data directly from a non-secure URL.
View the page source html code to check for non-secure items. This can be done in a web browser by doing a right click and selecting 'view source'.
To identify non-secure elements view the source code of the page and search for the text src="http://
This will then highlight elements on your page being loaded from a non-secure URL.
The source code (HTML) needs to be checked for NON SECURE tags. (i.e. http://www.symantec.com/images/seals/Secure...) Ensure the following references are changed to HTTPS or a virtual directory.
Note: The webmaster should always be consulted prior to any adjustments made to a web site.
Thank You and I hope this will Help out
For wordpress users that can't find any 'http:' in the page source check if you have a favicon set. Wordpress will default to their W icon (w-logo-blue.png) and I've had a couple sites continue to serve it from http even after fully converting to ssl.
Dashboard -> Appearance -> Customize -> Site Identity -> add a site icon
FOR IMAGES
First of all, check whether your image file has HTTP instead of HTTPS if so change it to https or rather save those images and put in in the Server.
For instance,
<img src="http://example.com/images/image.jpg">(http image source)
to
<img src="https://example.com/images/image.jpg">(https image source)or into
<img src="//example.com/images/image.jpg">(server image)
Firefox throws an error when you have mixed active content. This is having a combination of HTTP and HTTPs requests; it's a security issue as it leaves room for man in the middle attack- intercepting HTTP content requests with malicious or unwanted requests.
Tip: Check all the following urls in your active content:
< script >
< link >
< iframe >
< XMLHttpRequest >
fetch()
urls in CSS (#font-face,cursor,background.image)
< object >
Navigator.sendBeacon (look for url)
Another tip: make sure to check all files (css gets overlooked)

W3C validator and HTTPS

I'm currently having trouble with the W3C markup validation service https://validator.w3.org and the use of HTTPS. When I type in there the website address with https I get the following response:
Sorry! This document cannot be checked.
Together with an error 500 saying that it can't connect to the site. Also, on the website page I have one link which carries the person into the validation and shows the site has been validated. When clicking the link without HTTPS everything works, but with HTTPS I get one message
Sorry! This document cannot be checked. No Referer header found!
which I believe is because the secure connection doesn't send the referer header right?
Now, how can I use HTTPS and avoid these problems with the validation?
Please always directly use https://validator.w3.org/nu/ (the current W3C HTML Checker) instead of https://validator.w3.org/ (the legacy W3C Markup Validator).
The HTML Checker is able to check documents at https URLS just fine. So If you find a https site that it doesn’t work with as expected, then that’s likely a bug I need to fix. (I maintain the checker, and recently updated it to get HTTPS support using HTTP Components HttpClient 4.4 —the latest Apache HTTP client library—including full support for HTTPS sites that use SNI.
A note about which W3C tool to use for checking HTML documents
On the W3C backend, when you use the https://validator.w3.org/ legacy Markup Validator to check documents with <!DOCTYPE html> doctypes, it just hands off the request to the same backend that directly drives the https://validator.w3.org/nu/ HTML Checker. But the HTML Checker has a UI with more features, and using it from https://validator.w3.org/nu/ is faster.
We (the W3C) plan to swap those two around eventually—that is, move the current HTML Checker to https://validator.w3.org/ and move the legacy Markup Validator to https://validator.w3.org/legacy/ or some such—but it will be a while yet before that happens. So in the mean time, as I said, I suggest always just doing all your HTML checking from the https://validator.w3.org/nu/ site.
There seems to be a bug in the W3C NU validator, so the "referer" value is not processed fully. :-/
I.e. the code for their badge <a target="_blank" href="http://validator.w3.org/check/referer"><img src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0 Transitional" title="Valid XHTML 1.0 Transitional" style="height: 31px; width: 88px;" /></a>
does not validate my nested sub-page, but just the root-page of the whole web-site instead, on click on the badge, in a footer of the deep sub-page. Sad. :-/
And the same for the alternative parameterized .../check?uri=referer" URL, still the same issue. :-/

Who knows which files should be included in a website?

When the browser requests a website, any website from a HTTP server, which of the two parses the site's content in order to know which other files need to be included on the webpage?
What I mean is this:
the browser asks for the html file and then observers that it needs to import some external css files and HE is the one who requests them.
OR
the HTTP server when faced with a request for a website, parses (already knows) which sites need to be linked to a certain webpage and sends them alongside the html page?
I'm guessing the first case is the correct one, but if someone can confirm and maybe clarify it, I'd appreciate it.
It's all done by the client (which is usually a browser). When it sees <script>, <iframe>, <img>, <link>, etc. tags that reference other documents, it downloads them if necessary.
According to Wikipedia -
The primary function of a web server is to cater web page to the
request of clients using the Hypertext Transfer Protocol (HTTP). This
means delivery of HTML documents and any additional content that may
be included by a document, such as images, style sheets and scripts.
and
The primary purpose of a web browser is to bring information resources
to the user ("retrieval" or "fetching"), allowing them to view the
information ("display", "rendering"), and then access other
information ("navigation", "following links").
It is the Browser that parses the HTML and request for the associated contents.

Why do I have both HTTPS and HTTP links on site, need them all secure!

I am getting the security alert: "You are about to be directed to a connection that is not secure. the information you are sending to the current site might be transmitted to a non-secure site. Do you wish to continue?" when I try to login as a customer on my clients oscommerce website. I noticed the link in the status bar goes from a https prefix to a nonsecure http prefix. The site has a SSL certificate, so how do I ensure the entire store portion of the site directs to the secured site?
It is likely that some parts of the page, most often images or scripts, are loaded non-secure. You'll need to go through them in the browser's "view page source" view one by one and eliminate the reason (most often, a configuration setting pointing to http://).
Some external tools like Google Analytics that you may be embedding on your site can be included through https://, some don't. In that case, you may have to remove those tools from your secure site.
If you can't switch all the settings, try using relative paths
<img src="/images/shop/xyz.gif">
but the first thing is to identify the non-secure elements using the source code view of your browser.
An immediate redirection from a https:// page to a http:/ one would not result in a warning as you describe. Can you specify what's up with that?
Use Fiddler and browse your site, in the listing it should become evident what is using HTTP and HTTPS.
Ensure that the following are included over https:
css files
js files
embedded media (images, videos)
If you're confident none of your own stuff is included over http, check things like tracking pixels and other third-party gadgets.
Edit: Now that you've linked your page, I see that your <base> tag is the problem:
<base href="http://balancedecosolutions.com/products//catalog/">
Change to:
<base href="https://balancedecosolutions.com/products//catalog/">
If the suggestion from Pekka doesn't suit your needs you can try using relative links based on the schema (http or https):
e.g.,
I am a 100% valid link!
The only problem with this technique is that it doesn't work with CSS files in all browsers; though it does work within Javascript and inline CSS. (I could be wrong here; anyone want to check?).
e.g., the following :
<link rel="stylesheet" href="/css/mycss.css" />
<!-- mycss.css contents: -->
...
body{
background-image:url(//static.example.com/background.png);
}
...
...might fail.
A simple Find/Replace on your source code could be easy.
It sounds to me like the HTML form you are submitting is hardcoded to post to a non-secure page.