hashHistory, _escaped_fragment_, and Google

hashHistory, _escaped_fragment_, and Google - seo

I've been working with React Router for some time now and I've been using hashHistory to handle routing. At some point I am going to transition the app to browserHistory, but I'm curious as to why Google's "Fetch as Google" feature does not appear to work for anything other than the root route (/). It's clear that it's rendering something, just not the routes not handled by the server. (Image below)
I see that Google has deprecated their AJAX crawling scheme, which leads me to believe that I no longer need to deal with ?_escaped_fragment_=, but even so, I cannot get Google to render any other routes.
For example, the site is www.learnphoenix.io and the lessons are listed under www.learnphoenix.io/#/phoenix-chat/lessons. Yet, Google's Fetch as Google feature in webmaster redirects to the homepage and only renders the homepage. Using _escaped_fragment_ leads to the same result.
Is there a way to allow Google to index my site using hashHistory, or do I just have to accept that only my homepage will be indexed until I switch to browserHistory?

By default, it seems that google ignores url fragments (#). According to this article, which may be dated, using #! will tell google that the fragments can be used to define different canonical pages.
https://www.oho.com/blog/explained-60-seconds-hash-symbols-urls-and-seo
It's worth a shot, though Hashbang isn't supported by ReactRouter, again, because its supposed to be deprecated.
A better option might be to just bite the bullet and use browserHistory (pushState) in your react-router. The problem with that is if you're running a static server-less app, paths like /phoenix-chat/lessons will return a 404. In AWS, there is a hack around that too. Setting your 404 page to be your app index page.
http://blog.boushley.net/2015/10/29/html5-deep-link-on-amazon-s3/
Feels dirty, but again, worth a shot. Hopefully there's something of value in this answer for you!

Related

Using Shopify 404 page for content

Shopify is quite restrictive about the ways that you can structure directories. For example all pages must have a url which looks like "my-store.com/pages/my-page".
Whilst there is no way around this in Shopify, I considered a workaround which would work like this.
Use javascript to check the URL queried when displaying the 404 page.
If URL queried = "my-url" connect to Wordpress Rest or graph QL API, query and then render desired content on the page.
For example, my-site.com/blog would return a 404 error, however javascript would run a function to get content when the URL ends in "/blog".
Although this would work from a technical point of view, I understand the server would still be giving a 404 error and this probably has wider implications? To what extent is this the case and is this an unviable solution?

A really interesting idea.
The biggest issue I see will be SEO, since the URLS will still points to the 404 page and you won't be able to show the proper content with liquid, all of the pages will pull the 404 content and show as 404 pages in the google search.
That said I don't see any other major issues that will prevent you to use this with JS. It depends really how many type of pages will require this logic and how the JS logic is written, but as an idea I really like the possibility of it.
I will probably not recommend it to a client that wants a SEO optimized site, but for a personal one it seems like an interesting idea.

What should be located at the homepage of a REST API?

I'm currently in the process of writing a REST API and this question always seems to popup.
I've always just added a description, quick links to docs, server time etc, but see now (after looking around a bit) that a simple redirect to the API docs would be even better.
My question is what would be the accepted norm to have as the root - '/' - "homepage" of your API?
I've been looking at a few implementations:
Facebook: Just gives a error of "Unsupported get request.";
Twitter: Shows an actual 404 page;
StackOverflow: Redirect to quick "usage" page.
After looking at those it's clear everyone is doing it differently.
In the bigger picture this is of little significance but would be interesting to see what the "RESTfull" way of doing it (if there is one) might be.

Others have had the same question and as you discovered yourself everyone is doing it their own way. There is a move in this direction to somehow standardize it, so see if you find this draft useful:
Home Documents for HTTP APIs aka JSON Home.

I've give this much thought and right now I either return a 404 page, a health status page, a dummy page or redirect to another page, mostly likely on within the organization.
An API homepage isn't something everyone should be looking at and believe me, it can be found. There are more people like me that love to inspect the browser and see how a website is performing.

Facebook App in Page Tab receiving signed_request but missing page data

I have a page tab app that I am hosting. I have both http and https supported. While I receive a signed_request package as expected, after I decode it does not contain page information. That data is simply missing.
I verified that like schemes are being used (https) among facebook, my hosted site and even the 'go between'-- facebook's static page handler.
Also created a new application with page tab support but got the same results-- simply no page information in the signed_request.
Any other causes people can think of?
I add the app to the page tab using this link:
https://www.facebook.com/dialog/pagetab?app_id=176236832519816&next=https://www.intelligantt.com/Facebook/application.html
Here is the page tab I am using (Note: requires permissions):
https://www.facebook.com/pages/School-Auction-Test-2/154869721351873?id=154869721351873&sk=app_176236832519816
Here is the decoded signed_request I am receiving:
{"algorithm":"HMAC-SHA256","code":!REMOVED!,"issued_at":1369384264,"user_id":"1218470256"}
5/25 Update - I thought maybe the canvas app urls didn't match the page tab urls so I spent several hours going through scenarios where they both had a trailing slash or not. Where they both had a trailing ? or not, with query parameters or not.
I also tried changing the 'next' value when creating the page tab to the canvas app url and the page tab url.
No success on either count.
I did read where because I'm seeing the 'code' value in the signed_request it means Facebook either couldn't match my urls or that I'm capturing the second request. However, I given all the URL permutations I went through I believe the urls match. I also subscribed to the 'auth.authResponseChange' which should give me the very first authResponse that should contain the signed_request with page.id in it (but doesn't).
If I had any reputation, I'd add a bounty to this.
Thanks.

I've just spent ~5 hours on this exact same problem and posted a prior answer that was incorrect. Here's the deal:
As you pointed out, signed_request appears to be missing the page data if your tab is implemented in pure javascript as a static html page (with *.htm extension).
I repeated the exact same test, on the exact same page, but wrapped my html page (including js) within a Perl script (with *.cgi extension)... and voila, signed_request has the page info.
Although confusing (and should be better documented as a design choice by Facebook), this may make some sense because it would be impossible to validate the signed_request wholly within Javascript without placing your secretkey within the scope (and therefore revealing it to a potential hacker).

It would be much easier with the PHP SDK, but if you just want to use JavaScript, maybe this will help:
Facebook Registration - Reading the data/signed request with Javascript
Also, you may want to check out this: https://github.com/diulama/js-facebook-signed-request

simply you can't get the full params with the javascript signed_request, use the php sdk to get the full signed_request . and record the values you need into javascript variabls ...
with the php sdk after instanciation ... use the facebook object as following.
$signed_request = $facebook->getSignedRequest();
var_dump($signed_request) ;
this is just to debug but u'll see that the printed array will contain many values that u won't get with js sdk for security reasons.
hope that helped better anyone who would need it, cz it seems this issue takes at the min 3 hours for everyone who runs into.

How to use regular urls without the hash symbol in spine.js?

I'm trying to achieve urls in the form of http://localhost:9294/users instead of http://localhost:9294/#/users
This seems possible according to the documentation but I haven't been able to get this working for "bookmarkable" urls.
To clarify, browsing directly to http://localhost:9294/users gives a 404 "Not found: /users"

You can turn on HTML5 History support in Spine like this:
Spine.Route.setup(history: true)
By passing the history: true argument to Spine.Route.setup() that will enable the fancy URLs without hash.
The documentation for this is actually buried a bit, but it's here (second to last section): http://spinejs.com/docs/routing
EDIT:
In order to have urls that can be navigated to directly, you will have to do this "server" side. For example, with Rails, you would have to build a way to take the parameter of the url (in this case "/users"), and pass it to Spine accordingly. Here is an excerpt from the Spine docs:
However, there are some things you need to be aware of when using the
History API. Firstly, every URL you send to navigate() needs to have a
real HTML representation. Although the browser won't request the new
URL at that point, it will be requested if the page is subsequently
reloaded. In other words you can't make up arbitrary URLs, like you
can with hash fragments; every URL passed to the API needs to exist.
One way of implementing this is with server side support.
When browsers request a URL (expecting a HTML response) you first make
sure on server-side that the endpoint exists and is valid. Then you
can just serve up the main application, which will read the URL,
invoking the appropriate routes. For example, let's say your user
navigates to http://example.com/users/1. On the server-side, you check
that the URL /users/1 is valid, and that the User record with an ID of
1 exists. Then you can go ahead and just serve up the JavaScript
application.
The caveat to this approach is that it doesn't give search engine
crawlers any real content. If you want your application to be
crawl-able, you'll have to detect crawler bot requests, and serve them
a 'parallel universe of content'. That is beyond the scope of this
documentation though.
It's definitely a good bit of effort to get this working properly, but it CAN be done. It's not possible to give you a specific answer without knowing the stack you're working with.

I used the following rewrites as explained in this article.
http://www.josscrowcroft.com/2012/code/htaccess-for-html5-history-pushstate-url-routing/

How to force google to show my first page from a page set with pagination?

I have a website and in my website I have, for example, a list of Audi models. I saw, using google webmaster tools, that my website appears in the google search by the word audi, but the target page was the 22nd page from my result set, not the first. I need my first page to appead, not my last (or middle), but I cannot tell google that this is a parameter, because my URLs are rewritten using mod rewrite. Any ideas?
BTW, I have read in a SEO forum, that it's a bad idea to use a cannonical tag. So is it really a bad idea in my case?

You can't force Google to do anything, however, they have made it easier to deal with pagination issues with a recent post on rel="next" and rel="prev".
But the primary problem you face is signalling to Google that your first (main) page is the starting point - this is achieved using internal link and back-link "juice" focussed on that page. You need to ensure that the first page of results is linked to properly from higher-value pages (like the home-page).

Google recently announced that you can use View All which will allow them to find and index entire articles that are normally broken up using pagination and display them all as one result.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas