Using Shopify 404 page for content - shopify

Shopify is quite restrictive about the ways that you can structure directories. For example all pages must have a url which looks like "my-store.com/pages/my-page".
Whilst there is no way around this in Shopify, I considered a workaround which would work like this.
Use javascript to check the URL queried when displaying the 404 page.
If URL queried = "my-url" connect to Wordpress Rest or graph QL API, query and then render desired content on the page.
For example, my-site.com/blog would return a 404 error, however javascript would run a function to get content when the URL ends in "/blog".
Although this would work from a technical point of view, I understand the server would still be giving a 404 error and this probably has wider implications? To what extent is this the case and is this an unviable solution?

A really interesting idea.
The biggest issue I see will be SEO, since the URLS will still points to the 404 page and you won't be able to show the proper content with liquid, all of the pages will pull the 404 content and show as 404 pages in the google search.
That said I don't see any other major issues that will prevent you to use this with JS. It depends really how many type of pages will require this logic and how the JS logic is written, but as an idea I really like the possibility of it.
I will probably not recommend it to a client that wants a SEO optimized site, but for a personal one it seems like an interesting idea.

Related

hashHistory, _escaped_fragment_, and Google

I've been working with React Router for some time now and I've been using hashHistory to handle routing. At some point I am going to transition the app to browserHistory, but I'm curious as to why Google's "Fetch as Google" feature does not appear to work for anything other than the root route (/). It's clear that it's rendering something, just not the routes not handled by the server. (Image below)
I see that Google has deprecated their AJAX crawling scheme, which leads me to believe that I no longer need to deal with ?_escaped_fragment_=, but even so, I cannot get Google to render any other routes.
For example, the site is www.learnphoenix.io and the lessons are listed under www.learnphoenix.io/#/phoenix-chat/lessons. Yet, Google's Fetch as Google feature in webmaster redirects to the homepage and only renders the homepage. Using _escaped_fragment_ leads to the same result.
Is there a way to allow Google to index my site using hashHistory, or do I just have to accept that only my homepage will be indexed until I switch to browserHistory?
By default, it seems that google ignores url fragments (#). According to this article, which may be dated, using #! will tell google that the fragments can be used to define different canonical pages.
https://www.oho.com/blog/explained-60-seconds-hash-symbols-urls-and-seo
It's worth a shot, though Hashbang isn't supported by ReactRouter, again, because its supposed to be deprecated.
A better option might be to just bite the bullet and use browserHistory (pushState) in your react-router. The problem with that is if you're running a static server-less app, paths like /phoenix-chat/lessons will return a 404. In AWS, there is a hack around that too. Setting your 404 page to be your app index page.
http://blog.boushley.net/2015/10/29/html5-deep-link-on-amazon-s3/
Feels dirty, but again, worth a shot. Hopefully there's something of value in this answer for you!

What should be located at the homepage of a REST API?

I'm currently in the process of writing a REST API and this question always seems to popup.
I've always just added a description, quick links to docs, server time etc, but see now (after looking around a bit) that a simple redirect to the API docs would be even better.
My question is what would be the accepted norm to have as the root - '/' - "homepage" of your API?
I've been looking at a few implementations:
Facebook: Just gives a error of "Unsupported get request.";
Twitter: Shows an actual 404 page;
StackOverflow: Redirect to quick "usage" page.
After looking at those it's clear everyone is doing it differently.
In the bigger picture this is of little significance but would be interesting to see what the "RESTfull" way of doing it (if there is one) might be.
Others have had the same question and as you discovered yourself everyone is doing it their own way. There is a move in this direction to somehow standardize it, so see if you find this draft useful:
Home Documents for HTTP APIs aka JSON Home.
I've give this much thought and right now I either return a 404 page, a health status page, a dummy page or redirect to another page, mostly likely on within the organization.
An API homepage isn't something everyone should be looking at and believe me, it can be found. There are more people like me that love to inspect the browser and see how a website is performing.

Block google from indexing some pages from site

I have a problem with lots of 404 errors on one site. I figured out that these errors are happening because google is trying to find pages that no longer exist.
Now I need to tell Google not to index those pages again.
I found some solutions on the internet about using robots.txt file. But this is not a site that I built. I just need to fix those errors.The thing is, those pages are generated. They do not physically exist in that form. So I can not add anything in php code.
And I am not quite sure how to add those to robot.txt.
When I just write:
*User-agent: *
noindex: /objekten/anzeigen/haus_antea/5-0000001575*
and hit test button in webmaster tools
I get this from Googlebot:
Allowed
Detected as a directory; specific files may have different restrictions
And I do not know what that means.
I am new in this kind of stuff so please write your answer as simpler as it can be.
Sorry for bad english.
I think Google will remove such pages that return a 404 error automatically from its index. Google will not display these pages in the results. So you don't need to care about that.
Just make sure, that these pages are not linked from other pages. If so, Google may try to index them from time to time. In this case you should return a 301 error (permanently moved) and redirect to the correct url. Google will follow the 301 errors and use the redirected url instead.
Robots.txt is only necessary, if you want to remove pages that are already in the search results. But I think pages with error code 404 will not be displayed there anyway.

SEO and 404 error redirection

Recently I've removed some products and categories from my Magento store and that generated some many 404 errors as the pages where in the research index and now they doesn't exist anymore.
I was thinking about developing/using a module that takes the request to the store when they should give 404 and use keywords from the request URL to build a search query on the website so the customers doesn't get stop by a dead link.
But the question is:
will that kill my SEO?
How does Google, for instance, couple with the 404 error suppression?
As anyone else encountered this problem and tried something like this?
Since the operation will take quite some time, I would like some feedback before going into this road.
As now, I only know that redirecting 404 error to the homepage or another page is bad as it keeps dead links alive, but redirecting with a criteria will have the same "zombifing" effect?
In their SEO guide Google recommends building a nice custom 404 page for your case (do not forget to return 404 status code).
Abstractly quoting: "A good custom 404 page will help people find the information they're looking for, as well as providing other helpful content and encouraging them to explore your site further."
Google's recommendations about 404 pages are available here.
Also do not forget to check their starter SEO guide in the "Best Practices" chapter.
I would just try to follow the recommendations as close as possible.
Good luck!
As solution is already mentioned over here i would like to add few words to this,
one doesn't need to be worried about their numbers of 404 pages shown in webmaster tools because it is natural that you create and delete you webpages. All you need to do is set a proper custom 404 pages so when user lands on pages that is no longer available they can easily navigate to the similar stuff they would be desiring from your website.
This will increase good user experience and Google loves those sites which caters their visitors in best manner.

How Can I Deal With Those Dead Links After Revamping My Web Site?

Couple of months ago, we revamped our web site. We adopted totally new site structure, specifically merged several pages into one. Everything looks charming.
However, there are lots of dead links which produce a large number of 404 errors.
So how can I do with it? If I leave it alone, could it bite back someday, say eating up my pr?
One basic option is using 301 redirect, however it is almost impossible considering the number of it.
So is there any workaround? Thanks for your considering!
301 is an excellent idea.
Consider you can take advantage of global configurations to map a group of pages. You don't necessary need to write one redirect for every 404.
For example, if you removed the http://example/foo folder, using Apache you can write the following configuration
RedirectMatch 301 ^/foo/(.*)$ http://example.org/
to catch all 404 generated from the removed folder.
Also, consider to redirect selectively. You can use Google Webmaster Tools to check which 404 URI are receiving the highest number inbound links and create a redirect configuration only for those.
Chances are the number of redirection rules you need to create will decrease drastically.
301 is definitely the correct route to go down to preserve your page rank.
Alternatively, you could catch 404 errors and redirect either to a "This content has moved" type page, or your home page. If you do this I would still recommend cherry picking busy pages and important content and setting up 301s for these - then you can preserve PR on your most important content, and deal gracefully with the rest of the dead links...
I agree with the other posts - using mod_rewrite you can remap URLs and return 301s. Note - it's possible to call an external program or database with mod_rewrite - so there's a lot you can do there.
If your new and old site don't follow any remapable pattern, then I suggest you make your 404 page as useful as possible. Google has a widget which will suggest the page the user is probably looking for. This works well once Google has spidered your new site.
Along with the other 301 suggestions, you could also split the requested url string into a search string routing to your default search page (if you have one) passing those parameters automatically to the search.
For example, if someone tries to visit http://example.com/2009/01/new-years-was-a-blast, this would route to your search page and automatically search for "new years was a blast" returning the best result for those key words and hopefully your most relevant article.