Google Ajax-crawling instructions say the !# is actually transformed into ?_escaped_fragment_ by the google crawler.
I'd like to prepare my Vaadin 7 application to be SEO ready for Google search engine so could you please tell me if there is any out of the box functionality that will simplify that process by handling a following requests with ?_escaped_fragment_ ?
If there is no out of the box solution - what is the right way in order to implement this ?
Or another idea - is it possible to use Prerender.io together with Vaadin ?
UPDATED
Looks like nowadays Google is able to crawl, render, and index the #! URLs.
Q: My site currently follows your recommendation and supports _escaped_fragment_. Would my site stop getting indexed now that you've deprecated your recommendation?
A: No, the site would still be indexed. In general, however, we recommend you implement industry best practices when you're making the next update for your site. Instead of the _escaped_fragment_ URLs, we'll generally crawl, render, and index the #! URLs.
https://webmasters.googleblog.com/2015/10/deprecating-our-ajax-crawling-scheme.html
Can someone please confirm that Vaadin application can be successfully crawled by Google bot ?
Check out the volga addon and the related blog post.
Volga is a Vaadin add-on that helps you to add meta data to your
Vaadin applications, which will help social media services and search
engines better interpret your application.
Related
I have webiste which is available in the internet. Front is written in Vue.js and CMS is created using Laravel (PHP). These are separate applications.
Unfortunatelly Google dosen't see the page because of lack of SEO (yeah! Vue). The website is big, has a lot of routes, fetches data from API ex. for a blog.
Here is my question. What is a better solution? Move website to Nuxt.js or use some fancy plugin like https://github.com/chrisvfritz/prerender-spa-plugin and fetch data from API.
I'll be gratefull for any tips!
Use #prerenderer/renderer-puppeteer if: You're prerendering up to a couple hundred pages and want accurate results.
Use #prerenderer/renderer-jsdom if: You need to prerender thousands upon thousands of pages, but quality isn't all that important, and you're willing to work around issues for more advanced cases. (Programmatic SVG support, etc.)
Will Web API based website suffer SEO problems?
Given that all content of a page is being pulled by javascript...
will search engine crawlers be able to get the page content?
I heard that crawlers do not always support javascript or perform javascript when crawling on a page.
It's not Web API that is bad for SEO, it's choosing an architecture where you use a web browser to navigate to empty HTML pages and then use JS to pull in the data. ASP.NET Web API does not have to be used that way.
You can't blame a hammer for building a bad house.
Depends.
Will ALL search engine crawlers be able to get the page content, I do not know.
Do the search engine crawlers that matter get the page content, yes.
Google and Bing combined own the search market, both can index content pulled in by javascript (and probably others as well).
Robert Scavilla on how content is indexed.
Search Engine Land on google executing javascript for indexing.
Implemented Google site search in our company website. We need to automate the google indexing for our website.
Suppose like our customers are updated the forum. We need to show the up to updated forum information in our forum search ?
Is there any option in google API or any other API please help me ?
You can use an XML sitemap. This will tell the search engines where your content is so they can find it and crawl it. Keep in mind there is no way to make the search engines crawl your site when you want them to. They will crawl on a schedule they determine to be right for your site. (You can set a crawl rate in Google Webmaster Tools but that rate is relative to what crawl rate Google already has set for you. Setting it to fastest will not speed up heir crawl rate)).
Unfortunately, Google will only crawl your site when it feels like it. It is based on many variables to determine how often this occurs (i.e. site ranking, standards compliance, and so on). The sitemap XML is a helpful way to help Google determine what parts of your site to index, however if you don't have one Google will find it by crawling links on other parts of your page and updating its index if the page changes.
The more visitors you get and the more often your site's links appear on other sites will make Google index more frequently.
To start, I'd suggest http://validator.w3.org/ to validate your site and make sure you get it as close to possible to no errors. This makes it easier for Google to index your site because it can find the information it expects without having to crawl over invalid markup. Also, chances are, if a site validates with a very small amount of errors, it is more credible than one containing many errors. It tells the search engine that you update your site to ensure most all browsers can use it and that it is accessible.
Also validating your site gives you some bragging rights over those who don't meet W3 standards :)
Hope this helps!
I have a site which has been developed completely in flash. Now the site owners do not want to shift to a more text/html based site. So am planning to create an alternative html/text based site which the googlebot will get redirected to. (By checking the useragent). My question is that is this allowed officially by google?
If not then how come there are many subscription based sites which display a different set of data to google compared to the users? Is that allowed?
Thank you very much.
I've dealt with this exact scenario for a large ecommerce site and Google essentially ignored the site. Google considers it cloaking and addresses it directly here and says:
Cloaking refers to the practice of presenting different content or URLs to users and search engines. Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google index.
Instead, create an ADA compliant version of the website so that users with screen readers and vision aids can use your web site. As long as there as link from your home page to your ADA compliant pages, Google will index them.
The official advice seems to be: offer a visible link to a non-flash version of the site. Fooling the googlebot is a surefire way to get in trouble. And remember, Google results will link to the matching page! Do not make useless results.
Google already indexes flash content so my suggestion would be to check how your site is being indexed. Maybe you don't have to do anything.
I don't think showing an alternate version of the site is good from a Google perspective.
If you serve up your page with the exact same address, then you're probably fine. For example, if you show 'http://www.somesite.com/' but direct googlebot to 'http://www.somesite.com/alt.htm', then Google might direct search users to alt.htm. You don't want that, right?
This is called cloaking. I'm not sure what the effects of it are but it is certainly not whitehat. I am pretty sure Google is working on a way to crawl flash now so it might not even be a concern.
I'm assuming you're not really doing a redirect but instead a PHP import or something similar so it shows up as the same page. If you're actually redirecting then it's just going to index the other page like normal.
Some sites offer a different level of content -- they LIMIT the content, they don't offer alternative and additional content. This is done so it doesn't index unrelated things generally.
Technorarati's got their Cosmos api, which works fairly well but limits you to noncommercial use and no more than 500 queries a day.
Yahoo's got a Site Explorer InLink Data API, but it defines the task very literally, returning links from sidebar widgets in blogs rather than just links from inside blog content.
Is there any other alternative for tracking who's linking to a given URL (think of the discussion links that run below stories on Techmeme.com)? Or will I have to roll my own?
Well, it's not an API, but if you google (for example): "link:nytimes.com", the search results that come back show inbound links to that site.
I haven't tried to implement what you want yet, but the Google search API almost certainly has that functionality built in.
Is this for links to Urls under your control?
If so, you could whip up something quick that logs entries in the Referrer HTTP header.
If you wanted to do to this for an entire web site without altering application code, you could implement as an ISAPI filter or equivalent for your web server of choice.
Information available publicly from web crawlers is always going to be incomplete and unreliable (not that my solution isn't...).