How to remove URLs with argument in google result - seo

I have a website which I have recently started and also submitted my sitemap on google webmaster tool. My site got index whiten short time but whenever I search about my website on google, I see two three version of my same pages with diff URL arguments on each
Means suppose my site name is example.com, so when I search about exmaple.com on Google I get the results like following
www.example.com/?page=2
www.example.com/something/?page=3
www.example.com
As I know result 1 and result 3 are same, why are they being shown separately ? I don't have any such URL in my sitemap and not even in any of my html page so why is this happening I am little confused. I want to get rid of it
Also result no 2 should be displayed simple as www.exaple.com/something
and not like www.example.com/something?page=3

There is actually a setting in google webmaster tool which helps in removing URLs with parameters. To access & configure the setting, navigate to Webmaster tool --> Crawl --> URL Parameters and set them according to your needs
I also found following article useful for understanding concept behind those parameters and how could we remove pages getting crawled with unnecessary parameters
http://www.shoutmeloud.com/google-webmaster-tool-added-url-parameter-option-seo.html

Related

How to force google to show my first page from a page set with pagination?

I have a website and in my website I have, for example, a list of Audi models. I saw, using google webmaster tools, that my website appears in the google search by the word audi, but the target page was the 22nd page from my result set, not the first. I need my first page to appead, not my last (or middle), but I cannot tell google that this is a parameter, because my URLs are rewritten using mod rewrite. Any ideas?
BTW, I have read in a SEO forum, that it's a bad idea to use a cannonical tag. So is it really a bad idea in my case?
You can't force Google to do anything, however, they have made it easier to deal with pagination issues with a recent post on rel="next" and rel="prev".
But the primary problem you face is signalling to Google that your first (main) page is the starting point - this is achieved using internal link and back-link "juice" focussed on that page. You need to ensure that the first page of results is linked to properly from higher-value pages (like the home-page).
Google recently announced that you can use View All which will allow them to find and index entire articles that are normally broken up using pagination and display them all as one result.

will limiting dynamic urls with robots.txt improve my SEO ranking?

My website has about 200 useful articles. Because the website has an internal search function with lots of parameters, the search engines end up spidering urls with all possible permutations of additional parameters such as tags, search phrases, versions, dates etc. Most of these pages are simply a list of search results with some snippets of the original articles.
According to Google's Webmaster-tools Google spidered only about 150 of the 200 entries in the xml sitemap. It looks as if Google has not yet seen all of the content years after it went online.
I plan to add a few "Disallow:" lines to robots.txt so that the search engines no longer spiders those dynamic urls. In addition I plan to disable some url parameters in the Webmaster-tools "website configuration" --> "url parameter" section.
Will that improve or hurt my current SEO ranking? It will look as if my website is losing thousands of content pages.
This is exactly what canonical URLs are for. If one page (e.g. article) can be reached by more then one URL then you need to specify the primary URL using a canonical URL. This prevents duplicate content issues and tells Google which URL to display in their search results.
So do not block any of your articles and you don't need to enter any parameters, either. Just use canonical URLs and you'll be fine.
As nn4l pointed out, canonical is not a good solution for search pages.
The first thing you should do is have search results pages include a robots meta tag saying noindex. This will help get them removed from your index and let Google focus on your real content. Google should slowly remove them as they get re-crawled.
Other measures:
In GWMT tell Google to ignore all those search parameters. Just a band aid but may help speed up the recovery.
Don't block the search page in the robots.txt file as this will block the robots from crawling and cleanly removing those pages already indexed. Wait till your index is clear before doing a full block like that.
Your search system must be based on links (a tags) or GET based forms and not POST based forms. This is why they got indexed. Switching them to POST based forms should stop robots from trying to index those pages in the first place. JavaScript or AJAX is another way to do it.

Important elements of web page are "nvisible"

I ran my website through a web tool that evaluates SEO weight of elements and in the report it says that certain parts, like Description and other meta tags are missing... Also as a thumbnail of my site it shows a default server page. At the same time it shows the list of other pages that are linked from index page.
I checked and this AGENT is not blocked in robots.txt
Now, how can that be?
Demo
I think that the description issue is caused by the the fact that you are using "META" instead of "meta" in your meta tags.
there are many sites out that can run similar tests on your site such as the one you provided. It is just that site showing old data, you may want to submit your sitemap.xml to Bing & Google Webmaster Tools. If your site doesn't have a sitemap.xml file you may want to consider creating one.

What should i add to my site to make google index the subpages as well

I am a beginner web developer and i have a site JammuLinks.com, it is built on php. It is a city local listing search engine. Basically i've written search pages which take in a parameter, fetch the records from the database and display it. So it is dynamically generating the content. However if you look at the bottom of the site, i have added many static links where i have hard coded the parameters in the link like searchresult.php?tablename='schools'. So my question is
Since google crawls the page and also the links listed in the page, will it be crawling the results page data as well? How can i identify if it has. So far i tried site:www.jammulinks.com but it results the homepage and the blog alone.
What more can i add to make the static links be indexed by it as well.
The best way to do this is to create a sitemap document (you can even get the template from Google's webmaster portion of their sites, www.google.com/webmasters/ I believe).

Google Webmaster Tools - Remove query parameters from URL

I am using JBoss Seam on a Jetty web server and am having some issues with the query parameters breaking links when they appear in google searches.
The first parameter is one JBoss Seam uses to track conversations, cid or conversationId. This is a minor issue as Google is complaining I am submitting different urls with the same information.
Secondly, would it make sense to publish/remove urls via the Google Webmaster API instead of publishing/removing via the sitemap?
Walter
Hey Walter, I would recommend that you use the rel=canonical tag to tell the search engines to ignore certain parameters in your URL strings. The canonical tag is a common standard that Google, Yahoo and Microsoft have committed to supporting.
For example, if JBoss is creating URLs that look like this: mysite.com?cid=FOO&conversationId=BAR, then you can create a canonical tag in the section of your website like this:
<html>
<head>
<link rel="canonical" href="http://mysite.com" />
</head>
</html>
The search engines will use this information to normalize the URLs on your website to the canonical (or shortest & most authoritative) version. Specifically, they will treat this as a 301 redirect from the URL of the HTTP request to the URL specified in the canonical tag (as long as you haven't done anything silly, like make it an infinite loop, or pointed to a URL that doesn't exist).
While the canonical tag is pretty fricken cool, it is only a 90% solution, in that you can still run into issues with metrics tracking with all the extra parameters on your website. The best solution would be to update your infrastructure to trap these tracking parameters, create a cookie, and then use a 301 redirect to redirect the URL to the canonical version. However, this can be a prohibitive amount of work for that extra 10% gain, so many people prefer to start with the canonical tag.
As for your second question, generally you don't want to remove these URLs from Google if people are linking to them. By using the canonical tag, you achieve the same goal, but don't loose any value of the inbound links to your website.
For more information about the canonical tag, and the specific issues & solutions, check out this article I wrote on it here: http://janeandrobot.com/library/url-referrer-tracking.
Google Webmaster Tools will tell you about duplicate titles and other issues that Google see that are being caused by "duplicates" that are really the same page being served up with two different URL versions. I suggest trying to make sure the number of errors listed in Webmaster Tools account under duplicate titles is as close to zero as possible.