Google search console fails to fetch sitemaps | "Sitemap could not be read" - seo

I have generated a sitemap from online generators, it seems to be working and even i tested it on old google search console sitemap testor and it works. but when i submit it in both versions it just displays error message.

This is a known bug. See this Google support answer.

In my case, it's the sitemap that had a syntax error.
You should open sitemaps in Firefox, it will tell you if you have a syntax error.

Your sitemap domain address might have changed. If it is wordpress use yoast plugin, where search console will automatically consider sitemap.xml

I had the same problem and the solution was very simple, just put the full path to your sitemap.
Where the console asks 'add new sitemap', instead of writing /sitemap.xml, write the full path, such as https://example.com/sitemap.xml.
That should fix the problem.

Using the yoast SEO plugin which built out 10 sitemaps, the index got red the first time and only one of the sub-sitemaps did. I manually visited the other sitemaps (likely they took to long to respond I thought) and deleted the sitemap on google search console and re-uploaded. All were read that time.

I had this issue and it was because I didn't set the content-type to application/xml
This sitemap validator notified me of the issue: https://www.xml-sitemaps.com/validate-xml-sitemap.html

Enter the full URL of your sitemap, e.g., https://example.com/sitemap.xml. Also, ensure your sitemap name does not include numbers and symbols.

Related

Main site url removed from google despite re-submitting it

I have a site www.megalim.co.il,
recently due to a version upgrade, I discovered that i have a robots.txt file that disallowed all Search engines.. my google ranking dropped , and I couldn't find the site's main page anymore
I changed the robots.txt file to one that allows all, and now the web master toolkit doesn't
write me that the site is blocked from google.
I did this about 5 days ago, I've also fetched as google
and submitted www.megalim.co.il to index with all related pages
but still, when i search this: "site:www.megalim.co.il"
i get a bunch of results from my site , but not the main page!
what else should I look for?
thanks!
Igal
You don't see your main page because of your old robots.txt. 5 days is nothing for Google bots to re-index all your website.
Just wait a little and you will see your website fully indexed in Google results.
Issue sorted out..
embarassing...
apparently we (inexplicably) had a nofollow, noindex meta tag..
after a day we start reappearing in google
thanks :)

sitemap on a website only shows one link

I have a little issue concerning the sitemap of a website.
The website is : http://parmemarin.com.
If I go on http://www.xml-sitemaps.com/ and try to generate a sitemap for my website I end up with only one link in my sitemap which is _inc/conditions.php
There is no index.php or other of my links (index.php?page=...)
Can someone help me on this one ?
Thanks
Well, that could be a problem with xml-sitemaps.com, it could be a problem with your site, or it could be a combination of both. I don't know that service, so I can't tell how it really works and what it does.
Skimming through your markup, I noticed that you linked to some URLs without encoding the ampersand. & (if used for parameters) should be & if you link it in HTML.
I wonder: Why don't you build your sitemap yourself? You are the only one who knows for sure which pages exist on your site. Collect all your URLs and put them in a file, like http://www.sitemaps.org/ describes.
thanks, for your answer.
I had totally forgot to delete the meta : no index, no follow ... That clearly didn't helped :)

Block google from indexing some pages from site

I have a problem with lots of 404 errors on one site. I figured out that these errors are happening because google is trying to find pages that no longer exist.
Now I need to tell Google not to index those pages again.
I found some solutions on the internet about using robots.txt file. But this is not a site that I built. I just need to fix those errors.The thing is, those pages are generated. They do not physically exist in that form. So I can not add anything in php code.
And I am not quite sure how to add those to robot.txt.
When I just write:
*User-agent: *
noindex: /objekten/anzeigen/haus_antea/5-0000001575*
and hit test button in webmaster tools
I get this from Googlebot:
Allowed
Detected as a directory; specific files may have different restrictions
And I do not know what that means.
I am new in this kind of stuff so please write your answer as simpler as it can be.
Sorry for bad english.
I think Google will remove such pages that return a 404 error automatically from its index. Google will not display these pages in the results. So you don't need to care about that.
Just make sure, that these pages are not linked from other pages. If so, Google may try to index them from time to time. In this case you should return a 301 error (permanently moved) and redirect to the correct url. Google will follow the 301 errors and use the redirected url instead.
Robots.txt is only necessary, if you want to remove pages that are already in the search results. But I think pages with error code 404 will not be displayed there anyway.

Google, do not index YET

In the effort of building a live site on its actual live hosting platform is there a way to tell google to not YET index the website? I found the following:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93710
But would that tell them to never come back or would they simply see the noindex tag and then not list the results, then when it comes back to crawl again later and my site is good to go I would have the noindex removed and the site would then start getting indexed?
Sounds like you want to use a robots.txt file instead:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449&topic=2370588&ctx=topic
Update your robots.txt file when you want your content to be indexed.
You can use the robot.txt method.
You can specify which subpage could be spidered. And google comes back, checking the file before indexing. So you can delete the file later in order to get fully indexed.
More Information
About /robots.txt
Robots.txt File Generator
You can always change it. The way Google and other robots find your page is if it is linked to on another page. As long as it isn't linked to on another page, it won't be found. Also, once your site is up, chances are that it will be far back in the list of sites.

Remove deleted page from Google search results

So I have a website that I recently made changes to, and one of the changes was removing a page from the site. I deleted the page, it doesn't exist anymore.
However, when you search for my site, one of the results is the page that I deleted. People are clicking on the page and getting an error.
How do I remove that page from the search results?
Here is the solution
First get ur site on google webmaster. Then go to site configuration -- > crawler access --> remove url . Click on New removal request and add the page you want to remove and make sure you have added that page to the robots.txt of your site. Google will deindex the page within 24 hrs.
You simply wait for googles robots to find out that it doesn't exist anymore.
A trick that used to work is to upload a sitemap to google where you add the url to the deleted page and set it to top priority and that it changes every day. That way the google robots will prio that page and quicker find out that its not there anymore.
There might be other ways but none that are known to me.
You can remove specific pages using the webmaster tools I believe.
Yahoo Web tools offer a similar service as I understand it.
This information was correct the last time I tried to do this a little while ago.
You should go to https://www.google.com/webmasters/tools/removals and remove the pages which you want.