Google isn't indexing all the Pages in my Site - indexing

In my website i use jquery to dynamically change the primary contents of a div(my primary content) so that pages are not reloading when someone presses a link but it adds content to the div.
Google searches for links in my site and finds only #smth and it does not index the pages.What should i do so that Google will index my other pages?
Any thoughts?

You can add a sitemap.xml file using the Sitemaps protocol to the root of your website (or another location specified in robots.txt). The Sitemaps protocol allows you to inform search engines about URLs on a website that are available for crawling (wiki).
An example sitemap (from referenced wiki above) looks like this:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>http://example.com/</loc>
<lastmod>2006-11-18</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
</urlset>
The crawler of a search engine will visit your sitemap.xml and index the locations specified.

I found out that the answer is to add ! in front of your hash urls and configure the server to send a snapshot of the page to google more info here

Related

How to detect amazon sitemap

I am trying to scrape some products from amazon.com, but it I can't find it in its robots.txt
I tried
amazon.com/sitemap.xml
amazon.com/sitemap.xml.gz
amazon.com/sitemap1.xml.gz
amazon.com/sitemap1.xml
all turn-up nothing
I also tried sitemap detector such like
https://seositecheckup.com/tools/sitemap-test
The result shows Amazon doesn't have a sitemap.
Is that true? or I didn't have the correct approach.
Look at robots.txt, you will see a sitemap link at bottom with access denied.
This ressources may be accessible only to robots (specific user-agent, IP...).

Sitemap files on different domains

I am creating multiple sitemap files for my website. The issue is that my sitemap files are located on a different file server from my website.
For example, I have a website by domain, www.example.com, however my sitemap index file and the other sitemap files reside on www.filestack.com.
My sitemap index file will look like:
<sitemapindex xmlns="http://www.google.com/schemas/sitemap/0.84">
<sitemap>
<loc>
https://www.filestack.com/sitemap1.xml
</loc>
</sitemap>
Though my sitemap1.xml will be:
<url>
<loc>
https://www.example.com/test
</loc>
<lastmod>2017-09-04</lastmod>
<changefreq>weekly</changefreq>
</url>
Is it possible to add links to do such a thing and how?
See Sitemaps & Cross Submits.
You have to provide a robots.txt at https://www.example.com/robots.txt which
links to the external sitemap:
Sitemap: https://www.filestack.com/sitemap1.xml
(This sitemap may only contain URLs from https://www.example.com/.)
You can use either XML sitemap or HTML sitemap as per Matt Cutts says. It's not mandatory that you must use both sitemaps. Though you can't submit an HTML sitemap to search engines, spiders can crawl your HTML sitemap and crawl pages deeper into your site. But you can not use XML sitemap that is on the different server.

sitemaps on cross submits

I went through the following link http://www.sitemaps.org/protocol.html for sitemaps & cross submits but did not get clarification if the following approach is correct or not:
I've the following websites one for desktop users and other for mobile users both server different content:
https://www.mainsite.com and https://mobile.site.com both domains point to same website physical root directory and based on user's device the domain URL is changed.
I've placed a robots.txt file in this root directory which has an entry to sitemap_index.xml file:
sitemap: https://www.mainsite.com/sitemap_index.xml
In sitemap_index file
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>
https://www.mainsite.com/sitemap_desktop_www.xml
</loc>
<lastmod>2015-09-04</lastmod>
</sitemap>
<sitemap>
<loc>
https://mobile.site.com/sitemap_mobile_www.xml
</loc>
<lastmod>2015-05-22</lastmod>
</sitemap>
</sitemapindex>
Is this approach correct?
If a bot reads robots.txt and sitemap_index.xml file for domain www.mainsite.com and in sitemap_index.xml will it ignore mobile.site.com?
You have to add your two websites in Google webmaster as two properties and in each one you can add your XML sitemap without the need for the robots.txt
Google will start crawling and indexing your website and see your XML sitemap for each domain.

How do I avoid duplicated sitemap content?

I have a php dynamic wallpaper site http://www.fondolandia.com , it's been online for a year, recently I've submited the page to a sitemap builder site, and within the sitemap file i found out that some links point to the same image in different resolutions, example
<url>
<loc>http://www.fondolandia.com/portal/display/51_palacio-europa1356656199/1600x1200</loc>
<changefreq>daily</changefreq>
<priority>0.64</priority>
</url>
<url>
<loc>http://www.fondolandia.com/portal/display/51_palacio-europa1356656199/1280x960</loc>
<changefreq>daily</changefreq>
<priority>0.64</priority>
</url>
<url>
<loc>http://www.fondolandia.com/portal/display/51_palacio-europa1356656199/1024x768</loc>
<changefreq>daily</changefreq>
<priority>0.64</priority>
</url>
<url>
<loc>http://www.fondolandia.com/portal/display/51_palacio-europa1356656199/800x600</loc>
<changefreq>daily</changefreq>
<priority>0.64</priority>
</url>
Actually 4 links to the same image but in different resolutions, should i delete those links from the sitemap, being that another link points to an overview of the same image.
<url>
<loc>http://www.fondolandia.com/portal/fondo/51_palacio-europa1356656199</loc>
<changefreq>daily</changefreq>
<priority>0.80</priority>
</url>
While doing a site:url i see many links to old images that have been deleted from the site and appear listed on the search results, should google solve this once it crawls my site or should i do something ?
thanks in advance
To answer your first question, the duplicate links in the sitemap are OK, but each page should have a canonical link to tell google that they are really the same thing.
As for the deleted pages in search results, google should solve this as well when it re-crawls the pages as long as your site is returning an appropriate status code (e.g. a 404 or a 301 redirect to some other page)

What's wrong with my sitemap? [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
We have a sitemap for our site http://www.appsamuck.com/
The sitemap is here http://www.appsamuck.com/sitemap.xml
But Google seems to hate it. My question is why? I'm just staring at it now saying to myself it looks right. Am I missing something?
3 Paths don't match
We've detected that you submitted your Sitemap using a URL path that doesn't include the www prefix (for instance, http://example.com/sitemap.xml). However, the URLs listed inside your Sitemap do use the www prefix (for instance, http://www.example.com/myfile.htm). Help Help
URL:
Problem detected on: http://www.appsamuck.com/
Oct 15, 2008
I just typed a huge response and FF crashed and I lost it I hate it when that happens!!
Basically its possible to have two sites with different content, one running under www. and one without the www a bit like a subdomain. Because of this when you submitted your sitemap google sees its on the www site (http://www.appsamuck.com/sitemap.xml) but all the urls in your sitemap do not contain the www, therefore google is wondering if the sitemap is actually for another site the non www section. Usually these two deliver the same content but not always, so google is saying hang on you put the sitemap at www, but all your pages are on a non www domain whats that about!!
The best thing to do is stick to one or the other, are you advertising the www or non www? Whichever you are using (and I suggest the www version), submit your sitemap with www and make sure all your urls in your sitemap have www in them. That way google wont throw a fit. Also sticking to one may be slightly better for SEO.
As Nick suggested above, its also a good idea to let google know which one you prefer through the preferred domain option. I would set this option
Display URLs as www.appsamuck.com (for both www.appsamuck.com and appsamuck.com)
At least google will know that your talking about the same site then.
As for the sitemap, well there are some issues with that too.
Firstly as I pointed out about its missing the www from each URL.
Secondly you are missing an xml declaration etc for the top of the file. YOu need something like this
print("code sample");<?xml version="1.0" encoding="UTF-8"?>
<urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
Like Diodeus above suggested you really should add the other fields in such as priority etc.
Here is a quick go I have done for you (note it follows on from the above as I have opened the urlset tag above and it closes at the bottom of this set of code)
print("code sample");
<url>
<loc>http://www.appsamuck.com/</loc>
<priority>1.00</priority>
<lastmod>2008-10-17T03:01:05+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/index.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-17T03:01:05+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/blog/</loc>
<priority>0.80</priority>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/about.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-16T00:00:32+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/contact.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-16T00:00:33+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/iphonesdkdev.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-14T05:41:03+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/day16.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-17T03:13:21+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/day15.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-16T15:58:57+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/day14.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-15T16:58:06+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
<url>
<loc>http://www.appsamuck.com/day13.html</loc>
<priority>0.80</priority>
<lastmod>2008-10-13T17:52:08+00:00</lastmod>
<changefreq>monthly</changefreq>
</url>
</urlset>
Its not a full list im not going to do all the work for you :)
There are also some good online tools that will create sitemaps for you, they crawl the site and build it, just google xml-sitemaps and you should find some, there are some good free ones. Also if their spider cannot find your content its a flag that google probably cannot either,so it has a dual purpose.
Hope that helps :)
Paul
This could have to do with your preferred domain setting. If your sitemap has www's in it, but you submitted the site without the www, then it could cause the confusion. What I did for my sites was to submit it with the wwww in the sitemap, and make sure I submitted to Google in Webmaster tools the same way.
Then you can go in and set the "Preferred Domain" in the Tools area for your site. From there, you can have Google only link to the non www version if you want.
I've encountered similar problems. Just resubmit the same map. Often the warnings go away.
Try adding the other fields: <lastmod></lastmod>, <changefreq></changefreq>, <priority></priority>. Your site map looks correct.
Also, make sure the status of your resubmitted map is not "pending". Google sometimes takes hours to getting around to processing your files.
I found a similiar problem today. What I did was recreate site without the www. Google usually suggest you to create your site as htttp://www.yoursitename.com but you can also enter htttp:// yoursitename. com and the verify that you are the administrator. it workd well for me. Hope this helps.