Multiple sitemaps within single sitemap file - seo

I have a site with a sitemap, around 150 entries to URLs that are static on the site. lastmod element is set to 2012. This sitemap was updated approximately a year ago.
The last couple of lines of this SM file are:
<url><loc>http://example.com/sitemap2.xml</loc><changefreq>daily</changefreq></url>
<url><loc>http://example.com/siteMap3.xml</loc><changefreq>daily</changefreq></url>
</urlset>
Sitemap 2 contains the same logic but with links targetting specific products and sitemap 3 does the same but aimed towards categories. These two are generated daily.
The main sitemap.xml is registered. An external SEO advisor ran a test and advised the sitemap is not updated and does not list the links for products and categories.
How could I check if what he has said is correct? If he is correct what could I have done wrong here?

Having multiple sitemaps is fine, but you should link them from a sitemap index file.
For your case, you could use something like this:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://example.com/sitemap1.xml</loc>
<lastmod>2012</lastmod>
</sitemap>
<sitemap>
<loc>http://example.com/sitemap2.xml</loc>
<lastmod>2016-10-11</lastmod>
</sitemap>
<sitemap>
<loc>http://example.com/sitemap3.xml</loc>
<lastmod>2016-10-11</lastmod>
</sitemap>
</sitemapindex>
Instead of changefreq, you have to use lastmod, which takes the date of the last modification of the sitemap (not of the sitemap entries).
This sitemap index file can then be linked in your robots.txt (and/or be submitted to search engines).

Related

How can I use pagemap with google custom search refinements?

I'm trying to add refinements to my google custom search.
I have meta tags on just about every page of the site, such as
<meta name="type-id" content="241" />
Where there are many different types, and I want to have one refinement for each type.
In the docs, it says
You can also use these more:pagemap: operators with refinement labels
But I have been unable to do that.
Note that I have had success using more:pagemap:metatags-type-id:241 in the search input, or as a webSearchQueryAddition - but despite googles docs, I haven't been able to get it to work with a refinement.
Here's a sample from my cse.xml (removing some attributes from the CustomSearchEngine tag):
<?xml version="1.0" encoding="UTF-8"?>
<CustomSearchEngine>
<Title>Test</Title>
<Context>
<Facet>
<FacetItem>
<Label name="videos" mode="FILTER">
<Rewrite>more:p:metatags-article-keyword:121</Rewrite>
</Label>
<Title>Videos</Title>
</FacetItem>
</Facet>
</Context>
</CustomSearchEngine>
Is this supposed to work? Am I using wrong syntax in the rewrite rule? Has anyone else done something like this?
Your label in the facet should be mode="BOOST" if you want to restrict to a structured data field within the scope of your engine.
<Facet>
<FacetItem>
<Label name="videos" mode="BOOST">
<Rewrite>more:p:metatags-article-keyword:121</Rewrite>
</Label>
<Title>Videos</Title>
</FacetItem>
</Facet>

Why this dash manifest keeps the player stuck until streams are downloaded?

I have this manifest file below . The issue is that the player waits for the streams to download completely before to start playing which is bad for the user experience. Any idea how to fix it? I expected the player to start range requests and feed media source with partial requests instead to wait for the streams to completely download.
<MPD xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd" profiles="urn:mpeg:dash:profile:isoff-live:2011" type="static" mediaPresentationDuration="PT30M67.6S" minBufferTime="PT2S">
<ProgramInformation></ProgramInformation>
<Period id="0" start="PT0.0S">
<AdaptationSet id="0" contentType="video" segmentAlignment="true" bitstreamSwitching="true" lang="und">
<Representation id="0" mimeType="video/webm" codecs="vp9" bandwidth="770153" width="854" height="480" frameRate="23421/1000">
<BaseURL>https://liveradio.s3.eu-central-1.amazonaws.com/video.webm</BaseURL>
<SegmentList duration="1840613" startNumber="1">
<Initialization range="0-219"/>
<SegmentURL indexRange="220-6592"/>
</SegmentList>
</Representation>
</AdaptationSet>
<AdaptationSet id="1" contentType="audio" segmentAlignment="true" bitstreamSwitching="true" lang="und">
<Representation id="1" mimeType="audio/webm" codecs="opus" bandwidth="115412" audioSamplingRate="48000">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/>
<BaseURL>https://liveradio.s3.eu-central-1.amazonaws.com/audio.webm</BaseURL>
<SegmentList duration="1840641" startNumber="1">
<Initialization range="0-258"/>
<SegmentURL indexRange="259-3444"/>
</SegmentList>
</Representation>
</AdaptationSet>
</Period>
</MPD>
You seem to be using a mix of the DASH 'live' profile approach and the 'on-demand' profile one - you can see the profile in the profiles="urn:mpeg:dash:profile:isoff-live:2011" at the top of your manifest.
At a very high level the difference is:
'live' profile manifests contain a list of urls for each segment to be downloaded.
'on-demand' profile manifests contain a URL to a file and an index to where the segments can be found in the file, so the client can download chunks as it wants.
DASH is a complex specification and it may be that some players will accept some mixes of profiles and others not, and not all players support all features - for example Shaka player claims not to support 'indexRange' (or did in 2017: https://github.com/google/shaka-player/issues/765)

Multi-tiered sitemap?

From: http://www.sitemaps.org/protocol.html :
If you want to list more than 50,000 URLs, you must create multiple
Sitemap files <...> If you do provide multiple Sitemaps, you should
then list each Sitemap file in a Sitemap index file. Sitemap index
files may not list more than 50,000 Sitemaps and must be no larger
than 10MB (10,485,760 bytes) and can be compressed. You can have more
than one Sitemap index file.
Is it possible then to create a 3- or more tiered chain? For example:
//mysite/sitemap.xml is:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://mysite/sitemaps/index.xml</loc>
<lastmod>2004-10-01T18:23:17+00:00</lastmod>
</sitemap>
</sitemapindex>
//mysite/sitemaps/index.xml is:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://mysite/sitemaps/sitemap-lm.xml.gz</loc>
</sitemap>
<sitemap>
<loc>http://mysite/sitemaps/sitemap-1.xml.gz</loc>
</sitemap>
....
</sitemapindex>
and //mysite/sitemaps/sitemap-lm.xml.gz is a normal gzipped XML-file, passing validation and so on.
Id est:
/robots.txt -> /sitemap.xml -> /sitemaps/sitemapslist.xml ->
/sitemaps/sitemap-1.xml.gz
The specification doesn't give a clear answer.
Google and personal input both have yelded inconclusive and contradictory answers, ranging from "sure, why not" to "no, because nobody does it that way".
Any ideas would be welcome!
No, I don't think you can proceed that way, but there is nothing preventing you from declaring multiple <sitemapindex> sitemaps in your robots.txt with multiple sitemap:<path-to-a-sitemap-index-sitemap> lines in it.
Your robots.txt would be the 1st level, the listed <sitemapindex> sitemaps would be the second level, and the real sitemaps would be the 3rd level.

<meta name="msapplication-config" content="none"> for browserconfig.xml not working

We are developing a site for a client. We were getting a number of 404 requests for /browserconfig.xml. Then I read over here : http://msdn.microsoft.com/en-us/library/ie/dn320426(v=vs.85).aspx that if you do not want to support a browserconfig request you could add meta name="msapplication-config" content="none" in the head section.
However, even after adding the above meta tag still I am getting 404's for /browserconfig.xml.
Any pointers on this?
Adding a meta tag might or might not work. We also added this tag, but we still received 404 errors for browserconfig.xml requests all the time. At the end we decided to do a simple browserconfig.xml and after that we had no problems.
Our browserconfig.xml looks like this and basically it just tells where 4 images are located.
<?xml version="1.0" encoding="utf-8"?>
<browserconfig>
<msapplication>
<tile>
<square70x70logo src="/mstile-70x70.png"/>
<square150x150logo src="/mstile-150x150.png"/>
<wide310x150logo src="/mstile-310x150.png"/>
<square310x310logo src="/mstile-310x310.png"/>
<TileColor>#8bc53f</TileColor>
<TileImage src="/mstile-150x150.png" />
</tile>
</msapplication>
</browserconfig>

list=alllinks confusion

I'm doing a research project for the summer and I've got to use get some data from Wikipedia, store it and then do some analysis on it. I'm using the Wikipedia API to gather the data and I've got that down pretty well.
What my questions is in regards to the links-alllinks option in the API doc here
After reading the description, both there and in the API itself (it's down and bit and I can't link directly to the section), I think I understand what it's supposed to return. However when I ran a query it gave me back something I didn't expect.
Here's the query I ran:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=google&rvprop=ids|timestamp|user|comment|content&rvlimit=1&list=alllinks&alunique&allimit=40&format=xml
Which in essence says: Get the last revision of the Google page, include the id, timestamp, user, comment and content of each revision, and return it in XML format.
The allinks (I thought) should give me back a list of wikipedia pages which point to the google page (In this case the first 40 unique ones).
I'm not sure what the policy is on swears, but this is the result I got back exactly:
<?xml version="1.0"?>
<api>
<query><normalized>
<n from="google" to="Google" />
</normalized>
<pages>
<page pageid="1092923" ns="0" title="Google">
<revisions>
<rev revid="366826294" parentid="366673948" user="Citation bot" timestamp="2010-06-08T17:18:31Z" comment="Citations: [161]Tweaked: url. [[User:Mono|Mono]]" xml:space="preserve">
<!-- The page content, I've replaced this cos its not of interest -->
</rev>
</revisions>
</page>
</pages>
<alllinks>
<!-- offensive content removed -->
</alllinks>
</query>
<query-continue>
<revisions rvstartid="366673948" />
<alllinks alfrom="!2009" />
</query-continue>
</api>
The <alllinks> part, its just a load of random gobbledy-gook and offensive comments. No nearly what I thought I'd get. I've done a fair bit of searching but I can't seem to find a direct answer to my question.
What should the list=alllinks option return?
Why am I getting this crap in there?
You don't want a list; a list is something that iterates over all pages. In your case you simply "enumerate all links that point to a given namespace".
You want a property associated with the Google page, so you need prop=links instead of the alllinks crap.
So your query becomes:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions|links&titles=google&rvprop=ids|timestamp|user|comment|content&rvlimit=1&format=xml