Multi-tiered sitemap? - seo

From: http://www.sitemaps.org/protocol.html :
If you want to list more than 50,000 URLs, you must create multiple
Sitemap files <...> If you do provide multiple Sitemaps, you should
then list each Sitemap file in a Sitemap index file. Sitemap index
files may not list more than 50,000 Sitemaps and must be no larger
than 10MB (10,485,760 bytes) and can be compressed. You can have more
than one Sitemap index file.
Is it possible then to create a 3- or more tiered chain? For example:
//mysite/sitemap.xml is:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://mysite/sitemaps/index.xml</loc>
<lastmod>2004-10-01T18:23:17+00:00</lastmod>
</sitemap>
</sitemapindex>
//mysite/sitemaps/index.xml is:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://mysite/sitemaps/sitemap-lm.xml.gz</loc>
</sitemap>
<sitemap>
<loc>http://mysite/sitemaps/sitemap-1.xml.gz</loc>
</sitemap>
....
</sitemapindex>
and //mysite/sitemaps/sitemap-lm.xml.gz is a normal gzipped XML-file, passing validation and so on.
Id est:
/robots.txt -> /sitemap.xml -> /sitemaps/sitemapslist.xml ->
/sitemaps/sitemap-1.xml.gz
The specification doesn't give a clear answer.
Google and personal input both have yelded inconclusive and contradictory answers, ranging from "sure, why not" to "no, because nobody does it that way".
Any ideas would be welcome!

No, I don't think you can proceed that way, but there is nothing preventing you from declaring multiple <sitemapindex> sitemaps in your robots.txt with multiple sitemap:<path-to-a-sitemap-index-sitemap> lines in it.
Your robots.txt would be the 1st level, the listed <sitemapindex> sitemaps would be the second level, and the real sitemaps would be the 3rd level.

Related

Why this dash manifest keeps the player stuck until streams are downloaded?

I have this manifest file below . The issue is that the player waits for the streams to download completely before to start playing which is bad for the user experience. Any idea how to fix it? I expected the player to start range requests and feed media source with partial requests instead to wait for the streams to completely download.
<MPD xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:mpeg:dash:schema:mpd:2011" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd" profiles="urn:mpeg:dash:profile:isoff-live:2011" type="static" mediaPresentationDuration="PT30M67.6S" minBufferTime="PT2S">
<ProgramInformation></ProgramInformation>
<Period id="0" start="PT0.0S">
<AdaptationSet id="0" contentType="video" segmentAlignment="true" bitstreamSwitching="true" lang="und">
<Representation id="0" mimeType="video/webm" codecs="vp9" bandwidth="770153" width="854" height="480" frameRate="23421/1000">
<BaseURL>https://liveradio.s3.eu-central-1.amazonaws.com/video.webm</BaseURL>
<SegmentList duration="1840613" startNumber="1">
<Initialization range="0-219"/>
<SegmentURL indexRange="220-6592"/>
</SegmentList>
</Representation>
</AdaptationSet>
<AdaptationSet id="1" contentType="audio" segmentAlignment="true" bitstreamSwitching="true" lang="und">
<Representation id="1" mimeType="audio/webm" codecs="opus" bandwidth="115412" audioSamplingRate="48000">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/>
<BaseURL>https://liveradio.s3.eu-central-1.amazonaws.com/audio.webm</BaseURL>
<SegmentList duration="1840641" startNumber="1">
<Initialization range="0-258"/>
<SegmentURL indexRange="259-3444"/>
</SegmentList>
</Representation>
</AdaptationSet>
</Period>
</MPD>
You seem to be using a mix of the DASH 'live' profile approach and the 'on-demand' profile one - you can see the profile in the profiles="urn:mpeg:dash:profile:isoff-live:2011" at the top of your manifest.
At a very high level the difference is:
'live' profile manifests contain a list of urls for each segment to be downloaded.
'on-demand' profile manifests contain a URL to a file and an index to where the segments can be found in the file, so the client can download chunks as it wants.
DASH is a complex specification and it may be that some players will accept some mixes of profiles and others not, and not all players support all features - for example Shaka player claims not to support 'indexRange' (or did in 2017: https://github.com/google/shaka-player/issues/765)

How can I have named entities in asciidoctor?

I'm using asciidoctor with the docbook backend for books. In the past I wrote DocBook, which allows me to declare named entities that I use throughout the book:
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE book [
<!ENTITY class "Galactic TOP SECRET">
<!ENTITY project "World Domination">
<!ENTITY product "Illuminati Mind Control Chemtrail Spray System CSS-2020">
]>
<book ...>
...
What about our &class; &project;?
Is our &product; working?
...
</book>
:-)
I haven't found a way to tell asciidoctor to insert the DOCTYPE declaration between the XML processing instruction and the <book> element. So I resorted to --no-header-footer and prepending the header and footer lines. Is there a better way to do this? Something like a named entity definition directive? An include mechanism?
Do you have to use Docbook entity declarations? Asciidoctor has "attributes" that can serve the same purpose: https://asciidoctor.org/docs/user-manual/#attributes
For example, you can define an attribute within your document:
:class: Galactic TOP SECRET
Then later in your document, you can use the attribute:
"Billy, come up to the front and address the {class}." said the teacher.
When you transform your document to Docbook, you would see:
<simpara>"Billy, come up to the front and address the Galactic TOP SECRET."
said the teacher.</simpara>
If you do have to use Docbook entity declarations, you might use some XSL to transform the XML you get into the XML you want.

Multiple sitemaps within single sitemap file

I have a site with a sitemap, around 150 entries to URLs that are static on the site. lastmod element is set to 2012. This sitemap was updated approximately a year ago.
The last couple of lines of this SM file are:
<url><loc>http://example.com/sitemap2.xml</loc><changefreq>daily</changefreq></url>
<url><loc>http://example.com/siteMap3.xml</loc><changefreq>daily</changefreq></url>
</urlset>
Sitemap 2 contains the same logic but with links targetting specific products and sitemap 3 does the same but aimed towards categories. These two are generated daily.
The main sitemap.xml is registered. An external SEO advisor ran a test and advised the sitemap is not updated and does not list the links for products and categories.
How could I check if what he has said is correct? If he is correct what could I have done wrong here?
Having multiple sitemaps is fine, but you should link them from a sitemap index file.
For your case, you could use something like this:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://example.com/sitemap1.xml</loc>
<lastmod>2012</lastmod>
</sitemap>
<sitemap>
<loc>http://example.com/sitemap2.xml</loc>
<lastmod>2016-10-11</lastmod>
</sitemap>
<sitemap>
<loc>http://example.com/sitemap3.xml</loc>
<lastmod>2016-10-11</lastmod>
</sitemap>
</sitemapindex>
Instead of changefreq, you have to use lastmod, which takes the date of the last modification of the sitemap (not of the sitemap entries).
This sitemap index file can then be linked in your robots.txt (and/or be submitted to search engines).

Davical Sync-Token web request

I am trying not to re-invent the wheel here...
I have found some nice documentation on CalDav sync implementation there
According to its website, DaviCal is rfc6578-compliant since v. 0.9.8 (see here).
I therefore first send my request to get the sync token as follows:
PROPFIND http://my_cal_srv/user/calendar_path HTTP/1.1
Content-Type: application/xml; charset="utf-8"
<?xml version="1.0" encoding="utf-8" ?>
<d:propfind xmlns:d='DAV:'>
<d:prop>
<d:displayname />
<d:sync-token />
</d:prop>
</d:propfind>
This returns data as expected:
<?xml version="1.0" encoding="utf-8" ?>
<multistatus xmlns="DAV:">
<response>
<href>/caldav.php/user/calendar_path/</href>
<propstat>
<prop>
<displayname>My Calendar</displayname>
<sync-token>data:,9</sync-token>
</prop>
<status>HTTP/1.1 200 OK</status>
</propstat>
</response>
</multistatus>
So far so good, I have a token, it's "data: ,9". So, let's just try to get changes since 8, the token I had when I queried the server prior to adding some event.
REPORT http://my_cal_srv/user/calendar_path HTTP/1.1
Content-Type: application/xml; charset="utf-8"
<?xml version="1.0" encoding="utf-8" ?>
<d:sync-collection xmlns:d="DAV:">
<d:sync-token>8</d:sync-token>
<d:sync-level>1</d:sync-level>
<d:prop>
<d:getetag/>
</d:prop>
</d:sync-collection>
The answer is:
<?xml version="1.0" encoding="utf-8" ?>
<multistatus xmlns="DAV:">
<response>
<href>/caldav.php/user/path/86166f9c-3e2e-4242-9a28-0f3bfb1dd67a-caldavsyncadapter.ics</href>
<propstat>
<prop>
<getetag>"5ed2101b0c867e490dbd71d40c7071bb"</getetag>
</prop>
<status>HTTP/1.1 200 OK</status>
</propstat>
</response>
<response>
<href>/caldav.php/user/path/cb354fab-b41d-49ad-8a4f-8d68c9090ea0.ics</href>
<propstat>
<prop>
<getetag>"334892703f4151024e9232eab9b515a7"</getetag>
</prop>
<status>HTTP/1.1 200 OK</status>
</propstat>
</response>
<sync-token>data:,9</sync-token>
</multistatus>
After deleting an entry (so I get sync-token 10, and still compare using token 8), I get following result :
<?xml version="1.0" encoding="utf-8" ?>
<multistatus xmlns="DAV:">
<response>
<href>/caldav.php/user/cal_path/86166f9c-3e2e-4242-9a28-0f3bfb1dd67a-caldavsyncadapter.ics</href>
<status>HTTP/1.1 404 Not Found</status>
</response>
<response>
<href>/caldav.php/user/cal_path/cb354fab-b41d-49ad-8a4f-8d68c9090ea0.ics</href>
<propstat>
<prop>
<getetag>"334892703f4151024e9232eab9b515a7"</getetag>
</prop>
<status>HTTP/1.1 200 OK</status>
</propstat>
</response>
<sync-token>data:,10</sync-token>
</multistatus>
So I am a little confused here as I don't really know how to interpret these results...
Could anybody please explain to me how to extract the sync info from here? It is a little hard to figure out the changes types because the ICS namings are unclear...
Thanks in advance for helping out... And merry X-Mas !
Regards,
N.
That you get a "data:,9" doesn't imply you can actually query "data:,8" or ,7 etc. Sync tokens are opaque and do NOT give you a versioning system (you need sth like DAV Versioning Extensions for that).
DAV sync-tokens are a simple optimization technique - nothing more. They are completely opaque to the client and the server can expire sync tokens at any time (and is not required to save tombstones and such). Eg a server which can't store tombstones can simply expire tokens on DELETE requests.
The way you use sync-tokens is:
to figure out which child collections of a parent collection need to be re-synced
to optimize syncing of a the resources within a child collection
1) Which child collections need to be synced
Assume you have a collection of calendars (e.g. my_cal_srv/user/) and you do a PROPFIND Depth:1 on this collection, asking for the sync-tokens of the child collections. If those don't match the ones of your clients cache anymore, you know you need to perform a sync of just those child collections.
Note: do NOT use the token you got back from this request to sync the child collection (which is what you do above). It might have expired already. Within sync-reports only use tokens you got from sync-reports!
2) Optimizing syncing of the collection contents
Again: sync-token's are an optimization, nothing more. You always need to be prepared to get an (in)valid-sync-token precondition error (which means the server expired the token) and do a full refetch of the collection contents! And then compare that (URL,ETag) to your cached version to figure out what the changes are. (essentially all the steps you need to do when you have a server which doesn't support sync-reports).
If you get a sync-token in the sync-report results, you can then use it in the next sync-request. If the server still has the state, it'll just give you the changes. If it expired the token, it'll give you the sync-token error.
Note: In case it isn't obvious - in the very first sync-request you don't (can't) provide a token. You run the query w/o a token and get back all the contents. You do the same again if the server sends you an (in)valid-sync-token error.
You're not doing a correct request. In your request you have:
<d:sync-token>8</d:sync-token>
But this should be:
<d:sync-token>data:,8</d:sync-token>
Aside from that, the first response you are getting tells you that:
These resources have been changed or newly created:
/caldav.php/user/path/86166f9c-3e2e-4242-9a28-0f3bfb1dd67a-caldavsyncadapter.ics
/caldav.php/user/path/cb354fab-b41d-49ad-8a4f-8d68c9090ea0.ics
The second response tells you:
This resource has been changed or newly created:
/caldav.php/user/cal_path/cb354fab-b41d-49ad-8a4f-8d68c9090ea0.ics
This resource has been deleted:
/caldav.php/user/cal_path/86166f9c-3e2e-4242-9a28-0f3bfb1dd67a-caldavsyncadapter.ics

<meta name="msapplication-config" content="none"> for browserconfig.xml not working

We are developing a site for a client. We were getting a number of 404 requests for /browserconfig.xml. Then I read over here : http://msdn.microsoft.com/en-us/library/ie/dn320426(v=vs.85).aspx that if you do not want to support a browserconfig request you could add meta name="msapplication-config" content="none" in the head section.
However, even after adding the above meta tag still I am getting 404's for /browserconfig.xml.
Any pointers on this?
Adding a meta tag might or might not work. We also added this tag, but we still received 404 errors for browserconfig.xml requests all the time. At the end we decided to do a simple browserconfig.xml and after that we had no problems.
Our browserconfig.xml looks like this and basically it just tells where 4 images are located.
<?xml version="1.0" encoding="utf-8"?>
<browserconfig>
<msapplication>
<tile>
<square70x70logo src="/mstile-70x70.png"/>
<square150x150logo src="/mstile-150x150.png"/>
<wide310x150logo src="/mstile-310x150.png"/>
<square310x310logo src="/mstile-310x310.png"/>
<TileColor>#8bc53f</TileColor>
<TileImage src="/mstile-150x150.png" />
</tile>
</msapplication>
</browserconfig>