Sitemap index
A Sitemap index is an XML file that lists multiple XML or RSS sitemap files. The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file.[1] It allows webmasters to include additional information about each sitemap (when it was last updated). After creation of a Sitemap index file webmasters can notify search engines about the index file, and the other sitemaps that are included in the Sitemap index file will be automatically notified too.[2]
Contents |
[edit] XML Sitemap index Format
The Sitemap Protocol format consists of XML tags. The file itself must be UTF-8 encoded.
[edit] Sample
The following example shows a Sitemap index that lists two Sitemaps:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/sitemap1.xml.gz</loc>
<lastmod>2004-10-01T18:23:17+00:00</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.com/sitemap2.xml.gz</loc>
<lastmod>2005-01-01</lastmod>
</sitemap>
</sitemapindex>
[edit] Submitting Sitemaps
If Sitemaps are submitted directly to a search engine, it will return status information and any processing errors. Refer to Google Webmaster Tools or Yahoo SiteExplorer.
| Search engine | Submission URL | Help page |
|---|---|---|
| http://www.google.com/webmasters/sitemaps/ping?sitemap= | How do I resubmit my Sitemap once it has changed? | |
| Bing (Formerly Live Search) | http://www.bing.com/webmaster/ping.aspx?siteMap= | Webmaster Tools - Bing |
| Ask.com | http://submissions.ask.com/ping?sitemap= | Q: Does Ask.com support sitemaps? |
| Yahoo! (Disabled as of September 2011 due to Yahoo now using Bing's Search Database) |
http://search.yahooapis.com/SiteExplorerService/V1/updateNotification?appid=SitemapWriter&url= http://search.yahooapis.com/SiteExplorerService/V1/ping?sitemap= |
Does Yahoo! support Sitemaps? Site Explorer API: Notifying Yahoo of Changes |
| Yandex | — | Sitemaps files (in Russian) |
| Didikle (Turkish Mobile Search Engine) | http://www.didikle.com/ping?sitemap= | Sitemap and robots.txt info (in Turkish) |
Also, the location of the Sitemap index can be specified using a robots.txt file to help search engines find the Sitemap index files. To do this, the following lines need to be added to robots.txt:
Sitemap: <sitemap_index_location>
The <sitemap_index_location> should be the complete URL to the Sitemap index, such as: http://www.example.org/sitemap_index.xml
[edit] Sitemap Limits
Sitemap index files may not list more than 50,000 Sitemaps and must be no larger than 10 MB (10,485,760 bytes).[1] You can have more than one Sitemap index file.[1] However, Sitemap index files may not list other Sitemap index files. Therefore, the most pages a single Sitemap index can index is 2,500,000,000.
The Sitemap index file must:
- Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag.
- Include a <sitemap> entry for each Sitemap as a parent XML tag.
- Include a <loc> child entry for each <sitemap> parent tag.
The optional <lastmod> tag is also available for Sitemap index files.
[edit] Time format for <lastmod> tag
The value for the lastmod tag should be in W3C Datetime format. For example, 2007-08-25T00:00:00+00:00. This encoding allows the omission of the time portion of the ISO 8601 format; for example, 2007-08-25 is also valid.
Available time formats:
| Format | Example |
|---|---|
| YYYY-MM-DDThh:mm:ssTZD | 2007-08-25T00:00:00+00:00 |
| YYYY-MM-DDThh:mmTZD | 2007-08-25T00:00+00:00 |
| YYYY-MM-DD | 2007-08-25 |
[edit] Validating Sitemap index
Google uses an XML schema to define the elements and attributes that can appear in Sitemap index file.
In order to validate your Sitemap or Sitemap index file against a schema, the XML file will need additional headers.