The Sitemap protocol format consists of XML tags. All data values in a Sitemap must be entity-escaped. The file itself must be UTF-8 encoded.
The Sitemap must have the following:
- The head of the document
This is an XML document, so you need to start it with an XML declaration:
<?xml version="1.0" encoding="UTF-8"?>
- Begin with an opening <urlset> tag and end with a closing </urlset> tag.
Specify the namespace (protocol standard) within the <urlset> tag.
- Include a <url> entry for each URL, as a parent XML tag.
- Include a <loc> child entry for each <url> parent tag.
All other tags are optional. Support for these optional tags may vary among search engines
XML tag definitions:
| <urlset></urlset> |
Required |
Encapsulates the file and references the current protocol standard. |
| <url></url> |
Required |
Parent tag for each URL entry. The remaining tags are children of this tag. |
| <loc></loc> |
Required |
URL of the page,this URL must begin with the protocol such as http:// and must be less than 2,048 characters. |
| <lastmod></lastmod> |
Optional |
On which date the file was modified , and the date format should be YYYY-MM-DD |
<changefreq> </changefreq> |
Optional |
This field is a suggestion rather than a command to search engines. They may crawl the pages more frequently than you indicate or less. Don't rely on the option "never" to tell the search engine never to spider it. Use your robots.txt file for that. Valid values for this field are:
always,never,daily,weekly,monthly,yearly |
| <priority></priority> |
Optional |
The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are compared to pages on other sites—it only lets the search engines know which pages you deem most important for the crawlers and the default priority of a page is 0.5.
Please note that the priority you assign to a page is not likely to influence the position of your URLs in a search engine's result pages. Search engines may use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your most important pages are present in a search index.
Also, please note that assigning a high priority to all of the URLs on your site is not likely to help you. Since the priority is relative, it is only used to select between URLs on your site. |
Entity escaping
Sitemap file must be UTF-8 encoded with all XML files, any data values (including URLs) must use entity escape codes for the characters listed in the table below.
- Ampersand - & - &
- Single Quote - ' - '
- Double Quote - " - "
- Greater Than - > - >
- Less Than -< - <
|