ROBOTS: Difference between revisions

Line 63: Line 63:
== SITEMAP ==
== SITEMAP ==


<u>[[SITEMAP#ROBOTS|SITEMAP]]</u> is an extension of the Robots Exclusion Protocol to allow listing a [https://www.sitemaps.org/ sitemap] in the "<code>/robots.txt</code>" file. <ref><u><code>[[SITEMAP#ROBOTS]]</code></u></ref> <ref><code>https://www.sitemaps.org/</code></ref> <ref><code>https://www.sitemaps.org/protocol.html</code></ref>&ensp; Having a precompiled list of links for the website available makes the bot's job to crawl and index the site a lot easier and more efficient.&ensp; Since the first thing a good bot does when accessing a website is to check for a "<code>/robots.txt</code>" file, it is best to have the link to the sitemap listed directly in the "<code>/robots.txt</code>" file so the bot doesn't have to guess whether or not the website has a sitemap available (which could be either "<code>/sitemap.txt</code>" or "<code>/sitemap.xml</code>").&ensp; The following "<code>/robots.txt</code>" provides an example of a public website with a sitemap.&ensp; Note that unlike the other ROBOTS instructions, the sitemap should be provided with a full URL (uniform resource locator) and not with a relative link.
<u>[[SITEMAP#ROBOTS|SITEMAP]]</u> is an extension of the Robots Exclusion Protocol to allow listing a [https://www.sitemaps.org/ sitemap] in the "<code>/robots.txt</code>" file. <ref><u><code>[[SITEMAP#ROBOTS]]</code></u></ref> <ref><code>https://www.sitemaps.org/</code></ref> <ref><code>https://www.sitemaps.org/protocol.html</code></ref> <ref><code>https://developers.google.com/search/docs/crawling-indexing/robots/create-robots-txt/</code></ref>&ensp; Having a precompiled list of links for the website available makes the bot's job to crawl and index the site a lot easier and more efficient.&ensp; Since the first thing a good bot does when accessing a website is to check for a "<code>/robots.txt</code>" file, it is best to have the link to the sitemap listed directly in the "<code>/robots.txt</code>" file so the bot doesn't have to guess whether or not the website has a sitemap available (which could be either "<code>/sitemap.txt</code>" or "<code>/sitemap.xml</code>").&ensp; The following "<code>/robots.txt</code>" provides an example of a public website with a sitemap.&ensp; Note that unlike the other ROBOTS instructions, the sitemap should be provided with a full URL (uniform resource locator) and not with a relative link.


<code><highlight lang="robots">
<code><highlight lang="robots">