SITEMAP: Difference between revisions
Jump to navigation
Jump to search
Nicole Sharp (talk | contribs) No edit summary |
Nicole Sharp (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
Adding a sitemap to your website allows searchbots to find pages much faster and more efficiently, allowing them to be quickly indexed for search engines.  Sitemaps can be saved as either "<code>/sitemap.txt</code>" or "<code>/sitemap.xml</code>" | Adding a [https://www.sitemaps.org/ sitemap] to your website allows searchbots to find pages much faster and more efficiently, allowing them to be quickly indexed for search engines.  Sitemaps can be saved as either "<code>/sitemap.txt</code>" or "<code>/sitemap.xml</code>" and should be in the root webdirectory ("<code>/</code>"). <ref><code>https://www.sitemaps.org/</code></ref> <ref><code>https://www.sitemaps.org/protocol.html</code></ref>  Using plaintext (TXT) is much faster and easier than writing extensible markup language (XML).  I recommend keeping the sitemap as plaintext, allowing the SITEMAP protocol to join the ranks of the other plaintext website protocols for <u>[[ROBOTS]]</u>, [https://www.securitytxt.org/ SECURITY], and [https://humanstxt.org/ HUMANS]. | ||
To create a sitemap, you simply make a plaintext list of each URL (uniform resource locator) for the website.  Only URLs for a single domain should be included — do not add URLs for subdomains or alias domains.  You should also only list canonical URLs.  This means that if a particular webpage can be accessed from multiple URLs, only one URL should be listed for that webpage in the sitemap. | As with all webtext files, you should use an advanced text editor such as [https://www.notepad-plus-plus.org/ Notepad-Plus-Plus] that supports Unix line endings.  Do not use Microsoft Windows Notepad. | ||
== canonical links == | |||
To create a sitemap, you simply make a plaintext list of each URL (uniform resource locator) for the website with one URL per line and no other content (no comments).  Only URLs for a single domain should be included — do not add URLs for subdomains or alias domains.  You should also only list canonical URLs.  This means that if a particular webpage can be accessed from multiple URLs, only one URL should be listed for that webpage in the sitemap. | |||
For example, there are many different ways to access <u>[[Nicole Sharp's Homepage]]</u>: | For example, there are many different ways to access <u>[[Nicole Sharp's Homepage]]</u>: | ||
Line 23: | Line 27: | ||
: <u><code>[[about Nicole Sharp's Homepage|https://www.nicolesharp.net/wiki/NikkiWiki]]</code></u> | : <u><code>[[about Nicole Sharp's Homepage|https://www.nicolesharp.net/wiki/NikkiWiki]]</code></u> | ||
since all of the other URLs redirect to that URL. | since all of the other URLs redirect to that URL. | ||
In [[mw:Main Page|Wikimedia MediaWiki]], canonical URLs are provided by adding | |||
: <code>[[mw:$wgEnableCanonicalServerLink|$wgEnableCanonicalServerLink]] = true;</code> | |||
to "<code>LocalSettings.php</code>". | |||
== no subdomains == | |||
Here are even more ways to access Nicole Sharp's Homepage: | Here are even more ways to access Nicole Sharp's Homepage: | ||
Line 44: | Line 54: | ||
With the exception of "<code><nowiki>https://www.nicolesharp.net/</nowiki></code>", none of these other URLs should be included in "<u><code>https://www.nicolesharp.net/sitemap.txt</code></u>".  All of the URLs should have the same protocol (either all HTTPS [Hypertext Transfer Protocol Secure] or all HTTP [Hypertext Transfer Protocol]) and all of the URLs should be on the same subdomain (for example, either all with "<code>www</code>" or all without "<code>www</code>"). | With the exception of "<code><nowiki>https://www.nicolesharp.net/</nowiki></code>", none of these other URLs should be included in "<u><code>https://www.nicolesharp.net/sitemap.txt</code></u>".  All of the URLs should have the same protocol (either all HTTPS [Hypertext Transfer Protocol Secure] or all HTTP [Hypertext Transfer Protocol]) and all of the URLs should be on the same subdomain (for example, either all with "<code>www</code>" or all without "<code>www</code>"). | ||
== example == | |||
The following "<code>/sitemap.txt</code>" example gives a compliant sitemap for "<u><code>[[Nicole Sharp's Website|https://www.nicolesharp.net/]]</code></u>": | The following "<code>/sitemap.txt</code>" example gives a compliant sitemap for "<u><code>[[Nicole Sharp's Website|https://www.nicolesharp.net/]]</code></u>": | ||
Line 58: | Line 70: | ||
Only canonical URLs are included, all of the URLs have the same protocol ("<code>https://</code>"), and all of the URLs are on the same subdomain ("<code>www.nicolesharp.net</code>").  Each new subdomain will need its own sitemap. | Only canonical URLs are included, all of the URLs have the same protocol ("<code>https://</code>"), and all of the URLs are on the same subdomain ("<code>www.nicolesharp.net</code>").  Each new subdomain will need its own sitemap. | ||
== ROBOTS == | |||
Once your sitemap is completed, you can add it to the Robots Exclusion Protocol to be indexed by searchbots.  An example "<code>/robots.txt</code>" with a sitemap is given below. | Once your sitemap is completed, you can add it to the Robots Exclusion Protocol to be indexed by searchbots.  An example "<code>/robots.txt</code>" with a sitemap is given below. | ||
Line 74: | Line 88: | ||
* <code>https://www.securitytxt.org/</code> | * <code>https://www.securitytxt.org/</code> | ||
* <code>https://humanstxt.org/</code> | * <code>https://humanstxt.org/</code> | ||
== references == | |||
<references /> | |||
== keywords == | == keywords == | ||
<code>bots, development, indexing, ROBOTS, robots.txt, searchbots, SITEMAP, sitemap.txt, TXT, web, webcrawlers, webcrawling, webdevelopment, WWW</code> | <code>bots, CANONICAL, development, hyperlinks, indexing, links, ROBOTS, robots.txt, searchbots, SITEMAP, sitemap.txt, TXT, URLs, web, webcrawlers, webcrawling, webdevelopment, weblinks, WWW</code> | ||
{{#seo:|keywords=bots, development, indexing, ROBOTS, robots.txt, searchbots, SITEMAP, sitemap.txt, TXT, web, webcrawlers, webcrawling, webdevelopment, WWW}} | {{#seo:|keywords=bots, CANONICAL, development, hyperlinks, indexing, links, ROBOTS, robots.txt, searchbots, SITEMAP, sitemap.txt, TXT, URLs, web, webcrawlers, webcrawling, webdevelopment, weblinks, WWW}} | ||
[[category:webdevelopment]] | [[category:webdevelopment]] |