2,457
edits
Nicole Sharp (talk | contribs) No edit summary |
Nicole Sharp (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
[[image:Exciting Comics 3.jpg|thumb|[Image.]  The Robots Exclusion Protocol will not prevent bad bots from accessing your website. <ref><code>[[commons:category:robots in art]]</code></ref>]] | [[image:Exciting Comics 3.jpg|thumb|[Image.]  The Robots Exclusion Protocol will not prevent bad bots from accessing your website. <ref><code>[[commons:category:robots in art]]</code></ref>]] | ||
One of the first files you should add to your website is "<code>/robots.txt</code>". <ref><code>https://www.robotstxt.org/</code></ref>  This is a plaintext file for the Robots Exclusion Protocol (ROBOTS language, Internet Society Request for Comments [RFC] 9309). <ref><code>https://www.robotstxt.org/robotstxt.html</code></ref>  What the "<code>/robots.txt</code>" file does is instruct which webdirectories should be accessed or avoided by web bots. | One of the first files you should add to your website is "<code>/robots.txt</code>". <ref><code>https://www.rfc-editor.org/rfc/rfc9309</code></ref> <ref><code>https://www.robotstxt.org/</code></ref>  This is a plaintext file for the Robots Exclusion Protocol (ROBOTS language, Internet Society Request for Comments [RFC] 9309). <ref><code>https://www.robotstxt.org/robotstxt.html</code></ref>  What the "<code>/robots.txt</code>" file does is instruct which webdirectories should be accessed or avoided by web bots. | ||
An important thing to remember is that no bot is <em>required</em> to follow the Robots Exclusion Protocol. <ref><code>https://www.robotstxt.org/faq/prevent.html</code></ref> <ref><code>https://www.robotstxt.org/faq/blockjustbad.html</code></ref> <ref><code>https://www.robotstxt.org/faq/legal.html</code></ref>  The protocol only affects the behavior of compliant or well-behaved bots and anyone can program a bot to ignore the Robots Exclusion Protocol.  As such, you should <em>not</em> use the Robots Exclusion Protocol to try to hide sensitive directories, especially since publicly listing the directories in "<code>/robots.txt</code>" simply gives malicious bots an easy way to find the very directories you don't want them to visit. <ref><code>https://www.robotstxt.org/faq/nosecurity.html</code></ref>  On Apache HTTP (Hypertext Transfer Protocol) Server, you should use "<code>/.htaccess</code>" (hypertext access) instead to hide directories from public access. | An important thing to remember is that no bot is <em>required</em> to follow the Robots Exclusion Protocol. <ref><code>https://www.robotstxt.org/faq/prevent.html</code></ref> <ref><code>https://www.robotstxt.org/faq/blockjustbad.html</code></ref> <ref><code>https://www.robotstxt.org/faq/legal.html</code></ref>  The protocol only affects the behavior of compliant or well-behaved bots and anyone can program a bot to ignore the Robots Exclusion Protocol.  As such, you should <em>not</em> use the Robots Exclusion Protocol to try to hide sensitive directories, especially since publicly listing the directories in "<code>/robots.txt</code>" simply gives malicious bots an easy way to find the very directories you don't want them to visit. <ref><code>https://www.robotstxt.org/faq/nosecurity.html</code></ref>  On Apache HTTP (Hypertext Transfer Protocol) Server, you should use "<code>/.htaccess</code>" (hypertext access) instead to hide directories from public access. |