ROBOTS: Difference between revisions

Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
[[image:Exciting Comics 3.jpg|thumb|[Image.]  The Robots Exclusion Protocol will not prevent bad bots from accessing your website.]]
[[image:Exciting Comics 3.jpg|thumb|[Image.]&ensp; The Robots Exclusion Protocol will not prevent bad bots from accessing your website. <ref><code>[[commons:category:robots in art]]</code></ref>]]


One of the first files you should add to your website is "<code>/robots.txt</code>".&ensp; This is a plaintext file for the [https://www.robotstxt.org/ Robots Exclusion Protocol] (ROBOTS language).&ensp; What the <code>robots.txt</code> file does is instruct which webdirectories should be accessed or avoided by web bots.
One of the first files you should add to your website is "<code>/robots.txt</code>". <ref><code>https://www.robotstxt.org/</code></ref>&ensp; This is a plaintext file for the [https://www.robotstxt.org/ Robots Exclusion Protocol] (ROBOTS language).&ensp; What the <code>robots.txt</code> file does is instruct which webdirectories should be accessed or avoided by web bots.


An important thing to remember is that no bot is <em>required</em> to follow the Robots Exclusion Protocol.&ensp; The protocol only affects the behavior of compliant or well-behaved bots and anyone can program a bot to ignore "<code>robots.txt</code>".&ensp; As such, you should <em>not</em> use the Robots Exclusion Protocol to try to hide sensitive directories, especially since publicly listing the directories in "<code>robots.txt</code>" simply gives malicious bots an easy way to find the very directories you don't want them to visit.&ensp; To hide directories from public access (on Apache <abbr title="Hypertext Transfer Protocol">HTTP</abbr> Server) you should use "<code>/.htaccess</code>" (hypertext access) instead.
An important thing to remember is that no bot is <em>required</em> to follow the Robots Exclusion Protocol.&ensp; The protocol only affects the behavior of compliant or well-behaved bots and anyone can program a bot to ignore "<code>robots.txt</code>".&ensp; As such, you should <em>not</em> use the Robots Exclusion Protocol to try to hide sensitive directories, especially since publicly listing the directories in "<code>robots.txt</code>" simply gives malicious bots an easy way to find the very directories you don't want them to visit.&ensp; To hide directories from public access (on Apache <abbr title="Hypertext Transfer Protocol">HTTP</abbr> Server) you should use "<code>/.htaccess</code>" (hypertext access) instead.
Line 72: Line 72:
* <code>https://www.securitytxt.org/</code>
* <code>https://www.securitytxt.org/</code>
* <code>https://humanstxt.org/</code>
* <code>https://humanstxt.org/</code>
== references ==
<references />


== keywords ==
== keywords ==