ROBOTS: Difference between revisions

no edit summary
No edit summary
Line 3: Line 3:
One of the first files you should add to your website is "<code>/robots.txt</code>". <ref><code>https://www.robotstxt.org/</code></ref>&ensp; This is a plaintext file for the [https://www.robotstxt.org/ Robots Exclusion Protocol] (ROBOTS language).&ensp; What the <code>robots.txt</code> file does is instruct which webdirectories should be accessed or avoided by web bots.
One of the first files you should add to your website is "<code>/robots.txt</code>". <ref><code>https://www.robotstxt.org/</code></ref>&ensp; This is a plaintext file for the [https://www.robotstxt.org/ Robots Exclusion Protocol] (ROBOTS language).&ensp; What the <code>robots.txt</code> file does is instruct which webdirectories should be accessed or avoided by web bots.


An important thing to remember is that no bot is <em>required</em> to follow the Robots Exclusion Protocol.&ensp; The protocol only affects the behavior of compliant or well-behaved bots and anyone can program a bot to ignore "<code>robots.txt</code>".&ensp; As such, you should <em>not</em> use the Robots Exclusion Protocol to try to hide sensitive directories, especially since publicly listing the directories in "<code>robots.txt</code>" simply gives malicious bots an easy way to find the very directories you don't want them to visit.&ensp; To hide directories from public access (on Apache <abbr title="Hypertext Transfer Protocol">HTTP</abbr> Server) you should use "<code>/.htaccess</code>" (hypertext access) instead.
An important thing to remember is that no bot is <em>required</em> to follow the Robots Exclusion Protocol.&ensp; The protocol only affects the behavior of compliant or well-behaved bots and anyone can program a bot to ignore the Robots Exclusion Protocol.&ensp; As such, you should <em>not</em> use the Robots Exclusion Protocol to try to hide sensitive directories, especially since publicly listing the directories in "<code>/robots.txt</code>" simply gives malicious bots an easy way to find the very directories you don't want them to visit.&ensp; To hide directories from public access (on Apache <abbr title="Hypertext Transfer Protocol">HTTP</abbr> Server) you should use "<code>/.htaccess</code>" (hypertext access) instead.


Comments are added to the Robots Exclusion Protocol with a hash ("<code>#</code>") at the beginning of a new line.
Comments are added to the Robots Exclusion Protocol with a hash ("<code>#</code>") at the beginning of a new line.