Website hosting service by

 Back to Index

12.19 robotparser -- Parser for robots.txt

This module provides a single class, RobotFileParser, which answers questions about whether or not a particular user agent can fetch a URL on the Web site that published the robots.txt file. For more details on the structure of robots.txt files, see

class RobotFileParser( )

This class provides a set of methods to read, parse and answer questions about a single robots.txt file.

set_url( url)
Sets the URL referring to a robots.txt file.
read( )
Reads the robots.txt URL and feeds it to the parser.
parse( lines)
Parses the lines argument.
can_fetch( useragent, url)
Returns True if the useragent is allowed to fetch the url according to the rules contained in the parsed robots.txt file.
mtime( )
Returns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically.
modified( )
Sets the time the robots.txt file was last fetched to the current time.


The following example demonstrates basic use of the RobotFileParser class.

>>> import robotparser
>>> rp = robotparser.RobotFileParser()
>>> rp.set_url("")
>>> rp.can_fetch("*", "")
>>> rp.can_fetch("*", "")



2002-2004 Webhosting Service


Disclaimer: This documentation is provided only for the benefits of our hosting customers.
For authoritative source of the documentation, please refer to


Cheap domain registrar provides cheap domain registration, buy domain name and domain transfer from $5.95/year only   Cheap domain registration : Buy domain name and enjoy comprehensive free services