Website hosting and domain hosting service by

 Back to Index

10 PEP 293: Codec Error Handling Callbacks

When encoding a Unicode string into a byte string, unencodable characters may be encountered. So far, Python has allowed specifying the error processing as either ``strict'' (raising UnicodeError), ``ignore'' (skipping the character), or ``replace'' (using a question mark in the output string), with ``strict'' being the default behavior. It may be desirable to specify alternative processing of such errors, such as inserting an XML character reference or HTML entity reference into the converted string.

Python now has a flexible framework to add different processing strategies. New error handlers can be added with codecs.register_error, and codecs then can access the error handler with codecs.lookup_error. An equivalent C API has been added for codecs written in C. The error handler gets the necessary state information such as the string being converted, the position in the string where the error was detected, and the target encoding. The handler can then either raise an exception or return a replacement string.

Two additional error handlers have been implemented using this framework: ``backslashreplace'' uses Python backslash quoting to represent unencodable characters and ``xmlcharrefreplace'' emits XML character references.




2002-2004 Website Hosting Service


Disclaimer: This documentation is provided only for the benefits of our hosting customers.
For authoritative source of the documentation, please refer to


Register domain name by Cheap Domain Registrar with free domain hosting services

Cheap domain name - Cheap domain name registration service including free domain hosting services offers domain name registration, domain name transfer and domain search services