4.9.1.1 Codec Objects
The Codec class defines these methods which also define the function
interfaces of the stateless encoder and decoder:
-
- Encodes the object input and returns a tuple (output object, length
consumed). While codecs are not restricted to use with Unicode, in a Unicode context,
encoding converts a Unicode object to a plain string using a particular character set
encoding (e.g.,
cp1252 or iso-8859-1).
errors defines the error handling to apply. It defaults to 'strict'
handling.
The method may not store state in the Codec instance. Use StreamCodec for codecs which have to keep state in order to make
encoding/decoding efficient.
The encoder must be able to handle zero length input and return an empty object of the
output object type in this situation.
-
- Decodes the object input and returns a tuple (output object, length
consumed). In a Unicode context, decoding converts a plain string encoded using a
particular character set encoding to a Unicode object.
input must be an object which provides the bf_getreadbuf buffer
slot. Python strings, buffer objects and memory mapped files are examples of objects
providing this slot.
errors defines the error handling to apply. It defaults to 'strict'
handling.
The method may not store state in the Codec instance. Use StreamCodec for codecs which have to keep state in order to make
encoding/decoding efficient.
The decoder must be able to handle zero length input and return an empty object of the
output object type in this situation.
The StreamWriter and StreamReader classes
provide generic working interfaces which can be used to implement new encodings submodules
very easily. See encodings.utf_8 for an example on how this is done.
|