14 PEP 307: Pickle Enhancements
The pickle and cPickle modules received
some attention during the 2.3 development cycle. In 2.2, new-style classes could be
pickled without difficulty, but they weren't pickled very compactly; PEP 307 quotes a
trivial example where a new-style class results in a pickled string three times longer
than that for a classic class.
The solution was to invent a new pickle protocol. The pickle.dumps()
function has supported a text-or-binary flag for a long time. In 2.3, this flag is
redefined from a Boolean to an integer: 0 is the old text-mode pickle format, 1 is the old
binary format, and now 2 is a new 2.3-specific format. A new constant, pickle.HIGHEST_PROTOCOL, can be used to select the fanciest protocol
available.
Unpickling is no longer considered a safe operation. 2.2's pickle
provided hooks for trying to prevent unsafe classes from being unpickled (specifically, a __safe_for_unpickling__ attribute), but none of this code was ever
audited and therefore it's all been ripped out in 2.3. You should not unpickle untrusted
data in any version of Python.
To reduce the pickling overhead for new-style classes, a new interface for customizing
pickling was added using three special methods: __getstate__, __setstate__, and __getnewargs__. Consult PEP 307
for the full semantics of these methods.
As a way to compress pickles yet further, it's now possible to use integer codes
instead of long strings to identify pickled classes. The Python Software Foundation will
maintain a list of standardized codes; there's also a range of codes for private use.
Currently no codes have been specified.
|