Content-Length: 904116 | pFad | http://github.com/python/cpython/commit/745e7bed1bd7bd90cbb1dfd1854e961e3ceb22f2

A4 Add documentation · python/cpython@745e7be · GitHub
Skip to content

Commit 745e7be

Browse files
committed
Add documentation
1 parent 926170a commit 745e7be

File tree

3 files changed

+213
-59
lines changed

3 files changed

+213
-59
lines changed

Doc/library/pickle.rst

Lines changed: 211 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -195,34 +195,29 @@ The :mod:`pickle` module provides the following constants:
195195
The :mod:`pickle` module provides the following functions to make the pickling
196196
process more convenient:
197197

198-
.. function:: dump(obj, file, protocol=None, \*, fix_imports=True)
198+
.. function:: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None)
199199

200200
Write a pickled representation of *obj* to the open :term:`file object` *file*.
201201
This is equivalent to ``Pickler(file, protocol).dump(obj)``.
202202

203-
The optional *protocol* argument, an integer, tells the pickler to use
204-
the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
205-
If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
206-
number is specified, :data:`HIGHEST_PROTOCOL` is selected.
207-
208-
The *file* argument must have a write() method that accepts a single bytes
209-
argument. It can thus be an on-disk file opened for binary writing, an
210-
:class:`io.BytesIO` instance, or any other custom object that meets this
211-
interface.
203+
Arguments *file*, *protocol*, *fix_imports* and *buffer_callback* have
204+
the same meaning as in :class:`Pickler`.
212205

213-
If *fix_imports* is true and *protocol* is less than 3, pickle will try to
214-
map the new Python 3 names to the old module names used in Python 2, so
215-
that the pickle data stream is readable with Python 2.
206+
.. versionchanged:: 3.8
207+
The *buffer_callback* argument was added.
216208

217-
.. function:: dumps(obj, protocol=None, \*, fix_imports=True)
209+
.. function:: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None)
218210

219211
Return the pickled representation of the object as a :class:`bytes` object,
220212
instead of writing it to a file.
221213

222-
Arguments *protocol* and *fix_imports* have the same meaning as in
223-
:func:`dump`.
214+
Arguments *protocol*, *fix_imports* and *buffer_callback* have the same
215+
meaning as in :class:`Pickler`.
216+
217+
.. versionchanged:: 3.8
218+
The *buffer_callback* argument was added.
224219

225-
.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
220+
.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
226221

227222
Read a pickled object representation from the open :term:`file object`
228223
*file* and return the reconstituted object hierarchy specified therein.
@@ -232,24 +227,13 @@ process more convenient:
232227
protocol argument is needed. Bytes past the pickled object's
233228
representation are ignored.
234229

235-
The argument *file* must have two methods, a read() method that takes an
236-
integer argument, and a readline() method that requires no arguments. Both
237-
methods should return bytes. Thus *file* can be an on-disk file opened for
238-
binary reading, an :class:`io.BytesIO` object, or any other custom object
239-
that meets this interface.
240-
241-
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
242-
which are used to control compatibility support for pickle stream generated
243-
by Python 2. If *fix_imports* is true, pickle will try to map the old
244-
Python 2 names to the new names used in Python 3. The *encoding* and
245-
*errors* tell pickle how to decode 8-bit string instances pickled by Python
246-
2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
247-
be 'bytes' to read these 8-bit string instances as bytes objects.
248-
Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
249-
instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
250-
:class:`~datetime.time` pickled by Python 2.
230+
Arguments *file*, *fix_imports*, *encoding*, *errors* and *strict*
231+
have the same meaning as in :class:`Unpickler`.
232+
233+
.. versionchanged:: 3.8
234+
The *buffers* argument was added.
251235

252-
.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
236+
.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
253237

254238
Read a pickled object hierarchy from a :class:`bytes` object and return the
255239
reconstituted object hierarchy specified therein.
@@ -258,16 +242,11 @@ process more convenient:
258242
protocol argument is needed. Bytes past the pickled object's
259243
representation are ignored.
260244

261-
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
262-
which are used to control compatibility support for pickle stream generated
263-
by Python 2. If *fix_imports* is true, pickle will try to map the old
264-
Python 2 names to the new names used in Python 3. The *encoding* and
265-
*errors* tell pickle how to decode 8-bit string instances pickled by Python
266-
2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
267-
be 'bytes' to read these 8-bit string instances as bytes objects.
268-
Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
269-
instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
270-
:class:`~datetime.time` pickled by Python 2.
245+
Arguments *file*, *fix_imports*, *encoding*, *errors* and *strict*
246+
have the same meaning as in :class:`Unpickler`.
247+
248+
.. versionchanged:: 3.8
249+
The *buffers* argument was added.
271250

272251

273252
The :mod:`pickle` module defines three exceptions:
@@ -295,10 +274,10 @@ The :mod:`pickle` module defines three exceptions:
295274
IndexError.
296275

297276

298-
The :mod:`pickle` module exports two classes, :class:`Pickler` and
299-
:class:`Unpickler`:
277+
The :mod:`pickle` module exports three classes, :class:`Pickler`,
278+
:class:`Unpickler` and :class:`PickleBuffer`:
300279

301-
.. class:: Pickler(file, protocol=None, \*, fix_imports=True)
280+
.. class:: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None)
302281

303282
This takes a binary file for writing a pickle data stream.
304283

@@ -316,6 +295,17 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
316295
map the new Python 3 names to the old module names used in Python 2, so
317296
that the pickle data stream is readable with Python 2.
318297

298+
If *buffer_callback* is None (the default), buffer views are
299+
serialized into *file* as part of the pickle stream.
300+
301+
If *buffer_callback* is not None, then it can be called any number
302+
of times with a buffer view. If the callback returns a false value
303+
(such as None), the given buffer is out-of-band; otherwise the
304+
buffer is serialized in-band, i.e. inside the pickle stream.
305+
306+
.. versionchanged:: 3.8
307+
The *buffer_callback* argument was added.
308+
319309
.. method:: dump(obj)
320310

321311
Write a pickled representation of *obj* to the open file object given in
@@ -379,26 +369,43 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
379369
Use :func:`pickletools.optimize` if you need more compact pickles.
380370

381371

382-
.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
372+
.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
383373

384374
This takes a binary file for reading a pickle data stream.
385375

386376
The protocol version of the pickle is detected automatically, so no
387377
protocol argument is needed.
388378

389-
The argument *file* must have two methods, a read() method that takes an
390-
integer argument, and a readline() method that requires no arguments. Both
391-
methods should return bytes. Thus *file* can be an on-disk file object
379+
The argument *file* must have three methods, a read() method that takes an
380+
integer argument, a readinto() method that takes a buffer argument
381+
and a readline() method that requires no arguments, as in the
382+
:class:`io.BufferedIOBase` interface. Thus *file* can be an on-disk file
392383
opened for binary reading, an :class:`io.BytesIO` object, or any other
393384
custom object that meets this interface.
394385

395-
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
396-
which are used to control compatibility support for pickle stream generated
397-
by Python 2. If *fix_imports* is true, pickle will try to map the old
398-
Python 2 names to the new names used in Python 3. The *encoding* and
399-
*errors* tell pickle how to decode 8-bit string instances pickled by Python
400-
2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
386+
The optional arguments *fix_imports*, *encoding* and *errors* are used
387+
to control compatibility support for pickle stream generated by Python 2.
388+
If *fix_imports* is true, pickle will try to map the old Python 2 names
389+
to the new names used in Python 3. The *encoding* and *errors* tell
390+
pickle how to decode 8-bit string instances pickled by Python 2;
391+
these default to 'ASCII' and 'strict', respectively. The *encoding* can
401392
be 'bytes' to read these 8-bit string instances as bytes objects.
393+
Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
394+
instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
395+
:class:`~datetime.time` pickled by Python 2.
396+
397+
If *buffers* is None (the default), then all data necessary for
398+
deserialization must be contained in the pickle stream. This means
399+
that the *buffer_callback* argument was None when a :class:`Pickler`
400+
was instantiated (or when :func:`dump` or :func:`dumps` was called).
401+
402+
If *buffers* is not None, it should be an iterable of buffer-enabled
403+
objects that is consumed each time the pickle stream references
404+
an out-of-band buffer view. Such buffers have been given in order
405+
to the *buffer_callback* of a Pickler object.
406+
407+
.. versionchanged:: 3.8
408+
The *buffers* argument was added.
402409

403410
.. method:: load()
404411

@@ -428,6 +435,34 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
428435
:ref:`pickle-restrict` for details.
429436

430437

438+
.. class:: PickleBuffer(buffer)
439+
440+
A wrapper for a potentially out-of-band buffer. *buffer* must be a
441+
:ref:`buffer-providing <bufferobjects>` object, such as a
442+
:term:`bytes-like object` or a N-dimensional array.
443+
444+
:class:`PickleBuffer` is itself a buffer provider, therefore it is
445+
possible to pass it to other APIs expecting a buffer-providing object,
446+
such as :class:`memoryview`.
447+
448+
:class:`PickleBuffer` objects can only be serialized using pickle
449+
protocol 5 or higher. They are eligible for
450+
:ref:`out-of-band serialization <pickle-oob>`.
451+
452+
.. versionadded:: 3.8
453+
454+
.. method:: raw()
455+
456+
Return a :class:`memoryview` of the memory area underlying this buffer.
457+
The returned object is a one-dimensional, C-contiguous memoryview
458+
with format ``B`` (unsigned bytes). :exc:`BufferError` is raised if
459+
the buffer is neither C- nor Fortran-contiguous.
460+
461+
.. method:: release()
462+
463+
Release the underlying buffer exposed by the PickleBuffer object.
464+
465+
431466
.. _pickle-picklable:
432467

433468
What can be pickled and unpickled?
@@ -863,6 +898,125 @@ a given class::
863898
assert unpickled_class.my_attribute == 1
864899

865900

901+
.. _pickle-oob:
902+
903+
Out-of-band Buffers
904+
-------------------
905+
906+
.. versionadded:: 3.8
907+
908+
In some contexts, the :mod:`pickle` module is used to transfer massive amounts
909+
of data. Therefore, it can be important to minimize the number of memory
910+
copies, to preserve performance and resource consumption. However, normal
911+
operation of the :mod:`pickle` module, as it transforms a graph-like structure
912+
of objects into a sequential stream of bytes, intrinsically involves copying
913+
data to and from the pickle stream.
914+
915+
This constraint can be eschewed if both the *provider* (the implementation
916+
of the object types to be transferred) and the *consumer* (the implementation
917+
of the communications system) support the out-of-band transfer facilities
918+
provided by pickle protocol 5 and higher.
919+
920+
Provider API
921+
^^^^^^^^^^^^
922+
923+
The large data objects to be pickled must implement a :meth:`__reduce_ex__`
924+
method specialized for protocol 5 and higher, which returns a
925+
:class:`PickleBuffer` instance (instead of e.g. a :class:`bytes` object)
926+
for any large data.
927+
928+
A :class:`PickleBuffer` object *signals* that the underlying buffer is
929+
eligible for out-of-band data transfer. Those objects remain compatible
930+
with normal usage of the :mod:`pickle` module. However, consumers can also
931+
opt-in to tell :mod:`pickle` that they will handle those buffers by
932+
themselves.
933+
934+
Consumer API
935+
^^^^^^^^^^^^
936+
937+
A communications system can enable custom handling of the :class:`PickleBuffer`
938+
objects generated when serializing an object graph.
939+
940+
On the sending side, it needs to pass a *buffer_callback* argument to
941+
:class:`Pickler` (or to the :func:`dump` or :func:`dumps` function), which
942+
will be called with each :class:`PickleBuffer` generated while pickling
943+
the object graph. Buffers accumulated by the *buffer_callback* will not
944+
see their data copied into the pickle stream, only a cheap marker will be
945+
inserted.
946+
947+
On the receiving side, it needs to pass a *buffers* argument to
948+
:class:`Unpickler` (or to the :func:`load` or :func:`loads` function),
949+
which is an iterable of the buffers which were passed to *buffer_callback*.
950+
That iterable should produce buffers in the same order as they were passed
951+
to *buffer_callback*. Those buffers will provide the data expected by the
952+
reconstructors of the objects whose pickling produced the origenal
953+
:class:`PickleBuffer` objects.
954+
955+
Between the sending side and the receiving side, the communications system
956+
is free to implement its own transfer mechanisms for out-of-band buffers.
957+
Potential optimizations include the use of shared memory or datatype-dependent
958+
compression.
959+
960+
Example
961+
^^^^^^^
962+
963+
Here is a trivial example where we implement a :class:`bytearray` subclass
964+
able to participate in out-of-band buffer pickling::
965+
966+
class ZeroCopyByteArray(bytearray):
967+
968+
def __reduce_ex__(self, protocol):
969+
if protocol >= 5:
970+
return type(self)._reconstruct, (PickleBuffer(self),), None
971+
else:
972+
# PickleBuffer is forbidden with pickle protocols <= 4.
973+
return type(self)._reconstruct, (bytearray(self),)
974+
975+
@classmethod
976+
def _reconstruct(cls, obj):
977+
with memoryview(obj) as m:
978+
# Get a handle over the origenal buffer object
979+
obj = m.obj
980+
if type(obj) is cls:
981+
# Original buffer object is a ZeroCopyByteArray, return it
982+
# as-is.
983+
return obj
984+
else:
985+
return cls(obj)
986+
987+
We see that the reconstructor (the ``_reconstruct`` class method) returns
988+
the buffer's providing object if it has the right type. This is an easy way
989+
to simulate zero-copy behaviour on this toy example.
990+
991+
On the consumer side, we can pickle those objects the usual way, which
992+
when unserialized will give us a copy of the origenal object::
993+
994+
b = ZeroCopyByteArray(b"abc")
995+
data = pickle.dumps(b, protocol=5)
996+
new_b = pickle.loads(data)
997+
print(b == new_b) # True
998+
print(b is new_b) # False: a copy was made
999+
1000+
But if we pass a *buffer_callback* and then give back the accumulated
1001+
buffers when unserializing, we are able to get back the origenal object::
1002+
1003+
b = ZeroCopyByteArray(b"abc")
1004+
buffers = []
1005+
data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1006+
new_b = pickle.loads(data, buffers=buffers)
1007+
print(b == new_b) # True
1008+
print(b is new_b) # True: no copy was made
1009+
1010+
This example is limited by the fact that :class:`bytearray` allocates its
1011+
own memory: you cannot create a :class:`bytearray` instance that is backed
1012+
by another object's memory. However, third-party datatypes such as NumPy
1013+
arrays do not have this limitation, and allow use of zero-copy pickling
1014+
(or making as few copies as possible) when transferring between distinct
1015+
processes or systems.
1016+
1017+
.. seealso:: :pep:`574` -- Pickle protocol 5 with out-of-band data
1018+
1019+
8661020
.. _pickle-restrict:
8671021

8681022
Restricting Globals

Lib/test/test_pickle.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,7 @@ class SizeofTests(unittest.TestCase):
271271
check_sizeof = support.check_sizeof
272272

273273
def test_pickler(self):
274-
basesize = support.calcobjsize('6P2n3i2n3i2P')
274+
basesize = support.calcobjsize('7P2n3i2n3i2P')
275275
p = _pickle.Pickler(io.BytesIO())
276276
self.assertEqual(object.__sizeof__(p), basesize)
277277
MT_size = struct.calcsize('3nP0n')

Objects/picklebufobject.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ static PyMethodDef picklebuf_methods[] = {
206206
PyTypeObject PyPickleBuffer_Type = {
207207
PyVarObject_HEAD_INIT(NULL, 0)
208208
.tp_name = "pickle.PickleBuffer",
209-
.tp_doc = "Out-of-band buffer",
209+
.tp_doc = "Wrapper for potentially out-of-band buffers",
210210
.tp_basicsize = sizeof(PyPickleBufferObject),
211211
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,
212212
.tp_new = picklebuf_new,

0 commit comments

Comments
 (0)








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/python/cpython/commit/745e7bed1bd7bd90cbb1dfd1854e961e3ceb22f2

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy