@@ -195,34 +195,29 @@ The :mod:`pickle` module provides the following constants:
195
195
The :mod: `pickle ` module provides the following functions to make the pickling
196
196
process more convenient:
197
197
198
- .. function :: dump(obj, file, protocol=None, \*, fix_imports=True)
198
+ .. function :: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None )
199
199
200
200
Write a pickled representation of *obj * to the open :term: `file object ` *file *.
201
201
This is equivalent to ``Pickler(file, protocol).dump(obj) ``.
202
202
203
- The optional *protocol * argument, an integer, tells the pickler to use
204
- the given protocol; supported protocols are 0 to :data: `HIGHEST_PROTOCOL `.
205
- If not specified, the default is :data: `DEFAULT_PROTOCOL `. If a negative
206
- number is specified, :data: `HIGHEST_PROTOCOL ` is selected.
207
-
208
- The *file * argument must have a write() method that accepts a single bytes
209
- argument. It can thus be an on-disk file opened for binary writing, an
210
- :class: `io.BytesIO ` instance, or any other custom object that meets this
211
- interface.
203
+ Arguments *file *, *protocol *, *fix_imports * and *buffer_callback * have
204
+ the same meaning as in :class: `Pickler `.
212
205
213
- If *fix_imports * is true and *protocol * is less than 3, pickle will try to
214
- map the new Python 3 names to the old module names used in Python 2, so
215
- that the pickle data stream is readable with Python 2.
206
+ .. versionchanged :: 3.8
207
+ The *buffer_callback * argument was added.
216
208
217
- .. function :: dumps(obj, protocol=None, \*, fix_imports=True)
209
+ .. function :: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None )
218
210
219
211
Return the pickled representation of the object as a :class: `bytes ` object,
220
212
instead of writing it to a file.
221
213
222
- Arguments *protocol * and *fix_imports * have the same meaning as in
223
- :func: `dump `.
214
+ Arguments *protocol *, *fix_imports * and *buffer_callback * have the same
215
+ meaning as in :class: `Pickler `.
216
+
217
+ .. versionchanged :: 3.8
218
+ The *buffer_callback * argument was added.
224
219
225
- .. function :: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
220
+ .. function :: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None )
226
221
227
222
Read a pickled object representation from the open :term: `file object `
228
223
*file * and return the reconstituted object hierarchy specified therein.
@@ -232,24 +227,13 @@ process more convenient:
232
227
protocol argument is needed. Bytes past the pickled object's
233
228
representation are ignored.
234
229
235
- The argument *file * must have two methods, a read() method that takes an
236
- integer argument, and a readline() method that requires no arguments. Both
237
- methods should return bytes. Thus *file * can be an on-disk file opened for
238
- binary reading, an :class: `io.BytesIO ` object, or any other custom object
239
- that meets this interface.
240
-
241
- Optional keyword arguments are *fix_imports *, *encoding * and *errors *,
242
- which are used to control compatibility support for pickle stream generated
243
- by Python 2. If *fix_imports * is true, pickle will try to map the old
244
- Python 2 names to the new names used in Python 3. The *encoding * and
245
- *errors * tell pickle how to decode 8-bit string instances pickled by Python
246
- 2; these default to 'ASCII' and 'strict', respectively. The *encoding * can
247
- be 'bytes' to read these 8-bit string instances as bytes objects.
248
- Using ``encoding='latin1' `` is required for unpickling NumPy arrays and
249
- instances of :class: `~datetime.datetime `, :class: `~datetime.date ` and
250
- :class: `~datetime.time ` pickled by Python 2.
230
+ Arguments *file *, *fix_imports *, *encoding *, *errors * and *strict *
231
+ have the same meaning as in :class: `Unpickler `.
232
+
233
+ .. versionchanged :: 3.8
234
+ The *buffers * argument was added.
251
235
252
- .. function :: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
236
+ .. function :: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None )
253
237
254
238
Read a pickled object hierarchy from a :class: `bytes ` object and return the
255
239
reconstituted object hierarchy specified therein.
@@ -258,16 +242,11 @@ process more convenient:
258
242
protocol argument is needed. Bytes past the pickled object's
259
243
representation are ignored.
260
244
261
- Optional keyword arguments are *fix_imports *, *encoding * and *errors *,
262
- which are used to control compatibility support for pickle stream generated
263
- by Python 2. If *fix_imports * is true, pickle will try to map the old
264
- Python 2 names to the new names used in Python 3. The *encoding * and
265
- *errors * tell pickle how to decode 8-bit string instances pickled by Python
266
- 2; these default to 'ASCII' and 'strict', respectively. The *encoding * can
267
- be 'bytes' to read these 8-bit string instances as bytes objects.
268
- Using ``encoding='latin1' `` is required for unpickling NumPy arrays and
269
- instances of :class: `~datetime.datetime `, :class: `~datetime.date ` and
270
- :class: `~datetime.time ` pickled by Python 2.
245
+ Arguments *file *, *fix_imports *, *encoding *, *errors * and *strict *
246
+ have the same meaning as in :class: `Unpickler `.
247
+
248
+ .. versionchanged :: 3.8
249
+ The *buffers * argument was added.
271
250
272
251
273
252
The :mod: `pickle ` module defines three exceptions:
@@ -295,10 +274,10 @@ The :mod:`pickle` module defines three exceptions:
295
274
IndexError.
296
275
297
276
298
- The :mod: `pickle ` module exports two classes, :class: `Pickler ` and
299
- :class: `Unpickler `:
277
+ The :mod: `pickle ` module exports three classes, :class: `Pickler `,
278
+ :class: `Unpickler ` and :class: ` PickleBuffer ` :
300
279
301
- .. class :: Pickler(file, protocol=None, \*, fix_imports=True)
280
+ .. class :: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None )
302
281
303
282
This takes a binary file for writing a pickle data stream.
304
283
@@ -316,6 +295,17 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
316
295
map the new Python 3 names to the old module names used in Python 2, so
317
296
that the pickle data stream is readable with Python 2.
318
297
298
+ If *buffer_callback * is None (the default), buffer views are
299
+ serialized into *file * as part of the pickle stream.
300
+
301
+ If *buffer_callback * is not None, then it can be called any number
302
+ of times with a buffer view. If the callback returns a false value
303
+ (such as None), the given buffer is out-of-band; otherwise the
304
+ buffer is serialized in-band, i.e. inside the pickle stream.
305
+
306
+ .. versionchanged :: 3.8
307
+ The *buffer_callback * argument was added.
308
+
319
309
.. method :: dump(obj)
320
310
321
311
Write a pickled representation of *obj * to the open file object given in
@@ -379,26 +369,43 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
379
369
Use :func: `pickletools.optimize ` if you need more compact pickles.
380
370
381
371
382
- .. class :: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
372
+ .. class :: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None )
383
373
384
374
This takes a binary file for reading a pickle data stream.
385
375
386
376
The protocol version of the pickle is detected automatically, so no
387
377
protocol argument is needed.
388
378
389
- The argument *file * must have two methods, a read() method that takes an
390
- integer argument, and a readline() method that requires no arguments. Both
391
- methods should return bytes. Thus *file * can be an on-disk file object
379
+ The argument *file * must have three methods, a read() method that takes an
380
+ integer argument, a readinto() method that takes a buffer argument
381
+ and a readline() method that requires no arguments, as in the
382
+ :class: `io.BufferedIOBase ` interface. Thus *file * can be an on-disk file
392
383
opened for binary reading, an :class: `io.BytesIO ` object, or any other
393
384
custom object that meets this interface.
394
385
395
- Optional keyword arguments are *fix_imports *, *encoding * and *errors *,
396
- which are used to control compatibility support for pickle stream generated
397
- by Python 2. If *fix_imports * is true, pickle will try to map the old
398
- Python 2 names to the new names used in Python 3. The *encoding * and
399
- * errors * tell pickle how to decode 8-bit string instances pickled by Python
400
- 2; these default to 'ASCII' and 'strict', respectively. The *encoding * can
386
+ The optional arguments *fix_imports *, *encoding * and *errors * are used
387
+ to control compatibility support for pickle stream generated by Python 2.
388
+ If *fix_imports * is true, pickle will try to map the old Python 2 names
389
+ to the new names used in Python 3. The *encoding * and * errors * tell
390
+ pickle how to decode 8-bit string instances pickled by Python 2;
391
+ these default to 'ASCII' and 'strict', respectively. The *encoding * can
401
392
be 'bytes' to read these 8-bit string instances as bytes objects.
393
+ Using ``encoding='latin1' `` is required for unpickling NumPy arrays and
394
+ instances of :class: `~datetime.datetime `, :class: `~datetime.date ` and
395
+ :class: `~datetime.time ` pickled by Python 2.
396
+
397
+ If *buffers * is None (the default), then all data necessary for
398
+ deserialization must be contained in the pickle stream. This means
399
+ that the *buffer_callback * argument was None when a :class: `Pickler `
400
+ was instantiated (or when :func: `dump ` or :func: `dumps ` was called).
401
+
402
+ If *buffers * is not None, it should be an iterable of buffer-enabled
403
+ objects that is consumed each time the pickle stream references
404
+ an out-of-band buffer view. Such buffers have been given in order
405
+ to the *buffer_callback * of a Pickler object.
406
+
407
+ .. versionchanged :: 3.8
408
+ The *buffers * argument was added.
402
409
403
410
.. method :: load()
404
411
@@ -428,6 +435,34 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
428
435
:ref: `pickle-restrict ` for details.
429
436
430
437
438
+ .. class :: PickleBuffer(buffer)
439
+
440
+ A wrapper for a potentially out-of-band buffer. *buffer * must be a
441
+ :ref: `buffer-providing <bufferobjects >` object, such as a
442
+ :term: `bytes-like object ` or a N-dimensional array.
443
+
444
+ :class: `PickleBuffer ` is itself a buffer provider, therefore it is
445
+ possible to pass it to other APIs expecting a buffer-providing object,
446
+ such as :class: `memoryview `.
447
+
448
+ :class: `PickleBuffer ` objects can only be serialized using pickle
449
+ protocol 5 or higher. They are eligible for
450
+ :ref: `out-of-band serialization <pickle-oob >`.
451
+
452
+ .. versionadded :: 3.8
453
+
454
+ .. method :: raw()
455
+
456
+ Return a :class: `memoryview ` of the memory area underlying this buffer.
457
+ The returned object is a one-dimensional, C-contiguous memoryview
458
+ with format ``B `` (unsigned bytes). :exc: `BufferError ` is raised if
459
+ the buffer is neither C- nor Fortran-contiguous.
460
+
461
+ .. method :: release()
462
+
463
+ Release the underlying buffer exposed by the PickleBuffer object.
464
+
465
+
431
466
.. _pickle-picklable :
432
467
433
468
What can be pickled and unpickled?
@@ -863,6 +898,125 @@ a given class::
863
898
assert unpickled_class.my_attribute == 1
864
899
865
900
901
+ .. _pickle-oob :
902
+
903
+ Out-of-band Buffers
904
+ -------------------
905
+
906
+ .. versionadded :: 3.8
907
+
908
+ In some contexts, the :mod: `pickle ` module is used to transfer massive amounts
909
+ of data. Therefore, it can be important to minimize the number of memory
910
+ copies, to preserve performance and resource consumption. However, normal
911
+ operation of the :mod: `pickle ` module, as it transforms a graph-like structure
912
+ of objects into a sequential stream of bytes, intrinsically involves copying
913
+ data to and from the pickle stream.
914
+
915
+ This constraint can be eschewed if both the *provider * (the implementation
916
+ of the object types to be transferred) and the *consumer * (the implementation
917
+ of the communications system) support the out-of-band transfer facilities
918
+ provided by pickle protocol 5 and higher.
919
+
920
+ Provider API
921
+ ^^^^^^^^^^^^
922
+
923
+ The large data objects to be pickled must implement a :meth: `__reduce_ex__ `
924
+ method specialized for protocol 5 and higher, which returns a
925
+ :class: `PickleBuffer ` instance (instead of e.g. a :class: `bytes ` object)
926
+ for any large data.
927
+
928
+ A :class: `PickleBuffer ` object *signals * that the underlying buffer is
929
+ eligible for out-of-band data transfer. Those objects remain compatible
930
+ with normal usage of the :mod: `pickle ` module. However, consumers can also
931
+ opt-in to tell :mod: `pickle ` that they will handle those buffers by
932
+ themselves.
933
+
934
+ Consumer API
935
+ ^^^^^^^^^^^^
936
+
937
+ A communications system can enable custom handling of the :class: `PickleBuffer `
938
+ objects generated when serializing an object graph.
939
+
940
+ On the sending side, it needs to pass a *buffer_callback * argument to
941
+ :class: `Pickler ` (or to the :func: `dump ` or :func: `dumps ` function), which
942
+ will be called with each :class: `PickleBuffer ` generated while pickling
943
+ the object graph. Buffers accumulated by the *buffer_callback * will not
944
+ see their data copied into the pickle stream, only a cheap marker will be
945
+ inserted.
946
+
947
+ On the receiving side, it needs to pass a *buffers * argument to
948
+ :class: `Unpickler ` (or to the :func: `load ` or :func: `loads ` function),
949
+ which is an iterable of the buffers which were passed to *buffer_callback *.
950
+ That iterable should produce buffers in the same order as they were passed
951
+ to *buffer_callback *. Those buffers will provide the data expected by the
952
+ reconstructors of the objects whose pickling produced the original
953
+ :class: `PickleBuffer ` objects.
954
+
955
+ Between the sending side and the receiving side, the communications system
956
+ is free to implement its own transfer mechanisms for out-of-band buffers.
957
+ Potential optimizations include the use of shared memory or datatype-dependent
958
+ compression.
959
+
960
+ Example
961
+ ^^^^^^^
962
+
963
+ Here is a trivial example where we implement a :class: `bytearray ` subclass
964
+ able to participate in out-of-band buffer pickling::
965
+
966
+ class ZeroCopyByteArray(bytearray):
967
+
968
+ def __reduce_ex__(self, protocol):
969
+ if protocol >= 5:
970
+ return type(self)._reconstruct, (PickleBuffer(self),), None
971
+ else:
972
+ # PickleBuffer is forbidden with pickle protocols <= 4.
973
+ return type(self)._reconstruct, (bytearray(self),)
974
+
975
+ @classmethod
976
+ def _reconstruct(cls, obj):
977
+ with memoryview(obj) as m:
978
+ # Get a handle over the original buffer object
979
+ obj = m.obj
980
+ if type(obj) is cls:
981
+ # Original buffer object is a ZeroCopyByteArray, return it
982
+ # as-is.
983
+ return obj
984
+ else:
985
+ return cls(obj)
986
+
987
+ We see that the reconstructor (the ``_reconstruct `` class method) returns
988
+ the buffer's providing object if it has the right type. This is an easy way
989
+ to simulate zero-copy behaviour on this toy example.
990
+
991
+ On the consumer side, we can pickle those objects the usual way, which
992
+ when unserialized will give us a copy of the original object::
993
+
994
+ b = ZeroCopyByteArray(b"abc")
995
+ data = pickle.dumps(b, protocol=5)
996
+ new_b = pickle.loads(data)
997
+ print(b == new_b) # True
998
+ print(b is new_b) # False: a copy was made
999
+
1000
+ But if we pass a *buffer_callback * and then give back the accumulated
1001
+ buffers when unserializing, we are able to get back the original object::
1002
+
1003
+ b = ZeroCopyByteArray(b"abc")
1004
+ buffers = []
1005
+ data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1006
+ new_b = pickle.loads(data, buffers=buffers)
1007
+ print(b == new_b) # True
1008
+ print(b is new_b) # True: no copy was made
1009
+
1010
+ This example is limited by the fact that :class: `bytearray ` allocates its
1011
+ own memory: you cannot create a :class: `bytearray ` instance that is backed
1012
+ by another object's memory. However, third-party datatypes such as NumPy
1013
+ arrays do not have this limitation, and allow use of zero-copy pickling
1014
+ (or making as few copies as possible) when transferring between distinct
1015
+ processes or systems.
1016
+
1017
+ .. seealso :: :pep:`574` -- Pickle protocol 5 with out-of-band data
1018
+
1019
+
866
1020
.. _pickle-restrict :
867
1021
868
1022
Restricting Globals
0 commit comments