Skip to content

email.message_from_bytes heavy memory use #115512

Open
@cnicodeme

Description

@cnicodeme

Bug report

Bug description:

Hi!

Investigating some memory issues on my lamdba, I discovered an odd usage coming from email.message_from_bytes

When opening an .eml that contains close to no text but a 30Mb attachment, the memory usage jumps to +238Mb !
9 times the size of the file!!

Here's what was my tests:

from email import message_from_bytes
import resource

print('Init ram: {}kb'.format(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))

data = None
with open('file.eml', 'rb') as f:
    data = f.read()

print('File loaded: {}kb'.format(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))
print('    (file size: {}kb)'.format(len(data) / 1024))

mail = message_from_bytes(data)

print('After message_from_bytes: {}kb'.format(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))

And the output:

Init ram: 7168kb
File loaded: 37120kb
    (file size: 29900kb)
After message_from_bytes: 279296kb

The EML in question contains an attachment (a CSV file) encoded in Base64. I suspect that BytesParser is converting that content to binary data, but I find it surprising that doing this takes 9 times the filesize.
Wouldn't it be faster and more efficient to convert that only when accessing, and having a way to not convert it at all (getting it raw, in base64) ?

(Maybe there is already and I missed it?)

I tested this in:

  • Python 3.10.13
  • Python 3.12.1

And got the same results.

CPython versions tested on:

3.10

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtopic-emailtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy