Skip to content

Solve HMAC one-shot digest limitations #136912

@picnixz

Description

@picnixz

Bug report

Bug description:

We have THREE different implementations for HMAC: one based on OpenSSL, one based on HACL* and a pure Python one that directly uses hash functions (HMAC can be written easily in Python if the hash function is given as input as well). Apart from that, we also a dispatching mechanism for "one-shot" HMAC that is a bit inconsistent:

  • If OpenSSL is present -> we use OpenSSL's one-shot HMAC. The key and message sizes are limited to INT_MAX, otherwise an OverflowError occurs. If the algorithm doesn't exist, it falls back to HACL* if possible.
  • HACL* implementations limits the key and message sizes to $2^{32}-1$ which may be different from INT_MAX. An OverflowError is also raised in this case. If again the algorithm isn't recognized, we fall back to the slow implementation.
  • There is no size restriction for the slow implementation.

Currently, the first OverflowError is actually not caught, so if we have OpenSSL + HACL*, we can't fallback to the slow path. I don't know whether it's better to actually reject large messages or not and how to fallback. I can suggest two plans with advantages and disadvantages:

Suggestion 1 (nice for the user)

  • If OpenSSL is present and the key/message are too long, we fallback to HACL*.
  • If it's still too large, we fallback to the slow one.
  • And if it's still very large here, then that's an user issue.

My rationale here is that I want to at least give a chance to anyone using one-shot HMAC on a very large buffer even though they should do in chunks. I also plan to document this.

Suggestion 2 (nice for the CPU)

  • If a message is too large to be handled by OpenSSL's HMAC, we fall back to HACL*
  • If it's too large for HACL*, then we do not fall back to the slow one (currently we do!)

@gpshead suggested the following ones:

Suggestion 3

  • Do the first bullet from Suggestion 2 and fall back to "the slow one" if we must.
  • Document that HMAC should always be done in chunks of less that 2GiB for the best performance.
  • Goal: never let OverflowError be raised.

Suggestion 4

  • Always do the chunking loop ourselves if len(msg) > LIMIT
  • Use memoryview in that loop to avoid huge copies.
  • Goal: never let OverflowError be raised & never let performance degrade significantly.

Current behavior

  • If OpenSSL is present, long keys/messages are directly rejected with no way to fall back to HACL* or slow HMAC.
  • If OpenSSL is not present, long keys/messages may be rejected by one-shot HACL* HMAC but we will fall back to the slow HMAC.

This is kind of disturbing when OpenSSL is present vs when it's not.


In the end, I think suggestion 4 will be nicest while reducing performance impact.

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

Note

We still need a 3.14.1 backport but we won't do it until 3.14 is stable.

Metadata

Metadata

Assignees

Labels

3.14bugs and security fixes3.15new features, bugs and security fixesdeferred-blockerstdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    pFad - Phonifier reborn

    Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

    Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


    Alternative Proxies:

    Alternative Proxy

    pFad Proxy

    pFad v3 Proxy

    pFad v4 Proxy