Skip to content

GH-135904: Improve the JIT's performance on macOS #136528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Jul 11, 2025

This PR makes a couple of minor tweaks to the JIT that result in 1.7% faster performance on macOS overall:

  • Our AArch64 code doesn't need to be 8-byte aligned, just the data. Currently, we guarantee this by aligning all code anyways, since the data follows immediately after it. This is wasteful, since it means about half of all stencils end in a nop. Instead, don't pad any stencils, and just align the data when it's compiled. 🤦🏼
  • The textual assembly "optimizer" pass has a bug where it interprets lines that are commented with ; as instructions. By recognizing these commented lines, we can remove more zero-length jumps at the end of stencils. 🤦🏼
  • During this same pass, we can represent the address of the next instruction (the end of the template, or the _JIT_CONTINUE label) as a "local" label, which allows the assembler to resolve it at compile time and encode it more efficiently. There's a special (platform-dependent) prefix to signal this.
  • Finally, instead of declaring jump targets (_JIT_CONTINUE, _JIT_ERROR_TARGET, and _JIT_JUMP_TARGET) as extern symbols, just declare them as local functions. This results in more efficient jumps (and also allows us to remove a somewhat hacky pre-processing step for the textual assembly on Windows to force these efficient jumps).

@brandtbucher brandtbucher requested a review from diegorusso July 11, 2025 02:01
@brandtbucher brandtbucher self-assigned this Jul 11, 2025
@brandtbucher brandtbucher added performance Performance or resource usage skip news interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-JIT labels Jul 11, 2025
@Fidget-Spinner
Copy link
Member

This is amazing! I can't review it, but thanks for all your assembler wizardry on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage skip news topic-JIT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy