-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
ls: Print optimizations #7801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ls: Print optimizations #7801
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces several performance improvements to the ls printing routines. Key changes include:
- Replacing eager computations with LazyCell to delay expensive calculations, e.g. for current_column.
- Adding a new ExtendPad trait for Vec to implement extend_pad_left/right functions, reducing formatting overhead.
- Preallocating output buffers and replacing write! calls with extend methods for faster output display.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/uu/ls/src/ls.rs | Optimized printing functions by introducing lazy evaluations and new pad methods. |
src/uu/ls/BENCHMARKING.md | Updated benchmarking documentation to reflect performance improvements. |
GNU testsuite comparison:
|
Seems that it regressed this :) |
@@ -3206,13 +3211,13 @@ fn display_item_name( | |||
more_info: String, | |||
out: &mut BufWriter<Stdout>, | |||
style_manager: &mut Option<StyleManager>, | |||
current_column: usize, | |||
current_column: LazyCell<usize, Box<dyn FnOnce() -> usize + '_>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use the byte offset instead, because we add this byte if it could possibly wrap, so we only need a lower bound on adding it. That should be cheaper to compute than the ansi_width
computation.
I can even show that GNU does that. I did this in a terminal that fit the filename easily, but because emojis consist of many bytes, GNU will add the byte:
touch 🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀.foo
env TERM=xterm LS_COLORS="*.foo=0;31;42" TIME_STYLE=+T ls --color=always 🦀
🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀.foo
I checked this by piping into bat -A
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about colors? (i.e. when called with --color=auto
). That's the other thing that ansi_width
removes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to just add the original byte size to some counter from before the color was added, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh right, the color specifiers are just added in a few places. And then we could get rid of ansi_width
.
I don't want to do this as part of this PR though, I think I'd need to investigate more deeply and this PR is meant to be a no-op in terms of functionality. Filed #7804.
Do you want me to drop the commit that adds this LazyCell
thing from this PR? Either way is fine by me.
Preallocate output_display to a larger size, and use `extend` instead of format. Saves about ~5% performance vs baseline implementation.
Saves another ~7% performance.
In many cases, current_column value is not actually needed, but computing its value is quite expensive (`ansi_width` isn't very fast). Move the computation to a LazyCell, so that we only execute it when required. Saves 5% on a basic `ls -lR .git`.
Whoopsie ,-) I was perhaps relying too much on Rust unit tests ,-) Fixed. |
GNU testsuite comparison:
|
That... doesn't fail locally. And has nothing to do with |
Maybe add a test to cover it too? :) |
See #7563, I'll have more fixes, but these are low-hanging fruits in printing.
We get closer to GNU coreutils (~10%), this improves performance by about ~15% for the use case below.
ls/BENCHMARKING.md: Add some tricks
ls: display_item_name: Make current_column a closure
In many cases, current_column value is not actually needed, but
computing its value is quite expensive (
ansi_width
isn't veryfast).
Move the computation to a LazyCell, so that we only execute it
when required.
Saves 5% on a basic
ls -lR .git
.ls: Add extend_pad_left/right functions that operate on Vec
Saves another ~7% performance.
ls: Optimize display_item_long
Preallocate output_display to a larger size, and use
extend
instead of format.
Saves about ~5% performance vs baseline implementation.