Skip to content

ls: Print optimizations #7801

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 21, 2025
Merged

ls: Print optimizations #7801

merged 4 commits into from
Apr 21, 2025

Conversation

drinkcat
Copy link
Contributor

See #7563, I'll have more fixes, but these are low-hanging fruits in printing.

We get closer to GNU coreutils (~10%), this improves performance by about ~15% for the use case below.

cargo build -r -p uu_ls && taskset -c 0 hyperfine --warmup 100 -L ls ls,target/release/ls,./ls-main "{ls} -lR /var/lib .git || true"
Benchmark 1: ls -lR /var/lib .git || true
  Time (mean ± σ):      26.7 ms ±   1.2 ms    [User: 11.3 ms, System: 14.7 ms]
  Range (min … max):    25.8 ms …  38.5 ms    109 runs
 
Benchmark 2: target/release/ls -lR /var/lib .git || true
  Time (mean ± σ):      29.5 ms ±   1.3 ms    [User: 13.2 ms, System: 15.5 ms]
  Range (min … max):    28.6 ms …  41.3 ms    100 runs
 
Benchmark 3: ./ls-main -lR /var/lib .git || true
  Time (mean ± σ):      33.8 ms ±   1.5 ms    [User: 18.3 ms, System: 14.7 ms]
  Range (min … max):    32.6 ms …  45.6 ms    86 runs
 
Summary
  ls -lR /var/lib .git || true ran
    1.10 ± 0.07 times faster than target/release/ls -lR /var/lib .git || true
    1.26 ± 0.08 times faster than ./ls-main -lR /var/lib .git || true

ls/BENCHMARKING.md: Add some tricks

ls: display_item_name: Make current_column a closure

In many cases, current_column value is not actually needed, but
computing its value is quite expensive (ansi_width isn't very
fast).

Move the computation to a LazyCell, so that we only execute it
when required.

Saves 5% on a basic ls -lR .git.

ls: Add extend_pad_left/right functions that operate on Vec

Saves another ~7% performance.

ls: Optimize display_item_long

Preallocate output_display to a larger size, and use extend
instead of format.

Saves about ~5% performance vs baseline implementation.

@sylvestre sylvestre requested a review from Copilot April 20, 2025 17:11
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces several performance improvements to the ls printing routines. Key changes include:

  • Replacing eager computations with LazyCell to delay expensive calculations, e.g. for current_column.
  • Adding a new ExtendPad trait for Vec to implement extend_pad_left/right functions, reducing formatting overhead.
  • Preallocating output buffers and replacing write! calls with extend methods for faster output display.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/uu/ls/src/ls.rs Optimized printing functions by introducing lazy evaluations and new pad methods.
src/uu/ls/BENCHMARKING.md Updated benchmarking documentation to reflect performance improvements.

Copy link

GNU testsuite comparison:

GNU test failed: tests/ls/stat-failed. tests/ls/stat-failed is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

GNU test failed: tests/ls/stat-failed. tests/ls/stat-failed is passing on 'main'. Maybe you have to rebase?

Seems that it regressed this :)

@@ -3206,13 +3211,13 @@ fn display_item_name(
more_info: String,
out: &mut BufWriter<Stdout>,
style_manager: &mut Option<StyleManager>,
current_column: usize,
current_column: LazyCell<usize, Box<dyn FnOnce() -> usize + '_>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the byte offset instead, because we add this byte if it could possibly wrap, so we only need a lower bound on adding it. That should be cheaper to compute than the ansi_width computation.

I can even show that GNU does that. I did this in a terminal that fit the filename easily, but because emojis consist of many bytes, GNU will add the byte:

touch 🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀.foo
env TERM=xterm  LS_COLORS="*.foo=0;31;42" TIME_STYLE=+T ls  --color=always 🦀
🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀.foo

I checked this by piping into bat -A.

Copy link
Contributor Author

@drinkcat drinkcat Apr 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about colors? (i.e. when called with --color=auto). That's the other thing that ansi_width removes...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just add the original byte size to some counter from before the color was added, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh right, the color specifiers are just added in a few places. And then we could get rid of ansi_width.

I don't want to do this as part of this PR though, I think I'd need to investigate more deeply and this PR is meant to be a no-op in terms of functionality. Filed #7804.

Do you want me to drop the commit that adds this LazyCell thing from this PR? Either way is fine by me.

Preallocate output_display to a larger size, and use `extend`
instead of format.

Saves about ~5% performance vs baseline implementation.
In many cases, current_column value is not actually needed, but
computing its value is quite expensive (`ansi_width` isn't very
fast).

Move the computation to a LazyCell, so that we only execute it
when required.

Saves 5% on a basic `ls -lR .git`.
@drinkcat
Copy link
Contributor Author

GNU test failed: tests/ls/stat-failed. tests/ls/stat-failed is passing on 'main'. Maybe you have to rebase?

Seems that it regressed this :)

Whoopsie ,-) I was perhaps relying too much on Rust unit tests ,-) Fixed.

Copy link

GNU testsuite comparison:

GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)

@drinkcat
Copy link
Contributor Author

GNU testsuite comparison:

GNU test failed: tests/misc/tee. tests/misc/tee is passing on 'main'. Maybe you have to rebase?

That... doesn't fail locally. And has nothing to do with ls AFAICT...

@sylvestre
Copy link
Contributor

GNU test failed: tests/ls/stat-failed. tests/ls/stat-failed is passing on 'main'. Maybe you have to rebase?

Seems that it regressed this :)

Whoopsie ,-) I was perhaps relying too much on Rust unit tests ,-) Fixed.

Maybe add a test to cover it too? :)

@sylvestre sylvestre merged commit b5e0e23 into uutils:main Apr 21, 2025
69 of 70 checks passed
@drinkcat drinkcat deleted the ls-opt-1 branch April 21, 2025 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy