Skip to content

GH-46507: [C++] Make the aws sdk S3 lowSpeedLimit configurable from arrow S3Options #46506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mderoy
Copy link

@mderoy mderoy commented May 19, 2025

Rationale for this change

In extreme cases, a slow network response response from S3, Minio, etc will result in arrow returning curlCode: 28 timeout errors even before the request_timeout period has elapsed. This is because there is an additional setting used in the awssdk known as the lowSpeedLimit (used in the curl library) to abort the transfer if the transfer rate drops below some value (by default 1 byte / second). In such a case the user may want to disable this lowSpeedLimit to allow their reads to succeed despite the slow network throughputs.

What changes are included in this PR?

add the low_speed_limit option to S3Options, and then include them in the client_config_ if they are >=0
< 0 will use the default setting from the s3 sdk (like other options)
0 disables the lowSpeedLimit
> 0 will set the lowSpeedLimit to N bytes / s

Are these changes tested?

They are tested our fork in arrow 13.0.0.0 but I have not tested with the latest code, as I'm unable to build main in my environment (it looks like a new build dependency on ninja was added and I'm not able to install it at the moment on my system). I have opened this PR via a clean cherry-pick of my changes.

Are there any user-facing changes?

No, these settings are for advanced users of the C++ sdk with S3FS

Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@mderoy mderoy changed the title Make the aws sdk S3 lowSpeedLimit configurable from arrow S3Options MINOR: [C++] Make the aws sdk S3 lowSpeedLimit configurable from arrow S3Options May 19, 2025
@mderoy mderoy changed the title MINOR: [C++] Make the aws sdk S3 lowSpeedLimit configurable from arrow S3Options GH-46507: [C++] Make the aws sdk S3 lowSpeedLimit configurable from arrow S3Options May 19, 2025
Copy link

⚠️ GitHub issue #46507 has been automatically assigned in GitHub to PR creator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy