-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Filesystem] Add readFileInChunks
method to read files in fixed-size chunks
#60916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 7.4
Are you sure you want to change the base?
Conversation
33f31eb
to
5516edd
Compare
$chunks .= $chunk; | ||
} | ||
|
||
$this->assertSame($expectedContent, $chunks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there also be assertions/test cases about the chunk size?
* | ||
* @param string $filename The full path to the file | ||
* | ||
* @return iterable<string> Yields file content as strings in chunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using "yields" isn't accurate, given the return type is an iterable (which can be any array for example). You may either change the return type to \Generator
(I personally prefer iterable because it's more flexible) or not use the "yield" word here
5516edd
to
eb16ee7
Compare
Hi everyone, thanks for your suggestions! |
* | ||
* @return \Generator<string> Yields file content as strings in chunks | ||
* | ||
* @throws IOException If the file cannot be opened or read, or if it's a directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is wrong in the current implementation. This exception is not thrown when calling this method, but when starting the iteration (your test is hiding that by using iterator_to_array
).
To properly perform the validation synchronously (which is easier for error handling and documentation), you would need to move the iteration to a private method (defining the generator) while the validation runs in the public method before calling that private method. We use that approach in symfony/cache
for instance (where we are required to validate keys synchronously to respect PSR-6)
* | ||
* @throws IOException If the file cannot be opened or read, or if it's a directory | ||
*/ | ||
public function readFileInChunks(string $filename, int $size = 8192): \Generator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the return type should be either iterable
or \Traversable
(if we want to guarantee that it is not returning an array) rather than \Generator
, as it gives us more flexibility to refactor the implementation in the future (widening a return type is a BC break)
eb16ee7
to
b06ce7a
Compare
Hi @stof , thanks for your suggestion! |
b06ce7a
to
ea7570e
Compare
Description
This PR introduces a new
readFileInChunks()
method to theFilesystem
component, which provides a memory-efficient way to read large files by yielding fixed-size chunks of data.Motivation
Reading large files all at once can consume excessive memory and potentially kill the process, especially in constrained environments. This method avoids that by yielding smaller chunks, making it safer and more efficient for large file handling.
Example