btrfs: zoned: implement ZONE_RESET space_info reclaiming
From: | Naohiro Aota <naohiro.aota-AT-wdc.com> | |
To: | linux-btrfs-AT-vger.kernel.org | |
Subject: | [PATCH v2 0/3] btrfs: zoned: implement ZONE_RESET space_info reclaiming | |
Date: | Thu, 14 Nov 2024 17:04:26 +0900 | |
Message-ID: | <cover.1731571240.git.naohiro.aota@wdc.com> | |
Cc: | Naohiro Aota <naohiro.aota-AT-wdc.com> | |
Archive-link: | Article |
There is a longstanding early ENOSPC issue on the zoned mode. When there are heavy write operations on a nearly ENOSPC file system, freeing up the space and resetting the zones often cannot catch up the write speed. That results in an early ENOSPC. For example, running the following fio script, which repeatedly over-writes 15 GB files on 20 GB file system results in a ENOSPC shown below. Fio script: [test] filename=/mnt/scratch/test readwrite=write ioengine=libaio direct=1 loops=10 filesize=15G bs=128k Result: BTRFS info (device nvme0n1): cannot satisfy tickets, dumping space info BTRFS info (device nvme0n1): space_info DATA has 0 free, is full BTRFS info (device nvme0n1): space_info total=20535312384, used=16106127360, pinned=0, reserved=0, may_use=0, readonly=0 zone_unusable=4429185024 BTRFS info (device nvme0n1): failing ticket with 131072 bytes BTRFS info (device nvme0n1): space_info DATA has 0 free, is full BTRFS info (device nvme0n1): space_info total=20535312384, used=16106127360, pinned=0, reserved=0, may_use=0, readonly=0 zone_unusable=4429185024 BTRFS info (device nvme0n1): global_block_rsv: size 25870336 reserved 25853952 BTRFS info (device nvme0n1): trans_block_rsv: size 0 reserved 0 BTRFS info (device nvme0n1): chunk_block_rsv: size 0 reserved 0 BTRFS info (device nvme0n1): delayed_block_rsv: size 0 reserved 0 BTRFS info (device nvme0n1): delayed_refs_rsv: size 0 reserved 0 fio: io_u error on file /mnt/scratch/test: No space left on device: write offset=13287555072, buflen=131072 fio: pid=869, err=28/file:io_u.c:1962, func=io_u error, error=No space left on device ... Run status group 0 (all jobs): WRITE: bw=113MiB/s (118MB/s), 113MiB/s-113MiB/s (118MB/s-118MB/s), io=27.4GiB (29.4GB), run=248965-248965msec As the result shows, fio fails only after 27GB. Instead, it should be able to write 150 GB by freeing over-written region. The space_info status shows that there is 4.1 GB zone_unusable in the DATA space. While this space will be eventually freed after a transaction commit and zone reset, the space_info dump means btrfs is too slow to reuse the zone_unusable. There are some reasons to hit ENOSPC early and this series only addresses one of them: unusable block group is not reclaimed enough fast. This series introduces a new space_info reclaim method ZONE_RESET. That method will pick a block group in the unused list and send ZONE_RESET command to free up and reuse the zone_unusable space. For the first implementation, the ZONE_RESET is only applied to a block group whose region is fully zone_unusable. Reclaiming partial zone_unusable block group could be implemented later. Patches 1 and 2 do the preparation for the patch 3 and there are no functional change. Patch 3 introduces the new space_info reclaim method ZONE_RESET described above. Following series will fully fix ENOSPC issue on the above fio script. One will separate space_info of regular data and relocation data. And, another will rework zone resetting of deleted block group to let it set the empty zone bit early. Changes: - v2: - Use the ordinal locking style. - Rewrite btrfs_return_free_space() to reduce indent level. - Add some extra comment. - v1: https://lore.kernel.org/linux-btrfs/gjr4vwt5qm7j36xnjijp5... Naohiro Aota (3): btrfs: introduce btrfs_return_free_space() btrfs: drop fs_info argument from btrfs_update_space_info_* btrfs: zoned: reclaim unused zone by zone resetting fs/btrfs/block-group.c | 16 ++--- fs/btrfs/block-rsv.c | 10 +-- fs/btrfs/delalloc-space.c | 2 +- fs/btrfs/delayed-ref.c | 5 +- fs/btrfs/extent-tree.c | 35 ++-------- fs/btrfs/inode.c | 2 +- fs/btrfs/space-info.c | 69 ++++++++++++++++--- fs/btrfs/space-info.h | 15 +++-- fs/btrfs/transaction.c | 3 +- fs/btrfs/zoned.c | 124 +++++++++++++++++++++++++++++++++++ fs/btrfs/zoned.h | 7 ++ include/trace/events/btrfs.h | 3 +- 12 files changed, 223 insertions(+), 68 deletions(-) -- 2.47.0