Content-Length: 12561 | pFad | http://lwn.net/Articles/998183/

btrfs: zoned: implement ZONE_RESET space_info reclaiming [LWN.net]
|
|
Subscribe / Log in / New account

btrfs: zoned: implement ZONE_RESET space_info reclaiming

From:  Naohiro Aota <naohiro.aota-AT-wdc.com>
To:  linux-btrfs-AT-vger.kernel.org
Subject:  [PATCH v2 0/3] btrfs: zoned: implement ZONE_RESET space_info reclaiming
Date:  Thu, 14 Nov 2024 17:04:26 +0900
Message-ID:  <cover.1731571240.git.naohiro.aota@wdc.com>
Cc:  Naohiro Aota <naohiro.aota-AT-wdc.com>
Archive-link:  Article

There is a longstanding early ENOSPC issue on the zoned mode. When there
are heavy write operations on a nearly ENOSPC file system, freeing up
the space and resetting the zones often cannot catch up the write speed.
That results in an early ENOSPC. For example, running the following fio
script, which repeatedly over-writes 15 GB files on 20 GB file system
results in a ENOSPC shown below.

Fio script:

  [test]
  filename=/mnt/scratch/test
  readwrite=write
  ioengine=libaio
  direct=1
  loops=10
  filesize=15G
  bs=128k

Result:

  BTRFS info (device nvme0n1): cannot satisfy tickets, dumping space info
  BTRFS info (device nvme0n1): space_info DATA has 0 free, is full
  BTRFS info (device nvme0n1): space_info total=20535312384, used=16106127360, pinned=0, reserved=0, may_use=0,
  readonly=0 zone_unusable=4429185024
  BTRFS info (device nvme0n1): failing ticket with 131072 bytes
  BTRFS info (device nvme0n1): space_info DATA has 0 free, is full
  BTRFS info (device nvme0n1): space_info total=20535312384, used=16106127360, pinned=0, reserved=0, may_use=0,
  readonly=0 zone_unusable=4429185024
  BTRFS info (device nvme0n1): global_block_rsv: size 25870336 reserved 25853952
  BTRFS info (device nvme0n1): trans_block_rsv: size 0 reserved 0
  BTRFS info (device nvme0n1): chunk_block_rsv: size 0 reserved 0
  BTRFS info (device nvme0n1): delayed_block_rsv: size 0 reserved 0
  BTRFS info (device nvme0n1): delayed_refs_rsv: size 0 reserved 0
  fio: io_u error on file /mnt/scratch/test: No space left on device: write offset=13287555072, buflen=131072
  fio: pid=869, err=28/file:io_u.c:1962, func=io_u error, error=No space left on device
  ...
  Run status group 0 (all jobs):
    WRITE: bw=113MiB/s (118MB/s), 113MiB/s-113MiB/s (118MB/s-118MB/s), io=27.4GiB (29.4GB), run=248965-248965msec

As the result shows, fio fails only after 27GB. Instead, it should be
able to write 150 GB by freeing over-written region. The space_info
status shows that there is 4.1 GB zone_unusable in the DATA space. While
this space will be eventually freed after a transaction commit and zone
reset, the space_info dump means btrfs is too slow to reuse the zone_unusable.

There are some reasons to hit ENOSPC early and this series only
addresses one of them: unusable block group is not reclaimed enough
fast. This series introduces a new space_info reclaim method
ZONE_RESET. That method will pick a block group in the unused list and
send ZONE_RESET command to free up and reuse the zone_unusable space.

For the first implementation, the ZONE_RESET is only applied to a block
group whose region is fully zone_unusable. Reclaiming partial
zone_unusable block group could be implemented later.

Patches 1 and 2 do the preparation for the patch 3 and there are no
functional change. Patch 3 introduces the new space_info reclaim method
ZONE_RESET described above.

Following series will fully fix ENOSPC issue on the above fio script.
One will separate space_info of regular data and relocation data. And,
another will rework zone resetting of deleted block group to let it set
the empty zone bit early.

Changes:
- v2:
  - Use the ordinal locking style.
  - Rewrite btrfs_return_free_space() to reduce indent level.
  - Add some extra comment.
- v1: https://lore.kernel.org/linux-btrfs/gjr4vwt5qm7j36xnjijp5...

Naohiro Aota (3):
  btrfs: introduce btrfs_return_free_space()
  btrfs: drop fs_info argument from btrfs_update_space_info_*
  btrfs: zoned: reclaim unused zone by zone resetting

 fs/btrfs/block-group.c       |  16 ++---
 fs/btrfs/block-rsv.c         |  10 +--
 fs/btrfs/delalloc-space.c    |   2 +-
 fs/btrfs/delayed-ref.c       |   5 +-
 fs/btrfs/extent-tree.c       |  35 ++--------
 fs/btrfs/inode.c             |   2 +-
 fs/btrfs/space-info.c        |  69 ++++++++++++++++---
 fs/btrfs/space-info.h        |  15 +++--
 fs/btrfs/transaction.c       |   3 +-
 fs/btrfs/zoned.c             | 124 +++++++++++++++++++++++++++++++++++
 fs/btrfs/zoned.h             |   7 ++
 include/trace/events/btrfs.h |   3 +-
 12 files changed, 223 insertions(+), 68 deletions(-)

-- 
2.47.0




Copyright © 2024, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://lwn.net/Articles/998183/

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy