Skip to content

Commit 015a544

Browse files
Christoph Hellwigbrauner
authored andcommitted
xfs: set s_min_writeback_pages for zoned file systems
Set s_min_writeback_pages to the zone size, so that writeback always writes up to a full zone. This ensures that writeback does not add spurious file fragmentation when writing back a large number of files that are larger than the zone size. Fixes: 4e4d520 ("xfs: add the zoned space allocator") Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20251017034611.651385-4-hch@lst.de Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
1 parent 90db4d4 commit 015a544

1 file changed

Lines changed: 26 additions & 2 deletions

File tree

fs/xfs/xfs_zone_alloc.c

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1215,6 +1215,7 @@ xfs_mount_zones(
12151215
.mp = mp,
12161216
};
12171217
struct xfs_buftarg *bt = mp->m_rtdev_targp;
1218+
xfs_extlen_t zone_blocks = mp->m_groups[XG_TYPE_RTG].blocks;
12181219
int error;
12191220

12201221
if (!bt) {
@@ -1245,10 +1246,33 @@ xfs_mount_zones(
12451246
return -ENOMEM;
12461247

12471248
xfs_info(mp, "%u zones of %u blocks (%u max open zones)",
1248-
mp->m_sb.sb_rgcount, mp->m_groups[XG_TYPE_RTG].blocks,
1249-
mp->m_max_open_zones);
1249+
mp->m_sb.sb_rgcount, zone_blocks, mp->m_max_open_zones);
12501250
trace_xfs_zones_mount(mp);
12511251

1252+
/*
1253+
* The writeback code switches between inodes regularly to provide
1254+
* fairness. The default lower bound is 4MiB, but for zoned file
1255+
* systems we want to increase that both to reduce seeks, but also more
1256+
* importantly so that workloads that writes files in a multiple of the
1257+
* zone size do not get fragmented and require garbage collection when
1258+
* they shouldn't. Increase is to the zone size capped by the max
1259+
* extent len.
1260+
*
1261+
* Note that because s_min_writeback_pages is a superblock field, this
1262+
* value also get applied to non-zoned files on the data device if
1263+
* there are any. On typical zoned setup all data is on the RT device
1264+
* because using the more efficient sequential write required zones
1265+
* is the reason for using the zone allocator, and either the RT device
1266+
* and the (meta)data device are on the same block device, or the
1267+
* (meta)data device is on a fast SSD while the data on the RT device
1268+
* is on a SMR HDD. In any combination of the above cases enforcing
1269+
* the higher min_writeback_pages for non-RT inodes is either a noop
1270+
* or beneficial.
1271+
*/
1272+
mp->m_super->s_min_writeback_pages =
1273+
XFS_FSB_TO_B(mp, min(zone_blocks, XFS_MAX_BMBT_EXTLEN)) >>
1274+
PAGE_SHIFT;
1275+
12521276
if (bdev_is_zoned(bt->bt_bdev)) {
12531277
error = blkdev_report_zones(bt->bt_bdev,
12541278
XFS_FSB_TO_BB(mp, mp->m_sb.sb_rtstart),

0 commit comments

Comments
 (0)