ok and mastodon people. Answer me this, I have a ZFS filesystem on machine A, I zfs-send it to machine B (zfs send -R) (zfs receive -Fv). machine A has 4T of space *total*. machine B has 8T of space. when done, machine B has only 250g of free space available), the filesystem is almost _twice_ as large?

@david my first guess is that you have a lot of small files and something is causing zfs to insert a lot of padding.

Is ashift the same on both pools (zpool get ashift, I think)? My guess is the source may be 9 (512 byte minimum block size) and the destination is 12 (4k min block).

Is the source not raidz and destination is raidz?

How are you looking at total space? zpool and zfs commands look at different things?

@mgerdts Not small files, average filesize is close to 1gig (this is a postgres database data filesystem), there are 2 recordsizes on it, a 'precopy' snapshot with 128K records, and then I set it to 8k to get better perfomance and copied everything over, I also set lz4 to zstd on the new copies. no raid on either, straight concat/stripe (underlying hardware does all of the redundancy)

@mgerdts I am looking at it via 'zfs list' and 'df' both show compatible information. The main difference seems to be in the refer (I am redoing the receive right now, so I am going from memory), it appears that the receive has multiple full copies.

And the zfs-receive seems to corroborate that by saying it has multiple 'full' streams ... maybe?. In ~7 more hours the receive will be finished

@david maybe the copies or compressratio properties on each dataset will offer clues.

@mgerdts ok, so compression ratios are different, by about a factor of 2x.. which explains it. But why? I looked at ashift (zpool property) on both and they are zero on both. Both are zstd (which I additionally forced with a -o on zfs-receive, since ONE of the original ones was lz4.. but even if that was a degenerate compression case in converting lz4 to zstd, it doesn't explain nearly enough of the difference)

Follow

@mgerdts I did see that checksums are "on" on the source and "skein" on the destination. but on 4t of 8k pages, that's just 16g or 32g of additional space, total (depending on 256bits or 512bits of hashsize, and that's worsecase since that doesn't account for fletcher7 already being 128bits)

· · Web · 0 · 0 · 0
Sign in to participate in the conversation
Cross Family's Mastodon

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!