Show newer

@kairyssdal Sneakers is the rare movie that seems to only get more accurate and true as time goes on.

@uep @mgerdts I think the key here is that it is a postgres database store, so the recordsize is 8k to align with postgres pagesize, and with ashift of 4k that means BEST case possible compression is 2x, and anything less than 2x is 1x; that means realized compression has to be in the 2.0 to 1.0 range, whereas on the original I was in the 3.x to 4.x range. Math checks out.

@elliot@microscopic.network The goal here is to actually have all of the snapshots mirrored,and I zpool destroy and zpool create between each attempt, so any mystery snapshots would have to be coming from the original machine that only has 4T.

That said, latest experiment was a success, and ashift was the culprit

@mgerdts Now all I have to do is kill the pool... and restore.. again.. the ... 5th? time is the charm?

@mgerdts AH-HAH... googling indicates that I need to use zdb vs zpool to get ashift values... and.. there we are. ashift of 12 on the new devices and 9 on the old. I think we have the smoking gun... once I was actually looking in the right place. Thanks!

@mgerdts no raidz at all, simple stripe/concat. 4x1T on machine A, 2x4T on machine B.

@mgerdts I did see that checksums are "on" on the source and "skein" on the destination. but on 4t of 8k pages, that's just 16g or 32g of additional space, total (depending on 256bits or 512bits of hashsize, and that's worsecase since that doesn't account for fletcher7 already being 128bits)

@mgerdts ok, so compression ratios are different, by about a factor of 2x.. which explains it. But why? I looked at ashift (zpool property) on both and they are zero on both. Both are zstd (which I additionally forced with a -o on zfs-receive, since ONE of the original ones was lz4.. but even if that was a degenerate compression case in converting lz4 to zstd, it doesn't explain nearly enough of the difference)

@javierk4jh My understanding from reading zfs-send and zfs-receive and online searches is that you actually cannot change recordsize that way as the stream is deltas itself.

That is if the incremental says to "set block 15 to 0xfeedface", then block 0xfeedface doesn't have the context of the rest of the block to fill in.

Granted this is a solvable problem to just read the original and write out the whole, but they opted to not have that complexity

I did check anyway, and recordsizes look good

@mgerdts I am looking at it via 'zfs list' and 'df' both show compatible information. The main difference seems to be in the refer (I am redoing the receive right now, so I am going from memory), it appears that the receive has multiple full copies.

And the zfs-receive seems to corroborate that by saying it has multiple 'full' streams ... maybe?. In ~7 more hours the receive will be finished

@mgerdts Not small files, average filesize is close to 1gig (this is a postgres database data filesystem), there are 2 recordsizes on it, a 'precopy' snapshot with 128K records, and then I set it to 8k to get better perfomance and copied everything over, I also set lz4 to zstd on the new copies. no raid on either, straight concat/stripe (underlying hardware does all of the redundancy)

ok and mastodon people. Answer me this, I have a ZFS filesystem on machine A, I zfs-send it to machine B (zfs send -R) (zfs receive -Fv). machine A has 4T of space *total*. machine B has 8T of space. when done, machine B has only 250g of free space available), the filesystem is almost _twice_ as large?

Roses are red.
Roses are blue.
Depending on their velocity
relative to you.

@toran @lattera @kev The BSDs are a friendly and supportive community, you'll enjoy the fun then stay for the stability ☺️

Show older
Cross Family's Mastodon

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!