-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: archive/tar: support zero-copy reading/writing #70807
Comments
A similar optimization exists for the reading side, of course. |
Just to spell it out, I believe that the API change here is to define a new method on // ReadFrom implements [io.ReaderFrom].
func (tw *Writer) readFrom(r io.Reader) (int64, error) Note that I think you could get a similar effect without the API change by writing if tw, ok := fw.w.(*Writer); ok {
return tw.readFrom(r)
} CC @dsnet |
your suggestion certainly improves |
for the reader, this works.
|
It's probably me that was missing something. |
cc @dsnet given the TODO above |
Do we want to include logic to pad out the tar to align content files to the destination's blocksize if it's an Tar natively pad to 512 :'( Line 143 in e39e965
|
@Jorropo - Fascinating insight, thanks! I can confirm that on btrfs, if I set the blockSize to 4096, I can write a 2G tar file in 0.08s which is amazing. Unfortunately, it appears that the block size is not variable in the tar format, so this needs to be done in a different way. Fortunately, one could simply add as many empty files to pad out the tar file to 4096 or whatever the destination block size is. This can be done without changing the tar package at all. |
I am not suggesting we change that field, 512 is hardcoded part of the tar format. |
having the bytes generated depend on a non-obvious property of the destination sounds confusing to me; I think we wouldn't want to do this. Regardless, it should go into its own proposal. Let's not side-track this discussion. |
The Go standard library supports copy_file_range for I/O between files. For most file systems, this is nice speed up, because it avoids copying data. For certain filesystems, like BRTFS and XFS, CoW causes such writes to be metadata-only, speeding them up by 10x or so. In principle, uncompressed tar files can be read and written using copy_file_range, as the file format does not checksum the data (see golang/go#70807). Currently, podman makes this impossible, as the archive package liberally inserts buffers and pipes. This commit marks a couple of places that should be revisited.
A change with some numbers at https://go-review.googlesource.com/c/go/+/642736. For tmpfs and ext4, it yields a 10-20% speed improvement. I have trouble measuring consistent results with BTRFS on Fedora 41. A standalone program can seemingly copy around large files no time at all (speeds of ~1000 Gb/s), but these aren't reflected in the benchmarks. Despite this, |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
it looks like btrfs needs to have fsync called on the file, for it to be eligible for metadata only copies. Adding
|
This inserts dummy headers (empty name, char device 0,0) so files start at a given blocksize. This will allow the tar package to execute file copies in the kernel. For filesystems like XFS and BTRFS, this allows for reflink copying, which effectively makes copying data free. See golang/go#70807 for background.
This inserts dummy headers (empty name, char device 0,0) so files within the tar archive start at Stat_t.Blksize boundaries. This is a precondition for using reflink to share data blocks in BTRFS and XFS. See golang/go#70807 for background.
This inserts dummy headers (empty name, char device 0,0) so files within the tar archive start at Stat_t.Blksize boundaries. This is a precondition for using reflink to share data blocks in BTRFS and XFS. See golang/go#70807 for background.
Proposal Details
the container ecosystem (podman,docker) spends its days creating and consuming huge .tar files. There is potential for significant speed-up here by having the tar package use zero-copy file transport.
The change is straightforward, but involves an API change, so opening a proposal.
with the following change, tarring up a 2G file from tmpfs to tmpfs goes from 2.0s to 1.3s
The text was updated successfully, but these errors were encountered: