Code Monkey home page Code Monkey logo

copyonwrite's Introduction

The CopyOnWrite library provides a .NET layer on top of Windows OS-specific logic that provides copy-on-write linking for files (a.k.a. CoW, file cloning, or reflinking). CoW linking provides the ability to copy a file without actually copying the original file's bytes from one disk location to another. The filesystem is in charge of ensuring that if the original file is modified or deleted, the CoW linked files remain unmodified by lazily copying the original file's bytes into each link. Unlike symlinks or hardlinks, writes to CoW links do not write through to the original file, as the filesystem breaks the link and copies in a lazy fashion. This enables scenarios like file caches where a single copy of a file held in a content-addressable or other store is safely linked to many locations in a filesystem with low I/O overhead.

*NOTE: Only Windows functionality is implemented. On Linux and Mac using File.Copy is sufficient as it automatically uses CoW for Linux (starting in .NET 7, and as long as a CoW compatible filesystem like btrfs is in use) and Mac (.NET 8). A similar PR for Windows did not make it into .NET, however there is work underway to integrate CoW into the Windows API in a possible future release.

This library allows a .NET developer to:

  • Discover whether CoW links are allowed between two filesystem paths,
  • Discover whether CoW links are allowed for a directory tree based at a specific root directory,
  • Create CoW links,
  • Find filesystem CoW link limits.

Discovery is important, as different operating systems and different filesystems available for those operating systems provide varying levels of CoW link support:

  • Windows: The default NTFS filesystem does NOT support CoW, but the ReFS filesystem and 2023's new Dev Drive do.
  • Linux: Btrfs, Xfs, Zfs support CoW while ext4 does not.
  • Mac: AppleFS supports CoW by default.

When using this library you may need to create a wrapper that copies the file if CoW is not available.

Example

using Microsoft.CopyOnWrite;

ICopyOnWriteFilesystem cow = CopyOnWriteFilesystemFactory.GetInstance();
bool canCloneInCurrentDirectory = cow.CopyOnWriteLinkSupportedInDirectoryTree(Environment.CurrentDirectory);
if (canCloneInCurrentDirectory)
{
    cow.CloneFile(existingFile, cowLinkFilePath);
}

OS-specific caveats

Windows

File clones on Windows do not actually allocate space on-drive for the clone. This has a good and a possibly bad implication:

  • Good: You save space on-disk, as the clones only take up space for region clone metadata (small).
  • Possibly bad: If cloned files are opened for append or random-access write, the lazy materialization of the original content into the opened file may result in disk out-of-space errors.

Release History

NuGet version (CopyOnWrite)

  • 0.3.7 September 2023: Fix #30 - ignore ACCESS_DENIED on volume enumeration to avoid need to escalate privilege on Windows.
  • 0.3.6 July 2023: Set AssemblyVersion to 0.9.9999.0 to allow mixing different minor-version binaries from different packages in the same appdomain/process.
  • 0.3.5 July 2023: Set AssemblyVersion to 0.0.0.1 to allow mixing different minor-version binaries from different packages in the same appdomain/process.
  • 0.3.4 July 2023: Handle locked BitLocker volume during volume scan.
  • 0.3.3 July 2023: For Linux and Mac unimplemented filesystems, return false from CopyOnWriteLinkSupported... methods to avoid the need for checking for Windows OS before calling.
  • 0.3.2 February 2023: Fix issue with ERROR_UNRECOGNIZED_VOLUME returned from some volumes causing an error on initialization on some machines.
  • 0.3.1 February 2023: Fix issue with Windows drive information scanning hanging reading removable SD Card drives. Updated README with Windows clone behavior.
  • 0.3.0 January 2023: Remove Windows serialization by path along with CloneFlags.NoSerializedCloning and the useCrossProcessLocksWhereApplicable flag to CopyOnWriteFilesystemFactory. The related concurrency bug in Windows was fixed in recent patches and retested on Windows 11.
  • 0.2.2 January 2023: Fix mismatched sparseness when CloneFlags.DestinationMustMatchSourceSparseness was used (#17)
  • 0.2.1 September 2022: Add detection for DOS SUBST drives as additional source of mappings.
  • 0.2.0 September 2022: Improve documentation for ReFS parallel cloning bug workarounds. Improve Windows cloning performance by 7.2% by using sparse destination files. Default behavior change to leave destination file sparse and replaced CloneFlags.NoSparseFileCheck with DestinationMustMatchSourceSparseness, hence minor version increase.
  • 0.1.13 September 2022: Fix CloneFlags to use individual bits.
  • 0.1.12 September 2022: Add new factory flag that sets a mode to require cross-process Windows mutexes for safe source file locking to avoid a ReFS concurrency bug. Add optimization to allow bypassing redundant Path.GetFullPath() when caller has done it already.
  • 0.1.11 September 2022: Serialize Windows cloning on source path to work around ReFS limitation in multithreaded cloning.
  • 0.1.10 September 2022: Fix missing destination file failure detection.
  • 0.1.9 September 2022: Add explicit cache invalidation call to interface. Update Windows implementation to detect ReFS mount points that are not drive roots, e.g. mounting D:\ (ReFS volume) under C:\ReFS.
  • 0.1.8 April 2022: Add overload for CoW clone to allow bypassing some Windows filesystem feature checks
  • 0.1.7 April 2022: Perf improvement for Windows CoW link creation by reducing kernel round-trips
  • 0.1.6 April 2022: Perf improvement for all Windows APIs
  • 0.1.5 October 2021: Separate exception type for when link limit is exceeded. Mac and Linux throw NotSupportedException.
  • 0.1.4 October 2021: Fix doc XML naming. Mac and Linux throw NotSupportedException.
  • 0.1.3 October 2021: Bug fixes for Windows. Mac and Linux throw NotSupportedException.
  • 0.1.2 October 2021: Performance fixes for Windows. Mac and Linux throw NotSupportedException.
  • 0.1.1 October 2021: Bug fixes for Windows. Mac and Linux throw NotSupportedException.
  • 0.1.0 July 2021: Windows ReFS support. Mac and Linux throw NotSupportedException.

Related Works

Contributing

This project welcomes contributions and suggestions. See CONTRIBUTING.md.

Running Unit Tests on Windows

If you have a local ReFS drive volume on which to run ReFS related tests, set the following user or system level environment variable:

CoW_Test_ReFS_Drive=D:\

(You may need to exit and restart VS, VSCode, or consoles after setting this.) When this env var is not available, unit tests create and mount a local ReFS VHD for testing and must be run elevated (as admin), e.g. by opening Visual Studio as an admin before opening the solution.

Performance Comparisons

See benchmark data.

copyonwrite's People

Contributors

erikmav avatar johnterickson avatar microsoft-github-operations[bot] avatar microsoftopensource avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

copyonwrite's Issues

Failures on Windows when running in parallel

When running a Parallel.ForEach(10000, i => CloneFile("sourceFile", $"file{i}")) there are typically 3-5 of 10K calls that fail with a winerror 134. This might be a corruption issue. x64 test.

Memory corruption on CloneFile in Windows x86

Need more testing and regression testing for 32-bit x86. When trying to integrate library into MSBuild net472 x86 the process core-dumps with memory corruption errors when running CloneFile().

Ensure destination deleted if exists before overwriting

MSBuild Copy task has recently added this important logic:
https://github.com/dotnet/msbuild/blob/c36a54ed3308d1516ffe1a86b9086c42e4ca996f/src/Tasks/Copy.cs#L288-L294

If the Copy task sees that the destination exists, it deletes it before overwriting. This is important, because if the destination is hardlinked to some other file, previously the Copy task used to effectively overwrite all other hardlinked copies of the content, and this caused corruption that's hard to investigate.

Let's ensure the Copy logic in this repo does the same thing (delete the file before overwriting to "unlink" it in case it's a hardlink)

UnauthorizedAccessException reading container volume when low-priv

Reported by a Windows customer - reading a container volume's metadata requires escalation that should not be necessary, resulting in an UnauthorizedAccessException.

F:\w\b\some.proj(13,5): error MSB4061: The "Copy" task could not be instantiated from "C:\Users\XXXX.nuget\pa ckages\microsoft.build.copyonwrite\1.0.265\Sdk..\build\netstandard2.0\Microsoft.Build.CopyOnWrite.dll". F:\w\b\some.proj(13,5): error MSB4061: System.TypeInitializationException: The type initializer for 'Microsoft.Build .Tasks.Copy' threw an exception. ---> System.UnauthorizedAccessException: Failed retrieving volume information for \?\Volume{629458e4-0000-0000-0000-0100000000 00}\ with winerror 5
F:\w\b\some.proj(13,5): error MSB4061: at Microsoft.CopyOnWrite.Windows.VolumeInfoCache.GetVolumeInfo(VolumePaths volumePaths) in D:\CoW\lib\Windows\VolumeInfoCache.cs:line 153 F:\w\b\some.proj(13,5): error MSB4061: at Microsoft.CopyOnWrite.Windows.VolumeInfoCache.BuildFromCurrentFilesyste m() in D:\CoW\lib\Windows\VolumeInfoCache.cs:line 36 F:\w\b\some.proj(13,5): error MSB4061: at Microsoft.CopyOnWrite.CopyOnWriteFilesystemFactory.Create() in D:\CoW\l ib\CopyOnWriteFilesystemFactory.cs:line 48

Test break on lack of expected failure for exceeding Windows clone max

From unit tests - sporadic failures locally and in CI pipeline:

Assert.ThrowsException failed. No exception thrown. MaxCloneFileLinksExceededException exception was expected.

Implications:

  • We have some sort of leak in the cloning logic that does not complete all clones before attempting to set the (limit + 1) clone.
  • Windows itself has lazy accounting for the clone limit, i.e. it's a "loose" limit.

BSOD caused when copying to an alternative data stream

When attempting to copy onto an alternative data stream, a BSOD is caused.

Simple reproduction (F:\ is a Dev Drive):

File.WriteAllText($@"F:\New Text Document.txt", "a");
File.WriteAllText($@"F:\New Text Document 2.txt", "b");
File.WriteAllText($@"F:\New Text Document 2.txt:x", "c");
Microsoft.CopyOnWrite.CopyOnWriteFilesystemFactory.GetInstance().CloneFile($@"F:\New Text Document.txt", $@"F:\New Text Document 2.txt:x");

I discovered this while trying to implement dotnet/runtime#86681, can you please raise this internally so we can try to get it quickly fixed in Windows, after verifying it on your end. For now, I will add a workaround to skip for ADS and revert all of my testing changes on my local machine, so I can hopefully get a PR in a workable state soon.

Note, that my local version doesn't have the issue where it may get confused about which volume it's on by a path like this, it should definitely know it's on F:\ as it uses Windows APIs to correctly determine the volume. I'm happy to triple check this if you'd like, but a BSOD should obviously not be caused regardless.

Windows version:
Edition: Windows 11 Pro Insider Preview
Version: 22H2
OS build: 23481.1000
Experience: Windows Feature Experience Pack 1000.23481.1000.0

Image of BSOD if you want it for some reason (which I have because it's in a VM):
image

ReFS parallel file cloning silent failure bug

Windows-only.

Symptoms: When cloning a single source file in parallel (multiple threads or processes) to multiple destinations, sometimes success is returned but the file region assignment is never completed. This results in the destination file having all zeroes for its content.

Cause: When a file is not completely flushed to disk when the region clone operation starts, ReFS tries to flush the file to disk first. There is a race on multiple threads where failure to flush is ignored and the region clone proceeds anyway.

Workarounds:

  • Serialize cloning system-wide per source path. This library takes this approach by default for single-process cloning by using an in-memory dictionary. You can opt into system-wide serialization using kernel mutexes by specifying useCrossProcessLocksWhereApplicable = true when calling CopyOnWriteFilesystemFactory.GetInstance().
  • Ensure the source file is completely flushed to disk before cloning. This can be accomplished through one of the approaches below. Note that if you use these approaches, you can increase performance of cloning by using CloneFileFlags.NoSerializedCloning on your CloneFile calls.
    • Using the FlushFileBuffers API to force the file to be flushed from memory. This should be called at the end of writing the source file to disk while the file write handle is still open. Alternately it could be called on a new handle to the file opened with GENERIC_WRITE.
    • When writing the source file, open the file handle with FILE_FLAG_NO_BUFFERING. However, note this requires the code writing to the file to deal with writing chunks aligned with the sector size of the underlying volume, and using chunks that are a multiple of the sector size.
    • When writing the source file, open the file handle with FILE_FLAG_WRITE_THROUGH. This forces a flush on every write, which can decrease performance significantly.

Resolution: We currently have only workarounds (see above) and are awaiting resolution with the Windows team.

This issue tracks resolution in the Windows codebase and other approaches to work around the problem. Related to #1. PRs that have added workarounds:

  • #6 - added single-process (in-memory dictionary) serialization of cloning of individual files.
  • #9 - added optional cross-process serialization using kernel named mutexes, one per file path.
  • #13 - documentation
  • (See any linked future PRs)

DestinationMustMatchSourceSparseness flag not working, block-cloned file is always sparsed

If source file is NOT sparsed and CloneFlag.DestinationMustMatchSourceSparseness is set, block-cloned destination file should be NOT sparsed, however destination file is sparsed.

This Windows command can be used to check a file sparness:
fsutil sparse queryflag C:\Temp\myfile.vhdx

To set the destination file as not sparsed a FILE_SET_SPARSE_BUFFER structure should be used, currently a bool[] value is used.

This sample code successfully sets a file as not sparsed. Tested on Windows 2016 1607 and Windows 10 21H2. The only variation with the code on CloneFileAsync method is the use of a FILE_SET_SPARSE_BUFFER struct.

var FILE_SET_SPARSE_BUFFER = new FILE_SET_SPARSE_BUFFER
{
      SetSparse = false
};

if (!NativeMethods.DeviceIoControl(
    destFileHandle,
    NativeMethods.FSCTL_SET_SPARSE,
    FILE_SET_SPARSE_BUFFER,
    Marshal.SizeOf(FILE_SET_SPARSE_BUFFER),
    null,
    0,
    ref numBytesReturned,
    IntPtr.Zero))
 {
     int lastErr = Marshal.GetLastWin32Error();
     NativeMethods.ThrowSpecificIoException(lastErr,
     $"Failed to turn off file sparseness with winerror {lastErr} for destination file '{destination}'");
  }

Stress test break reading content back from ReFS CoW link

Sporadic failure. Occurs in either the 32 or 64 bit test suites. One interesting data point is that the test failure seems to occur on a test iteration that is a multiple of 510 (test510, test1020 seen in practice).

Test method Microsoft.CopyOnWrite.Tests.CopyOnWriteTests_Windows.ReFSPositiveDetectionAndCloneFileCorrectBehavior threw exception:
System.AggregateException: One or more errors occurred. (Assert.AreEqual failed. Expected:<1234abcd>. Actual:<\0\0\0\0\0\0\0\0>. B:\Stress1\test510) ---> Microsoft.VisualStudio.TestTools.UnitTesting.AssertFailedException: Assert.AreEqual failed. Expected:<1234abcd>. Actual:<\0\0\0\0\0\0\0\0>. B:\Stress1\test510

Stack Trace: 
<>c__DisplayClass10_1.b__0(Int32 i) line 241
<>c__DisplayClass19_01.<ForWorker>b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion) --- End of stack trace from previous location --- <>c__DisplayClass19_01.b__1(RangeWorker& currentWorker, Int32 timeout, Boolean& replicationDelegateYieldedBeforeCompletion)
Replica.Execute()
--- End of inner exception stack trace ---
TaskReplicator.Run[TState](ReplicatableUserAction1 action, ParallelOptions options, Boolean stopOnFirstFailure) Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action1 body, Action2 bodyWithState, Func4 bodyWithLocal, Func1 localInit, Action1 localFinally)
--- End of stack trace from previous location ---
Parallel.ThrowSingleCancellationExceptionOrOtherException(ICollection exceptions, CancellationToken cancelToken, Exception otherException)
Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action1 body, Action2 bodyWithState, Func4 bodyWithLocal, Func1 localInit, Action1 localFinally) Parallel.For(Int32 fromInclusive, Int32 toExclusive, Action1 body)
CopyOnWriteTests_Windows.StressTestCloning(String refsRoot) line 239
CopyOnWriteTests_Windows.ReFSPositiveDetectionAndCloneFileCorrectBehavior() line 157
CopyOnWriteTests_Windows.ReFSPositiveDetectionAndCloneFileCorrectBehavior() line 157
ThreadOperations.ExecuteWithAbortSafety(Action action)

Standard Output: 
Creating file with size 2147484672
Running parallel stress test for 8175 CoW links
Running parallel stress test for 8175 CoW links

Enable SourceLink

I'm debugging Microsoft.Build.CopyOnWrite.dll and I'm not seeing SourceLink.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.