Hardlinks and Docker

Gregory Szorc gregory.szorc at gmail.com
Wed Mar 29 21:15:43 EDT 2017



> On Mar 25, 2017, at 13:47, Jun Wu <quark at fb.com> wrote:
> 
> Excerpts from Jun Wu's message of 2017-03-25 12:19:46 -0700:
>> Excerpts from Gregory Szorc's message of 2017-03-25 10:54:10 -0700:
>>> Jun, et al:
>>> 
>>> The recent patches around hardlink detection are great! However, there are
>>> still some failures executing tests in certain Docker configurations, such
>>> as when using the overlay2 storage driver (overlayfs).
>>> 
>>> --- /mnt/hg/tests/test-hardlinks.t
>>> +++ /mnt/hg/tests/test-hardlinks.t.err
>>> @@ -58,14 +58,14 @@
>>> Create hardlinked clone r2:
>>> 
>>>   $ hg clone -U --debug r1 r2 --config progress.debug=true
>>> -  linking: 1
>>> -  linking: 2
>>> -  linking: 3
>>> -  linking: 4
>>> -  linking: 5
>>> -  linking: 6
>>> -  linking: 7
>>> -  linked 7 files
>>> +  copying: 1
>>> +  copying: 2
>>> +  copying: 3
>>> +  copying: 4
>>> +  copying: 5
>>> +  copying: 6
>>> +  copying: 7
>>> +  copied 7 files
>> 
>> This is unrelated to the statfs change. The "linking" / "copying" is
>> outputed by "util.copyfiles", which remains unchanged before and after the
>> statfs change.
>> 
>>    use hardlink?  | before statfs  | after statfs  
>>   --------------------------------------------------------------
>>    util.copyfile  | never          | sometimes (*, changed)
>>    util.copyfiles | sometimes (**) | sometimes (**, no change)
>> 
>>    (*) when dest is on a whitelisted filesystem - new code
>>    (**) when src and dst has a same st_dev - old code
>> 
>> util.copyfiles was trying to be smart, to use hardlink if src and dst are in
>> a same device:
>> 
>>    # util.copyfiles
>>    if hardlink is None:
>>        hardlink = (os.stat(src).st_dev ==
>>                    os.stat(os.path.dirname(dst)).st_dev)
>> 
>> Quick introduction to Linux's overlayfs:
>> 
>>    lowerdir: usually read-only
>>    upperdir: read-write
>>    overlay:  check upperdir first, if not found, check lowerdir.
>>              for read-only operation, it's like opening the file in
>>              lowerdir directly.
>>              for write operation, make sure materialize (copy) the file to
>>              upperdir, if it's not already there, then open it in upperdir
>>              directly.
>> 
>> So the "st_dev" check may failin the overlay case. src and dst could have
>> different st_dev - one lower, and one upper. I believe that's the root cause
>> of the test failure.
> 
> It seems the explanation is not what actually happened. stat could return
> differently for directories in overlays.
> 
> So the direct fix is to stat the directory instead of the file. I will send
> it soon.

Thank you for looking into this complicated issue! I'm glad I don't have to review your code because this is all over my head at this point ;)

> 
>> 
>> Interestingly, even if both paths are not on a same device, the "link"
>> syscall will success. It does that by materializing the file you're linking
>> first. So on "overlayfs", hardlink works, but it *copies* the file if not
>> materialized.
>> 
>>> Create non-hardlinked clone r3:
>>> 
>>> 
>>> "#require hardlink" is returning true. But hardlinks don't really work on
>>> overlayfs it appears. I see there is now a test-hardlinks-whitelisted.t,
>>> 
>>> which is a copy of test-hardlinks.t but uses the new filesystem detection
>>> code for whitelisting hardlinks. test-hardlinks-whitelisted.t passes under
>>> Docker on overlayfs, which is terrific! But it begs the question: what's
>> 
>> This seems a different topic.
>> 
>> I think test-hardlinks-whitelisted.t should behavior correct (skipped) on
>> overlayfs. (Note: if the test was running under /tmp which may be "tmpfs",
>> things could be different because "tmpfs" is whitelisted). "getfstype" will
>> correctly detect overlayfs regardless whether the test directory is
>> materialized or not. And since overlayfs is not whitelisted, the test will
>> be skipped.
>> 
>> It shows "(unknown)" for now though, I will submit a patch to let it show
>> "overlay". But that's just a display problem.
>> 
>>> our strategy for making tests pass when run under "faulty" filesystems? I
>>> ask because `make docker-*` targets run the test harness and test failures
>>> abort that process. So it would be nice to not have test failures under
>>> Docker.
>>> 
>>> (FWIW there are some permissions related failures running tests under
>>> Docker as well. I encourage others to run `make docker-ubuntu-xenial` and
>>> other docker-* targets to poke at things.)
>> 
>> As I have explained above, the "statfs" code is solid and works with
>> overlay. While the old "st_dev" check is the cause - but I'd say it still
>> does the right thing, see below.
>> 
>> A question is, do we want to change the "st_dev" check? No. Because it is
>> reasonable for overlay, too. Even if we know the "link" syscall can
>> success, but calling it may trigger a file copy, why use it? The current
>> "st_dev" check makes sure the hardlink operation *does not copy* the file,
>> and do work as expected if the overlay has materialized "src" and
>> "dirname(dst)".
>> 
>> Then how to fix it? I think it's "test-hardlinks.t" who needs to be updated
>> to become aware of overlayfs. We can move the overlay sensitive part to a
>> different test file, guarded with a new hghave "overlay" so it gets skipped
>> correctly.


More information about the Mercurial-devel mailing list