Hardlinks and Docker

Jun Wu quark at fb.com
Sat Mar 25 15:19:46 EDT 2017


Excerpts from Gregory Szorc's message of 2017-03-25 10:54:10 -0700:
> Jun, et al:
> 
> The recent patches around hardlink detection are great! However, there are
> still some failures executing tests in certain Docker configurations, such
> as when using the overlay2 storage driver (overlayfs).
> 
> --- /mnt/hg/tests/test-hardlinks.t
> +++ /mnt/hg/tests/test-hardlinks.t.err
> @@ -58,14 +58,14 @@
>  Create hardlinked clone r2:
> 
>    $ hg clone -U --debug r1 r2 --config progress.debug=true
> -  linking: 1
> -  linking: 2
> -  linking: 3
> -  linking: 4
> -  linking: 5
> -  linking: 6
> -  linking: 7
> -  linked 7 files
> +  copying: 1
> +  copying: 2
> +  copying: 3
> +  copying: 4
> +  copying: 5
> +  copying: 6
> +  copying: 7
> +  copied 7 files

This is unrelated to the statfs change. The "linking" / "copying" is
outputed by "util.copyfiles", which remains unchanged before and after the
statfs change.

    use hardlink?  | before statfs  | after statfs  
   --------------------------------------------------------------
    util.copyfile  | never          | sometimes (*, changed)
    util.copyfiles | sometimes (**) | sometimes (**, no change)

    (*) when dest is on a whitelisted filesystem - new code
    (**) when src and dst has a same st_dev - old code

util.copyfiles was trying to be smart, to use hardlink if src and dst are in
a same device:
  
    # util.copyfiles
    if hardlink is None:
        hardlink = (os.stat(src).st_dev ==
                    os.stat(os.path.dirname(dst)).st_dev)

Quick introduction to Linux's overlayfs:

    lowerdir: usually read-only
    upperdir: read-write
    overlay:  check upperdir first, if not found, check lowerdir.
              for read-only operation, it's like opening the file in
              lowerdir directly.
              for write operation, make sure materialize (copy) the file to
              upperdir, if it's not already there, then open it in upperdir
              directly.

So the "st_dev" check may failin the overlay case. src and dst could have
different st_dev - one lower, and one upper. I believe that's the root cause
of the test failure.

Interestingly, even if both paths are not on a same device, the "link"
syscall will success. It does that by materializing the file you're linking
first. So on "overlayfs", hardlink works, but it *copies* the file if not
materialized.
 
>  Create non-hardlinked clone r3:
> 
> 
> "#require hardlink" is returning true. But hardlinks don't really work on
> overlayfs it appears. I see there is now a test-hardlinks-whitelisted.t,
>
> which is a copy of test-hardlinks.t but uses the new filesystem detection
> code for whitelisting hardlinks. test-hardlinks-whitelisted.t passes under
> Docker on overlayfs, which is terrific! But it begs the question: what's

This seems a different topic.

I think test-hardlinks-whitelisted.t should behavior correct (skipped) on
overlayfs. (Note: if the test was running under /tmp which may be "tmpfs",
things could be different because "tmpfs" is whitelisted). "getfstype" will
correctly detect overlayfs regardless whether the test directory is
materialized or not. And since overlayfs is not whitelisted, the test will
be skipped.

It shows "(unknown)" for now though, I will submit a patch to let it show
"overlay". But that's just a display problem.

> our strategy for making tests pass when run under "faulty" filesystems? I
> ask because `make docker-*` targets run the test harness and test failures
> abort that process. So it would be nice to not have test failures under
> Docker.
> 
> (FWIW there are some permissions related failures running tests under
> Docker as well. I encourage others to run `make docker-ubuntu-xenial` and
> other docker-* targets to poke at things.)

As I have explained above, the "statfs" code is solid and works with
overlay. While the old "st_dev" check is the cause - but I'd say it still
does the right thing, see below.

A question is, do we want to change the "st_dev" check? No. Because it is
reasonable for overlay, too. Even if we know the "link" syscall can
success, but calling it may trigger a file copy, why use it? The current
"st_dev" check makes sure the hardlink operation *does not copy* the file,
and do work as expected if the overlay has materialized "src" and
"dirname(dst)".

Then how to fix it? I think it's "test-hardlinks.t" who needs to be updated
to become aware of overlayfs. We can move the overlay sensitive part to a
different test file, guarded with a new hghave "overlay" so it gets skipped
correctly.


More information about the Mercurial-devel mailing list