[PATCH 2 of 4 V2] lfs: infer the blob store URL from paths.default

Matt Harbison mharbison72 at gmail.com
Tue Apr 10 16:09:24 EDT 2018


# HG changeset patch
# User Matt Harbison <matt_harbison at yahoo.com>
# Date 1523154140 14400
#      Sat Apr 07 22:22:20 2018 -0400
# Node ID b784de3b414876f25964778b02e684f9b0a3787f
# Parent  473a8954c957945d1642d08eeefc8fe706597b59
lfs: infer the blob store URL from paths.default

If `lfs.url` is specified, it takes precedence.  However, now that we support
serving blobs via hgweb, we shouldn't *require* this setting.  Less
configuration is better (things will work out of the box once this is sorted
out), and git has similar functionality.

This is not a complete solution- it isn't able to infer the blob store from an
explicitly supplied path, and it should consider `paths.default-push` for push.
The pull solution for that is a bit hacky, and this alone is an improvement for
the vast majority of cases.

Even though there are only a handful of references to the saved remote store,
the location of them makes things complicated.

  1) downloading files on demand in the revlog flag processor
  2) copying to readonlyvfs with bundlerepo
  3) downloading in the file prefetch hook
  4) the canupload()/skipdownload() checks
  5) uploading blobs

Since revlog doesn't have a repo or ui reference, we can't avoid creating a
remote store when the extension is loaded.  While the long term goal is to make
sure the prefetch hook is invoked early for every command for efficiency, this
handling in the flag processor is needed as a last ditch fetch.

In order to support the clone command, the remote store needs to be created
later than when the extension loads, since `paths.default` isn't set until just
before the files are checked out.  Therefore, this patch changes the prefetch
hook to ignore the saved reference, and build a new one.

The canupload()/skipdownload() checks simply check if the stored instance is a
`_nullremote`.  Since this can only be set via `lfs.url` (which is reflected in
the saved reference), checking only the instance created when the extension
loaded is fine.

The blob uploading function is called from several places:

  1) a prepush hook
  2) when writing a new bundle
  3) from infinitepush

The prepush hook gets an exchange.pushop, so it has a path to where the push is
going.  The bundle writer and infinitepush don't.  Further, bundle creation for
things like strip and amend are causing blobs to be uploaded.  This seems wrong,
but I don't want to side track this sorting that out, so punt on trying to
handle explicit push paths or `paths.default-push`.

I also think that sending blobs to a remote store when pushing to a local repo
is wrong.  This functionality predates the usercache, so perhaps that's the
reason for it.  I've got some patches floating around to stop sending blobs
remotely in this case, and instead write directly to the other repo's blob
store.  But the tests for corruption handling weren't happy with this change,
and I don't have time to rewrite them.  So exclude filesystem based paths from
this for now.

I don't think there's much of a chance to implement `paths.remote:lfsurl` style
configs, given how early these are resolved vs how late the remote store is
created.  But git has it, so I threw a TODO in there, in case anyone has ideas.

I have no idea why this is now doing http auth twice when it wasn't before.  I
don't think the original blobstore's url is ever being used in these cases.

diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py
--- a/hgext/lfs/__init__.py
+++ b/hgext/lfs/__init__.py
@@ -87,7 +87,9 @@ Configs::
     #   git-lfs endpoint
     # - file:///tmp/path
     #   local filesystem, usually for testing
-    # if unset, lfs will prompt setting this when it must use this value.
+    # if unset, lfs will assume the repository at ``paths.default`` also handles
+    # blob storage for http(s) URLs.  Otherwise, lfs will prompt to set this
+    # when it must use this value.
     # (default: unset)
     url = https://example.com/repo.git/info/lfs
 
diff --git a/hgext/lfs/blobstore.py b/hgext/lfs/blobstore.py
--- a/hgext/lfs/blobstore.py
+++ b/hgext/lfs/blobstore.py
@@ -532,8 +532,29 @@ def _verify(oid, content):
                           hint=_('run hg verify'))
 
 def remote(repo):
-    """remotestore factory. return a store in _storemap depending on config"""
+    """remotestore factory. return a store in _storemap depending on config
+
+    If ``lfs.url`` is specified, use that remote endpoint.  Otherwise, try to
+    infer the endpoint, based on the remote repository using the same path
+    adjustments as git.  As an extension, 'http' is supported as well so that
+    ``hg serve`` works out of the box.
+
+    https://github.com/git-lfs/git-lfs/blob/master/docs/api/server-discovery.md
+    """
     url = util.url(repo.ui.config('lfs', 'url') or '')
+    if url.scheme is None:
+        # TODO: investigate 'paths.remote:lfsurl' style path customization,
+        # and fall back to inferring from 'paths.remote' if unspecified.
+        defaulturl = util.url(repo.ui.config('paths', 'default') or b'')
+
+        # TODO: support local paths as well.
+        # TODO: consider the ssh -> https transformation that git applies
+        if defaulturl.scheme in (b'http', b'https'):
+            defaulturl.path = defaulturl.path or b'' + b'.git/info/lfs'
+
+            url = util.url(bytes(defaulturl))
+            repo.ui.note(_('lfs: assuming remote store: %s\n') % url)
+
     scheme = url.scheme
     if scheme not in _storemap:
         raise error.Abort(_('lfs: unknown url scheme: %s') % scheme)
diff --git a/hgext/lfs/wrapper.py b/hgext/lfs/wrapper.py
--- a/hgext/lfs/wrapper.py
+++ b/hgext/lfs/wrapper.py
@@ -257,7 +257,9 @@ def _prefetchfiles(repo, ctx, files):
             pointers.append(p)
 
     if pointers:
-        repo.svfs.lfsremoteblobstore.readbatch(pointers, localstore)
+        # Recalculating the repo store here allows 'paths.default' that is set
+        # on the repo by a clone command to be used for the update.
+        blobstore.remote(repo).readbatch(pointers, localstore)
 
 def _canskipupload(repo):
     # if remotestore is a null store, upload is a no-op and can be skipped
diff --git a/tests/test-lfs-serve.t b/tests/test-lfs-serve.t
--- a/tests/test-lfs-serve.t
+++ b/tests/test-lfs-serve.t
@@ -34,7 +34,6 @@ for flag '0x2000'!" if the extension is 
 masked by the Internal Server Error message).
   $ cat >> $HGRCPATH <<EOF
   > [lfs]
-  > url=file:$TESTTMP/dummy-remote/
   > usercache = null://
   > threshold=10
   > [web]
diff --git a/tests/test-lfs-test-server.t b/tests/test-lfs-test-server.t
--- a/tests/test-lfs-test-server.t
+++ b/tests/test-lfs-test-server.t
@@ -157,6 +157,7 @@ Clear the cache to force a download
   resolving manifests
    branchmerge: False, force: False, partial: False
    ancestor: 000000000000, local: 000000000000+, remote: 99a7098854a3
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 311 (git-server !)
   Content-Length: 352 (hg-server !)
@@ -328,6 +329,7 @@ Clear the cache to force a download
   resolving manifests
    branchmerge: False, force: False, partial: False
    ancestor: 99a7098854a3, local: 99a7098854a3+, remote: dfca2c9e2ef2
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 608 (git-server !)
   Content-Length: 670 (hg-server !)
@@ -417,6 +419,7 @@ TODO: give the proper error indication f
   resolving manifests
    branchmerge: False, force: True, partial: False
    ancestor: dfca2c9e2ef2+, local: dfca2c9e2ef2+, remote: dfca2c9e2ef2
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 311 (git-server !)
   Content-Length: 183 (hg-server !)
@@ -516,6 +519,7 @@ Archive will prefetch blobs in a group
   $ rm -rf .hg/store/lfs `hg config lfs.usercache`
   $ hg archive --debug -r 1 ../archive
   http auth: user foo, password ***
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 905 (git-server !)
   Content-Length: 988 (hg-server !)
@@ -611,6 +615,7 @@ Cat will prefetch blobs in a group
   $ rm -rf .hg/store/lfs `hg config lfs.usercache`
   $ hg cat --debug -r 1 a b c
   http auth: user foo, password ***
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 608 (git-server !)
   Content-Length: 670 (hg-server !)
@@ -685,6 +690,7 @@ Revert will prefetch blobs in a group
   reverting b
   reverting c
   reverting d
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 905 (git-server !)
   Content-Length: 988 (hg-server !)
@@ -781,6 +787,7 @@ Check error message when the remote miss
   resolving manifests
    branchmerge: False, force: True, partial: False
    ancestor: 62fdbaf221c6+, local: 62fdbaf221c6+, remote: ef0564edf47e
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 308 (git-server !)
   Content-Length: 186 (hg-server !)
@@ -892,6 +899,7 @@ Check error message when object does not
   resolving manifests
    branchmerge: False, force: False, partial: False
    ancestor: 000000000000, local: 000000000000+, remote: d2a338f184a8
+  http auth: user foo, password ***
   Status: 200
   Content-Length: 308 (git-server !)
   Content-Length: 186 (hg-server !)


More information about the Mercurial-devel mailing list