[PATCH 1 of 2 RFC] lfs: add support for serving blob files

Matt Harbison mharbison72 at gmail.com
Thu Feb 22 01:34:08 EST 2018


On Wed, 21 Feb 2018 01:10:50 -0500, Gregory Szorc  
<gregory.szorc at gmail.com> wrote:

> On Sat, Feb 17, 2018 at 11:15 PM, Matt Harbison <mharbison72 at gmail.com>
> wrote:
>
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison at yahoo.com>
>> # Date 1518937155 18000
>> #      Sun Feb 18 01:59:15 2018 -0500
>> # Node ID ba2e8627d87cfaca00931fe2dcee738c3c9a4f9d
>> # Parent  868bb2821e292cdda6050f229ade1af42d52c9e6
>> lfs: add support for serving blob files
>>
>> There's a ton more to do, especially on the LFS side of things.  But for
>> now,
>> I'm looking for a sanity check that this is the correct approach.  The
>> protocol
>> is tied to http, and Gregory's recent refactoring at least gave me a  
>> clue
>> where
>> to start.  But it doesn't quite fit, because the POST requests end up  
>> with
>> a
>> 'query' string, so HTTP_NOT_FOUND is returned.  I thought maybe I could
>> just
>> wrap hgweb._runwsgi() to slurp the requests and bypass the core
>> completely.  But
>> that's an instance method, and I didn't see a way to ensure every  
>> instance
>> could
>> be subclassed.  That function also does a bit of work to populate the
>> request
>> object.  So I went back to the new protocolhandler, and hacked the core  
>> to
>> not
>> fail on an LFS handler.  (Assuming that this is generally OK, maybe core
>> could
>> check an attribute on it to see if it's a native protocol before doing  
>> the
>> query
>> and permission checks?)
>>
>> The core hasn't been handling PUT requests, which are needed to upload
>> files.  I
>> tried, but failed subclass the handler from the LFS extension, so I just
>> added
>> it in core for now.  (I think I know what I did wrong, but it's trivial  
>> to
>> add
>> to core, so IDK how much effort it's worth trying to wrap more stuff to
>> keep it
>> out of there.)
>>
>> The code is pretty well marked with TODOs.  I know very little about the
>> underlying python framework, or how this code fits into `hg serve` and a
>> normal
>> webserver, so this is the result of a fair amount of trial and error.   
>> On
>> the
>> plus side, test-lfs-test-server.t can have test-lfs-serve swapped for
>> `hg serve`, and the test runs, modulo one point where corruption was
>> introduced.
>> The server should be kicking back an error indicating the corruption,  
>> but
>> it
>> aborts instead (this is already flagged as a TODO).
>>
>
> I really wish we had a proper HTTP server / dispatching framework in play
> to make stuff like this easier to implement.
>
> Anyway, I think teaching core about the LFS URL space is fine. We can
> provide a dummy function that 404s on access to those URLs by default. If
> the LFS extension is enabled, it can wrap the default handlers to  
> implement
> needed functionality.
>
> It would also be rad if we could fix the request object so it is sane.
> Having to read from CGI/WSGI environment variables is a pain and prone to
> errors. Also, the main dispatch function conflates the query string and  
> the
> POSTed URL encoded request body IIRC. It's a lot of horrible code that
> needs some refactoring love. It might be easier to vendor something like
> WebOb...

That looks interesting.  It looks like there's a file descriptor for the  
body, so maybe we won't need to read in the whole file before sending it.   
But all of this is well outside my area of knowledge, so I'll leave that  
for someone else to figure out.

> FWIW, I suspect I may be doing some refactoring here as part of
> implementing version 2 of the HTTP wire protocol. I'm slowly making
> progress...

Cool.  I think there's still plenty of time this cycle to get all of this  
landed.  Hopefully with the hook points I just submitted, there's no other  
stuff I need to change in the core.  Figuring out the lfs behavior is the  
tricky part.  I'm not too worried about changes to the http code.

> If you want to start sending non-RFC patches, I'd start with landing the
> URL routing pieces in core. Make it so requests to the LFS URL space are
> recognized and dispatched accordingly. The rest will sort itself out over
> time.


More information about the Mercurial-devel mailing list