[PATCH] largefiles: makes verify to work on local content by default (issue4242) (BC)
Mads Kiilerich
mads at kiilerich.com
Tue Apr 12 17:07:49 EDT 2016
On 03/02/2016 02:05 PM, liscju wrote:
> # HG changeset patch
> # User liscju <piotr.listkiewicz at gmail.com>
> # Date 1456917749 -3600
> # Wed Mar 02 12:22:29 2016 +0100
> # Node ID 184b0386fad4aff1ec64f0076c74c13e2cf5d036
> # Parent c7f89ad87baef87f00c507545dfd4cc824bc3131
> largefiles: makes verify to work on local content by default (issue4242) (BC)
Sorry for late response. Here comes something.
Conceptually, there can be many different variations of verification of
largefile repos:
Largefiles to check:
* referenced in specific revisions (currently only available as the
default of checking current revision)
* referenced in repo (--lfa)
* all largefiles in store (cd .hg/largefiles && sha1sum -c <(ls
?????????*| while read f; do echo $f $f; done))
Locations to check:
* check repo store
* check "user cache"
* check default remote server (currently apparently inevitable)
* first of these (should probably be default)
Kind of check:
* check existence (default)
* check actual hash (--lfc but not possible remote without downloading)
I don't know which combinations we want to support. The current options
might be a bit arbitrary. If we change/add new options, we should make
sure they are less arbitrary or represent common use cases.
I don't think we should support remote (wireproto) hash checking. That
would make it too easy to make a cpu load DoS. It is more managable if
"attackers" have to use bandwidth to hash remote largefiles. It is ok
that everybody is responsible for their own repositories. Remote
repositories should be checked by running verify locally on the remote
server.
For the patch, I would prefer to have a first "add test coverage" /
"demonstrate problem" patch - that would make it easier to spot what the
problem is and discuss the problem and fixes.
It seems to me like there currently are two bugs (in design or
implementation):
1. 'hg verify --large' currently invokes remote statlfile for all files,
even if they already are available locally.
2. 'hg verify --large' currently sends one remote statlfile command once
for each file. It should use batching.
If these two were fixed, I think it would be less of a problem (or
perhaps no problem) that it falls back to using remote. It would exactly
mirror what happens on actual use.
It seems like it currently handles lack of remote server availability by
simple reporting the largefiles it needs missing. I think that is fine.
/Mads
More information about the Mercurial-devel
mailing list