D7708: lfs: add a switch to `hg verify` to ignore the content of blobs

mharbison72 (Matt Harbison) phabricator at mercurial-scm.org
Mon Dec 23 13:30:26 EST 2019


mharbison72 added a comment.


  In D7708#113600 <https://phab.mercurial-scm.org/D7708#113600>, @indygreg wrote:
  
  > Per Pulkit's review, please change the argument to `--lfs`, as `--no-lfs` should automatically work.
  
  Does my response to @pulkit change your mind any?  (Not sure how familiar you are with the lfs internals- I apparently forgot a lot myself).
  
  > Also, I didn't realize `verify.skipflags` existed and exposes low-level bit flags of revlog storage. That's a massive abstraction leak and I'd be in favor of reinventing that feature so it worked based on string feature names.
  
  Any thumbnail sketch of what this should look like?  Having to know that things are externally stored or that rename checking has special requirements for example, also seems leaky to a lesser degree.  I was surprised that revlog bits were being passed around like this too.  My plan is/was to add an lfs configbool that's documented, which could toggle this on or off, so users don't need to know bit patterns.  (I was also surprised that hex numbers in the config don't work, and it pukes when you fetch the value.)  Obviously I wouldn't want the lack of `--lfs` or `--no-lfs` to override this.  IDK if it's better to have an LFS config for it's bit, and then another for that other bit... or some setting that is a list of skip names.  I assume whatever pattern is designed here would influence the internal replacement.

INLINE COMMENTS

> pulkit wrote in __init__.py:408
> IIUC, we can have a `--lfs` flag and `--no-lfs` will work.
> 
> Also by default, we may not want to verify large files content?

The way it is now, both `--lfs` and `--no-lfs` are accepted.  I actually started with `--lfs`, defaulting to on, but then changed it and forget why.  I'll try to figure that out today, but it may have been the code was simpler when defaulting to verifying the blobs and honoring the config setting.  It may have also had to do with shifting to verifying the locally accessible blobs later in the series.  I didn't want to over complicate things with `--none`, `--local-only`, and `--all`.  The options with largefiles are kind of annoying.  I guess the question is, "What would an average user who doesn't know hg internals want/understand?"

As far as defaulting to *not* verifying blobs, I'm uncertain.  I'd think that verifying local blobs wouldn't add much overhead.  But the deciding factor for me was that there's no way to verify the content in the revlog without verifying the blob itself- the stored hash in the revlog is for the blob, not the stored JSON data.  That's why I asked in IRC the other day about how strong the guarantees are about repo integrity after running a verify.  IIRC, largefiles doesn't verify content by default, but it will at least make sure the revlog for the standin file is OK.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7708/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7708

To: mharbison72, #hg-reviewers, indygreg
Cc: indygreg, pulkit, mercurial-devel


More information about the Mercurial-devel mailing list