Should bfiles auto-status affect all uses of localrepo.status?

Greg Ward greg-hg at gerg.ca
Thu Feb 18 09:20:45 CST 2010


On Wed, Feb 17, 2010 at 5:44 PM, Anton Markov <anton.markov at gmail.com> wrote:
> I would like some input on a question that would affect other parts of
> mercurial and other extensions if 'hg status' incorporates output of 'hg
> bfstatus'.
>
> Currently, 'hg status' simply pretty-prints the output of the
> localrepo.status function which is also used by other parts of mercurial. I
> am trying to decide if I want to (A) alter the output of 'hg status' to
> include 'hg bfstatus' but leave the output of localrepo.status unchanged, or
> (B) to change the output of localrepo.status[1].

Hmmm, good point.  I had not considered how this would affect other
parts of Mercurial.

> Pros (A): doesn't affect other code, easier to implement
> Cons (A): the user may get confused if 'hg status' shows a big file, but
> another command using localrepo.status ignores that file.

Another problem with only wrapping 'hg status' is that it means
TortoiseHg (and other front ends that use the Python API directly)
will not be consistent with the command line.  In fact, I thought that
was the clincher argument in favour of wrapping
localrepository.status().  (Or maaaaaybe dirstate.status(), but that's
probably lower level than we need to go.)

But you are quite right: wrapping localrepository.status() could have
unintended consequences elsewhere in Mercurial.  Well, OK, when in
doubt, audit the code.  I'm just scanning all the Mercurial source for
lines matching '\.status(' but not 'ui\.status'.  Here's what I'm
finding...

1) convert uses it in hg_source: this means that converting hg to svn
would convert hg big files to svn regular files: good
2) extdiff uses it to know what filenames to pass to the external diff
tool: sounds good in principle, assuming the user's diff tool knows
how to compare large binary files. But then it uses filectx to fetch
the file contents, which won't work... except extdiff is coded
defensively and should not blow up.  Phew.  Should be OK, but we might
want to add a test.
3) fetch uses it to check for outstanding uncommitted changes: good,
we should abort if the user tries to fetch (or merge!) with
uncommitted changes to big files
4) gpg uses it to check if you have modified .hgsigs... well, if
.hgsigs is a big file, you are doing weird stuff and deserve whatever
happens to you ;-)
5) hgcia uses it to construct list of files added/modified/removed:
good, this should include big files
6) hgk uses it similarly: good
7) keyword uses it (on selected filenames) to detect uncommitted
changes: good, as with fetch and merge, we want this to include big
files
8) mq uses it to detect uncommitted changes in various contexts: good
9) mq (qnew) uses it when you pass filenames (or use -I or -X) to
figure out which files to include in the patch (I think): hmmm, that
could be problematic
10) mq (qpop) uses it for reasons I don't understand: hmmmm
11) mq (qrefresh) uses it, again for reasons I don't understand
12) purge uses it to look for unknown and ignored files: should be
fine as long as we make sure that neither standins nor big files are
included in unknown or ignored
13) record uses it to know what files it has to ask the user about:
hmmm, could be problematic
14) transplant uses it to check for uncommitted changes: good
15) cmdutil uses it in the general-purpose utility function
bail_if_changed(), which should probably be used by fetch, keyword,
mq, and transplant ;-)
16) cmdutil uses it in changeset_printer._show(), in --debug mode, to
determine what files were changed in the changeset: probably fine
17) commands (forget): looks problematic; I think it will crash with
"file is already untracked" if you try to forget a big file (hmmm)
18) commands (remove): appears similar to forget (hmmm)
19) commands (revert): not clear, could be problematic
20) commands (status): well, of *course* we want this be be affected!
21) commands (summary): ditto
22) context (changectx._status()): sounds correct, but could be
troublesome if someone takes a filename returned by changectx.status()
and tries to get a filectx for it
23) localrepo (tags): checks for local mods to .hgtags: if this is a
big file, you have big problems
24) localrepo (commit): uses it to know what files to commit: trouble! hmmmm
25) patch.diff(): uses it to exit early if no uncommitted changes: probably fine
26) templatekw: something to do with expanding {files} in templates, I
imagine... probably OK

I invite you to repeat this audit, as I might have missed something.
I went in 'hg manifest' order, so maybe you should reverse that order.
 That way your brain is fresh when looking at the stuff that I may
have skimmed over from audit fatigue.  ;-)

On balance, I would say:

  * go ahead and wrap localrepository.status(), since lots of bits of
Mercurial will *benefit*
    from modified big files being considered modified files
  * but carefully test the uses that I have flagged as troublesome.
#1 is localrepository.commit();
    that probably needs a test case.  We might have to document "do not use
    bfiles with MQ or record" if we have problems there and can't
think of a fix.

Greg


More information about the Mercurial-devel mailing list