Phase based repoview?

Jun Wu quark at fb.com
Wed Feb 22 08:44:10 UTC 2017


This is a proposal that adds a new "archived" phase so we have "public >
draft > secret > archived", and archived changesets are hidden. It was an
idea I mentioned causally on IRC, and interested some people.

I figured out that I had mentioned the same idea 5 months ago, where
marmoute pointed me to do archeology.

The wiki page https://www.mercurial-scm.org/wiki/PhasesDevel#More_phase has
a section explaining why a "trash" phase is not a good idea. It's mainly
about the exchange part - pulling a new commit on top of a "trashed" commit
may make the new commit "trashed" incorrectly, if we do not change the phase
boundary accordingly.

I don't think that's convincing, because:

  1. The same thing applies to other phases. It's possible to pull a public
     commit on top of a draft commit.
  2. The current behavior ("untrash/unarchive" commits) is reasonable.
     Think of Gmail: if a thread is "archived", a new mail will "revive" the
     thread, and the reader needs to "archive" it again. An alternative
     behavior, just ignore the head during pull is also reasonable to me.

The old discussion thread is at
https://www.mercurial-scm.org/pipermail/mercurial-devel/2012-January/036736.html

It has some views not found in the wiki page. I'm replying some of them here:

  > Implementing a trash phase to mark deleted changeset: phase are designed to
  > move a given direction (old phase > new phase) and moving changeset from
  > draft or secret phase to a trash one will break this rule.

  I see "phase" as "heads" with extra information. The rule is nice to have
  but I don't see reasons why it couldn't be broken.

  > I'm pretty sure we need to communicate trashed/obsoleted state as well
  > - otherwise how can I publish RFC draft changesets for people to
  > examine, and then have them get "deleted" later?

  That's obsstore's job, a separate task. 

  If we do not exchange secret/archived changesets, we do not have the
  exchange problem. It's like the repo do not have archived changesets when
  exchanging.

  (mpm also seemed to believe it's better than other approaches for tracking
   "deleted/hidden" changes, which is, I think, unsurprising)

Phase-based hidden will open up new possibilities:

  - Step 1: Selective "hidden" source of truth

    Who decides "hidden"? It could be the obsstore (currently), or just the
    "archived" phase (which is likely to be faster).

  - Step 2a: More flexible repoview

    With the phase being the main source of the "hidden/filteredrevs"
    information, we could have a new flag like
    "--visibility=public|draft|secret|archived" to let the user choose what
    they'd like to see. "--visibility=public" will hide all non public
    changesets. And "--visibility=archived" will show all changesets, like
    what "--hidden" does today.  "--visibility=secret" will be the default.

    In additional, if we source control the phases changes (should be cheap
    to do so), we could provide a repoview of anything in the past, like
    "--visibility=public at a-previous-commit".

    Another flexibility is to filter the repo with selected draft heads,
    plus all public heads. This means if the user has too many draft heads,
    and they could get a view of the repo with that single draft branch by
    something like "hg smartlog --visibility=public+head-rev".

  - Performance

    Practically, I think the "heads" with extra information (phase) is a
    *very* efficient way to do repo filtering. If we know some heads are
    "archived", we could even just skip them - just scan visible heads and
    mark them as visible in a len(repo) bitmap, instead of scanning all
    "heads" and filtering them later.

Although the obsstore has more information to implement the above
possibilities, it's too expensive to be practical. Speeding it up would be
likely to be reinventing phases in some way.

One sad issue is the word "archived" - it conflicts with the command
"hg archive" somehow. But it is in line with what people learned from Gmail
and some other software. And it's less scary than things like
"deleted/dead/killed" because the commit is still there. That said, I'd like
to hear ideas from naming experts.

If we found compelling reasons that phases should definitely not be messed
up with "hidden", since these kind information are just extra data attached
to "heads" (and "mid-heads"), I think it's still possible to attach the
"hidden" boolean flag to "(mid-)?heads", so a (mid-)head could have 2
fields attached:

  (phase (int), hidden (bool))

That's possible in theory and is still more efficient than obsstore-based
hidden for sure. But I feel that it could be overcomplicated, like
"shouldn't public commits always visible?". I'd prefer just using phases for
simplicity.

What do you think?

This discussion is a topic that I think is interesting, both feature-wise
and performance-wise. It's more about the future. A decision will help make
some other decisions along the road (like, spend less time on optimizing
obsstore-based hidden), but it does not mean that I'll work on this any time
soon.


More information about the Mercurial-devel mailing list