D1606: phases: drop the list with phase of each rev, always comput phase sets
quark (Jun Wu)
phabricator at mercurial-scm.org
Wed Dec 6 21:02:42 EST 2017
quark requested changes to this revision.
quark added a comment.
This revision now requires changes to proceed.
I have been also wanting to remove this O(changelog) space usage for long. I noticed phasecache was the biggest memory consumer among all `repo.` components (dirstate, changelog, etc) when prototyping stateful chg early this year.
Requesting changes mainly because the `if public in phases` branch in `getrevset` has issues. Otherwise it looks good to me.
INLINE COMMENTS
> revlog.c:627
>
> static PyObject *compute_phases_map_sets(indexObject *self, PyObject *args)
> {
I guess the `map` here means a `rev` -> `phase` map, which was the list. Since we now return len(size). Maybe rename the function to `compute_phases_len_sets`.
> revlog.c:1965
> "get an index entry"},
> {"computephasesmapsets", (PyCFunction)compute_phases_map_sets,
> METH_VARARGS, "compute phases"},
Also here, `computephaseslensets`
> phases.py:226
> else:
> - # slow path - enumerate all revisions
> - phase = self.phase
> - revs = (r for r in repo if phase(repo, r) in phases)
> - return smartset.generatorset(revs, iterasc=True)
> + assert not repo.changelog.filteredrevs
> + phases = set(allphases).difference(phases)
The `assert` is incorrect - repo could be "filtered". The old code support filtered repo implicitly by using `for r in repo`, the iterator of `repo` will remove filtered revisions.
> phases.py:226-237
> + assert not repo.changelog.filteredrevs
> + phases = set(allphases).difference(phases)
> + if not phases:
> + return smartset.fullreposet()
> + if len(phases) == 1:
> + [p] = phases
> + revs = self._phasesets[p]
Sorry - I made a mistake using `repo.revs('public()')` to test the logic here. Actually that exercises a different code path <https://www.mercurial-scm.org/repo/hg/file/aa905f9cdcda/mercurial/revset.py#l1616>.
There are practically no users exercising this `public in phases` code path. So issues like `fullreposet` requires a `repo` argument is not exposed.
I think we can either:
- Pass an optional `subset` here then replace the revset implementation to use it:
def getrevset(self, repo, phases, subset=None):
if subset is None:
subset = smartset.fullreposet(repo)
if ...
# fast path
...
return subset & smartset.baseset(revs)
else:
...
- Remove support for calculating `public()` here by just `raise error.ProgrammingError('public() is unsupported')`
I guess the former is preferred eventually. But if you want to land this faster, the latter is easier. I can help writing a follow-up.
REPOSITORY
rHG Mercurial
REVISION DETAIL
https://phab.mercurial-scm.org/D1606
To: joerg.sonnenberger, #hg-reviewers, quark
Cc: quark, mercurial-devel
More information about the Mercurial-devel
mailing list