children, n^2 and solutions?

Nicolas Dumazet nicdumz at gmail.com
Mon Jul 26 20:28:16 CDT 2010


2010/7/27 Jason Harris <jason at jasonfharris.com>:
>
> On Jul 26, 2010, at 8:34 PM, Matt Mackall wrote:
>
>> On Mon, 2010-07-26 at 18:58 +0200, Jason Harris wrote:
>>> Fortunately / unfortunately MacHg works incrementally. That means if I
>>> do hg log -r 200100:200200 it will give me the results about an order
>>> of magnitude faster than hg debugdag. So if I were to use it I would
>>> have to do the log command (to get the information to display to the
>>> users) and then the debugdag command as well to get the structure.
>>> Additionally the Tree can change at many hg instructions. Eg pull,
>>> merge, strip, edit, collapse, etc can cause the tree to change, so
>>> really a full update needs to be done on many commands...
>>
>> Call debugdag -once- the first time you visit a repo, cache the graph on
>> -your- side, and update it incrementally when the repo is changed.
>
> Well thats just it you see. For example say the user does an histedit, or indeed opens up the terminal and calls the command line outside of MacHg (which I detect things have changed through FSEvents), or does many other commands. Basically you have to start again. You can't do it incrementally since you have no idea which revisions changed, which were added, and which were removed.... Thus you can't incrementally do it easily.

I think that you can pretty much guess what's happening for a lot of cases.

- First of all, remember that for small repos, you don't need all this
complexity. You can just rebuilt the dag, because it's cheap.

- Then, for all operations taking place inside MacHg, you know what is
happening and what is changing. You can update your graph cache
accordingly. You don't even have to track precise modifications: for
example, for all MQ operations, just save before the qparent:qtip
range, and wipe it from cache after the operation to refresh it.
Looking up 10 revisions instead of 1 is just fine, it's cheap.

- For operations outside of MacHg? Just consider the operations ADDING
changesets for now.  You can use the set of heads as a cache
key/identifier for your cache, save it on disk with the cache, and you
win. Why? Let's consider that the user browses the changelog in MacHg,
and exits. You save on exit the cache and set of current heads. Then,
outside of MacHg, user pulls changes from somewhere. You fire up
MacHg, and find that changes have happened since last run. Don't
panic: finding out which changesets have been added is very, very
easy: get the new set of heads, changelog.findmissing(common=oldheads,
heads=newheads)

- And then, you're going to ask me "yes, but what about
history-destructive changes outside of MacHg"? First, how to detect
them? I am not sure, but history destructive changes outside of hg
should always change a few cset hashes, and in turn, change the heads
node values. Which means that, when MacHg starts, you fetch the
oldheads set of nodes you had saved, and I think that if one of those
old heads is not in the new changelog, it should be enough to detect
that a destructive operation has happened.

What can you do about those changes? Not much. You can't cover
everything. Sure, you could write complex wrappers of
mq/rebase/transplant to force hg to report somewhere what has changed,
but it's just not worth it.
If MacHg has a decent UI for mq/rebase/transplant, then user should
not be calling mq/rebase/transplant outside of MacHg. Be realistic, I
think that you have to draw the line, and can't support just about any
operation and be cheap everytime. Document that behaviour clearly, and
maybe display an UI message when refreshing the DAG due to this kind
of destructive changes "Dear user, you have modified history outside
of MacHg, but my developers could not implement a crystal ball. Yet.
I'm now refreshing the cache, please refrain from doing this to me in
the future."


Regards,
-Nicolas.

>
> At one stage I did have a command at once stage which gave the changeset hash key of a series of revisions going back wards. Ie it gave relative to the tip the changeset corresponding to the revision of the tip minus 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987 so this way with only 16 changesets requested I got a reasonable idea of where the graph would diverge from the old graph... This seemed to execute reasonably fast and from there I could figure out the approximate place the tree diverged and then start zeroing in on where it changed and then try and incrementally update from there, but this is all fairly non-trivial. In the end I abandoned this approach since it was problematic.
>
> In addition now if I determine that the graph has changed how (other than a debugdag command) do I know what has changed?  Thus I would have to execute a debugdag command and then we have lost incrementally. Of course there are certain measures I could take and already this is more complicated than just making sure I look more than 100 children into the future and then back patch the extended links...
>
> Anyways as I said my humble request would be to be able to use children in log in O(1) or O(n) which would be immensely useful to me, is a clean interface to Mercurial and is very "simple". If somehow someone finds it in their harts to put this on their schedules I would be extremely grateful! Thank you!
>
> Cheers,
>  Jas
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
>



-- 
Nicolas Dumazet — NicDumZ


More information about the Mercurial-devel mailing list