children, n^2 and solutions?

Jason Harris jason at jasonfharris.com
Tue Jul 27 08:50:33 CDT 2010


On Jul 27, 2010, at 3:28 AM, Nicolas Dumazet wrote:

> 2010/7/27 Jason Harris <jason at jasonfharris.com>:
>> 
>> On Jul 26, 2010, at 8:34 PM, Matt Mackall wrote:
>> 
>>> On Mon, 2010-07-26 at 18:58 +0200, Jason Harris wrote:
>>>> Fortunately / unfortunately MacHg works incrementally. That means if I
>>>> do hg log -r 200100:200200 it will give me the results about an order
>>>> of magnitude faster than hg debugdag. So if I were to use it I would
>>>> have to do the log command (to get the information to display to the
>>>> users) and then the debugdag command as well to get the structure.
>>>> Additionally the Tree can change at many hg instructions. Eg pull,
>>>> merge, strip, edit, collapse, etc can cause the tree to change, so
>>>> really a full update needs to be done on many commands...
>>> 
>>> Call debugdag -once- the first time you visit a repo, cache the graph on
>>> -your- side, and update it incrementally when the repo is changed.
>> 
>> Well thats just it you see. For example say the user does an histedit, or indeed opens up the terminal and calls the command line outside of MacHg (which I detect things have changed through FSEvents), or does many other commands. Basically you have to start again. You can't do it incrementally since you have no idea which revisions changed, which were added, and which were removed.... Thus you can't incrementally do it easily.
> 
> I think that you can pretty much guess what's happening for a lot of cases.
> 
> - First of all, remember that for small repos, you don't need all this
> complexity. You can just rebuilt the dag, because it's cheap.
> 
> - Then, for all operations taking place inside MacHg, you know what is
> happening and what is changing. You can update your graph cache
> accordingly. You don't even have to track precise modifications: for
> example, for all MQ operations, just save before the qparent:qtip
> range, and wipe it from cache after the operation to refresh it.
> Looking up 10 revisions instead of 1 is just fine, it's cheap.
> 
> - For operations outside of MacHg? Just consider the operations ADDING
> changesets for now.  You can use the set of heads as a cache
> key/identifier for your cache, save it on disk with the cache, and you
> win. Why? Let's consider that the user browses the changelog in MacHg,
> and exits. You save on exit the cache and set of current heads. Then,
> outside of MacHg, user pulls changes from somewhere. You fire up
> MacHg, and find that changes have happened since last run. Don't
> panic: finding out which changesets have been added is very, very
> easy: get the new set of heads, changelog.findmissing(common=oldheads,
> heads=newheads)
> 
> - And then, you're going to ask me "yes, but what about
> history-destructive changes outside of MacHg"? First, how to detect
> them? I am not sure, but history destructive changes outside of hg
> should always change a few cset hashes, and in turn, change the heads
> node values. Which means that, when MacHg starts, you fetch the
> oldheads set of nodes you had saved, and I think that if one of those
> old heads is not in the new changelog, it should be enough to detect
> that a destructive operation has happened.
> 
> What can you do about those changes? Not much. You can't cover
> everything. Sure, you could write complex wrappers of
> mq/rebase/transplant to force hg to report somewhere what has changed,
> but it's just not worth it.
> If MacHg has a decent UI for mq/rebase/transplant, then user should
> not be calling mq/rebase/transplant outside of MacHg. Be realistic, I
> think that you have to draw the line, and can't support just about any
> operation and be cheap everytime. Document that behaviour clearly, and
> maybe display an UI message when refreshing the DAG due to this kind
> of destructive changes "Dear user, you have modified history outside
> of MacHg, but my developers could not implement a crystal ball. Yet.
> I'm now refreshing the cache, please refrain from doing this to me in
> the future."

Thanks Nicolas. Yes, it *is* doable. But as you listed above yourself, there are a lot of edge cases and potential things which will just crop up. It would of course make my life a lot simpler to have O(1) or O(n) children working in log :) And hopefully it would be useful for others as well. Thus my plea...

But to get around this in the meantime my caching for MacHg will be somewhat different than what you have listed above. It will still be incremental and dynamically generated (ie it doesn't cache anything on disk, etc) but it does this by only looking some 100 changesets into the future. Any other children are listed when found and don't have a full link ie they have a link stub.

But thanks for the detailed comments! (I didn't know about about changelog.findmissing(common=oldheads, heads=newheads) but unfortunately it can't help me in any case since MacHg calls Mercurial through the public command line api... (I could of course write an extension to expose this, etc...))

Cheers,
  Jas

> 
> Regards,
> -Nicolas.
> 
>> 
>> At one stage I did have a command at once stage which gave the changeset hash key of a series of revisions going back wards. Ie it gave relative to the tip the changeset corresponding to the revision of the tip minus 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987 so this way with only 16 changesets requested I got a reasonable idea of where the graph would diverge from the old graph... This seemed to execute reasonably fast and from there I could figure out the approximate place the tree diverged and then start zeroing in on where it changed and then try and incrementally update from there, but this is all fairly non-trivial. In the end I abandoned this approach since it was problematic.
>> 
>> In addition now if I determine that the graph has changed how (other than a debugdag command) do I know what has changed?  Thus I would have to execute a debugdag command and then we have lost incrementally. Of course there are certain measures I could take and already this is more complicated than just making sure I look more than 100 children into the future and then back patch the extended links...
>> 
>> Anyways as I said my humble request would be to be able to use children in log in O(1) or O(n) which would be immensely useful to me, is a clean interface to Mercurial and is very "simple". If somehow someone finds it in their harts to put this on their schedules I would be extremely grateful! Thank you!
>> 
>> Cheers,
>>  Jas
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at selenic.com
>> http://selenic.com/mailman/listinfo/mercurial-devel
>> 
> 
> 
> 
> -- 
> Nicolas Dumazet — NicDumZ



More information about the Mercurial-devel mailing list