Forest? Branches? And other beginner questions...

Stuart W. Marks smarks at smarks.org
Mon Mar 26 22:49:25 CDT 2007


I certainly understand the rationale for having separate repositories for 
separate software components, both in Teamware and in Mercurial.

The thing I don't understand about forests is whether there are any semantics 
to them. For example, is "hg fpull" any different from

     for d in $SUBDIRS ; do
         hg pull -R $d
     done

? If not, is it worth having an extension for this? I suspect there is 
something more going on, though, since the ForestExtension page mention 
snapshot files but doesn't explain what they do....

I'm also totally unclear on the semantics of nested repositories, how they 
work, and even whether they work. So, it appears that I have my own set of 
beginner questions. :-)

s'marks

Kelly O'Hair wrote:
> I'm working on the OpenJDK mercurial planning, and we will be using
> nested repositories to contain what we are calling 'closed' sources,
> or sources we cannot distribute as 'open source'. Forests will be
> very handy in this, but mostly for the few teams that need to manage
> these sources, most teams won't, and much of the closed files are just
> temporary. Since once a file is in a repository, it never really
> is 'gone', so sources that need to be hidden should never enter an
> open repository.
> 
> In addition, we have always used multiple Teamware workspaces in the
> JDK project, and we will probably be continuing that model with Mercurial
> repositories. The logic for why and when you want to isolate different 
> sources
> into multiple/separate repositories is covered pretty well in your 
> comments.
> Teamware got sluggish on us when the file count exceeded 20,000 files for
> various reasons, so that did influence us in the past. Mercurial will
> probably remove that technical consideration, but very large repositories
> (40,000+ files) still may be a development issue for many people
> working on older machines, and sometimes it just makes sense to find
> some logical fault lines in your sources.
> Some teams that rarely need to look at the other 18,000 files want
> their 2,000 files managed separately, sometimes this is good, sometimes
> not. Anyone that does need to make changes to many files spawning
> multiple repositories will not like having to create N changesets
> for one global change.
> 
> There are issues with Forests, nested repositories, and separate
> repositories. They are really completely independent, different roots,
> different changesets, different configurations, sync-up of interface
> changes between repositories, etc.  So I am concerned
> about long term management of a complicated 'forest', but I'm extremely
> pleased to have the forest extension, it does give me something to work
> with.
> 
> -kto
> 
> Stuart W. Marks wrote:
>> Marcin Kasperski wrote:
>>> (This is my first post to this list, so, before actual question, I 
>>> would like to thank the creators of mercurial. Interesting, well 
>>> implemented tool. Thank you.)
>>>
>>> (background)
>>> I am new to mercurial (= I've read significant part of docs, and make 
>>> some experiments), and to the distributed source management (= have 
>>> been using RCS, DEC CMS, CVS and Subversion for years). Recently I 
>>> decided to adapt some distributed version control tool, read a couple 
>>> of pages, narrowed list to git and mercurial and finally picked the 
>>> latter as I need windows support.
>>
>> I'm fairly new to mercurial as well, but I've confronted some of the 
>> same issues. I'll try to answer your questions.
>>
>>> (question 1)
>>>  From the docs I've read I got the feeling, that I should use 'many 
>>> repositories' (= create separate mercury directory for every program, 
>>> library, documentation, whatever I track) - so the tagged, pulled, 
>>> pushed item is not made of a few independent entities. OK, let it be, 
>>> but if I then imagine having those 10 or maybe 20 separate projects, 
>>> tasks like verifying whether all are pushed to the main location, or 
>>> getting all the things to the new machine start being tedious.
>>> I noticed Forest plugin, which seem to be intended to help in such 
>>> cases. Unfortunately, it is almost undocumented (I can mostly guess 
>>> what the commands do, but some description of suggested process, 
>>> layout etc would be really useful)
>>
>> I don't think the advice is necessarily to have separate repositories 
>> for each program, library, document, etc. If there is a set of related 
>> stuff, say some libraries and a set of programs that use those 
>> libraries, along with documentation, then it would make sense for them 
>> all to reside in the same repository. A key question is whether the 
>> artifacts are required to evolve at the same time. For example, if a 
>> change is made to a library, and change is made to a program that 
>> depends on the new version of the library, it might make sense to have 
>> them all in the same repository. But you also might put a bunch of 
>> stuff that is not so closely related in the same repository, merely 
>> because they're part of the same "family" of stuff, or for your 
>> convenience.
>>
>> The advice about separate repositories is more about the workflow of 
>> developing and propagating changes. For example, many SCMs use a 
>> copy-modify-merge approach. In hg, the copy typically involves cloning 
>> some main repository into a local repository, and the merge would 
>> involve pushing changes from the local repository into the main 
>> repository. A more complicated setup might involve a team pushing 
>> changes from individual repositories into a workgroup repository, and 
>> then propagating these changes into a main repository at particular 
>> intervals. This is in contrast to centralized systems such as svn, 
>> where there is exactly ONE repository.
>>
>> I don't know anything about the Forest extension. Looking at the 
>> extension page
>>
>> http://www.selenic.com/mercurial/wiki/index.cgi/ForestExtension
>>
>> it sort of makes sense, but I'm having trouble imagining the 
>> circumstances under which it would be useful.
>>
>>> (question2)
>>> In git it is fairly clear, that one developer on one machine is using 
>>> one repo with many branches. In mercurial, it seemed each separate 
>>> line should be separate clone. But recently some branches command 
>>> showed up, not documented (almost). What is the current 'way of doing 
>>> things'?
>>
>> I think the preference for separate development lines is still 
>> separate repositories. However, there are a few cases where 
>> "in-repository" branches are called for. These branches probably don't 
>> work very well for individual developers. I think they're useful for 
>> development of parallel releases (e.g. a branch for patches to a 1.0 
>> release, with 2.0 development on the main line) since having these in 
>> a single repository shows their relationship more clearly (to me at 
>> least). There are some things that still need to change in hg before 
>> this will work really well, though.
>>
>> s'marks
>> _______________________________________________
>> Mercurial mailing list
>> Mercurial at selenic.com
>> http://selenic.com/mailman/listinfo/mercurial
> 
> 


More information about the Mercurial mailing list