Tags & production questions

Fri May 4 20:51:05 CDT 2007

On 2007.05.04 13:43, Guido Ostkamp wrote:
> Hello Mark,
> 
> > IMO, you are vastly overrating the importance of all clones being  
> able> to see everything at once.
> >
> > How big is your project?  How large are your disk drives?
> 
> to give you some numbers:
> 
> It is a large multisite project with dozens of persons having access   
> to  sources even from multiple countries. We have several VOBs (kind  
> of  Clearcase repositories). A few days ago I did an experiment and   
> transferred all ClearCase versions from just the main branch of the   
> main  source VOB by finding all versions of all files, sorting them by  
> checkin  time and then replaying all checkins into a fresh Mercurial   
> repository.  This conversion took a whole night. 
> I ended up with a Mercurial repository of ~950 Megabyte size   
> (including  ~500 MB working copy), which contained ~9200 files in  
> ~1300  directories  and had ~38000 changesets. A small number of files  
> are binaries.

Do you use straight ClearCase or do you use UCM?

In the "real" world (where ever that may be), you'd have several files  
in a changeset since you would desire that they all are committed to the  
repository as a unit.  (In ClearCase/UCM terminology, that would be an  
"activity". There, an "activity" can contain multiple versions of  
multiple files.)

I haven't tried converting our ClearCase main repository to Mercurial.   
In our case, it's around 14,000 java files with about 4,000 more  
data/jar/c++ files in support.  About 1.2G of working directory.  We've  
got ~6 releases out there and we normally have to propagate our fixes  
back 2 to 3 releases.

> 
>  As I said, this was only the main branch. We also have ~15 more   
> branches  with main development lines most of which are still  
> maintained where  each  branch contains numerous maintenance releases  
> made over the years  which  are 'tagged' with labels in ClearCase.
> 
>  Development mainly takes place on Sun servers running Solaris OS. In  
> a  professional environment, server disk space, which also has to be   
> backed  up at night, is very expensive - also the systems are used for  
> a long  time, and disks have thus not the sizes you are used to on a  
> modern  PC.
> 
>  Typically, each developer has a quota of just a few Gigabytes, let  
> say  5  GB, which he cannot exceed.

Wow, they are pretty generous.  Our limit is 500MB on the Unix servers.   
Designer desktops are 80GB (assuming that you don't just go out and buy  
another disk drive yourself).

> 
> In case you don't know Clearcase yourself, you must understand that it  
> has  its own filesystem which allows to define 'views' to the  
> repositories  by  applying rulesets defined in a so called  
> 'configspec'. You just get  the  stuff mapped in at certain  
> directories, but there is no physical copy  in  your directory that  
> uses up any disk space. Only compilation results  like  object files,  
> libraries and binaries really use up storage space (at  least  in our  
> setup where we do not use wink-in objects).

I'm quite familiar with ClearCase, as it so happens.  You *can* have  
static views you know; in fact, they are lot faster for compiling in our  
development environment under UCM.  (We don't use Clearmake for our java  
compilation.)  But we're able to compile on our desktops, which I guess  
you cannot.

> When we get a bug report for some version out in the field, we have to  
> fix  it for that version and port the fix to at least all newer  
> branches  including the mainline. This means we have to check what's  
> in those  versions by analyzing logs, possibly compare versions from  
> different  branches etc. Typically the fix is developed in the  
> mainline first (if  it  does not already exist therein) and then  
> ported back to maintainance  branches.

That's interesting.  We haven't used raw Clearcase for a couple of years  
now so I've kind of blocked that madness out of my head.

What we do now is (essentially) develop the fix in the release that  
finds the problem in an activity and "deliver" the activity to the other  
"streams".   (That's the terminology, anyways.)

The delivery mechanism in ClearCase/UCM is very very close to  
Mercurial's bundle/unbundle mechanism.  (We have an internal tool for  
propagating changes from one release to the next which is closer to  
Mercurial's export/import mechanism, because you often don't want *all*  
the changes made to a file in the past release to be merged back in  
time.)

> 
> Thus it is absolutely crucial to have everything available in  
> onerepository. It would be an absolute nightmare to have each of  
> themaintenance releases of each branch in its own repository.

Well, no, it wouldn't be.  You can merge across repositories with ease  
with Mercurial, which you cannot do with ClearCase.  The main point,  
though, is that you've *got* to start thinking in terms of changesets  
rather than in terms of individual file versions when you work with  
Mercurial.  One of your changesets would *be* the bug fix and *that* is   
what you would share across the repositories.

Right now, you're probably running findmerge or something similar.

> 
> Having explained that, I should make clear that I'm just one out of   
> many  developers and certainly not in the position to decide which  
> source  code  management my employer uses, so they will stay with  
> ClearCase.

I think that ClearCase/UCM would be an improvement over your current  
setup.
This is a link to a discussion about activity-based change control.   
It's designed to sell you something, of course, but the concepts within  
are pretty accurate.

http://www-128.ibm.com/developerworks/rational/library/04/r-3284/

And no, I don't work for IBM.

> 
> However, for my work I would like to maintain a shadow Mercurial   
> repository which can help me for interim development before I check   
> things  back into ClearCase, this is why I try to find out whether  
> Mercurial  could  do the job.

Well, there's going to be a methodology mismatch.  That might get in the  
way of what you're trying to do.