Strategies for push/merge problem?

Tue Jul 15 14:59:54 CDT 2008

2008/7/15 Adrian Buehlmann <adrian at cadifra.com>:
> Oh well, *I* don't care what you do in your private repos.
>
> But I wonder if a project e.g. like Mercurial would accept a changeset
> like this:
>
> "This is what I achieved until 5pm. Currently, the testsuite is
> broken, but if I manually set variable x to an illegal value z, I
> can almost do feature X. I will continue tomorrow morning, but I had
> to commit this state in order to being able to take it home on my
> USB stick RIGHT NOW (need to run!)".
>
> Who wants to see such a changeset in the eternal history of any non-toy,
> non-hobby, non-single-developer, non-throw-away project?

I agree entirely. And it seems to me that this is an issue still to be
addressed by all DVCS. Tools like mq address the problem, but not
really in a way that suits everyone (IMHO). I used mq while developing
my case folding patches, and while it did the job, I found that it
felt very little different from old-style (SVN) development where I
have a local working directory which I keep up to date with the
central one, and my work sits in the working directory without the
benefits of version control. OK, mq let me manage my changes as 3
patches rather than 1, but that's a minor thing. I tried versioning
the patch queue, but it didn't really help me a lot.

Maybe being able to collapse the local history into one "final"
changeset to be pushed would work. I'm not sure.

But it's not a solved problem - mq is one approach, but it's not perfect.

> And yes, *I* for one don't, also not on a "personal" branch
> that gets pulled into the history of a project.

So, if that is how I actually work, then what would you want to see in
the project repository? A single changeset representing my final push
as one chunk? (Bear in mind, here, that I am not talking about
non-core developers like I currently am, who have to work by sending
patches, I'm thinking about people who have the right to push to the
central repo).

> You really have to separate your personal need to go back and forth
> in micro steps and recording relevant project history which is meant
> to be read, pulled, tested, analyzed, reviewed and understood
> *and* bisected by others.

I'm not sure why you are making such an issue of bisect here. I
haven't used it yet, but my understanding was not that you ran the
project test suite at each bisect step, but that you ran a custom test
for a specific issue you were trying to locate. So, for example, if
you were looking for a changeset that introduced a typo into the help,
you'd use something like

    hg help | grep qremane

as your "test". This works no matter if the full test suite works or not.

Did I misunderstand?

> The former might be done by using Mercurial to track .hg/patches
> produced by mq.
>
> The latter is meant for *interesting* steps, not for arbitrary
> "I'm saving my current state right now, because I have to go home now".
> For example like: "Refactoring: moving function X to Y",
> "Fixing bug X", "Adding new function Z", etc., which can all
> be done *without* knowingly breaking a testsuite.

I'm not 100% sure what you mean in this context by "former" and
"latter", but as I said, I tried versioning .hg/patches and found it
somewhat clumsy. Maybe it'll get better with practice, but I'm not
sure.

> Assume you have a repo, which contains 80% of such "I-needed-to-go
> home-csets that unfortunately currently break the testsuite", so let's say you
> typically have 4 such vanilla csets and one good "pushed" one.
>
> Now let's assume one morning a developer discovers: "ouch, the
> testsuite doesn't pass anymore". What do you do now?
>
> Assuming you would like to know which change introduced the
> problem. So you try a few last csets, but with no luck.
> All don't pass the testsuite (*hit happens).
>
> Now what? Ok, you go back some 100 csets, to a cset which you sure
> know was good and start bisecting. Now bisect will present you
> some cset X in between and you have to say whether that one is ok
> or not.
>
> So, was that cset X a vanilla one, which you didn't care about whether
> the tests pass at all or was it a "deemed good", "pushed" one? If it's the
> former, you might say: I simply assume, this one was ok and say so to bisect.
> Do you even know that?
>
> Needless horror, I'd say.

I see your point, and as I say I believe this to be a not-yet-solved
problem (unless you view mq as the final solution to such issues).

The issue is, given distributed development and push/pull at the
changeset level, how can you enforce a policy on one repository
(whether is is "testsuite passes at all revisions", or "code compiles
at all revisions", or "QA have signed off all revisions") without
insisting on that policy for all clones, whether they are centrally
managed or entirely personal.

I'm saying (and I think others are too) that maybe "all revisions"
isn't a sensible constraint. Maybe what is needed is a concept of
"important" revisions. But that's not easy to define.

Maybe the group extension would help. Or some sort of "label"
extension, where a label is like a tag, but can apply to multiple
changesets, and bisect and log and the like can be restricted to
specific labels. I don't know. That's the point here - there are still
things that aren't fixed in stone.

> But please don't assume I would want to enforce anything on projects
> I don't care about.

I'm not. But the point is that enforcement is at the project level,
not at the individual repo level (IF changesets are passed around via
push/pull rather than as patches). So if the project has a policy, it
applies to any repo that isn't separated from the main project repo by
a diff/patch boundary.

Paul.

PS Both Hans Meine and Benjamin Smedberg have just posted the example
of refactoring. This is a much better use case than my "commit before
I transfer to another PC" example - even though my example is how I
use Mercurial in real life.