Bug 3616 - failed import leads to repo corruption
Summary: failed import leads to repo corruption
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: earlier
Hardware: All All
: critical bug
Assignee: Kevin Bullock
URL:
Keywords: regression
: 3787 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-09-07 16:06 UTC by Bryan O'Sullivan
Modified: 2017-11-01 18:05 UTC (History)
9 users (show)

See Also:
Python Version: ---


Attachments
Trying to apply this patch triggers the bug (11.92 KB, text/plain)
2012-09-07 16:07 UTC, Bryan O'Sullivan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Bryan O'Sullivan 2012-09-07 16:06 UTC
Just ran into this very nasty one. I tried to apply a patch from marmoute, the patch failed to apply, and my repo was corrupted in the process.

Bisecting led to this commit:

changeset:   15198:62dc0e7ab092
user:        Greg Ward <greg@gerg.ca>
date:        Sun Oct 02 14:34:28 2011 -0400
summary:     import: wrap a transaction around the whole command
Comment 1 Bryan O'Sullivan 2012-09-07 16:07 UTC
Created attachment 1692 [details]
Trying to apply this patch triggers the bug
Comment 2 Bryan O'Sullivan 2012-09-07 16:09 UTC
Here's the repro script I used for bisecting.

I have a regular hg repo in $HOME/hg/hg with tip of 1526ac765e29.

I have a crew repo in $HOME/hg/hg-crew with tip of bacde764fba0.

#!/bin/sh

export HGRCPATH=/dev/null

cd $HOME/hg/hg
make local || exit 125
cd ..
rm -rf hg-tmp
hg clone hg hg-tmp
cd hg-tmp
hg pull ../hg-crew
$HOME/hg/hg/hg import --exact $HOME/Downloads/1.patch || hg verify
Comment 3 Matt Mackall 2012-09-07 16:34 UTC
Works for me:

$ hg import --exact http://bug3616.bz.selenic.com/attachment.cgi?id=1692
applying http://bug3616.bz.selenic.com/attachment.cgi?id=1692
61 files updated, 0 files merged, 1 files removed, 0 files unresolved
no rollback information available
transaction abort!
rollback completed
abort: patch is damaged or loses information

$ hg verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
2163 files, 17424 changesets, 34164 total revisions
Comment 4 Bryan O'Sullivan 2012-09-07 16:47 UTC
Did you follow exactly the steps in that repro script?

For me, the corruption only arises if I clone, then pull, then import.

I've seen it on two different machines now, one Linux and one OS X, so I don't think it's a figment of my imagination.
Comment 5 Bryan O'Sullivan 2012-09-07 18:14 UTC
I believe I see what's going on.

With --exact, hg import passes None as a matcher to repo.commit:

http://selenic.com/repo/hg/file/3ee5d3c372fa/mercurial/commands.py#l3761

This is a patch where none of the hunks apply, so repo.commit returns a node ID that does not match the expected node ID when checkexact looks.

As a result, checkexact calls repo.rollback. Instead of rolling back the bad commit, it rolls back whatever the *previous* transaction was, as stored in .hg/store/undo.

I can even see this in the crash output, though I didn't notice it earlier:

  $ hg import --exact /tmp/trigger.patch --traceback
  applying /tmp/trigger.patch
  105 files updated, 0 files merged, 0 files removed, 0 files unresolved
  repository tip rolled back to revision 17443 (undo pull)

See how the description of what's being rolled back doesn't match?

Meanwhile, the data from our bogus empty transaction is present in the affected revlog files, and I presume that this has something to do with the corruption, though I don't understand exactly what yet.
Comment 6 Bryan O'Sullivan 2012-09-07 19:43 UTC
Simple repro:

#!/bin/sh

rm -rf 3616
mkdir 3616
cd 3616
hg init a
cd a

echo a>a
hg ci -Ama

echo a>>a
hg ci -mb

echo a>>a
hg ci -mc

echo a>a
echo b>>a
echo a>>a
hg ci -md

hg export 2 | head -7 > ../a.patch
hg export tip | tail +8 >> ../a.patch


cd ..
hg clone -r0 a b
cd b
hg pull -r1 ../a

hg import --exact ../a.patch
hg verify
Comment 7 Pierre-Yves David 2012-09-07 20:41 UTC
A trivial fix would be to wrap the whole import of a patch process in a single transaction and cancel this very transaction in case of trouble.
However this would create an empty transaction in case of non-op import.
Comment 8 Bryan O'Sullivan 2013-01-28 14:07 UTC
*** Bug 3787 has been marked as a duplicate of this bug. ***
Comment 9 Augie Fackler 2013-01-30 10:30 UTC
I just this using 'hg import --exact' trying to apply a probably-mangled hgsubversion patch:

https://groups.google.com/forum/message/raw?msg=hgsubversion/HokSDdXeCdE/R-gdaZJQCPgJ

As in bos's case, I can't reproduce this on a completely fresh clone - there's something about my local hgsubversion working copy that tickles this.
Comment 10 Kevin Bullock 2013-01-30 11:14 UTC
I have a repro on fresh clone in bug 3787. It involves a patch that probably has 'extra' data not exported, and thus makes import --exact fail with 'abort: patch is damaged or loses information'.
Comment 11 Matt Mackall 2013-01-30 11:18 UTC
The missing piece to raise this to "critical" is example verify output so we can see what is meant by "corrupted".
Comment 12 Augie Fackler 2013-01-30 11:21 UTC
I was about to do just that, but it seems I've done something to my hgsubversion clone that fixed the problem. ugh.
Comment 13 Kevin Bullock 2013-01-30 11:27 UTC
Okay, here it is with verify output. Note that this is a little different than what I posted in bug 3787 -- the first two patches import fine, so this narrows it to just the problem case (and thus shortens the verify output to something copy-pastable).

❧  hg clone -q -U http://hg.intevation.org/mercurial/crew+main hgasplode❧  cd hgasplode
❧  hg import --exact http://patchwork.serpentine.com/patch/745/mbox/ http://patchwork.serpentine.com/patch/746/mbox/ 
applying http://patchwork.serpentine.com/patch/745/mbox/
1008 files updated, 0 files merged, 0 files removed, 0 files unresolved
applying http://patchwork.serpentine.com/patch/746/mbox/
❧  hg import --exact http://patchwork.serpentine.com/patch/747/mbox/ applying http://patchwork.serpentine.com/patch/747/mbox/
repository tip rolled back to revision 18502 (undo import)
working directory now based on revision -1
transaction abort!
rollback completed
abort: patch is damaged or loses information
❧  hg verify
checking changesets
 changelog@?: data length off by 3693777 bytes
 changelog@?: rev 18503 points to nonexistent changeset -1
 (expected 18503)
 changelog@?: rev 18504 points to nonexistent changeset -1
 (expected 18504)
 changelog@?: duplicate revision 18504 (18503)
checking manifests
 manifest@?: data length off by 19114301 bytes
 manifest@?: rev 18490 points to nonexistent changeset -1                       
 manifest@?: 000000000000 not in changesets                                     
 manifest@?: rev 18491 points to nonexistent changeset -1                       
 manifest@?: duplicate revision 18491 (18490)                                   
 manifest@?: 000000000000 not in changesets                                     
crosschecking files in changesets and manifests                                 
checking files
2183 files, 18505 changesets, 36101 total revisions                             
2 warnings encountered!
10 integrity errors encountered!
Comment 14 Bryan O'Sullivan 2013-01-30 12:36 UTC
(In reply to comment #11)

There's been a simple repro hanging out in comment #5 for months, complete with a description of exactly what is going on.
Comment 15 HG Bot 2013-02-16 19:45 UTC
Fixed by http://selenic.com/repo/hg/rev/8eb3408bf005
Kevin Bullock <kbullock@ringworld.org>
import: don't rollback on failed import --exact (issue3616)

The checkexact() helper function was calling repo.rollback() from inside
an open transaction. In addition to being insane, this is unnecessary
because import will release the transaction on an exception.

It turns out that this has been broken since the feature was first
introduced, first released in v1.0:

changeset:   4263:47ba52121433
user:        Brendan Cully <brendan@kublai.com>
date:        Thu Mar 22 10:44:59 2007 -0700
files:       mercurial/commands.py mercurial/patch.py
description:
Add import --exact.
When this option is set, import will apply the patch (which must
be generated by export) to the parents specified in the patch,
and check that the node produced by the patch matches the node
ID in the patch.

(please test the fix)