Bug 3926 - Buildbot failure with dirty subrepo detection
Summary: Buildbot failure with dirty subrepo detection
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: 2.6-rc
Hardware: PC Windows
: urgent bug
Assignee: Bugzilla
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-06 16:25 UTC by Matt Mackall
Modified: 2013-07-23 18:34 UTC (History)
4 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 FUJIWARA Katsunori 2013-05-06 17:16 UTC
I can confirm outgoing revision in subrepo "s/ss" by adding "hg
outgoing" as below, even though "hg push" in parent repo root
still push nothing:

  $ hg -R s/ss outgoing ../tcc/s/ss # <<<< added
  $ hg push ../tcc
  pushing to ../tcc
  .....

So, "subrepo.storeclean()" seems to work incorrectly on Windows.


BTW, I also found that "lock.release()" in "storeclean()"
and "_cachestorehash()" of hgsubrepo class is not executed
in "finally" caluse.

"lock.release()" should be executed in "finally" clause to ensure
releasing, even if processing is aborted by unexpected exception,
shouldn't it ? or is this special case ?

I don't know whether this is related to this issue or not.
This is just a memo.
Comment 2 Matt Mackall 2013-05-06 17:51 UTC
Marking confirmed. Also going to go ahead and promote this to urgent as it looks like this will create problems for people pulling from a partial push.

I suspect the finally: issue is unrelated but please go ahead and send a patch for it.
Comment 3 Angel Ezquerra 2013-05-07 01:34 UTC
So this is Windows only then? I just ran the whole test suite on Linux on the tip of default and all tests passed.

I will look into this as soon as I can.
Comment 4 Matt Mackall 2013-05-07 13:21 UTC
Yep, tests pass just fine on Linux and Mac.

You can drill down into buildbot results from here:

http://hgbuildbot.kublai.com/waterfall

The specific problem in the current results is:

--- c:\Users\buildbot\w2k8\Windows_2008_R2_hg_tests\build\tests\test-subrepo.t 
+++ c:\Users\buildbot\w2k8\Windows_2008_R2_hg_tests\build\tests\test-subrepo.t.err 
@@ -440,12 +440,7 @@
   [1]
   $ hg push ../tcc
   pushing to ../tcc
-  pushing subrepo s/ss to ../tcc/s/ss
-  searching for changes
-  adding changesets
-  adding manifests
-  adding file changes
-  added 1 changesets with 1 changes to 1 files
+  no changes made to subrepo s\ss since last push to ../tcc/s/ss
   no changes made to subrepo s since last push to ../tcc/s
   no changes made to subrepo t since last push to ../tcc/t
   searching for changes
Comment 5 Angel Ezquerra 2013-05-08 16:32 UTC
This is weird. I manually converted test-subrepo.t into a windows batch file. I then ran the batch file up until the offending line. When I ran hg outgoing and hg push I got:

C:\ ... \Repositories\tc>hg -R s\ss outgoing ..\tcc\s\ss
comparing with ..\tcc\s\ss
searching for changes
changeset:   1:ddb53fced63f
tag:         tip
user:        Angel Ezquerra <angel.ezquerra@gmail.com>
date:        Wed May 08 21:56:15 2013 +0200
summary:     test dirty store detection


C:\ ... \Repositories\tc>hg push ..\tcc
pushing to ..\tcc
pushing subrepo s\ss to ../tcc/s/ss
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
no changes made to subrepo s since last push to ../tcc/s
no changes made to subrepo t since last push to ../tcc/t
searching for changes
no changes found


Which is what was expected I think. So I cannot reproduced this on my end when running this on Windows _manually_. I cannot run the test suite (I have not setup my new machine to run the test suite on windows).

Also I debugged the storeclean function step by step (using PyCharm's debugger) and everything seemed to work fine.
Comment 6 Matt Mackall 2013-05-08 19:02 UTC
Marking urgent as promised. We're probably going to cut an early 2.6.1 for bug 3921, it would be nice to get this one wrapped up soon.

Angel, it's possible that this is somehow timing or speed-dependent and might not fail every time.
Comment 7 FUJIWARA Katsunori 2013-05-09 07:04 UTC
I can find root cause of this problem.

"subrepo._calcfilehash()" opens specified file
by "open(filename)", and this causes incomplete data read in when
file contains '\x00' on Windows environment, because file is
opened by text mode: text mode cut of after '\x00' on Windows.

In this issue case, "\x00" is recorded into "00changelog.i"
BEFORE last commit in subrepo "s/ss"(= hash value is not changed
by commit), so change of changelog is not detected at next push.

I can confirm that "open(filename)" with "rb" fixes this issue,
and I'll post patch soon.

I think that username and/or commit date should prevent
"00changelog.i" from containing "\x00" at reproduction by Angel.
Comment 8 HG Bot 2013-05-09 12:00 UTC
Fixed by http://selenic.com/repo/hg/rev/ed1a212193dc
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
subrepo: open files in 'rb' mode to read exact data in (issue3926)

Before this patch, "subrepo._calcfilehash()" opens files by "open()"
without any mode specification. This implies "text mode" on Windows.

When target file contains '\x00' byte, "read()" in "text mode" reads
file contents in without data after '\x00'.

This causes invalid SHA1 hash calculation in "subrepo._calcfilehash()".

This patch opens files in 'rb' mode to read exact data in.

(please test the fix)