Bug 3263 - hg convert and hg-git unable to convert a big git repository
Summary: hg convert and hg-git unable to convert a big git repository
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: convert (show other bugs)
Version: unspecified
Hardware: All All
: normal bug
Assignee: Bugzilla
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-11 11:44 UTC by prathmesh
Modified: 2018-11-03 00:00 UTC (History)
4 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description prathmesh 2012-02-11 11:44 UTC
Friends, i have a project host on a git repository. The project had around
41 commits and size of around 375MB. I wanted to convert this repository to
a hg repository. 

I tried "hg convert" utility to convert the git repo to hg. The process
started converting the git repo but hanged (did not stop) midway. I let the
process run for more than half hour but it did not give any error nor did it
exit. After some time, I had to forcefully kill the process.

I am currently using 
hg 2.0 version and 
git version 1.7.5.4

After trying and failing with the above process for 2-3 times, I went to "hg
clone git repo" method. Even this process hung indefinitely.
Comment 1 Patrick Mézard 2012-02-11 12:15 UTC
Can you rerun the convert conversion with --debug.

Converting git repos usually hangs when convert try to grabs remote branches
to turn them into bookmarks and wait on an authentication prompt which may
or may not be displayed.

You can also try to comment out the call to this function:

 
http://hg.intevation.org/mercurial/crew/file/11aad09a6370/hgext/convert/git.py#l180
Comment 2 prathmesh 2012-02-11 12:29 UTC
@pmezard thanks for the quick reply .. the debug option revealed the main
bug. The conversion is stopped because it is trying to get a file
"qqq_trace.out" from the git-repo. I remember creating this file and having
issues uploading it to the git repo. But if I had deleted the file
forcefully from the repo or the file got corrupted, does this mean my repo
cannot get converted ?? 

Can i do something like change the particular version in the git repo and
try the samething again ?? or this is a different issue altogether ??
Comment 3 Patrick Mézard 2012-02-11 12:49 UTC
I suppose you see which convert git command is taking forever to complete?
It would be interesting to know what is really wrong here, whether the
command is normally slow or the repository is corrupted or anything else.
Unfortunately I do not know git enough to help you there. Can you tell us
which command is taking a long time to complete?
Comment 4 prathmesh 2012-02-11 13:00 UTC
@ patrick even i thought that the command is slow but then i was converting
a local repo. this should rule out the slow network option.

next i thought that the command is slow. but then i tried to track the size
of the converted repo. The size of the repo git stuck at 50MB and did not
increase thereafter. I used "du -ksh " to find the repo size. I also used
the lsof to check the files that hg was currently writing.

anyways i think that the file got into the git repository accidentally. i
will remove it from all the versions and check again. will update here soon ...
Comment 5 dbrakhane 2012-02-14 03:27 UTC
You can also try to use a filemap with convert that excludes the bad file.
Comment 6 Bugzilla 2012-05-12 09:28 UTC

--- Bug imported by bugzilla@serpentine.com 2012-05-12 09:28 EDT  ---

This bug was previously known as _bug_ 3262 at http://mercurial.selenic.com/bts/issue3262
Comment 7 Matt Mackall 2014-07-25 17:22 UTC
Bulk close: no activity for >2 years -> WONTFIX
Comment 8 Matt Mackall 2014-07-31 13:22 UTC
Bulk change recent WONTFIX -> new, more descriptive ARCHIVED state (sorry for the spam)
Comment 9 HG Bot 2018-10-26 17:50 UTC
Fixed by https://mercurial-scm.org/repo/hg/rev/7caf632e30c3
Yuya Nishihara <yuya@tcha.org>
filecache: unimplement __set__() and __delete__() (API)

Implementing __set__() implies that the descriptor can't be overridden by
obj.__dict__, which means any property access involves slow function call.

  "Data descriptors with __set__() and __get__() defined always override
  a redefinition in an instance dictionary. In contrast, non-data descriptors
  can be overridden by instances."

  https://docs.python.org/2.7/reference/datamodel.html#invoking-descriptors

This patch basically backs out 236bb604dc39, "scmutil: update cached copy
when filecached attribute is assigned (issue3263)." The problem described
in issue3263 (which is #3264 in Bugzilla) should no longer happen since
repo._bookmarkcurrent has been moved to repo._bookmarks.active. We still
have a risk of introducing similar bugs, but I think that's the cost we
have to pay.

  $ hg perfrevset 'branch(tip)' -R mercurial
  (orig) wall 0.139511 comb 0.140000 user 0.140000 sys 0.000000 (best of 66)
  (prev) wall 0.114195 comb 0.110000 user 0.110000 sys 0.000000 (best of 81)
  (this) wall 0.099038 comb 0.110000 user 0.100000 sys 0.010000 (best of 93)

(please test the fix)
Comment 10 Bugzilla 2018-11-03 00:00 UTC
Bug was set to TESTING for 7 days, resolving