disappearing repo history on samba server

Joel B. Mohler jmohler at eaglebusinesssoftware.com
Wed Dec 8 04:54:32 CST 2010


On 12/7/2010 6:09 PM, Adrian Buehlmann wrote:
> On 2010-12-07 23:29, Joel B. Mohler wrote:
>> On 12/7/2010 5:20 PM, Adrian Buehlmann wrote:
>>> On 2010-12-07 21:42, Adrian Buehlmann wrote:
>>>> On 2010-12-07 20:38, Joel B. Mohler wrote:
>>>>> Hi,  My windows machines have been having a rocky relationship with
>>>>> their samba server.  The latest issue is that a push to a certain
>>>>> repository will wipe out revision history.  I've narrowed this down to a
>>>>> very small slice of mercurial code which has nothing at all to do with
>>>>> the changeset index, but the symptom with-in mercurial is that
>>>>> .hg\store\00changelog.i is renamed to .hg\store\00changelog.i.hgtmp and
>>>>> then my repository history is effectively wiped out (although the exact
>>>>> details of what goes on with 00changelog.i.hgtmp appear to vary somewhat
>>>>> by windows client machine).
>>>>>
>>>>> The fatal bug is reproduced by the python script at
>>>>>        http://kiwistrawberry.us/opener.py
>>>>> This script assumes that you have mercurial and python win32 extensions
>>>>> installed on a windows machine.  Without python win32 extensions, the
>>>>> windows link code goes into graceful (?) degradation mode and so the
>>>>> faulty code is not run.  The bug reproduces with two different windows
>>>>> machines (vista and server 2008) and two different linux servers (ubuntu
>>>>> and gentoo) so I'm quite confident it's not just a server configuration
>>>>> fluke.
>>>>>
>>>>> However, there's a weird gotcha which I don't understand yet.  It is
>>>>> that I can only reproduce this on a large repository of about 32000
>>>>> revisions which I can't make public.  Attempts on a smaller repository
>>>>> have all worked (i.e. reproducing the bug failed).  From my opener.py
>>>>> script, I come to the conclusion that for some reason 00changelog.i is
>>>>> held open longer for a larger repository, but I was unable to determine
>>>>> why that might be.
>>>>>
>>>>> All relevant mercurial installs are at least 1.7.1 and I don't see any
>>>>> evidence that variations of revision beyond the arrival of the
>>>>> checknlink function make any difference.
>>>>>
>>>>> So, two questions:
>>>>> 1)  Am I correct in believing that opener.py illustrates potential for a
>>>>> data-losing bug?
>>>>> 2)  Is more information needed about my repository or can a fix for
>>>>> opener.py be found with-out that?
>>>>>
>>>> Thanks for your http://kiwistrawberry.us/opener.py script, I'll take a
>>>> closer look.
>>> Or maybe not. I might not have time or motivation to look at this. If
>>> anyone else has any ambitions or ideas here, feel free to jump in here.
>> I think I will pursue this some more myself.  I reproduced what appears
>> to me to be at least one part of the puzzle with-out mercurial code at
>> all.  The script in this link is merely python and pywin32:
>> http://kiwistrawberry.us/pywin32_link_issues.py
> Very good. After all you are driven by your own immediate need, which is
> by far the best motivation :). (I was just mainly concerned about
> mercurial getting a bad reputation, since I would never ever put my own
> repos on a Windows share :p).
>
> I guess the few remaining Mercurial developers who care about Windows
> don't care that much about repos on Windows shares (although the unlink
> "ghost file" problem doesn't even require a Windows share to happen,
> yikes...).
I'm not quite sure precisely what you meant by "windows share".  The bug 
I found with the pywin32-based script is when I am accessing a linux 
server via samba from a windows client.  I would regard this as a highly 
sub-optimal use case and I would interpret the official recommendation 
as to use ssh or http access to the repositories.  The reason my 
organization persists in this sub-optimal use case is for back-porting 
fixes to our older revision clones.  We don't use in-repository 
branching and individual developers don't have clones of the old 
branches.  As I write this I realize anew how many ways this is a 
dubious workflow -- no, we don't even test-build when we back-port, it's 
all let for the build master to sort out the breakage.
> If I might make a suggestion: Matt usually prefers inline info and I
> think pasting your short scripts here would indeed be helpful for others
> to look at. And describe in prose what you found at each step. You can
> also use the wiki.
>
> Here is Joel's script (looks very interesting, combination of hardlink +
> windows share + os.unlink !):
>
>    import os
>    import win32file
>
>    dir,file=r"\\tcpkal\downloads","test_links.txt"
>    f=os.path.join(dir,file)
>    open(f,"w")
>
>    win32file.CreateHardLink(f+".link",f)
>    fd=open(f)
>
>    raw_input("Try to delete the .link file now...")
>
>    os.unlink(f+".link")
>    fd.close()
>
My current worry is that could actually be a samba issue as much as a 
pywin32 issue.  If that's the case, I doubt it will be a wise investment 
for me to follow through on.  I'll just lock down my linux server to 
only allow http access to the repositories and fix my workflow (issues 
noted above) to match that restriction.  ***However, it is my current 
opinion that the introduction of the new checknlink code may actually be 
a net regression for pushing to samba shares.  I have a guaranteed data 
loss situation at this point where-as the 1.6.x and prior just failed 
erratically. (some trade-off there :))!***




More information about the Mercurial-devel mailing list