largefiles: server storage?

Justin Holewinski justin.holewinski at gmail.com
Fri Oct 14 11:15:17 CDT 2011


On Fri, Oct 14, 2011 at 11:42 AM, Na'Tosha Bard <natosha at unity3d.com> wrote:

> 2011/10/14 Justin Holewinski <justin.holewinski at gmail.com>
>
>> On Fri, Oct 14, 2011 at 11:06 AM, Na'Tosha Bard <natosha at unity3d.com>wrote:
>>
>>> 2011/10/14 Justin Holewinski <justin.holewinski at gmail.com>
>>>
>>>>
>>>>
>>>> On Wed, Oct 12, 2011 at 1:05 PM, Justin Holewinski <
>>>> justin.holewinski at gmail.com> wrote:
>>>>
>>>>> I'm excited for the new largefiles extension, and I have been trying it
>>>>> out on some local test repositories.  I realize the extension (in its
>>>>> "official" form) is very new and subject to change, but I have a question on
>>>>> how the large files are (or are going to be) stored on the server.
>>>>>
>>>>> Let's say I have two local repositories S and C.  S represents the
>>>>> "server" repository, and C represents a client clone.  If I add large files
>>>>> to C and push to S, the large files appear to be stored in
>>>>> $HOME/.largefiles, not in the .hg directory for S.
>>>>>
>>>>
>>> I believe this makes sense, based on the current implementation.
>>>
>>>
>>>>   It looks like S just contains hashes for the files, which makes sense.
>>>>>
>>>>
>>> This is correct -- the hashes are sitting in S/.hglf -- correct?
>>>
>>
>> They are stored in .hg/store/data/~2ehglf.  There appear to be .i files
>> that store all past revision hashes, and the .i files are stored in a
>> structure that mirrors the repository structure.  The client repo has the
>> .hglf directory.  If I run "hg update" on the server repo, then I get the
>> .hglf directory, and "hg update null" removes it.
>>
>
> The client repo seems correct.  The stuff happening on the server sounds
> like garbage we need to fix.
>
>
>>  Also, this may be a bug: in my test repository, I have all of the
>> largefiles in an assets/ directory.  If I run "hg update" on the server,
>> this directory is created.  But if I run "hg update null", then the contents
>> of assets/ are deleted, but the directory still remains, unlike other
>> directories that contain only normally-versioned files.
>>
>
> That directory probably won't show up on the server unless you run hg
> update, because by default the server version has no working copy, right?
>

Right, I probably wasn't clear enough.  If I run "hg update" *on the
server*, I get a working copy that includes assets/.  If I then run "hg
update null" *on the server*, the contents of assets/ are wiped, but the
directory itself remains.


>
>    Incidentally, will there be a config option for this, for users that
>>>>> wish to sandbox all hg-related files in a separate directory?
>>>>>
>>>>
>>> Every large-files enabled repo will have it's own set of standins
>>> maintained in repo/.hglf -- I don't see any reason why this should be able
>>> to be moved out of the repository because it is repo-specific.  Also the
>>> standins are very small text files, so why do they need to be elsewhere?
>>>
>>
>> I was actually referring to the opposite: will I be able to configure the
>> server to store all largefiles blobs in the .hg directory, or some other
>> user-configurable directory?
>>
>
> I don't believe it is supported yet, but I believe we should add it.
>
>
>>   Now, let's say I create a new repository accessible over SSH, called
>>>>> S2.  If I push C to S2, the largefiles seem to be stored in *both*
>>>>> $HOME/.largefiles (in the SSH account) and the .hg directory for S2.  Things
>>>>> are getting a bit inconsistent.
>>>>>
>>>>
>>> That does sound inconsistent -- to me, anyway.  There shouldn't be any
>>> largefiles in S2/.hg -- there should only be the textfiles with the SHA1
>>> sums in S2/.hglf -- is the S2/.hglf directory there?
>>>
>>
>> There is no .hglf directory, the largefiles appear to be stored in
>> .hg/largefiles (in the server repo):
>>
>> $ ls -l .hg/largefiles
>> total 71584
>> -rw------- 2 hg hg    71576 2011-10-12 12:49
>> 161215d4d5bcd9e1e92bb9865a1c5dfd5363229a
>> -rw------- 2 hg hg 24376104 2011-10-12 12:49
>> 21dd947c014a25729adca78a1b7eee30ded378e3
>> -rw------- 2 hg hg     7821 2011-10-12 12:49
>> 2204a70d38f16a931bfbfc1e81f3b18572e1bd69
>> -rw------- 2 hg hg 24376128 2011-10-12 12:49
>> 311990273ded89150c5eb3a84399323189c74425
>> -rw------- 2 hg hg     7813 2011-10-12 12:49
>> 3f727d8d67f5a7bb7adea5beae0a02cc3ef4dd15
>> -rw------- 2 hg hg 24376116 2011-10-12 12:49
>> 9605bf1cdd8cbba8879ebca4e7ea68d9a949b569
>> -rw------- 2 hg hg    71567 2011-10-12 12:49
>> a4572db237727e4d26c9c3230774f1705872221b
>>
>>
>>
>>>
>>>
>>>> I have not tested HTTP/HTTPS, but what is the expected behavior in this
>>>>> case?  There may not be a writable home directory in this case.
>>>>>
>>>>>
>>>>> More specifically, what are the planned methods for storing large files
>>>>> on mercurial servers?
>>>>>
>>>>
>>>> Ping?  Any comment from the largefiles devs on the planned server
>>>> storage model?
>>>>
>>>
>>> I'm not really sure we have a concrete plan yet.  This extension (at
>>> least in this form) is very new.
>>>
>>
>> Is it still going to be released with Mercurial 2.0?
>>
>
> Yes.
>
>
>>  Some of us are expecting to use largefiles with Kiln, which just
>>> implements the server-side stuff already.  Some people will be migrating
>>> from the old bfiles extension, which means they already have a central share
>>> set up somewhere (but I assume some conversion will be necessary).  Greg is
>>> preparing a way for users to migrate from bfiles to largefiles, so he might
>>> have some idea on this.
>>>
>>> The built-in serving ability of largefiles was developed by the team at
>>> FogCreek, so hopefully one of them can reply with what their vision was.
>>>
>>> My initial thought is that:
>>>
>>> The $HOME/.largefiles cache should be configurable server-side, if it is
>>> not already
>>> Each repo should only contain the hashes in repo/.hglf -- when largefiles
>>> are uploaded, they should probably all go directly to the cache.
>>>
>>
>> That makes sense to me, as long as the cache path is configurable. :)
>>
>
> Let's wait for the other Largefiles devs to weigh in on the issue before we
> make a plan.
>
> Cheers,
> Na'Tosha
>
>
> --
> *Na'Tosha Bard*
> Build & Infrastructure Developer | Unity Technologies
>
> *E-Mail:* natosha at unity3d.com
> *Skype:* natosha.bard
>
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20111014/88f34897/attachment.html>


More information about the Mercurial-devel mailing list