largefiles: server storage?

Na'Tosha Bard natosha at unity3d.com
Fri Oct 14 10:42:30 CDT 2011


2011/10/14 Justin Holewinski <justin.holewinski at gmail.com>

> On Fri, Oct 14, 2011 at 11:06 AM, Na'Tosha Bard <natosha at unity3d.com>wrote:
>
>> 2011/10/14 Justin Holewinski <justin.holewinski at gmail.com>
>>
>>>
>>>
>>> On Wed, Oct 12, 2011 at 1:05 PM, Justin Holewinski <
>>> justin.holewinski at gmail.com> wrote:
>>>
>>>> I'm excited for the new largefiles extension, and I have been trying it
>>>> out on some local test repositories.  I realize the extension (in its
>>>> "official" form) is very new and subject to change, but I have a question on
>>>> how the large files are (or are going to be) stored on the server.
>>>>
>>>> Let's say I have two local repositories S and C.  S represents the
>>>> "server" repository, and C represents a client clone.  If I add large files
>>>> to C and push to S, the large files appear to be stored in
>>>> $HOME/.largefiles, not in the .hg directory for S.
>>>>
>>>
>> I believe this makes sense, based on the current implementation.
>>
>>
>>>   It looks like S just contains hashes for the files, which makes sense.
>>>>
>>>
>> This is correct -- the hashes are sitting in S/.hglf -- correct?
>>
>
> They are stored in .hg/store/data/~2ehglf.  There appear to be .i files
> that store all past revision hashes, and the .i files are stored in a
> structure that mirrors the repository structure.  The client repo has the
> .hglf directory.  If I run "hg update" on the server repo, then I get the
> .hglf directory, and "hg update null" removes it.
>

The client repo seems correct.  The stuff happening on the server sounds
like garbage we need to fix.


> Also, this may be a bug: in my test repository, I have all of the
> largefiles in an assets/ directory.  If I run "hg update" on the server,
> this directory is created.  But if I run "hg update null", then the contents
> of assets/ are deleted, but the directory still remains, unlike other
> directories that contain only normally-versioned files.
>

That directory probably won't show up on the server unless you run hg
update, because by default the server version has no working copy, right?

  Incidentally, will there be a config option for this, for users that wish
>>>> to sandbox all hg-related files in a separate directory?
>>>>
>>>
>> Every large-files enabled repo will have it's own set of standins
>> maintained in repo/.hglf -- I don't see any reason why this should be able
>> to be moved out of the repository because it is repo-specific.  Also the
>> standins are very small text files, so why do they need to be elsewhere?
>>
>
> I was actually referring to the opposite: will I be able to configure the
> server to store all largefiles blobs in the .hg directory, or some other
> user-configurable directory?
>

I don't believe it is supported yet, but I believe we should add it.


>  Now, let's say I create a new repository accessible over SSH, called S2.
>>>>  If I push C to S2, the largefiles seem to be stored in *both*
>>>> $HOME/.largefiles (in the SSH account) and the .hg directory for S2.  Things
>>>> are getting a bit inconsistent.
>>>>
>>>
>> That does sound inconsistent -- to me, anyway.  There shouldn't be any
>> largefiles in S2/.hg -- there should only be the textfiles with the SHA1
>> sums in S2/.hglf -- is the S2/.hglf directory there?
>>
>
> There is no .hglf directory, the largefiles appear to be stored in
> .hg/largefiles (in the server repo):
>
> $ ls -l .hg/largefiles
> total 71584
> -rw------- 2 hg hg    71576 2011-10-12 12:49
> 161215d4d5bcd9e1e92bb9865a1c5dfd5363229a
> -rw------- 2 hg hg 24376104 2011-10-12 12:49
> 21dd947c014a25729adca78a1b7eee30ded378e3
> -rw------- 2 hg hg     7821 2011-10-12 12:49
> 2204a70d38f16a931bfbfc1e81f3b18572e1bd69
> -rw------- 2 hg hg 24376128 2011-10-12 12:49
> 311990273ded89150c5eb3a84399323189c74425
> -rw------- 2 hg hg     7813 2011-10-12 12:49
> 3f727d8d67f5a7bb7adea5beae0a02cc3ef4dd15
> -rw------- 2 hg hg 24376116 2011-10-12 12:49
> 9605bf1cdd8cbba8879ebca4e7ea68d9a949b569
> -rw------- 2 hg hg    71567 2011-10-12 12:49
> a4572db237727e4d26c9c3230774f1705872221b
>
>
>
>>
>>
>>> I have not tested HTTP/HTTPS, but what is the expected behavior in this
>>>> case?  There may not be a writable home directory in this case.
>>>>
>>>>
>>>> More specifically, what are the planned methods for storing large files
>>>> on mercurial servers?
>>>>
>>>
>>> Ping?  Any comment from the largefiles devs on the planned server storage
>>> model?
>>>
>>
>> I'm not really sure we have a concrete plan yet.  This extension (at least
>> in this form) is very new.
>>
>
> Is it still going to be released with Mercurial 2.0?
>

Yes.


>  Some of us are expecting to use largefiles with Kiln, which just
>> implements the server-side stuff already.  Some people will be migrating
>> from the old bfiles extension, which means they already have a central share
>> set up somewhere (but I assume some conversion will be necessary).  Greg is
>> preparing a way for users to migrate from bfiles to largefiles, so he might
>> have some idea on this.
>>
>> The built-in serving ability of largefiles was developed by the team at
>> FogCreek, so hopefully one of them can reply with what their vision was.
>>
>> My initial thought is that:
>>
>> The $HOME/.largefiles cache should be configurable server-side, if it is
>> not already
>> Each repo should only contain the hashes in repo/.hglf -- when largefiles
>> are uploaded, they should probably all go directly to the cache.
>>
>
> That makes sense to me, as long as the cache path is configurable. :)
>

Let's wait for the other Largefiles devs to weigh in on the issue before we
make a plan.

Cheers,
Na'Tosha


-- 
*Na'Tosha Bard*
Build & Infrastructure Developer | Unity Technologies

*E-Mail:* natosha at unity3d.com
*Skype:* natosha.bard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20111014/5c49532c/attachment.html>


More information about the Mercurial-devel mailing list