Solving long paths by hashing
Adrian Buehlmann
adrian at cadifra.com
Sun Jun 29 06:57:04 CDT 2008
On 29.06.2008 12:28, Dirkjan Ochtman wrote:
> Adrian Buehlmann wrote:
>> Questions left:
>> Does streamclone really need to walk the store like that?
>> Would it be possible to eliminate this use of util.decodefilename?
>
> Well, I think what it does is walking all the files and passing their
> name, size and contents to the client so that the client can just save
> the revlog contents under the appropriate file name, using the encoding
> that the client hg prefers, so there's no way around that, really.
>
> The alternative is discovering all filenames in some other way than
> walking the store, but it seems that would involve either walking
> manifests for all changesets in the changelog or reading each filelog,
> checking out the manifest in which it last appeared, then read that
> manifest to find any files that are still missing, or something. Both of
> these aren't going to be very efficient, it seems.
Thanks Dirkjan.
Just got another idea:
Instead of writing a reverse mapping of encoded -> unencoded filenames
into a single file as Jesse's patch does (the "longnames" file), we could:
Prepend a new prefix to the content of every name-hashed *.i file in the
store, consisting of
a) a new revlog-header
b) followed by the unencoded filename
c) followed by some limiter
and then followed by whatever *.i files currently contain (somewhat similar
to adding another layer to a protocol).
We could then read the decoded filename from the beginning of the *.i file
and skip the new prefix.
After all, we create a new repo layout anyway, so we can change
the way we store *.i files.
For example, streamclone.stream_out reads the *.i files anyway, so it
would be efficient for stream_out to extract the unencoded filename
from the *.i file it is about the send.
More information about the Mercurial-devel
mailing list