[PATCH] V13 of experiment for a simpler path encoding for hashed paths (for "fncache2")

Isaac Jurado diptongo at gmail.com
Sat Sep 29 19:52:31 CDT 2012


On Sat, Sep 29, 2012 at 10:44 PM, Adrian Buehlmann <adrian at cadifra.com> wrote:
>
>> The calls to memcpy are only inlined when both the src and length are
>> constant at compile time; which is not the case in that code.  At
>> least, that's how it works with GCC.  I believe there cannot be much
>> more magic about that, so probably other compilers behave the same.
>
> Interesting.
>
> As a try I've replaced the memcopy call with
>
>                                 //memcopy(dest, &destlen, destsize, &seg, seglen);
>                                 if (dest) {
>                                         for (t = 0; t < seglen; ++t)
>                                                 dest[destlen] = seg[t];
>                                 }
>                                 destlen += seglen;
>
> which is translated by MS C as
>
> <paste>
> $LN9 at cutdirs:
>
> ; 530  :                                }
> ; 531  :                                /* memcopy(dest, &destlen, destsize, &seg, seglen); */
> ; 532  :                                if (dest) {
>
>         test    r12, r12
>         je      SHORT $LN6 at cutdirs
>
> ; 533  :                                        for (t = 0; t < seglen; ++t)
>
>         test    r8, r8
>         jle     SHORT $LN6 at cutdirs
>         lea     rcx, QWORD PTR [r12+rdi]
>         lea     rdx, QWORD PTR seg$[rsp]
>         call    memcpy
>         lea     rdx, OFFSET FLAT:encchar
> $LN6 at cutdirs:
>         mov     rcx, QWORD PTR len$[rsp]
>
> ; 534  :                                                dest[destlen + t] = seg[t];
> ; 535  :                                }
> ; 536  :                                destlen += seglen;
>
>         movsxd  rax, esi
>
> ; 537  :
> ; 538  :                                seglen = 0;
>
>         xor     esi, esi
>         add     rdi, rax
>         xor     r8d, r8d
> $LN15 at cutdirs:
>
> ; 539  :                        }
> </paste>
>
> So the compiler even creates a memcpy call if I write a for loop that copies
> memory around.

In many systems, libc's memcpy, et al, are implemented in hand-coded
assembly that tries to apply some tricks and shortcuts, like cache-line
loop iterations or word by word copy.

So they are more efficient than we, distrustful programmers, tend to
think.

-- 
Isaac Jurado

"The noblest pleasure is the joy of understanding"
Leonardo da Vinci


More information about the Mercurial-devel mailing list