why does the cmdserver use a 4-byte length field?

Idan Kamara idankk86 at gmail.com
Mon Jul 4 08:34:59 CDT 2011


On Mon, Jul 4, 2011 at 4:22 PM, Laurens Holst <laurens.nospam at grauw.nl> wrote:
>
> Op 01-07-11 15:32, Idan Kamara schreef:
>
> On Fri, Jul 1, 2011 at 1:13 AM, Jesper Schmidt <schmiidt at gmail.com> wrote:
> >
> > I mean a 2-byte length field seems to be enough for the input channels.
>
> I think in today’s day and age you shouldn’t worry about two bytes. Any performance difference it would make will be basically immeasurable.
>
> What you *should* worry about is that the protocol does not impose arbitrary limitations that may become a serious limitation in the future. Y’know, like FAT32’s 4GB file size limit (Mercurial’s not that different from a file system... :)).

Mercurial already has internal limitations on the file sizes, see:
http://mercurial.selenic.com/wiki/HandlingLargeFiles

>
> > It could also be that 4-byte integers today are more common than 2-byte
> > integers and therefore easier to work with in some newer languages. Is
> > that the motivation for choosing 4 bytes over 2 bytes or is 2 bytes
> > simply not enough? If 2 bytes are not enough then why are 4 bytes then
> > enough?
> Theoretically if Mercurial needs to write 4GB in a single call, then it won't be enough.
>
> You could use a text-based protocol where the length is a newline-terminated string of numeric ASCII characters. Then it would be somewhat more future-proof, support for lengths > 4GB could be added without requiring a protocol change.

Yeah, but then the header length is variable, as opposed to being just
5 bytes. That introduces issues of its own.
Besides, we could always send outgoing data on the server side in chunks < 4GB.

>
> I saw someone even suggesting using a MIME-based protocol? Not such a bad idea if you ask me :).
>
> ~Laurens


More information about the Mercurial-devel mailing list