[PATCH 04 of 14 RFC c-hglib:level0] hg_open: additional functions (read_hello and read_header)

Giovanni Gherdovich g.gherdovich at gmail.com
Wed Sep 11 23:49:53 CDT 2013


Ok that's another book chapter.

TLDR goes like this:
I don't like hg_rawread() to read the header of the next packet in advance.
I suggest how you can avoid doing so.

:::: # HG changeset patch
:::: # User Iulian Stana <julian.stana at gmail.com>
:::: # Date 1378234490 -10800
:::: #      Tue Sep 03 21:54:50 2013 +0300
:::: # Node ID 3b4648c6815ba9e1c46a7e2d1184a72952c6e537
:::: # Parent  806f59433d9ca5e4db5abd79fea365dc89c4c9f5
:::: hg_open: additional functions (read_hello and read_header)
::::
:::: read_hello: reading the hello msg from cmd server
:::: read_header: reading the header from cmd server
::::
:::: It's also an inside function that comes with hg_open function.
:::: Users are not allowed to call this function. I think that it's a
protection for
:::: them. This function is called from hg_open function, hg_rawread
and hg_rawwrite
:::: functions
::::
:::: diff --git a/client.c b/client.c
:::: --- a/client.c
:::: +++ b/client.c
:::: @@ -10,6 +10,86 @@
::::  #include <signal.h>
::::
::::
:::: +/**
:::: + * \brief Reading the header from cmdsrv.
:::: + *
:::: + * The function will read the header from the command server and
will save it to
:::: + * the header parameter of the handle structure.
:::: + * \param handle The handle of the connection, wherewith I want
to communicate
:::: + * \retval 0 if succesfull
:::: + * \retval -1 to indicate an error, with errno set appropriately
:::: + *
:::: + * errno can be:
:::: + *       - EINVAL  - Invalid argument (handle it's set to a null pointer)
:::: + *       - read(2) command errors
:::: + * */
:::: +int read_header(hg_handle *handle)

This function has to change.

In our conversation on IRC you explained me that
user code should not be given the possibility to read the
header (channel, length) of a cmdsrv packet at random times,
since the header is available only in a very specific moment:
at the beginning of a packet.

But hg_rawread() does not respect packet boundaries;
it reads data in chunks of fixed size.

Your solution to this problem is the following:

1) hide read_header() from the user (it's not in the header)
2) call read_header() from the most unexpected places in the code,
   that I literally had to grep for it otherwise I'd have never found it:
   (*) in hg_rawcommand(), just after you say "do this-and-that" to cmdsrv
   (*) every now and then in hg_rawread(), i.e. when some tricky
variable gets to zero.

I don't like this solution because it makes the code difficult to understand:
- Why do you *read* stuff in a function that is supposed to send commands?
- Why do you read the header of packet N+1 inside a function that is
supposed to read packet N?

I'd like you to change read_header() so that it behaves like this

1) You will have a field in the hg_handle struct called bytes_on_pipe
   that counts how many bytes are left on the pipe to complete the
current packet.
   You will initialize it with the length announced in the header,
   and decrease as you read data.
2) read_header will check bytes_on_pipe: if zero, it will read the header from
   the pipe and store it in the handle; if not, it will just return the header
   written in the handle.
3) prefix with hg_, put it in the public header, make it available to the user.
   S/he can call it whenever s/he likes. No restriction of any sort.

In this way you still have the pattern

while( read_header()->channel != 'r' ) {
  /* do stuff */
}

to be used in level 1 commands.

And the code will be easier to read, as the header is read in a more
uniform way.

:::: +{
:::: + uint32_t length;
:::: + char ch_char;
:::: +
:::: + if(!handle) {
:::: + errno = EINVAL;
:::: + return -1;
:::: + }
:::: + handle->current_header = handle->next_header;

I told you that I never really got what "current_header" and "next_header" were.
The names misled me to think that you were somehow reading headers
ahead of time...
Of course it's nothing like that (but the names...).

After a careful inspection, I understood that:

(*) handle->next_header. Its "channel" field is the channel you're dealing with.
    Its length field decrease as you consume data, pretty much as the
bytes_on_pipe
    variable I mention before.
    You need that since otherwise you cannot know where are packet boundaries,
    and when you have to read the new header.

(*) handle->current_header. Unused in level 0, and buggy (I'll come on that).
    In my opinion, you can live without it.

What you should do:

(1) use only one hg_header field in hg_handle objects,
    call it header,
    and don't change it's value until you get to the next packet.
(2) use a bytes_on_pipe on pipe variable to know when the packet is over.

Now: in the assignment

handle->current_header = handle->next_header

the "length" field of the right hand side will always be zero,
since you enter read_header only when a packet is over.
And before that assignment, I have the feeling that
handle->current_header->length
is uninitialized (during the first iteration).

Whatever. The names are so misleading
(neither current_header nor next_header are headers)
that you should definitely find a way to get rid of those.
I think one header is enough.
Whatever use you have for handle->current_header->channel,
you have to make it more explicit.


As a last note, I recall we already had this discussion last Aug 6th
in this thread
http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/62348/focus=62489

we werec commenting the prototype

int hg_rawread(hg_handle *handle, char *buffer, size_t sizebuff);

I didn't really understand Kevin argument on race conditions,
but I was not convinced by your answer neither.
It went like this:

Kevin wrote:
:::: How do I know what channel I'm reading from?
:::: Or how many bytes I should try to read?

Iulian wrote:
:::: In my examples I used the hg_channel command to track the channel,
:::: but you also have the possibility to call hg_head that will return
:::: an hg_header structure.
:::: I would like to think that my rawread command is more
:::: like read system call. Internal the command is checking the
:::: length of data that is coming.

Kevin wrote:
:::: If I have to call hg_header() or hg_channel() to find out
:::: (from what's set in struct hg_handle), then I'm trivially
:::: subject to race conditions.
::::
:::: Seems like it'd be better to make the caller provide an
:::: hg_header * to pass back the response header
:::: (as a so-called out parameter).

Iulian wrote:
:::: I think that you must call hg_header() or hg_channel(),
:::: to know your next step. I don't know exactly how else
:::: you can do this (to know if you need to read or to write)

:::: + if(read(handle->p_read, &ch_char, 1) < 0){
:::: + return -1;
:::: + }
:::: + if(read(handle->p_read, &length, sizeof(uint32_t)) < 0){
:::: + return -1;
:::: + }
:::: + handle->next_header.channel = ch_char;
:::: + handle->next_header.length = swap_uint32(length);
:::: +
:::: + return 0;
:::: +}
:::: +
:::: +/**
:::: + * \brief Reading the hello msg from cmd server.
:::: + *
:::: + * After the connection, the command server sends a hello message.
:::: + * The message contains the command server capabilities and the messages
:::: + * encoding.
:::: + * \param handle The handle of the connection, wherewith I want
to communicate
:::: + * \retval  0 if succesfull
:::: + * \retval -1 to indicate an error, with errno set appropriately
:::: + *
:::: + * errno can be:
:::: + *       - EINVAL  - Invalid argument (handle it's set to a null pointer)
:::: + *       - read(2) command errors
:::: + * */
:::: +int read_hello(hg_handle *handle)
:::: +{
:::: + char *buffer;
:::: + hg_header ch;
:::: +
:::: + if(!handle) {
:::: + errno = EINVAL;
:::: + return -1;
:::: + }
:::: +
:::: + read_header(handle);
:::: + ch = handle->next_header;
:::: +
:::: + if(ch.length == 0){
:::: + return -1;
:::: + }
:::: +
:::: + buffer = malloc(ch.length + 1);
:::: +
:::: + if(read(handle->p_read, buffer, ch.length) < 0){
:::: + free(buffer);
:::: + return -1;
:::: + }
:::: + buffer[ch.length] = '\0';
:::: +
:::: + free(buffer);
:::: +
:::: + return 0;
:::: +}
:::: +
:::: +
::::  /*
::::   * Open the connection with the mercurial command server.
::::   * */


More information about the Mercurial-devel mailing list