[PATCH 1 of 3 RFC -V3] Header file for level 0 cmd server API

Kevin Bullock kbullock+mercurial at ringworld.org
Mon Aug 5 22:51:12 CDT 2013


On 25 Jul 2013, at 12:27 PM, Iulian Stana wrote:

> # HG changeset patch
> # User Iulian Stana <julian.stana at gmail.com>
> # Date 1374750370 -10800
> #      Thu Jul 25 14:06:10 2013 +0300
> # Node ID 2cc0845928f01bf161cb1d4983bd029c7c3e70aa
> # Parent  0000000000000000000000000000000000000000
> Header file for level 0 cmd server API
> 
> This patch includes just the header file of c-hglib API. (client.h)
> 
> This header is basically unchanged from the last patch,
> (http://mercurial.markmail.org/thread/qsfwzmxzzs4ekuuv), but I am resubmitting
> it as part of a wider work where I propose examples for some mercurial commands,
> implemented in terms of these low level API (the "stories" that Matt asked for).
> 
> I made a page where I collected all the address questions.
> (http://mercurial.selenic.com/wiki/C-Hglib/Questions)
> The most important being: 'how do you get exit codes' and 'how do you treat
> filenames w/ spaces'.
> 
> The "level 0", called also the raw level stands for the lowest level of c-glib.
> The main idea is to "pass a raw command string, get unparsed results". Also to
> create all the functions that will be needed on higher levels.
> Now, we don't really get "unparse results", we _do_ parse the stdout of cmdsrv,
> since we identify the channel type.

'Unparsed results' simply means 'the results as if you had simply called system("hg COMMAND ...")'.

> diff --git a/client.h b/client.h
> new file mode 100644
> --- /dev/null
> +++ b/client.h
> @@ -0,0 +1,149 @@
> +#ifndef _CLIENT_H_
> +#define _CLIENT_H_
> +
> +#include <errno.h>
> +#include <stdint.h>
> +#include <sys/types.h>
> +
> +
> +/*
> + * This structure contains the variables for the header.
> + * The channel will stock a letter that will indicate the type of channel.
> + * The length will stock the size of data that will be send or the maxim data
> + * that can be received by the server.
> + * */
> +typedef struct hg_header{
> +    char channel;
> +    uint32_t length;
> +} hg_header;

I'd rather see some constants or an enum for the channel. Not sure what the current C best practice is, but I don't think it's "dump the raw char into a struct field".

> +/*
> + * This structure will be use the handle the connection with the server.
> + * He contains 2 pipes for the biderectional connection.
> + * The childpid, the pid of the process that will open the connection.
> + * The header that will contain the current header-action(channel, length). 
> + */
> +typedef struct hg_handle{
> +    pid_t childpid;
> +    hg_header header;
> +    int p_read;
> +    int p_write;
> +
> +} hg_handle;
> +
> +
> +/*
> + * Open the connection with the mecurial command server.
> + * The handle structure will be allocated.
> + * \param path The path to the repository wherewith I want to create a connection
> + *        NULL argument means the repository in which I am.
> + * \param encoding Will set HGENCODING to the given encoding
> + *        NULL argument means the default encoding. (UTF-8)
> + * \return A handle for this connection. 
> + *        NULL to indicate an error, with errno set appropriately.
> + * errno can be :
> + *      ECONNREFUSED
> + *      EFAULT - for a bad path address
> + *      EINVAL - for a bad encoding
> + *      All the errors that can be provide from exec
> + * */
> +hg_handle *hg_open(const char *path, char *encoding);

EFAULT means a bad memory address, pretty sure that's inappropriate here. Since what you're doing here will be effectively the same as `cd /path/to/repo && hg serve --cmdserver pipe`, probably any of the errors that chdir(2) returns are possible.

What if I want to do non-blocking I/O on the pipe?

> +
> +/*
> + * Close the connection for the given handle.
> + * Erase the handle and free the memory
> + * \param handle The handle of the connection that I want to close
> + * \return   0 if successful
> + *          -1 to indicate an error, with errno set appropriately.
> + * errno can be:
> + *      EINVAL
> + * */
> +int hg_close(hg_handle **handle);

One '*' too many, unless you're planning on closing multiple handles at once...

> +/*
> + * Sending a command to the mercurial command server, through the given handle.
> + * \param handle The handle of the connection, wherewith I want to communicate
> + * \param command An array of pointers to null-terminated strings that represent
> + *                the argument list available to the new command.
> + * \param cmd_size The length of command array.
> + * \return   0 if successful
> + *          -1 to indicate an error, with errno set appropriately.
> + * errno can be:
> + *      EINVAL
> + *      EIO
> + *      ECOMM
> + * */
> +int hg_rawcommand(hg_handle *handle, char *const command[], size_t cmd_size);

cmd_size should come first and be a 4-byte unsigned integer, since that's what the protocol specifies. A varargs (execl()-like) variant would be nice to have too (and this is why cmd_size should come first).

Would it be reasonable to leave out cmd_size entirely and make the function compute it itself?

> +/*
> + * Reading some unparse data from the server. Will read just a 'line', the 
> + * header that is received from server and the data that comes after the header
> + * \param handle The handle of the connection, wherewith I want to communicate
> + * \param buffer A character array where the read content will be stored.
> + * \param sizebuff The number of bytes to read.
> + * \return   The number of bytes that were read.
> + *          -1 to indicate an error, with errno set appropriately.
> + * errno can be:
> + *      EINVAL
> + *      EIO
> + *      ECOMM
> + * */
> +int hg_rawread(hg_handle *handle, char *buffer, size_t sizebuff);

How do I know what channel I'm reading from? Or how many bytes I should try to read? If I have to call hg_header() or hg_channel() to find out (from what's set in struct hg_handle), then I'm trivially subject to race conditions.

Seems like it'd be better to make the caller provide an hg_header * to pass back the response header (as a so-called out parameter).

> +/*
> + * Will write the buffer to server for the connection establish by the handle.
> + * This function will be used when one of the input channels will be received 
> + * from the command server. ('I' or 'L' channels)
> + * \param handle The handle of the connection, wherewith I want to communicate
> + * \param buffer A null terminated character string of the content to write.
> + * \param sizebuff The number of bytes to write 
> + * \return   The number of bytes that were written
> + *          -1 to indicate an error, with errno set appropriately.
> + * errno can be:
> + *      EINVAL
> + *      EIO
> + *      ECOMM
> + * */
> +int hg_rawwrite(hg_handle *handle, const char *buffer, size_t sizebuff);

How do you know what channel you're writing to?

> +/*
> + * Before you read or write data, you will want to know for what kind of data is 
> + * server waiting, an input data or an output data.

Perhaps you meant 'I'nput or 'L'ine-input? The server will never be waiting on output. (It may block, but that's at a lower level than this protocol.)

> + * This function will return the current channel for the connection established
> + * by the handle.
> + * \param handle The handle of the connection, wherewith I want to communicate
> + * \return   0 if successful
> + *          -1 to indicate an error, with errno set appropriately.
> + * errno can be:
> + *      EINVAL
> + *      EIO
> + *      ECOMM
> + * */
> +char hg_channel(hg_handle *handle);

I don't like the existence of this function.

pacem in terris / мир / शान्ति / ‎‫سَلاَم‬ / 平和
Kevin R. Bullock



More information about the Mercurial-devel mailing list