[PATCH 2 of 2 RFC] c-hglib: hg_log() level 1 function

Iulian Stana julian.stana at gmail.com
Mon Aug 19 11:40:09 CDT 2013


2013/8/19 Idan Kamara <idankk86 at gmail.com>

> On Mon, Aug 19, 2013 at 3:10 PM, Giovanni Gherdovich <
> g.gherdovich at gmail.com> wrote:
>
>>
>>
>> 2013/8/18 Iulian Stana <julian.stana at gmail.com>
>>
>>> # HG changeset patch
>>> # User Iulian Stana <julian.stana at gmail.com>
>>> # Date 1376828454 -10800
>>> #      Sun Aug 18 15:20:54 2013 +0300
>>> # Node ID 04bf311c99b8f4cf4c3e2096b493ee126f861b99
>>> # Parent  21be7b973803e02b0bb310465691f88705a2dd53
>>> c-hglib: hg_log() level 1 function
>>>
>>> The revision history could have a huge mass of data. To deal with this
>>> issue, I
>>> had created a iterator-like mechanism. In this way I will get the
>>> changesets in
>>> chunks or more-like one at the time.
>>>
>>> The hg_log function will prepare the command and then will call
>>> cmd-server for
>>> changesets. This function will return to the user a iterator structure,
>>> to be
>>> used on hg_fetch_log_entry funciton. (The log command will not passing
>>> any
>>> changeset to the user)
>>>
>>> The hg_fetch_log_entry function will read changesets from command server
>>> and
>>> will pass into the hg_log_entry_t structure in a parse way the changeset.
>>> The hg_log_entry_t structure will be one of the function parameter.
>>>
>>> In the main.c file it can be found an example on how this function can
>>> be used.
>>>
>>> diff --git a/client.c b/client.c
>>> new file mode 100644
>>> --- /dev/null
>>> +++ b/client.c
>>> @@ -0,0 +1,134 @@
>>> +#define _GNU_SOURCE
>>> +#include <errno.h>
>>> +#include <fcntl.h>
>>> +#include <stdlib.h>
>>> +#include <stdio.h>
>>> +#include <string.h>
>>> +#include <unistd.h>
>>> +#include <sys/types.h>
>>> +#include <sys/wait.h>
>>> +#include <signal.h>
>>> +
>>> +
>>> +#include "client.h"
>>> +#include "utils.h"
>>> +
>>> +#define HGPATH "hg"
>>> +#define CHANGESET
>>> "{rev}\\0{node}\\0{tags}\\0{branch}\\0{author}\\0{desc}\
>>> +                                               \\0{date|isodate}\\0"
>>> +
>>> +/**
>>> + * \brief A helper for building the command arguments.
>>> + * \param command The name for mercurial command.
>>> + * \param options An array pointer to command option list.
>>> + * \param template Template to be used for this command.
>>> + * \retval new_string An array of pointers, where the command it's
>>> almost ready
>>> + *                    to be send to mercurial cmdserver.
>>> + * */
>>> +char **cmdbuilder(char *command, char *options[], char *template)
>>> +{
>>> +       int cmd_size, size;
>>> +       char **new_cmd;
>>> +       char *temp = "--template";
>>> +
>>> +       for(size = 0; options[size] != NULL; size++);
>>> +       cmd_size = size + 1;
>>> +       new_cmd = malloc(sizeof(char *) * (cmd_size + 3));
>>> +
>>> +       new_cmd[0] = command;
>>> +       memcpy(new_cmd + 1, options, (cmd_size - 1)  * sizeof(char *));
>>> +       if(template){
>>> +               new_cmd[cmd_size] = temp;
>>> +               new_cmd[cmd_size + 1] = template;
>>> +               new_cmd[cmd_size + 2] = NULL;
>>> +       }else{
>>> +               new_cmd[cmd_size] = NULL;
>>> +       }
>>> +
>>> +       return new_cmd;
>>> +}
>>> +
>>> +/**
>>> + * \brief 'Parse a changeset'. It's more like pointing to the correct
>>> position.
>>> + *
>>> + * The changeset could be found on buff pointer. To not duplicate the
>>> data I
>>> + * choose to point every log_entry field to the right position.
>>> + * \param buff The pointer to the changeset.
>>> + * \param le   The log_entry_t structure where the changeset will be
>>> parse.
>>> + * \retval 0 if successful.
>>> + * */
>>> +int parse_changeset(char *buff, hg_log_entry_t *le)
>>> +{
>>> +       /* set pointer for revision position */
>>> +       le->rev = buff;
>>> +
>>> +       /* set pointer for node position */
>>> +       buff = buff + strlen(buff) + 1;
>>> +       le->node = buff;
>>> +
>>> +       /* set pointer for tag position */
>>> +       buff = buff + strlen(buff) + 1;
>>> +       le->tags = buff;
>>> +
>>> +       /* set pointer for branch position */
>>> +       buff = buff + strlen(buff) + 1;
>>> +       le->branch = buff;
>>> +
>>> +       /* set pointer for author position */
>>> +       buff = buff + strlen(buff) + 1;
>>> +       le->author = buff;
>>> +
>>> +       /* set pointer for description position */
>>> +       buff = buff + strlen(buff) + 1;
>>> +       le->desc = buff;
>>> +
>>> +       /* set pointer for data position */
>>> +       buff = buff + strlen(buff) + 1;
>>> +       le->date = buff;
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +/* The high level log command for hglib API. */
>>> +hg_log_iterator *hg_log(hg_handle *handle, char *option[])
>>> +{
>>> +       hg_log_iterator *log_iterator = malloc(sizeof(hg_log_iterator));
>>> +       log_iterator->handle = handle;
>>> +
>>> +       log_iterator->command = cmdbuilder("log", option, CHANGESET);
>>> +
>>> +       if(hg_rawcommand(handle, log_iterator->command) < 0){
>>> +               return NULL;
>>> +       }
>>> +
>>> +       log_iterator->changeset = malloc(0);
>>> +       return log_iterator;
>>>
>>
>>
>> What happens here is that hg_log() does not return a status code
>> (with the meaning "command successfully sent"), but the hg_log_iterator
>> struct instead.
>>
>> A status code would be desirable and idiomatic I think;
>> the struct to be passed as an argument.
>>
>>
>>
>>> +}
>>> +
>>> +/* The iterator next step. Getting the next changeset. */
>>> +int hg_fetch_log_entry(hg_log_iterator *log_iterator, hg_log_entry_t
>>> *le)
>>> +{
>>> +       hg_header head = hg_head(log_iterator->handle);
>>> +
>>> +       if(head.channel == 'r'){
>>> +               free(log_iterator->command);
>>> +               free(log_iterator->changeset);
>>> +               int exitcode = hg_exitcode(log_iterator->handle);
>>> +               free(log_iterator);
>>> +               return exitcode;
>>> +       }
>>> +
>>> +       log_iterator->changeset = realloc(log_iterator->changeset,
>>> head.length);
>>> +
>>> +       if(read(log_iterator->handle->p_read, log_iterator->changeset,
>>> +                                       head.length) < 0){
>>>
>>
>>
>> Here you are not using hg_rawread(). Please explain why.
>> Level 1 functions are meant to build on top of level 0 functions.
>>
>> Also: you are not reading in 4096 B chunks.
>>
>> Also: you are assuming that all information for a given log entry
>> (a revision) are retrieved in a "single shot" of the  'o' channel.
>>
>> You are assuming that a log revision cannot be split across two
>> iteration of output on the 'o' channel.
>>
>> Is this assumption safe?
>>
>
> No.
>

Your answer is not clear... You want to say that the server will
give a revision in more "shots"? From my test I saw that the cmdserver
will send a log entry in a single call ( when I am using a template)
even if that revision have 20 kB.
The header from cmdserver is something like channel = 'o' length = 20000
data = [...]

Or the operation of getting the entire data in one call, is the bad thing?

Also: If there will be more iterations for a revision, How could I know how
much iterations there are?



-- 
Iulian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20130819/ca216f89/attachment.html>


More information about the Mercurial-devel mailing list