[PATCH 2 of 2 RFC] c-hglib: hg_log() level 1 function

Iulian Stana julian.stana at gmail.com
Mon Aug 19 14:53:43 CDT 2013


2013/8/19 Idan Kamara <idankk86 at gmail.com>

> On Mon, Aug 19, 2013 at 7:40 PM, Iulian Stana <julian.stana at gmail.com>wrote:
>
>> 2013/8/19 Idan Kamara <idankk86 at gmail.com>
>>
>>> On Mon, Aug 19, 2013 at 3:10 PM, Giovanni Gherdovich <
>>> g.gherdovich at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> 2013/8/18 Iulian Stana <julian.stana at gmail.com>
>>>>
>>>>> # HG changeset patch
>>>>> # User Iulian Stana <julian.stana at gmail.com>
>>>>> # Date 1376828454 -10800
>>>>> #      Sun Aug 18 15:20:54 2013 +0300
>>>>> # Node ID 04bf311c99b8f4cf4c3e2096b493ee126f861b99
>>>>> # Parent  21be7b973803e02b0bb310465691f88705a2dd53
>>>>> c-hglib: hg_log() level 1 function
>>>>>
>>>>> The revision history could have a huge mass of data. To deal with this
>>>>> issue, I
>>>>> had created a iterator-like mechanism. In this way I will get the
>>>>> changesets in
>>>>> chunks or more-like one at the time.
>>>>>
>>>>> The hg_log function will prepare the command and then will call
>>>>> cmd-server for
>>>>> changesets. This function will return to the user a iterator
>>>>> structure, to be
>>>>> used on hg_fetch_log_entry funciton. (The log command will not passing
>>>>> any
>>>>> changeset to the user)
>>>>>
>>>>> The hg_fetch_log_entry function will read changesets from command
>>>>> server and
>>>>> will pass into the hg_log_entry_t structure in a parse way the
>>>>> changeset.
>>>>> The hg_log_entry_t structure will be one of the function parameter.
>>>>>
>>>>> In the main.c file it can be found an example on how this function can
>>>>> be used.
>>>>>
>>>>> diff --git a/client.c b/client.c
>>>>> new file mode 100644
>>>>> --- /dev/null
>>>>> +++ b/client.c
>>>>> @@ -0,0 +1,134 @@
>>>>> +#define _GNU_SOURCE
>>>>> +#include <errno.h>
>>>>> +#include <fcntl.h>
>>>>> +#include <stdlib.h>
>>>>> +#include <stdio.h>
>>>>> +#include <string.h>
>>>>> +#include <unistd.h>
>>>>> +#include <sys/types.h>
>>>>> +#include <sys/wait.h>
>>>>> +#include <signal.h>
>>>>> +
>>>>> +
>>>>> +#include "client.h"
>>>>> +#include "utils.h"
>>>>> +
>>>>> +#define HGPATH "hg"
>>>>> +#define CHANGESET
>>>>> "{rev}\\0{node}\\0{tags}\\0{branch}\\0{author}\\0{desc}\
>>>>> +                                               \\0{date|isodate}\\0"
>>>>> +
>>>>> +/**
>>>>> + * \brief A helper for building the command arguments.
>>>>> + * \param command The name for mercurial command.
>>>>> + * \param options An array pointer to command option list.
>>>>> + * \param template Template to be used for this command.
>>>>> + * \retval new_string An array of pointers, where the command it's
>>>>> almost ready
>>>>> + *                    to be send to mercurial cmdserver.
>>>>> + * */
>>>>> +char **cmdbuilder(char *command, char *options[], char *template)
>>>>> +{
>>>>> +       int cmd_size, size;
>>>>> +       char **new_cmd;
>>>>> +       char *temp = "--template";
>>>>> +
>>>>> +       for(size = 0; options[size] != NULL; size++);
>>>>> +       cmd_size = size + 1;
>>>>> +       new_cmd = malloc(sizeof(char *) * (cmd_size + 3));
>>>>> +
>>>>> +       new_cmd[0] = command;
>>>>> +       memcpy(new_cmd + 1, options, (cmd_size - 1)  * sizeof(char *));
>>>>> +       if(template){
>>>>> +               new_cmd[cmd_size] = temp;
>>>>> +               new_cmd[cmd_size + 1] = template;
>>>>> +               new_cmd[cmd_size + 2] = NULL;
>>>>> +       }else{
>>>>> +               new_cmd[cmd_size] = NULL;
>>>>> +       }
>>>>> +
>>>>> +       return new_cmd;
>>>>> +}
>>>>> +
>>>>> +/**
>>>>> + * \brief 'Parse a changeset'. It's more like pointing to the correct
>>>>> position.
>>>>> + *
>>>>> + * The changeset could be found on buff pointer. To not duplicate the
>>>>> data I
>>>>> + * choose to point every log_entry field to the right position.
>>>>> + * \param buff The pointer to the changeset.
>>>>> + * \param le   The log_entry_t structure where the changeset will be
>>>>> parse.
>>>>> + * \retval 0 if successful.
>>>>> + * */
>>>>> +int parse_changeset(char *buff, hg_log_entry_t *le)
>>>>> +{
>>>>> +       /* set pointer for revision position */
>>>>> +       le->rev = buff;
>>>>> +
>>>>> +       /* set pointer for node position */
>>>>> +       buff = buff + strlen(buff) + 1;
>>>>> +       le->node = buff;
>>>>> +
>>>>> +       /* set pointer for tag position */
>>>>> +       buff = buff + strlen(buff) + 1;
>>>>> +       le->tags = buff;
>>>>> +
>>>>> +       /* set pointer for branch position */
>>>>> +       buff = buff + strlen(buff) + 1;
>>>>> +       le->branch = buff;
>>>>> +
>>>>> +       /* set pointer for author position */
>>>>> +       buff = buff + strlen(buff) + 1;
>>>>> +       le->author = buff;
>>>>> +
>>>>> +       /* set pointer for description position */
>>>>> +       buff = buff + strlen(buff) + 1;
>>>>> +       le->desc = buff;
>>>>> +
>>>>> +       /* set pointer for data position */
>>>>> +       buff = buff + strlen(buff) + 1;
>>>>> +       le->date = buff;
>>>>> +
>>>>> +       return 0;
>>>>> +}
>>>>> +
>>>>> +/* The high level log command for hglib API. */
>>>>> +hg_log_iterator *hg_log(hg_handle *handle, char *option[])
>>>>> +{
>>>>> +       hg_log_iterator *log_iterator =
>>>>> malloc(sizeof(hg_log_iterator));
>>>>> +       log_iterator->handle = handle;
>>>>> +
>>>>> +       log_iterator->command = cmdbuilder("log", option, CHANGESET);
>>>>> +
>>>>> +       if(hg_rawcommand(handle, log_iterator->command) < 0){
>>>>> +               return NULL;
>>>>> +       }
>>>>> +
>>>>> +       log_iterator->changeset = malloc(0);
>>>>> +       return log_iterator;
>>>>>
>>>>
>>>>
>>>> What happens here is that hg_log() does not return a status code
>>>> (with the meaning "command successfully sent"), but the hg_log_iterator
>>>> struct instead.
>>>>
>>>> A status code would be desirable and idiomatic I think;
>>>> the struct to be passed as an argument.
>>>>
>>>>
>>>>
>>>>> +}
>>>>> +
>>>>> +/* The iterator next step. Getting the next changeset. */
>>>>> +int hg_fetch_log_entry(hg_log_iterator *log_iterator, hg_log_entry_t
>>>>> *le)
>>>>> +{
>>>>> +       hg_header head = hg_head(log_iterator->handle);
>>>>> +
>>>>> +       if(head.channel == 'r'){
>>>>> +               free(log_iterator->command);
>>>>> +               free(log_iterator->changeset);
>>>>> +               int exitcode = hg_exitcode(log_iterator->handle);
>>>>> +               free(log_iterator);
>>>>> +               return exitcode;
>>>>> +       }
>>>>> +
>>>>> +       log_iterator->changeset = realloc(log_iterator->changeset,
>>>>> head.length);
>>>>> +
>>>>> +       if(read(log_iterator->handle->p_read, log_iterator->changeset,
>>>>> +                                       head.length) < 0){
>>>>>
>>>>
>>>>
>>>> Here you are not using hg_rawread(). Please explain why.
>>>> Level 1 functions are meant to build on top of level 0 functions.
>>>>
>>>> Also: you are not reading in 4096 B chunks.
>>>>
>>>> Also: you are assuming that all information for a given log entry
>>>> (a revision) are retrieved in a "single shot" of the  'o' channel.
>>>>
>>>> You are assuming that a log revision cannot be split across two
>>>> iteration of output on the 'o' channel.
>>>>
>>>> Is this assumption safe?
>>>>
>>>
>>> No.
>>>
>>
>> Your answer is not clear... You want to say that the server will
>> give a revision in more "shots"? From my test I saw that the cmdserver
>> will send a log entry in a single call ( when I am using a template)
>> even if that revision have 20 kB.
>> The header from cmdserver is something like channel = 'o' length = 20000
>> data = [...]
>>
>
> The command server is oblivious to the data it writes. It simply forwards
> whatever hg normally writes to stdout, but in the command protocol.
> So nothing guarantees what you're seeing, even though that's what happens
> in practice.
>
>
>> Or the operation of getting the entire data in one call, is the bad thing?
>>
>
> What data is that? All the output generated by 'hg log ..'? Or the data
> for one
> log entry?
>

By data I mean a log entry in a call... What I understand from your
python-hglib
implementation, on the hg log command you are doing something similar. In
 a
read call from command server you will receive a changeset(a log entry).
Something like this I am trying to make, getting a changeset in a single
call.


>
>
>>
>> Also: If there will be more iterations for a revision, How could I know
>> how
>> much iterations there are?
>>
>
> Your template defines how to parse the output data. The result channel
> tells you the current command ended.
>

What I understand from this is that, if you are using such kind of template
the command server will pass a changeset in a single call.

What I am trying to ask is what happens if the command server will
send a changeset in 2 calls, or more like if the hg could perform this
kind of operation?


-- 
Iulian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20130819/d6280051/attachment.html>


More information about the Mercurial-devel mailing list