Line ending translation extension

"Martin v. Löwis" martin at v.loewis.de
Mon Sep 7 05:22:02 CDT 2009


>  > > And the design very likely has to change, to deal with the
>  > > decentralized, content-oriented behavior of DVCSes.  It seems to me
>  > > that your question about hg diff acknowledges that the proposed design
>  > > is incomplete in this sense.
>  > 
>  > I don't see how this has anything to do with the decentralized behavior.
>  > Can you please be more specific?
> 
> In a centralized VCS doing a diff means checking out one or more
> versions of each file.  The main cost is network transport.  So
> there's no reason to look for a fast path locally AFAICS, no reason to
> have different ways to checkout.  So both versions will have been
> treated by the same checkout filters and the diff is valid and
> efficient.
> 
> In a distributed VCS, however, there may be a fast path.

Sorry, but this is non-sense. In the centralized VCS, there may be
a fast path, too. E.g. in subversion, a diff of the working copy
against the base revision does not involve any network communication.

> For example,
> in git you can first check for equality with the SHA1.  This is not
> useful if you are comparing a checked-out file that has been filtered
> with a committed version, and thus users who configure a non-identity
> filter will get inefficient performance compared to those with
> identity filters.

That's an implementation detail. git *could* generate a checksum after
output filters have stored the file on disk, to also know what the hash
is of the checked-out version with no modifications. Or, you could
filter it (which you may do, anyway, when doing the diff), and compute
the SHA1 - this might still be faster than creating the diff.

> Again, if you are comparing two checked-in
> versions, it may make sense to never actually write the file, and
> simply compare repo contents of one blob with another.

How is that different in a centralized VCS? In subversion, you send
the diff request over the network (IIUC), and it sends you back the
diff, (obviously) using the in-repo representation.

>  > No, he can't - because then the extension will become unmaintained.
>  > So the situation would be just like win32ext: it is there, it is
>  > unmaintained, and it doesn't quite work.
> 
> This is no different from the current situation, where refusal to work
> leaves us with win32ext, except that the not quite working
> unmaintained extension might be a marked improvement over win32ext.

So clearly, we need a maintained version first.

>  > This method (treating project files as binary) can't work since it
>  > allows people to introduce mixed eol styles into the file, which
>  > would break the tools that want to process the files.
> 
> See?  I told you you knew more about the problem than me.

I always said that I'm willing to test, and report problems, and so on.
I'm not willing to maintain some part of Mercurial just because I'm
using it - just as well as I'm not willing to maintain a part of
Subversion just because I'm using it.

I *am* willing to maintain the subversion hooks that I have developed as
part of the subversion migration - that's because I'm in charge of the
subversion installation). So I feel that whoever is in charge of the
Mercurial installation (and somebody needs to be in charge) also must
make sure that all code that we use is maintained - either by the
Mercurial team, or by our Mercurial admin, or by some other volunteer.

> I'm trying to investigate!  That's why I'm asking for URLs!  I don't
> even know where to find the Print Options menu on Vista, and my normal
> tools on Linux and Mac OS X handle CRLF and LF transparently.  *There
> are no problems visible where I live, and there never will be.*

Ok, then install the proposed extension, add an .hgeol file, and set
your preferred eol-style to CRLF. Then see whether it works.

> If you want me to move over to Windows

I don't think that's necessary. The design of the proposed extension is
so that you can evaluate on Unix as well.

>  > FWIW, git supports the crlf attribute, which is very similar to
>  > the proposed feature.
> 
> No, it's very similar to win32ext, with a little bit of additional
> safety.  Quoting from git-config(1) (emphasis added):
> 
>     core.autocrlf
>         If true, makes git convert CRLF at the end of lines in text
>         files to LF when reading from the filesystem, and convert in
>         reverse when writing to the filesystem. The variable can be
>         set to input, in which case the conversion happens only while
>         reading from the filesystem but files are written out with LF
>         at the end of lines. CURRENTLY, WHICH PATHS TO CONSIDER "TEXT"
>         (I.E. BE SUBJECTED TO THE AUTOCRLF MECHANISM) IS DECIDED
>         PURELY BASED ON THE CONTENTS.

I don't think this is accurate, given this quote from gitattributes(5):

       crlf
           This attribute controls the line-ending convention.

           Set
               Setting the crlf attribute on a path is meant to mark
               the path as a "text" file.  core.autocrlf conversion
               takes place WITHOUT GUESSING the content type by
               inspection.

           Unset
               Unsetting the crlf attribute on a path tells git
               not to attempt any end-of-line conversion upon
               checkin or checkout.

           Unspecified
               Unspecified crlf attribute tells git to apply the
               core.autocrlf conversion when the file content
               looks like text.

My guess is that it started out the way you quote, was found
insufficent, and then support for explicit specification was added.

> I didn't make that argument.  My argument is that the feature needs
> different implementation, and probably broader coverage of commands,
> for DVCS rather than CVCS.

Ah, ok. That may well be. I don't know Mercurial good enough to tell
what commands should be modified in what way, and whether it is at
all possible to provide the feature by means of an extension.

Regards,
Martin


More information about the Mercurial-devel mailing list