Line ending translation extension

Stephen J. Turnbull stephen at xemacs.org
Sun Sep 6 08:41:05 CDT 2009


Mark Hammond writes:

 > See my other reply - I was simply suggesting non-Windows users could try 
 > using a repository which tries to keep text files with \r\n EOLs to try 
 > and get a feeling for how Windows users operate.

Aha.  But that won't work, except to the extent that the conversion
extension is buggy.  My tools of choice either ignore the CRs, or hide
(but preserve) them.  This is the rule on Unix, for various reasons.

 > >   >  blaming Windows or Windows users is somewhat disingenuous
 > >
 > > Would you please cut that out?
 > 
 > I'm not sure what you want me to cut out exactly;

The repeated "don't blame Windows."  It *is* a problem for Windows but
*not* for Unix.  It would be a PITA to *create* CRLF files on Unix,
but once a file exists, it doesn't matter what the EOL is; we only
care that it stays the same to keep diff happy.

 > I agree it is tedious to hear me continue to point out that this
 > isn't an imagined problem, but I'm simply responding to what is
 > continually asserted.

What I see asserted is quite different: that (a) they don't experience
it themselves, and (b) that they don't understand how a problem can
arise frequently enough to worry about it.  Some of those people
develop primarily for and with Windows!

This is important to the design.  AIUI, Martin v. Löwis is saying
"right, it's pretty rare, but *extremely* annoying when it does
happen, so let's do something 100% simple and 99% usable to cut that
down to 'almost never'."  And that's probably good enough IMO.  OTOH,
you are evidently more concerned about it, and give the impression
that you think it's a harder problem, as well.  If "harder problem"
turns out to be the case, a deeper/more complex solution may be
warranted.

AFAICS there are three simple solutions.

In all, files are split into "text" and "binary".  Binary files are
always checked out and committed verbatim.  They differ in how they
treat text files.

Method A: Text files are labeled according to target platform
convention: LF, CRLF, or native.  LF and CRLF are checked out as
binary, checked for the proper conventions on check in, and converted
as necessary.  This would give some additional flexibility, and better
checking, but it's more complex.  Visual Studio project files would be
labeled as *CRLF*.  This seems the leading candidate.

Method B: Text files are always decoded and reencoded according to
user preference, defaulting to platform convention.  Visual Studio
project files would be classed as *binary*.  This is simpler and
flexible but allows less fine-grained checks..

Method C: Text files are always decoded and reencoded according to
platform conventions.  Visual Studio project files would be classed as
*text*.  Perhaps the simplest approach.  However, this depends on
people working on platforms that care about the EOL convention to use
that convention.  People on Windows who prefer LF would have to do
something special for those files when they build, or live with CRLF
files throughout the tree.

For all of these, a belt-and-suspenders push-time check for text files
is that the EOL convention not change across a commit.  This could be
disabled by a --force flag.




More information about the Mercurial-devel mailing list