Line ending translation extension
"Martin v. Löwis"
martin at v.loewis.de
Sun Sep 6 06:52:03 UTC 2009
> The win32text extension already has the necessary filters, but I think
> it has an unfortunate name: it seems very Windows-centric whereas this
> problem is not particularly tied to Windows. The names of the filters
> are also strange to me ('cleverencode:' -- why the colon?).
>
> Another problem with win32text is that one must configured it again and
> again -- the settings are not stored in the repository.
>
> *Python people:* are there other problems with win32text? I don't use
> it, so I wonder why you guys are not trying to improve that extension?
I think win32text is completely different from what we want. We don't
want something clever (which finds out all by itself whether to convert
the file or not). We want an explicit definition of eol-style, and we
want the extension to comply to this specification.
For those who haven't been following: each text file should be specified
as either native, LF, or CRLF (although in practice, only native and
CRLF will be used).
> It seems to work fine for local clones. For clones of remote
> repositories the .hgeol file is not read -- the extension attempt to
> read it before the changesets have been transferred.
That sounds like a problem.
> I think we need someone who uses Windows and who therefore cares about
> this issue -- as a Linux guy it's still difficult for me to see why this
> is such a problem, except for new files (but they are easy to correct as
> they pop up).
How would you correct them?
You mean, with a second commit? This is tedious.
Also, there is the problem with mixed line endings (i.e. CR characters
being introduced into an LF file).
> def tolf(s, *args):
> if not util.binary(s):
> s = s.replace('\r\n', '\n')
> return s
This doesn't look quite right. Why does it test for binary? (and what
does that test actually do?) If the file is specified as lf, it
shouldn't be binary.
> def tocrlf(s, *args):
> if not util.binary(s):
> s = s.replace('\n', '\r\n')
> return s
What if the file has already CRLF in it? This would then duplicate all
CRs.
I'm also missing error checking. In subversion, on Unix, if I put a CR
character into a file with eol-style native, I get
svn: While preparing '/tmp/x/a.java' for commit
svn: Inconsistent line ending style
Only if all lines consistently have CRLF, the entire file gets
converted.
Wrt. error checking, it also seems there is no support for handling
CRLF or LF files.
Regards,
Martin
More information about the Mercurial-devel
mailing list