UTF-8 Byte order marks inserted by hg merge
Adrian Buehlmann
adrian at cadifra.com
Mon Jun 30 04:21:40 CDT 2008
On 30.06.2008 10:26, Brian Wallis wrote:
> First I should state that I am still investigating this problem and am
> still a little unsure as to what happened.
>
> We have a user on Linux (Suse 10.3) running Mercurial 1.0.1 and
> another on Windows Vista running TortoiseHg 0.4 each of who were
> working on some changes on a branch. When it came time to merge, the
> user on Windows pulled the changes from the other repository and
> merged the two heads. The merged result seemed to be slightly
> corrupted in that there were three extra characters added to the front
> of a few files. These were (in hex) EF BB BF which are the byte order
> marker for UTF-8.
>
> Two of the files in question were merged without conflict and the
> third required a conflict resolution for which Beyond Compare 3 was
> configured and used. These leads me to think that it was mercurial
> that inserted the marks, not some other windows tool.
>
> Looking at the two parent revisions of the files shows no Byte Order
> Marks in the files. I am going to attempt to reproduce the problem
> tomorrow in another similarly configured Windows Vista machine.
>
> Any suggestions would be appreciated.
This shouldn't have anything to do with Mercurial. I bet this
was notepad.
I have a file here on Windows that we want to have to be stored
as UTF-8 *without* BOM (byte order mark).
I've been biten numerous times in the past (already long before we
switched to Mercurial) by notepad inserting such a BOM into that file.
My solution is to not use stupid notepad on that file.
WordPad seems to be better, I currently mostly use the free notepad++.
So don't use notepad to edit "UTF-8 files without BOM", or it will
insert a BOM behind your back.
More information about the Mercurial
mailing list