Some text files that I open are not readable and have weird signs,
The bom for this files is ff fe (UTF-16 BE according to wikipedia)
vim has fencs set to ucs-bom
it is vim version 7.2 on windows XP
I can load the files fine if before I load them I do
set encoding = utf-8
but I don't want to change the encoding of the file when saving.
Is there a way to that?
> Some text files that I open are not readable and have weird signs,
> The bom for this files is ff fe (UTF-16 BE according to wikipedia)
> vim has fencs set to ucs-bom
> it is vim version 7.2 on windows XP
> I can load the files fine if before I load them I do
> set encoding = utf-8
> but I don't want to change the encoding of the file when saving.
> Is there a way to that?
> Guy Kroizman
Normally, Vim should detect the encoding and remember how to translate
to disk what it has in memory.
'encoding' is the representation of the characters in Vim's internal
memory. UTF-16 cannot be used because it has too many null bytes, which
would terminate the C strings used by Vim (and BTW, FF FE is UTF-16le,
not -be), but UTF-8 is capable of representing the characters of all
charsets used on any computer, so if you set that, you're safe.
When loading an already existing file, Vim uses a heuristic defined by
the option 'fileencodings' (with s at the end): this is a
comma-separated lists of possible charsets, as follows:
- ucs-bom, if used (and it is recommended that it _be_ used) should come
- There should be no more than one 8-bit charset, and it should come last
- Charsets are tried from left to right, and the first one which doesn't
give an error signal is used to read the file. (That's why any 8-bit
charset used should be last: such charsets cannot give an error signal).
A typical value is: :set fencs=ucs-bom,utf-8,latin1
This will correctly detect any Unicode file which has a BOM, or failing
that Vim will try UTF-8, and if the file is not valid UTF-8 the file
will then be shown to you under the assumption that it is Latin1. Vim
stores the disk charset of the file in the local string option
'fileencoding' for that file, and the presence or absence of a BOM in
the local Boolean option 'bomb'. IOW, if the file you mentioned has been
That means everything is OK, and that you don't need to do anything to
record the file in the correct encoding -- :w or :wq will know how to do
the required translation from the UTF-8 representation in memory.
If you see something else, you can save the file in UTF-16le with BOM
(which, then, will *not* be the original charset of the file) by doing
:setlocal fenc=utf-16le bomb
before you save the file. (Use :setlocal, not just :set, because the
latter alters what will happen to _other_ files, especially the new ones
you create thereafter.)