Encoding issues with Windows gvim

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Encoding issues with Windows gvim

Ritmo2k
I have a utf-16-le based file with a bom which I can view on a linux host with vi without issue:

# file utf16.txt
utf16.txt: Little-endian UTF-16 Unicode text, with CRLF line terminators

# od -t x1 -N 2 utf16.txt
0000000 ff fe
0000002

It opens and appears as expected with vi at the console (see vim_linux.png).

However on Windows when I reload it with :e ++ff=dos ++enc=utf-16le it displays as the attached gvim_windows.png. No incantation of settings I can come up allow me to open and edit this on Windows.

Anyone have an idea how to accomplish this?

Thanks.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

vim_linux.png (21K) Download Attachment
vim_windows.png (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Encoding issues with Windows gvim

Tony Mechelynck
On Sat, Sep 9, 2017 at 8:08 PM, Joseph L. Casale <[hidden email]> wrote:

> I have a utf-16-le based file with a bom which I can view on a linux host with vi without issue:
>
> # file utf16.txt
> utf16.txt: Little-endian UTF-16 Unicode text, with CRLF line terminators
>
> # od -t x1 -N 2 utf16.txt
> 0000000 ff fe
> 0000002
>
> It opens and appears as expected with vi at the console (see vim_linux.png).
>
> However on Windows when I reload it with :e ++ff=dos ++enc=utf-16le it displays as the attached gvim_windows.png. No incantation of settings I can come up allow me to open and edit this on Windows.
>
> Anyone have an idea how to accomplish this?
>
> Thanks.

In your Windows gvim, at the point where you would be reading your
problematic file, do instead

        :verbose set enc?

If the answer is anything other than utf-8, then you cannot display
the file in gvim because the UTF-16le of the file cannot be translated
into whatever it is that gvim is using to represent characters in
memory.

See http://vim.wikia.com/wiki/Working_with_Unicode

Best regards,
Tony.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Encoding issues with Windows gvim

Ritmo2k
On Saturday, September 9, 2017 at 12:16:27 PM UTC-6, Tony Mechelynck wrote:

> In your Windows gvim, at the point where you would be reading your
> problematic file, do instead
>
>         :verbose set enc?
>
> If the answer is anything other than utf-8, then you cannot display
> the file in gvim because the UTF-16le of the file cannot be translated
> into whatever it is that gvim is using to represent characters in
> memory.
>
> See http://vim.wikia.com/wiki/Working_with_Unicode
Hi Tony,
Executing ":verbose set enc?" showed latin1. After reading the doc to be honest my minimal understanding of the topic was grayed even more. The sections of the manual around "*45.4*  Editing files with a different encoding" helped however I am still unclear.

After setting an appropriate Unicode font in my vimrc (set guifont=courier_new:h11) and opening the file with ":e ++enc=utf-16le utf16.txt", the file was loaded with conversion errors (all upside down question marks). Executing ":set encoding=utf-16le" and reloading yet again with ":e ++enc=utf-16le utf16.txt" worked, I can now view the file?

Why didn't opening the file with "++enc=utf-16le" accomplish all that ":set encoding=utf-16le" did?

Thanks a lot for the help.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Encoding issues with Windows gvim

Tony Mechelynck
On Sat, Sep 9, 2017 at 9:26 PM, Joseph L. Casale <[hidden email]> wrote:

> On Saturday, September 9, 2017 at 12:16:27 PM UTC-6, Tony Mechelynck wrote:
>> In your Windows gvim, at the point where you would be reading your
>> problematic file, do instead
>>
>>         :verbose set enc?
>>
>> If the answer is anything other than utf-8, then you cannot display
>> the file in gvim because the UTF-16le of the file cannot be translated
>> into whatever it is that gvim is using to represent characters in
>> memory.
>>
>> See http://vim.wikia.com/wiki/Working_with_Unicode
>
> Hi Tony,
> Executing ":verbose set enc?" showed latin1. After reading the doc to be honest my minimal understanding of the topic was grayed even more. The sections of the manual around "*45.4*  Editing files with a different encoding" helped however I am still unclear.
>
> After setting an appropriate Unicode font in my vimrc (set guifont=courier_new:h11) and opening the file with ":e ++enc=utf-16le utf16.txt", the file was loaded with conversion errors (all upside down question marks). Executing ":set encoding=utf-16le" and reloading yet again with ":e ++enc=utf-16le utf16.txt" worked, I can now view the file?
>
> Why didn't opening the file with "++enc=utf-16le" accomplish all that ":set encoding=utf-16le" did?
>
> Thanks a lot for the help.

Despite its name, ++enc sets 'fileencoding' (telling Vim which charset
is used _on disk_ for that file), not 'encoding' (the charset used for
the data _in Vim memory_); the latter, if you don't change it, is
still set to latin1, which has no representation for Greek letters.

When you did ":set enc=utf-16le", Vim actually used utf-8, because
UTF-16le uses a lot of null bytes (one each for every codepoint not
greater than U+00FF, for instance spaces, tabs, commas, etc.) and Vim
uses C strings, which those null bytes would terminate. UTF-8, like
UTF-16, can represent data in any encoding including the Greek text of
your problematic file. Your Linux Vim probably runs in a UTF-8 locale
(something many Linux systems use) which would explain why your Greek
text was immediately readable on Linux.

But if you change 'encoding' while some file (even maybe just a help
file) is already loaded in memory, all the data in memory becomes
invalid. The only safe place to change 'encoding' is near the top of
your vimrc, before any editfile has been read, and there are other
changes that go with it.

Please read the Vim wiki article linked in my previous post, it will
tell you how to do it safely, and explain in detail the differences
between the various encoding-related options that Vim possesses.

Best regards,
Tony.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Encoding issues with Windows gvim

Ritmo2k
On Sunday, September 10, 2017 at 12:28:12 AM UTC-6, Tony Mechelynck wrote:

> Despite its name, ++enc sets 'fileencoding' (telling Vim which charset
> is used _on disk_ for that file), not 'encoding' (the charset used for
> the data _in Vim memory_); the latter, if you don't change it, is
> still set to latin1, which has no representation for Greek letters.
>
> When you did ":set enc=utf-16le", Vim actually used utf-8, because
> UTF-16le uses a lot of null bytes (one each for every codepoint not
> greater than U+00FF, for instance spaces, tabs, commas, etc.) and Vim
> uses C strings, which those null bytes would terminate. UTF-8, like
> UTF-16, can represent data in any encoding including the Greek text of
> your problematic file. Your Linux Vim probably runs in a UTF-8 locale
> (something many Linux systems use) which would explain why your Greek
> text was immediately readable on Linux.
>
> But if you change 'encoding' while some file (even maybe just a help
> file) is already loaded in memory, all the data in memory becomes
> invalid. The only safe place to change 'encoding' is near the top of
> your vimrc, before any editfile has been read, and there are other
> changes that go with it.
>
> Please read the Vim wiki article linked in my previous post, it will
> tell you how to do it safely, and explain in detail the differences
> between the various encoding-related options that Vim possesses.
Seems I missed some vital points that I now understand.
Thanks for the patience Tony.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.