multi-encoding problem, an unknown char inserted at the front of a new file

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

multi-encoding problem, an unknown char inserted at the front of a new file

Hua Yanghao-2

Hi,

(Only after I have sent the email to [hidden email] I realized this
question is more appropriate to this group, I'm sorry if you have
received both the message.)
If I use the following multi-encoding script in vimrc, when I edit a
new file, an unknow character seems have been inserted into the start
of the file.

The multi-encoding script is:
" multi-encoding setting
if has("multi_byte")
set bomb
set fileencodings=ucs-bom,utf-8,cp936,big5,euc-jp,euc-kr,latin1

" CJK environment detection and corresponding setting
if v:lang =~ "^zh_CN"

" Use cp936 to support GBK, euc-cn == gb2312
set encoding=cp936
set termencoding=cp936
set fileencoding=cp936
endif

" Detect UTF-8 locale, and replace CJK setting if needed
if v:lang =~ "utf8$" || v:lang =~ "UTF-8$"
set encoding=utf-8
set termencoding=utf-8
set fileencoding=utf-8
endif
else
echoerr "Sorry, this version of (g)vim was not compiled with
multi_byte"
endif
" end of script

$ vim a_new_file
# save and exit without typing anything into the file
$ vim a_new_file
# do a ":%!xxd" in vim, and the result is:
0000000: efbb bf                                  ...
# if the script was not used, an empty file would have nothing inside.
if I dont use the multi-encoding script, no this problem. This has no
effect if I do documentation in plain text. But if I write programs in
c or latex, the file fails to compile because of the very first
character.

my vim version:
VIM - Vi IMproved 7.1 (2007 May 12, compiled Sep 20 2007 21:19:17)
Included patches: 1-42
Modified by Gentoo-7.1.042
Compiled by root@grass
Huge version without GUI.  Features included (+) or not (-):
+arabic +autocmd -balloon_eval -browse ++builtin_terms +byte_offset
+cindent
+clientserver +clipboard +cmdline_compl +cmdline_hist +cmdline_info
+comments
+cryptv +cscope +cursorshape +dialog_con +diff +digraphs -dnd -ebcdic
+emacs_tags +eval +ex_extra +extra_search +farsi +file_in_path
+find_in_path
+folding -footer +fork() +gettext -hangul_input +iconv +insert_expand
+jumplist
 +keymap +langmap +libcall +linebreak +lispindent +listcmds +localmap
+menu
+mksession +modify_fname +mouse -mouseshape +mouse_dec +mouse_gpm
-mouse_jsbterm +mouse_netterm +mouse_xterm +multi_byte +multi_lang -
mzscheme
-netbeans_intg -osfiletype +path_extra +perl +postscript +printer
+profile
+python +quickfix +reltime +rightleft -ruby +scrollbind +signs
+smartindent
-sniff +statusline -sun_workshop +syntax +tag_binary +tag_old_static
-tag_any_white -tcl +terminfo +termresponse +textobjects +title -
toolbar
+user_commands +vertsplit +virtualedit +visual +visualextra +viminfo
+vreplace
+wildignore +wildmenu +windows +writebackup +X11 +xfontset -xim
+xsmp_interact
+xterm_clipboard -xterm_save
   system vimrc file: "/etc/vim/vimrc"
     user vimrc file: "$HOME/.vimrc"
      user exrc file: "$HOME/.exrc"
  fall-back for $VIM: "/usr/share/vim"
Compilation: i686-pc-linux-gnu-gcc -c -I. -Iproto -DHAVE_CONFIG_H
-
O2 -march=i686 -pipe    -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/
usr/include/gdbm  -I/usr/lib/perl5/5.8.8/i686-linux/CORE  -I/usr/
include/python2.4 -pthread
Linking: i686-pc-linux-gnu-gcc   -rdynamic   -L/usr/local/lib -o
vim    -lXt -lcurses -lacl -lgpm   -rdynamic  -L/usr/local/lib /usr/
lib/perl5/5.8.8/i686-linux/auto/DynaLoader/DynaLoader.a -L/usr/lib/
perl5/5.8.8/i686-linux/CORE -lperl -lutil -lc -L/usr/lib/python2.4/
config -lpython2.4 -lpthread -lutil -lm -Xlinker -export-dynamic

Anyone have any idea about this issue?

Thanks.
Hua Yanghao
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|

Re: multi-encoding problem, an unknown char inserted at the front of a new file

Camillo Särs-2

Hua Yanghao wrote:
> If I use the following multi-encoding script in vimrc, when I edit a
> new file, an unknow character seems have been inserted into the start
> of the file.
>
...
> set bomb

You are seeing the BOM, the byte order mark, of utf-8 at the beginning
of the file.  This can cause problems for any program that does not know
how to handle utf-8 encoded files.

Unless you want to play with the bomb setting per file type, you just
need to make sure you create any files for special applications without
the bomb.  To my knowledge, vim won't insert it if it is not already
present in an existing file.  The setting thus applies to only new files.

Regards,
Camillo
--
Camillo Särs <[hidden email]>             Aim for the impossible and you
http://www.ged.fi                       will achieve the improbable

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|

Re: multi-encoding problem, an unknown char inserted at the front of a new file

Hua Yanghao-2

Hi,

Thanks very much, that's exactly what i missed here.

Best Regards,
Hua Yanghao
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_multibyte" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---