[bug] illegal memory access with ":e ++enc=utf-8 ++bad=keep" and BufReadPre autocmd

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug] illegal memory access with ":e ++enc=utf-8 ++bad=keep" and BufReadPre autocmd

Dominique Pellé
Hi

I see an error with Valgrind with Vim-7.2.416 on Linux:

Steps to reproduce:

1) Create a file with illegal utf-8 sequence:

   $ perl -e 'print chr(0)' > sample.txt

2) Start Vim with valgrind:

   $ valgrind --num-callers=20 --log-file=vg.log vim --noplugin -u NONE

3) Type Ex commands:

   :syntax on
   :au BufReadPre *.dat let &bin=1
   :e ++enc=utf-8 ++bad=keep sample.txt

4) Observe the following error in vg.log:

==6333== Invalid read of size 1
==6333==    at 0x4026088: strlen (mc_replace_strmem.c:282)
==6333==    by 0x8088E64: set_cmdarg (eval.c:18309)
==6333==    by 0x80C86A5: apply_autocmds_group (fileio.c:9067)
==6333==    by 0x80C818B: apply_autocmds_exarg (fileio.c:8706)
==6333==    by 0x80BD19E: readfile (fileio.c:649)
==6333==    by 0x8052E96: open_buffer (buffer.c:131)
==6333==    by 0x80952F6: do_ecmd (ex_cmds.c:3657)
==6333==    by 0x80AA7AD: do_exedit (ex_docmd.c:7585)
==6333==    by 0x80AA46A: ex_edit (ex_docmd.c:7481)
==6333==    by 0x80A312D: do_one_cmd (ex_docmd.c:2629)
==6333==    by 0x80A0A77: do_cmdline (ex_docmd.c:1098)
==6333==    by 0x811C867: nv_colon (normal.c:5224)
==6333==    by 0x8116188: normal_cmd (normal.c:1188)
==6333==    by 0x80DE26F: main_loop (main.c:1211)
==6333==    by 0x80DDD66: main (main.c:955)
==6333==  Address 0x56543bf is 1 bytes before a block of size 100 alloc'd
==6333==    at 0x4024F70: malloc (vg_replace_malloc.c:236)
==6333==    by 0x8107152: lalloc (misc2.c:868)
==6333==    by 0x810706F: alloc (misc2.c:767)
==6333==    by 0x80B66AC: alloc_cmdbuff (ex_getln.c:2509)
==6333==    by 0x80B2E6B: getcmdline (ex_getln.c:229)
==6333==    by 0x80B5E47: getexline (ex_getln.c:2124)
==6333==    by 0x80A080F: do_cmdline (ex_docmd.c:994)
==6333==    by 0x811C867: nv_colon (normal.c:5224)
==6333==    by 0x8116188: normal_cmd (normal.c:1188)
==6333==    by 0x80DE26F: main_loop (main.c:1211)
==6333==    by 0x80DDD66: main (main.c:955)

The same kind of error happens at least with:

     :e ++enc=utf-8 ++bad=keep sample.txt
or   :e ++enc=utf-8 ++bad=drop sample.txt
or   :e ++enc=utf-8 ++bad=x    sample.txt

Error happens in eval.c:18309:

18308     if (eap->bad_char != 0)
18309         len += (unsigned)STRLEN(eap->cmd + eap->bad_char) + 7;

Adding a printf, I see:
- eap->cmd is "e ++enc=utf-8"
- eap->bad_char is -1  !!!

eap->bad_char was previously set to -1 (BAD_KEEP) at ex_docmd.c:4776:

4773         p = eap->cmd + eap->bad_char;
4774
4775         if (STRICMP(p, "keep") == 0)
4776             eap->bad_char = BAD_KEEP;
4777         else if (STRICMP(p, "drop") == 0)
4778             eap->bad_char = BAD_DROP;
4779         else if (MB_BYTE2LEN(*p) == 1 && p[1] == NUL)
4780             eap->bad_char = *p;

This code is odd since field 'eap->bad_char' is used for two different things:

- before ex_docmd.c:4773, 'eap->bad_char' contains an index in eap->cmd
- after line ex_docmd.c:4776, 4777 or 4780, the meaning of eap->bad_char
  changes, it then contains: BAD_KEEP, BAD_DROP or the character to
  replace bad char with).

Using 'eap->bad_char' for 2 different things is dangerous.  And it bites us
at eval.c:18309 where code expects 'eap->bad_char' to still contain an
index in 'eap->cmd', but it's already been replaced by BAD_KEEP (bug).

I'm not sure how it should be fixed.  Maybe field 'eap->bad_char' should
be replaced by 2 different fields to avoid mixing 2 different things:
- bad_char_idx ... index in eap->cmd
- bad_char ....... char to replace bad characters with

Does it look like the right way to fix it?

-- Dominique

--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|

Re: [bug] illegal memory access with ":e ++enc=utf-8 ++bad=keep" and BufReadPre autocmd

Dominique Pellé
Dominique Pellé <[hidden email]>:

> Hi
>
> I see an error with Valgrind with Vim-7.2.416 on Linux:
>
> Steps to reproduce:
>
> 1) Create a file with illegal utf-8 sequence:
>
>   $ perl -e 'print chr(0)' > sample.txt
>
> 2) Start Vim with valgrind:
>
>   $ valgrind --num-callers=20 --log-file=vg.log vim --noplugin -u NONE
>
> 3) Type Ex commands:
>
>   :syntax on
>   :au BufReadPre *.dat let &bin=1
>   :e ++enc=utf-8 ++bad=keep sample.txt
>
> 4) Observe the following error in vg.log:
>
> ==6333== Invalid read of size 1
> ==6333==    at 0x4026088: strlen (mc_replace_strmem.c:282)
> ==6333==    by 0x8088E64: set_cmdarg (eval.c:18309)
> ==6333==    by 0x80C86A5: apply_autocmds_group (fileio.c:9067)
> ==6333==    by 0x80C818B: apply_autocmds_exarg (fileio.c:8706)
> ==6333==    by 0x80BD19E: readfile (fileio.c:649)
> ==6333==    by 0x8052E96: open_buffer (buffer.c:131)
> ==6333==    by 0x80952F6: do_ecmd (ex_cmds.c:3657)
> ==6333==    by 0x80AA7AD: do_exedit (ex_docmd.c:7585)
> ==6333==    by 0x80AA46A: ex_edit (ex_docmd.c:7481)
> ==6333==    by 0x80A312D: do_one_cmd (ex_docmd.c:2629)
> ==6333==    by 0x80A0A77: do_cmdline (ex_docmd.c:1098)
> ==6333==    by 0x811C867: nv_colon (normal.c:5224)
> ==6333==    by 0x8116188: normal_cmd (normal.c:1188)
> ==6333==    by 0x80DE26F: main_loop (main.c:1211)
> ==6333==    by 0x80DDD66: main (main.c:955)
> ==6333==  Address 0x56543bf is 1 bytes before a block of size 100 alloc'd
> ==6333==    at 0x4024F70: malloc (vg_replace_malloc.c:236)
> ==6333==    by 0x8107152: lalloc (misc2.c:868)
> ==6333==    by 0x810706F: alloc (misc2.c:767)
> ==6333==    by 0x80B66AC: alloc_cmdbuff (ex_getln.c:2509)
> ==6333==    by 0x80B2E6B: getcmdline (ex_getln.c:229)
> ==6333==    by 0x80B5E47: getexline (ex_getln.c:2124)
> ==6333==    by 0x80A080F: do_cmdline (ex_docmd.c:994)
> ==6333==    by 0x811C867: nv_colon (normal.c:5224)
> ==6333==    by 0x8116188: normal_cmd (normal.c:1188)
> ==6333==    by 0x80DE26F: main_loop (main.c:1211)
> ==6333==    by 0x80DDD66: main (main.c:955)
>
> The same kind of error happens at least with:
>
>     :e ++enc=utf-8 ++bad=keep sample.txt
> or   :e ++enc=utf-8 ++bad=drop sample.txt
> or   :e ++enc=utf-8 ++bad=x    sample.txt
>
> Error happens in eval.c:18309:
>
> 18308     if (eap->bad_char != 0)
> 18309         len += (unsigned)STRLEN(eap->cmd + eap->bad_char) + 7;
>
> Adding a printf, I see:
> - eap->cmd is "e ++enc=utf-8"
> - eap->bad_char is -1  !!!
>
> eap->bad_char was previously set to -1 (BAD_KEEP) at ex_docmd.c:4776:
>
> 4773         p = eap->cmd + eap->bad_char;
> 4774
> 4775         if (STRICMP(p, "keep") == 0)
> 4776             eap->bad_char = BAD_KEEP;
> 4777         else if (STRICMP(p, "drop") == 0)
> 4778             eap->bad_char = BAD_DROP;
> 4779         else if (MB_BYTE2LEN(*p) == 1 && p[1] == NUL)
> 4780             eap->bad_char = *p;
>
> This code is odd since field 'eap->bad_char' is used for two different things:
>
> - before ex_docmd.c:4773, 'eap->bad_char' contains an index in eap->cmd
> - after line ex_docmd.c:4776, 4777 or 4780, the meaning of eap->bad_char
>  changes, it then contains: BAD_KEEP, BAD_DROP or the character to
>  replace bad char with).
>
> Using 'eap->bad_char' for 2 different things is dangerous.  And it bites us
> at eval.c:18309 where code expects 'eap->bad_char' to still contain an
> index in 'eap->cmd', but it's already been replaced by BAD_KEEP (bug).
>
> I'm not sure how it should be fixed.  Maybe field 'eap->bad_char' should
> be replaced by 2 different fields to avoid mixing 2 different things:
> - bad_char_idx ... index in eap->cmd
> - bad_char ....... char to replace bad characters with
>
> Does it look like the right way to fix it?
>
> -- Dominique
Hi

Attached patch fixes the Valgrind errors in Vim-7.2.416 described
in above email.

After re-reading my previous email, I noticed that bug
happens when loading any kind of file with...
  :e ++enc=utf-8 ++bad=keep filename
... with or without invalid utf-8 sequences, and that the sample,txt
file I gave in above email was in fact a valid utf-8 file.

This is how I checked that Vim works after patch:

1) create a file with invalid utf-8 char

$ perl -e 'print chr(254); print chr(255)' > invalid-utf-8.txt

2) Try to load the file with:

:e ++enc=utf-8 ++bad=keep invalid-utf-8.txt

-> should display <fe><ff>

:e ++enc=utf-8 ++bad=X invalid-utf-8.txt

-> should display XX

:e ++enc-utf-8 ++bad=drop invalid-utf-8.txt

-> should display nothing

Cheers
-- Dominique

--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

fixed-bad_char-7.2.416.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [bug] illegal memory access with ":e ++enc=utf-8 ++bad=keep" and BufReadPre autocmd

Bram Moolenaar

Dominique Pellé wrote:

> > Hi
> >
> > I see an error with Valgrind with Vim-7.2.416 on Linux:
> >
> > Steps to reproduce:
> >
> > 1) Create a file with illegal utf-8 sequence:
> >
> >   $ perl -e 'print chr(0)' > sample.txt
> >
> > 2) Start Vim with valgrind:
> >
> >   $ valgrind --num-callers=20 --log-file=vg.log vim --noplugin -u NONE
> >
> > 3) Type Ex commands:
> >
> >   :syntax on
> >   :au BufReadPre *.dat let &bin=1
> >   :e ++enc=utf-8 ++bad=keep sample.txt
> >
> > 4) Observe the following error in vg.log:
> >
> > ==6333== Invalid read of size 1
> > ==6333==    at 0x4026088: strlen (mc_replace_strmem.c:282)
> > ==6333==    by 0x8088E64: set_cmdarg (eval.c:18309)
> > ==6333==    by 0x80C86A5: apply_autocmds_group (fileio.c:9067)
> > ==6333==    by 0x80C818B: apply_autocmds_exarg (fileio.c:8706)
> > ==6333==    by 0x80BD19E: readfile (fileio.c:649)
> > ==6333==    by 0x8052E96: open_buffer (buffer.c:131)
> > ==6333==    by 0x80952F6: do_ecmd (ex_cmds.c:3657)
> > ==6333==    by 0x80AA7AD: do_exedit (ex_docmd.c:7585)
> > ==6333==    by 0x80AA46A: ex_edit (ex_docmd.c:7481)
> > ==6333==    by 0x80A312D: do_one_cmd (ex_docmd.c:2629)
> > ==6333==    by 0x80A0A77: do_cmdline (ex_docmd.c:1098)
> > ==6333==    by 0x811C867: nv_colon (normal.c:5224)
> > ==6333==    by 0x8116188: normal_cmd (normal.c:1188)
> > ==6333==    by 0x80DE26F: main_loop (main.c:1211)
> > ==6333==    by 0x80DDD66: main (main.c:955)
> > ==6333==  Address 0x56543bf is 1 bytes before a block of size 100 alloc'd
> > ==6333==    at 0x4024F70: malloc (vg_replace_malloc.c:236)
> > ==6333==    by 0x8107152: lalloc (misc2.c:868)
> > ==6333==    by 0x810706F: alloc (misc2.c:767)
> > ==6333==    by 0x80B66AC: alloc_cmdbuff (ex_getln.c:2509)
> > ==6333==    by 0x80B2E6B: getcmdline (ex_getln.c:229)
> > ==6333==    by 0x80B5E47: getexline (ex_getln.c:2124)
> > ==6333==    by 0x80A080F: do_cmdline (ex_docmd.c:994)
> > ==6333==    by 0x811C867: nv_colon (normal.c:5224)
> > ==6333==    by 0x8116188: normal_cmd (normal.c:1188)
> > ==6333==    by 0x80DE26F: main_loop (main.c:1211)
> > ==6333==    by 0x80DDD66: main (main.c:955)
> >
> > The same kind of error happens at least with:
> >
> >     :e ++enc=utf-8 ++bad=keep sample.txt
> > or   :e ++enc=utf-8 ++bad=drop sample.txt
> > or   :e ++enc=utf-8 ++bad=x    sample.txt
> >
> > Error happens in eval.c:18309:
> >
> > 18308     if (eap->bad_char != 0)
> > 18309         len += (unsigned)STRLEN(eap->cmd + eap->bad_char) + 7;
> >
> > Adding a printf, I see:
> > - eap->cmd is "e ++enc=utf-8"
> > - eap->bad_char is -1  !!!
> >
> > eap->bad_char was previously set to -1 (BAD_KEEP) at ex_docmd.c:4776:
> >
> > 4773         p = eap->cmd + eap->bad_char;
> > 4774
> > 4775         if (STRICMP(p, "keep") == 0)
> > 4776             eap->bad_char = BAD_KEEP;
> > 4777         else if (STRICMP(p, "drop") == 0)
> > 4778             eap->bad_char = BAD_DROP;
> > 4779         else if (MB_BYTE2LEN(*p) == 1 && p[1] == NUL)
> > 4780             eap->bad_char = *p;
> >
> > This code is odd since field 'eap->bad_char' is used for two different things:
> >
> > - before ex_docmd.c:4773, 'eap->bad_char' contains an index in eap->cmd
> > - after line ex_docmd.c:4776, 4777 or 4780, the meaning of eap->bad_char
> >  changes, it then contains: BAD_KEEP, BAD_DROP or the character to
> >  replace bad char with).
> >
> > Using 'eap->bad_char' for 2 different things is dangerous.  And it bites us
> > at eval.c:18309 where code expects 'eap->bad_char' to still contain an
> > index in 'eap->cmd', but it's already been replaced by BAD_KEEP (bug).
> >
> > I'm not sure how it should be fixed.  Maybe field 'eap->bad_char' should
> > be replaced by 2 different fields to avoid mixing 2 different things:
> > - bad_char_idx ... index in eap->cmd
> > - bad_char ....... char to replace bad characters with
> >
> > Does it look like the right way to fix it?
> >
> > -- Dominique
>
> Hi
>
> Attached patch fixes the Valgrind errors in Vim-7.2.416 described
> in above email.
>
> After re-reading my previous email, I noticed that bug
> happens when loading any kind of file with...
>   :e ++enc=utf-8 ++bad=keep filename
> ... with or without invalid utf-8 sequences, and that the sample,txt
> file I gave in above email was in fact a valid utf-8 file.
>
> This is how I checked that Vim works after patch:
>
> 1) create a file with invalid utf-8 char
>
> $ perl -e 'print chr(254); print chr(255)' > invalid-utf-8.txt
>
> 2) Try to load the file with:
>
> :e ++enc=utf-8 ++bad=keep invalid-utf-8.txt
>
> -> should display <fe><ff>
>
> :e ++enc=utf-8 ++bad=X invalid-utf-8.txt
>
> -> should display XX
>
> :e ++enc-utf-8 ++bad=drop invalid-utf-8.txt
>
> -> should display nothing

Thanks!  I'll look into it soon.

--
'Well, here's something to occupy you and keep your mind off things.'
'It won't work, I have an exceptionally large mind.'
                -- Douglas Adams, "The Hitchhiker's Guide to the Galaxy"

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php