Overlong sequences are not handled specially and displayed like a valid
character. However, search patterns may not match on an overlong sequence.
(an overlong sequence is where more bytes are used than required for the
character.) An exception is NUL (zero) which is displayed as "<00>".
IOW, Vim accepts overlong sequences in file data, but you can't enter them
from the keyboard. IIUC, "forbidden" codepoints (such as all those ending in
FFFE or FFFF in any plane) are not rejected by Vim either. Neither are, of
course, those left out "for future use".
----- Original Message -----
From: "Bram Moolenaar" <[hidden email]>
To: "Ron Aaron" <[hidden email]>
Cc: <[hidden email]>
Sent: Wednesday, July 27, 2005 11:29 PM
Subject: Re: vim (6.3.85) accepts overlong UTF-8 sequences
> Ron Aaron wrote:
>> Loading the 'UTF-8 Stress test' file:
>> http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt >>
>> I can see all the "unsafe" UTF-8 sequences. Probably it would be a
>> good idea to at least have a flag to say whether or not to be
> What do you suggest, reject to edit the file?
I guess the todo.txt line 1352 applies here:
8 Detect overlong UTF-8 sequences and handle them like illegal bytes.