Dear Forum,
I maintain an application that reads and writes text in utf-8. Due to bad joss incurred during my (and others) utf-8 learning curve I now have some garbled characters in my input. These show up in vim -b as '<nn>', where nn is a lower-case hex string. Here's an example: 3020 tuomas jorma juhani r<e4>s<e4>nen My question is, how do I search for these characters in vim so I can fix or delete them? Treating them as literal strings doesn't work. Thanks! -- -- You received this message from the "vim_multibyte" maillist. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_multibyte" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
James Barnett wrote:
> I maintain an application that reads and writes text in utf-8. > Due to bad joss incurred during my (and others) utf-8 learning > curve I now have some garbled characters in my input. These > show up in vim -b as '<nn>', where nn is a lower-case hex > string. Here's an example: > 3020 tuomas jorma juhani r<e4>s<e4>nen > > My question is, how do I search for these characters in vim so > I can fix or delete them? Treating them as literal strings > doesn't work. In principle, vim_multibyte is the right mailing list, but in practice it is hardly every used, and I suggest using the main vim_use mailing list in the future unless a very esoteric issue regarding multibyte issues needs to be discussed at length. There are three very useful commands entered in normal mode: ga g8 8g8 ga and g8 display information about the character at the cursor. 8g8 finds the next illegal UTF-8 sequences (it does nothing if none found). Use ':help 8g8' for info. John -- -- You received this message from the "vim_multibyte" maillist. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_multibyte" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
On 27Oct2014, at 17:20, John Beckett <[hidden email]> wrote: > James Barnett wrote: >> I maintain an application that reads and writes text in utf-8. >> Due to bad joss incurred during my (and others) utf-8 learning >> curve I now have some garbled characters in my input. These >> show up in vim -b as '<nn>', where nn is a lower-case hex >> string. > > There are three very useful commands entered in normal mode: > ga > g8 > 8g8 > > ga and g8 display information about the character at the cursor. > 8g8 finds the next illegal UTF-8 sequences (it does nothing if > none found). > > Use ':help 8g8' for info. > I assume that your ‘encoding’ (vim buffer internal encoding) is UTF-8. Once you know the hex value that you want to find, e.g 00E4, I think that you should be able to search for it by entering / (the slash), Ctrl-v, u, 00E4. ******************************** Kenneth R. Beesley, D.Phil. P.O. Box 540475 North Salt Lake, UT 84054 USA -- -- You received this message from the "vim_multibyte" maillist. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_multibyte" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512 On October 28, 2014 7:08:27 PM EAT, Kenneth Reid Beesley <[hidden email]> wrote: > >On 27Oct2014, at 17:20, John Beckett <[hidden email]> wrote: > >> James Barnett wrote: >>> I maintain an application that reads and writes text in utf-8. >>> Due to bad joss incurred during my (and others) utf-8 learning >>> curve I now have some garbled characters in my input. These >>> show up in vim -b as '<nn>', where nn is a lower-case hex >>> string. > >> >> There are three very useful commands entered in normal mode: >> ga >> g8 >> 8g8 >> >> ga and g8 display information about the character at the cursor. >> 8g8 finds the next illegal UTF-8 sequences (it does nothing if >> none found). >> >> Use ':help 8g8' for info. >> > >I assume that your ‘encoding’ (vim buffer internal encoding) is UTF-8. > >Once you know the hex value that you want to find, e.g 00E4, >I think that you should be able to search for it by entering / (the >slash), >Ctrl-v, u, 00E4. This only allows you to search for unicode characters. They never show up as <xx> AFAIK. To enter invalid character one needs to use <C-r>="\xXX"<CR>. > >******************************** >Kenneth R. Beesley, D.Phil. >P.O. Box 540475 >North Salt Lake, UT >84054 USA -----BEGIN PGP SIGNATURE----- Version: APG v1.1.1 iQI1BAEBCgAfBQJUT8PZGBxaeVggPHp5eC52aW1AZ21haWwuY29tPgAKCRCf3UKj HhHSvutZEACqiQyQd8mJZKDxM1s4hkLcFhtTqX5WC+euSBB37pOsK8w/X5qjPxjS Z7Em9swlg777/ngBr3Lu0vWWBgYuoYp2Ad7/YE4HAzaT3NhUwWx3nhNGQbcaO9AN 6h9eAqVhtOki0/g3/kQT2cN2Md1kzcYYYRNGs6jRxeNW2+O/mMXbLXkDls2N46mK WIIaklb+4El2zCT7+PXxDC+vLGpDEdktbHzOnAldfjpOxM1Apu5mqkp6weDHhWaU iLKUaVhRDW2CFJAXyVKsr3q/ei5EPx3Xcrd1xn6BZcYy0fRbVYLBYLbGbtSVV5tw PAvhsKL4xnVaBKK9n7d2KgdOqaSOkUprmh8Y13kMUE/oyuT+1SvnNnX9I4eUCIOg evgrY5qi++zM/MsuuNYK16VJgicpxo8TD+QqKjyr+yPfS806AMTnnzoD0/lqsE4Q iIQjSg1bj+Z7s4jC9cSbRBQl7jUrCw5XhSjnmCwdIRl5tErD+yRHWPAw+2EML+Xi N28gxtR3gKaPBD4D40XFE9XNYCC48yjBcqupd5w8nJD4pURPMhQ8gIhbSQeh6ezA q1V/E/0IAL31jn5DgYpsHl5pGAzuFumjCnibsHnISk2x9Q8pktSiP/T9Gsomv6AP A+Fu8hqsqnUwKgKwvMW96mhQV4PfO31K7+fMNc0q07qxQP6DHi/4eQ== =CkpT -----END PGP SIGNATURE----- -- -- You received this message from the "vim_multibyte" maillist. For more information, visit http://www.vim.org/maillist.php --- You received this message because you are subscribed to the Google Groups "vim_multibyte" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/d/optout. |
Free forum by Nabble | Edit this page |