Quantcast

LOCALE settings and regexp classes

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

LOCALE settings and regexp classes

Mészáros Gergely
Hi!

I try to google for solution without success so I'm here. :)

It seems vim regexpes absolutely ignore local collation settings,
what is very unfortunate. OS linux, locale is hu_HU.utf8.
For expl I would like to search for /[a-z]*/ and fail on "néz".

Am I miss some setting or this cannot be done?

While googling I found that range expression [a-z] is not well defined
in different regexp implementations. I think thats ok, but vim
should be *intuitive* and even if [a-z] not defined, vim should use the
local collation. (It does not).

Ok, I thought, there are no wonders, lets try a "word character".
\w definitely should include all character in range a-z according to
local collation (simple egrep does it well). Unfortunately
in vim  /\w*/ also fails. :-(

Any hint/advice please, gruruz out there?

Gergely

ps.: set encoding is utf-8 presumably ok.

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LOCALE settings and regexp classes

Bram Moolenaar

Gergely wrote:

> I try to google for solution without success so I'm here. :)
>
> It seems vim regexpes absolutely ignore local collation settings,
> what is very unfortunate. OS linux, locale is hu_HU.utf8.
> For expl I would like to search for /[a-z]*/ and fail on "néz".
>
> Am I miss some setting or this cannot be done?
>
> While googling I found that range expression [a-z] is not well defined
> in different regexp implementations. I think thats ok, but vim
> should be *intuitive* and even if [a-z] not defined, vim should use the
> local collation. (It does not).
>
> Ok, I thought, there are no wonders, lets try a "word character".
> \w definitely should include all character in range a-z according to
> local collation (simple egrep does it well). Unfortunately
> in vim  /\w*/ also fails. :-(
>
> Any hint/advice please, gruruz out there?
>
> Gergely
>
> ps.: set encoding is utf-8 presumably ok.

You can try using [[:alpha:]] or \i or \k.
See :help whitespace, note the remark just above that.

--
hundred-and-one symptoms of being an internet addict:
23. You can't call your mother...she doesn't have a modem.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\  an exciting new programming language -- http://www.Zimbu.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LOCALE settings and regexp classes

Mészáros Gergely

> You can try using [[:alpha:]] or \i or \k.
> See :help whitespace, note the remark just above that.

I've already tried [[:alpha:]] (not working), but \i and \k works like
a charm. Thank you very much.

However its a bit strange 'cause \i should be used environment
variable checking, where national
characters explicitly forbidden (in bash at least).

Moreover \i and \k supposed to be depending on file formats and many
(most?) programming languages accents
cannot be used in keywords or indentifiers.

Overall, im not realy pleased with mulitbyte handling of this
otherwise superior editor. It seems
to be *very* incosistent.  Can somebody please tell me what is the
proper way to contact devel team to
ask for bugfix or post feature request?

thx in advance
Goteguru

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: LOCALE settings and regexp classes

Павлов Николай Александрович
Reply to message «Re: LOCALE settings and regexp classes»,
sent 15:49:23 14 December 2010, Tuesday
by GoteGuru:

> I've already tried [[:alpha:]] (not working), but \i and \k works like
> a charm. Thank you very much.
[:...:] classes does not work with unicode, only with ASCII.

> However its a bit strange 'cause \i should be used environment
> variable checking, where national
> characters explicitly forbidden (in bash at least).
But not forbidden in zsh and csh.

> Moreover \i and \k supposed to be depending on file formats and many
> (most?) programming languages accents
> cannot be used in keywords or indentifiers.
If you want to use regexs with non-ascii characters, use perl. If you want to
use them in vim, use vim with python support (vim has very bad perl
integration).

> Overall, im not realy pleased with mulitbyte handling of this
> otherwise superior editor. It seems
> to be *very* incosistent.
Yes, it is. I know some more bugs, like impossibility to specify character range
that is more then 255 characters wide in a [...] collection and strange
difference in handling [\uXXXX] and \%uXXXX if XXXX is a hex representation of
diacritics character. All these bugs were reported.

> Can somebody please tell me what is the
> proper way to contact devel team to
> ask for bugfix or post feature request?
vim-dev mailing list. Note that your requests are probably already in a todo
list.

signature.asc (205 bytes) Download Attachment
Loading...