Substitution ignoring combining marks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
BPJ
Reply | Threaded
Open this post in threaded view
|

Substitution ignoring combining marks

BPJ
It seems that I at some point, or a default, has set s/// to
automatically ignore differences in combining marks, which at the
moment is a problem since I'm searching for the word "ánd", with
a combining accent, in a text with English as metalanguage![^1]
I've looked around in the help and in my .vimrc to no avail. I
guess I could comment out every uncommented :set in my .vimrc
until I hit the right one :-) but hopefully one of you can give me
the right answer quicker than that!

TIA,

/bpj

[^1]: The metalanguage is the language a text about a linguistic
subject is written *in*, as opposed to the language(s) it is
written *about* which is/are the object language(s).

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Substitution ignoring combining marks

Tony Mechelynck
On Sat, Apr 15, 2017 at 2:54 PM, BPJ <[hidden email]> wrote:
> It seems that I at some point, or a default, has set s/// to automatically
> ignore differences in combining marks, which at the moment is a problem
> since I'm searching for the word "ánd", with a combining accent, in a text
> with English as metalanguage![^1] I've looked around in the help and in my
> .vimrc to no avail. I guess I could comment out every uncommented :set in my
> .vimrc until I hit the right one :-) but hopefully one of you can give me
> the right answer quicker than that!

see :help patterns-composing

\Z anywhere in a pattern makes the whole pattern insensitive to
combining characters

\%C makes the immediately preceding atom (usually a letter) match
regardless of combining characters

Note: á (small latin a with acute U+00E1), á (small latin a U+0061
combining acute U+0301) and of course а́ (small cyrillic a U+0430
combining acute U+0301) are all different, none of them matches either
of the other two.


Best regards,
Tony.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
BPJ
Reply | Threaded
Open this post in threaded view
|

Re: Substitution ignoring combining marks

BPJ


Den 15 apr 2017 23:11 skrev "Tony Mechelynck" <[hidden email]>:
On Sat, Apr 15, 2017 at 2:54 PM, BPJ <[hidden email]> wrote:
> It seems that I at some point, or a default, has set s/// to automatically
> ignore differences in combining marks, which at the moment is a problem
> since I'm searching for the word "ánd", with a combining accent, in a text
> with English as metalanguage![^1] I've looked around in the help and in my
> .vimrc to no avail. I guess I could comment out every uncommented :set in my
> .vimrc until I hit the right one :-) but hopefully one of you can give me
> the right answer quicker than that!

see :help patterns-composing

\Z anywhere in a pattern makes the whole pattern insensitive to
combining characters

\%C makes the immediately preceding atom (usually a letter) match
regardless of combining characters

Note: á (small latin a with acute U+00E1), á (small latin a U+0061
combining acute U+0301) and of course а́ (small cyrillic a U+0430
combining acute U+0301) are all different, none of them matches either
of the other two.

I'm aware of that. My problem is that "\<ánd\>" (with U+0061 + U+0301) also matches "and" without U+0301 without there being any explicit \Z in the pattern. I solved the task at hand by converting the files to NFC, doing the substitution on that and then converted the file back to NFD, but the behavior is strange and annoying.

/bpj

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.