Make langmap accept multi-byte characters

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Make langmap accept multi-byte characters

Konstantin Korikov
Hello all.

I make a patch to support multi-byte characters in langmap.

http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb.patch

--
Best regards,
Konstantin Korikov
--------------------------------------------------
OpenEXR-1.2.2 - A high dynamic-range (HDR) image file format
--------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Bram Moolenaar

Konstantin Korikov wrote:

> I make a patch to support multi-byte characters in langmap.
>
> http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb.patch

Thanks for taking a shot at this.  I assume that the binary search
lookup should be fast enough in most situations.

Instead of using a fixed size array consider using a growarray.  Then
the size won't be limited.  See ga_init() and related functions.

--
hundred-and-one symptoms of being an internet addict:
37. You start looking for hot HTML addresses in public restrooms.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Konstantin Korikov
> Thanks for taking a shot at this.  I assume that the binary search
> lookup should be fast enough in most situations.

It is means that I should remove old algorithm completely? It is means
that the program should use binary search lookup even when FEAT_MBYTE
is not defined?

> Instead of using a fixed size array consider using a growarray.  Then
> the size won't be limited.  See ga_init() and related functions.

Thanks for advice. I will rewrite the patch.

--
Best regards,
Konstantin Korikov
--------------------------------------------------
cups-lpd-1.2.4 - Common Unix Printing System - lpd emulation
--------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Bram Moolenaar

Konstantin Korikov wrote:

> > Thanks for taking a shot at this.  I assume that the binary search
> > lookup should be fast enough in most situations.
>
> It is means that I should remove old algorithm completely? It is means
> that the program should use binary search lookup even when FEAT_MBYTE
> is not defined?

I think it's better to keep the code for now.  Some people may compile
without the multi-byte feature to save on code.

> > Instead of using a fixed size array consider using a growarray.  Then
> > the size won't be limited.  See ga_init() and related functions.
>
> Thanks for advice. I will rewrite the patch.

Good.

--
Those who live by the sword get shot by those who don't.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Konstantin Korikov
> > > Instead of using a fixed size array consider using a growarray.  Then
> > > the size won't be limited.  See ga_init() and related functions.  
> >
> > Thanks for advice. I will rewrite the patch.  
>
> Good.

http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-2.patch

I use combination of binary search and simple array lockup.

--
Best regards,
Konstantin Korikov
--------------------------------------------------
python-2.4.3 - An interpreted, interactive, object-oriented programming language.
--------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Bram Moolenaar

Konstantin Korikov wrote:

> > > > Instead of using a fixed size array consider using a growarray.  Then
> > > > the size won't be limited.  See ga_init() and related functions.  
> > >
> > > Thanks for advice. I will rewrite the patch.  
> >
> > Good.
>
> http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-2.patch
>
> I use combination of binary search and simple array lockup.

It looks like you use pages of 256 characters.  Isn't that a bit much?
Memory will be wasted if in some pages only one character is used.

I wonder why you are using pages.  What is the problem with having a
separate entry for each key?

--
I wonder how much deeper the ocean would be without sponges.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Konstantin Korikov
> > I use combination of binary search and simple array lockup.
>
> It looks like you use pages of 256 characters.  Isn't that a bit much?
> Memory will be wasted if in some pages only one character is used.

Typically a particular language has its own ~256 characters page in the
Unicode table.

For example, Cyrillic: 0x400..0x4ff; Hebrew: 0x590..0x5FF;
Arabic: 0x600..0x6ff.

In most cases a user sets langmap option for its own native language
only, and a value at that consists of mapping of all alphabetic keys
that exists on the keyboard.

So that appropriate page size may be 64..256 characters. I have no
objections to use page size of 64 or 128 characters.

> I wonder why you are using pages.  What is the problem with having a
> separate entry for each key?

Answer is simple. This way is faster when we have a lot of entries
(>100 for example).

--
Best regards,
Konstantin Korikov
--------------------------------------------------
libcdio-0.76 - CD-ROM input and control library
--------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Konstantin Korikov
In reply to this post by Konstantin Korikov
> http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-2.patch
>
> I use combination of binary search and simple array lockup.

This patch have a mistake here:

        /* insert new page at position a */
        pages = (langmap_page_T*)(langmap_pages.ga_data) + a;
        mch_memmove(pages + 1, pages,
                (langmap_pages.ga_len - a) * sizeof(langmap_page_T));
        ++langmap_pages.ga_len;
        pages[0].num = page_num;
        /* init with a-one-to one map */
        for (b = 0; b < (1 << LANGMAP_PAGESIZE_POT); b++)
            pages[0].charmap[b] = b;

Last loop will correctly initialize array only if page number is zero.
To resolve this problem

  char_u charmap[1 << LANGMAP_PAGESIZE_POT];

needs to be replaced with

  int charmap[1 << LANGMAP_PAGESIZE_POT];

But this will increase memory usage in 4 (for 32 bit machines) times.
Another solution: fill array with zeros and insert addition check in
LANGMAP_ADJUST.

But I think there is no reason to complicate this task and separate
entry for each key will be really optimal way, that keeps task simple.

Thanks.

http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-3.patch

--
Best regards,
Konstantin Korikov
--------------------------------------------------
gstreamer-0.10.4 - GStreamer streaming media framework runtime
--------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Bram Moolenaar

Konstantin Korikov wrote:

> > http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-2.patch
> >
> > I use combination of binary search and simple array lockup.
>
> This patch have a mistake here:
>
>         /* insert new page at position a */
> pages = (langmap_page_T*)(langmap_pages.ga_data) + a;
> mch_memmove(pages + 1, pages,
> (langmap_pages.ga_len - a) * sizeof(langmap_page_T));
> ++langmap_pages.ga_len;
> pages[0].num = page_num;
> /* init with a-one-to one map */
> for (b = 0; b < (1 << LANGMAP_PAGESIZE_POT); b++)
>    pages[0].charmap[b] = b;
>
> Last loop will correctly initialize array only if page number is zero.
> To resolve this problem
>
>   char_u charmap[1 << LANGMAP_PAGESIZE_POT];
>
> needs to be replaced with
>
>   int charmap[1 << LANGMAP_PAGESIZE_POT];
>
> But this will increase memory usage in 4 (for 32 bit machines) times.
> Another solution: fill array with zeros and insert addition check in
> LANGMAP_ADJUST.
>
> But I think there is no reason to complicate this task and separate
> entry for each key will be really optimal way, that keeps task simple.
>
> Thanks.
>
> http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-3.patch

Looks OK.  The lookup will be a little slower, but I doubt if someone
would notice.

I wonder if there is any "to" entry that doesn't fit in 8 bits.  All
Normal mode commands are 8 bit.  Perhaps we should give a warning when
someone tries to langmap a character to a multi-byte character?

--
Just remember...if the world didn't suck, we'd all fall off.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///
Reply | Threaded
Open this post in threaded view
|

Re: Make langmap accept multi-byte characters

Konstantin Korikov
> I wonder if there is any "to" entry that doesn't fit in 8 bits.  All
> Normal mode commands are 8 bit.  Perhaps we should give a warning when
> someone tries to langmap a character to a multi-byte character?

I added such warning.

http://lostclus.linux.kiev.ua/patches/all/vim70-langmapmb-4.patch

--
Best regards,
Konstantin Korikov
--------------------------------------------------
eel2-2.14.3 - Eazel Extensions Library.
--------------------------------------------------