rare prefixes

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

rare prefixes

Bram Moolenaar

Moshe Kaminsky wrote:

> I have some more issues: Some of the prefixes are really rare.
> I saw there is a way to specify rare words. Is there a way to
> specify that a suffix is rare?

I have now implemented rare prefixes.  It's actually also possible for
suffixes.  This works with and without PFXPOSTPONE.  Simply add "rare"
after the entry in the .aff file.  Example:

        PFX F   0     con         .    rare

Have a try with last night's snapshot

The implementation was easier than I expected.  Just a matter of passing
the "rare" flag from reading the affix file all the way through the .spl
file to where the word is found.  There was a bit available in most
places.

--
hundred-and-one symptoms of being an internet addict:
170. You introduce your wife as "[hidden email]" and refer to your
     children as "forked processes."

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///
Reply | Threaded
Open this post in threaded view
|

Re: rare prefixes

Moshe Kaminsky
Hi,

* Bram Moolenaar <[hidden email]> [28/06/05 17:42]:

>
> Moshe Kaminsky wrote:
>
> > I have some more issues: Some of the prefixes are really rare.
> > I saw there is a way to specify rare words. Is there a way to
> > specify that a suffix is rare?
>
> I have now implemented rare prefixes.  It's actually also possible for
> suffixes.  This works with and without PFXPOSTPONE.  Simply add "rare"
> after the entry in the .aff file.  Example:
>
> PFX F   0     con         .    rare
>
> Have a try with last night's snapshot
>
> The implementation was easier than I expected.  Just a matter of passing
> the "rare" flag from reading the affix file all the way through the .spl
> file to where the word is found.  There was a bit available in most
> places.
Thanks. I have adapted some of the programs in the hspell package into a
script that generates input files with rare prefixes. It appears to work
fine. The script is attached.

Thanks,
Moshe

>
> --
> hundred-and-one symptoms of being an internet addict:
> 170. You introduce your wife as "[hidden email]" and refer to your
>      children as "forked processes."
>
>  /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
> ///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
> \\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
>  \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///
>
--
I love deadlines. I like the whooshing sound they make as they fly by.
                                        -- Douglas Adams
   
    Moshe Kaminsky <[hidden email]>
    Home: 08-9456841


hspell2vim.gz (7K) Download Attachment
attachment1 (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rare prefixes

Bram Moolenaar

Moshe Kaminsky wrote:

> > I have now implemented rare prefixes.
[...]

> Thanks. I have adapted some of the programs in the hspell package into a
> script that generates input files with rare prefixes. It appears to work
> fine. The script is attached.

I'm glad it works properly.  I won't use the script, since I don't want
to be the maintainer for all the various spell files!

I have now moved the building blocks for the spell files to
runtime/spell/LL, where LL is the language name.  I am using an Aap
recipe to download the OpenOffice.org spell files, apply a patch and
generate the .spl file.  That works quite nicely to avoid having to do
it all manually.

Now we need maintainers!  I have already made a few examples for
English, French, Polish, etc.  The idea is that the maintainer tweaks
the .aff and .dic files for generating a .spl file with Vim.  Then make
a patch with "aap diff" and send that to me.  I'll include that with the
distribution, so that everybody can produce their own .spl files.

I also make the .spl files available on the ftp site (for those people
who can't or don't want to build .spl files themselves):

        ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/

Please have a look at the Hebrew one and provide me with files/patches
for runtime/spell/he/.  I'll also generate an iso-8859-2 one, but I
doubt that it's valid (my FreeBSD system doesn't have a iso-8859-2
locale for Hebrew).  Adding the UPP/FOL/LOW lines in the .aff file will
help.

--
hundred-and-one symptoms of being an internet addict:
236. You start saving URL's in your digital watch.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///
Reply | Threaded
Open this post in threaded view
|

Re: rare prefixes

Moshe Kaminsky
* Bram Moolenaar <[hidden email]> [05/07/05 12:32]:

>
> Moshe Kaminsky wrote:
>
> > > I have now implemented rare prefixes.
> [...]
>
> > Thanks. I have adapted some of the programs in the hspell package into a
> > script that generates input files with rare prefixes. It appears to work
> > fine. The script is attached.
>
> I'm glad it works properly.  I won't use the script, since I don't want
> to be the maintainer for all the various spell files!
>
> I have now moved the building blocks for the spell files to
> runtime/spell/LL, where LL is the language name.  I am using an Aap
> recipe to download the OpenOffice.org spell files, apply a patch and
> generate the .spl file.  That works quite nicely to avoid having to do
> it all manually.
I didn't mean you should use the script, I just wanted to make it
available somewhere, so that the files can be generated. I could send a
patch against the files I generated, but it will be longer than the
original version: Many of the prefixes are rare, quite a lot of prefixes
are omitted in the Myspell version since they are rare, and there are
other differences.

I could simply supply my version of the files, but I'm not sure that my
choice of rare prefixes is good. The script allows to specify them, and
also has some other options to affect the list.

In fact, playing with the suggestions, I realised that my approach was
too simple-minded. I will a new version with more options in a few days.

>
> Now we need maintainers!  I have already made a few examples for
> English, French, Polish, etc.  The idea is that the maintainer tweaks
> the .aff and .dic files for generating a .spl file with Vim.  Then make
> a patch with "aap diff" and send that to me.  I'll include that with the
> distribution, so that everybody can produce their own .spl files.
>
> I also make the .spl files available on the ftp site (for those people
> who can't or don't want to build .spl files themselves):
>
> ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/
>
> Please have a look at the Hebrew one and provide me with files/patches
> for runtime/spell/he/.  I'll also generate an iso-8859-2 one, but I
> doubt that it's valid (my FreeBSD system doesn't have a iso-8859-2
> locale for Hebrew).
Did you mean iso-8859-8? I can generate them, but again, we first need
to choose the input.

> Adding the UPP/FOL/LOW lines in the .aff file will help.

I had a question about that. The help says that they need not be
specified when the encoding is UTF-8. I guess we're talking about the
encoding of the input files, not the 'encoding' option, right? I thought
this information is available in some Unicode tables.

Anyway, in Hebrew there is no case at all, so I thought it can be
omitted. I saw that I had words marked as having wrong capitalisation,
so I guess that's not true. I'll just add three identical lines.

BTW, the anchor *hl-SpellCap* is missing under *highlight-groups*.

Thanks,
Moshe

>
> --
> hundred-and-one symptoms of being an internet addict:
> 236. You start saving URL's in your digital watch.
>
>  /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
> ///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
> \\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
>  \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///
>
--
I love deadlines. I like the whooshing sound they make as they fly by.
                                        -- Douglas Adams
   
    Moshe Kaminsky <[hidden email]>
    Home: 08-9456841


attachment0 (205 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rare prefixes

Bram Moolenaar

Moshe Kaminsky wrote:

> > I also make the .spl files available on the ftp site (for those people
> > who can't or don't want to build .spl files themselves):
> >
> > ftp://ftp.vim.org/pub/vim/unstable/runtime/spell/
> >
> > Please have a look at the Hebrew one and provide me with files/patches
> > for runtime/spell/he/.  I'll also generate an iso-8859-2 one, but I
> > doubt that it's valid (my FreeBSD system doesn't have a iso-8859-2
> > locale for Hebrew).
>
> Did you mean iso-8859-8?

Yes.

> I can generate them, but again, we first need to choose the input.

OK.

> > Adding the UPP/FOL/LOW lines in the .aff file will help.
>
> I had a question about that. The help says that they need not be
> specified when the encoding is UTF-8. I guess we're talking about the
> encoding of the input files, not the 'encoding' option, right? I thought
> this information is available in some Unicode tables.

That's about the encoding of the output file.

The idea is that when 'encoding' is "utf-8" then we know what characters
are word characters, this doesn't depend on the locale.  However, I just
fixed a bug that 'iskeyword' would be used.

When 'encoding' is iso-8859-8 then Vim would have to is isalpha() to
find out what characters are word characters.  But isalpha() depends on
the current locale, thus this only works when the system has a Hebrew
locale for iso-8859-8.  Mine doesn't.  I can still set 'encoding' to
iso-8859-8, but isalpha() won't work properly.  That's what the
UPP/FOL/LOW tables are for.

> Anyway, in Hebrew there is no case at all, so I thought it can be
> omitted. I saw that I had words marked as having wrong capitalisation,
> so I guess that's not true. I'll just add three identical lines.

That should be fine.  They will then only be used to recognize word
characters.

> BTW, the anchor *hl-SpellCap* is missing under *highlight-groups*.

I'll add it.

--
hundred-and-one symptoms of being an internet addict:
247. You use www.switchboard.com instead of dialing 411 and 555-12-12
     for directory assistance.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///