Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

adah
Bram Moolenaar wrote:
 

> Yongwei wrote:
>
>> Bram Moolenaar wrote on 2005-05-24 19:21 +0800:
>>
>>>> So setlocale(LC_CTYPE, "C") is safe?  Glad to know that.
>>>
>>> I didn't say it was safe, just that we won't have problems with
>>> those functions.  It's hard to know for sure it's safe for all
>>> functions.  I could introduce this and wait for things to go
>>> wrong...
>>>
>>> Actually, I would expect X libraries to break when the locale isn't
>>> set properly.  Perhaps we should do it for Win32 only.
>>
>> I am mentioning this again, because I applied the following patch in
>> May and have not encountered any problems since then (only _WIN32 is
>> tested).  Maybe it can be accepted now?
>>
>>
========================================================================

>> --- main.c.orig Mon Sep 26 13:49:52 2005
>> +++ main.c      Mon Sep 26 14:04:18 2005
>> @@ -218,6 +218,9 @@
>>       * work until set_init_1() has been called!
>>       */
>>      setlocale(LC_ALL, "");
>> +#if defined(_WIN32) || defined(_WIN64)
>> +    setlocale(LC_CTYPE, "C");
>> +#endif
>>
>>  # ifdef FEAT_GETTEXT
>>      {
>>
========================================================================
>
> OK, let's include it now and await any problem reports.
>
> For completeness this also has to be added, for the ":language"
> command:
>
> [snipped]

Hi, Bram, I did not see an official patch appear regarding this issue,
so I am asking here again.  Do you think it good enough now?  Can it
enter 6.4?

I have currently the patch applied as attached --- in case you forgot
where you put your old and unapplied patches :-).  No problems are found
and everything is OK.

Best regards,

Yongwei


win32_locale_printf.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

Bram Moolenaar

Yongwei wrote:

> >> I am mentioning this again, because I applied the following patch in
> >> May and have not encountered any problems since then (only _WIN32 is
> >> tested).  Maybe it can be accepted now?
> >>
> >>
> ========================================================================
> >> --- main.c.orig Mon Sep 26 13:49:52 2005
> >> +++ main.c      Mon Sep 26 14:04:18 2005
> >> @@ -218,6 +218,9 @@
> >>       * work until set_init_1() has been called!
> >>       */
> >>      setlocale(LC_ALL, "");
> >> +#if defined(_WIN32) || defined(_WIN64)
> >> +    setlocale(LC_CTYPE, "C");
> >> +#endif
> >>
> >>  # ifdef FEAT_GETTEXT
> >>      {
> >>
> ========================================================================
> >
> > OK, let's include it now and await any problem reports.
> >
> > For completeness this also has to be added, for the ":language"
> > command:
> >
> > [snipped]
>
> Hi, Bram, I did not see an official patch appear regarding this issue,
> so I am asking here again.  Do you think it good enough now?  Can it
> enter 6.4?
>
> I have currently the patch applied as attached --- in case you forgot
> where you put your old and unapplied patches :-).  No problems are found
> and everything is OK.

This is a change in Vim 7, it has been included in the latest
snapshots.

I don't want to include this in Vim 6.4, the risk that it breaks
something is too big.

--
BEDEVERE: And that, my lord, is how we know the Earth to be banana-shaped.
                 "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

adah
In reply to this post by adah
Bram Moolenaar wrote:

>> Hi, Bram, I did not see an official patch appear regarding this
>> issue, so I am asking here again.  Do you think it good enough now?
>> Can it enter 6.4?
>>
>> I have currently the patch applied as attached --- in case you forgot
>> where you put your old and unapplied patches :-).  No problems are
>> found and everything is OK.
>
> This is a change in Vim 7, it has been included in the latest
> snapshots.
>
> I don't want to include this in Vim 6.4, the risk that it breaks
> something is too big.

Highly as I respect your opinions, I want to try again to persuade you
on this issue:

1) The risk is small.  It mainly affects the is* functions and the mb*
functions.  You have already taken care that mb* functions are not used
on Win32, I suppose, and do you think Vim will be affected by isalpha
not being true on e-acute?  BTW, I have used my locally patched Vim
successfully on Chinese Windows 2000 and English Windows XP: No problems
are seen.

2) The benefit is high.  Without this patch, Chinese/Japanese/Korean
Windows users cannot reliably use `encoding=utf-8' and the localized
interface at the same time at all.

If it still cannot be formally accepted, maybe I'll have to post my
patched binary on my personal site to encourage Chinese Vimmers to use
UTF-8 ;-).

Best regards,

Yongwei
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

adah
In reply to this post by adah
Bram Moolenaar wrote:
> This is a change in Vim 7, it has been included in the latest
> snapshots.
>
> I don't want to include this in Vim 6.4, the risk that it breaks
> something is too big.

If you are still doubting about the risk (yes, the doubt is reasonable),
may I suggest some mechanisms that is safer?  The pseudo-code is as
follows:

check_locale()
{
  if (encoding == 'utf-8')
    setlocale(LC_CTYPE, "C");
}

set_lang(a:lang)
{
  ...
  setlocale(LC_ALL, ...);
#ifdef _WIN32
  check_locale();
#endif
}

I.e., setting LC_CTYPE=C only when encoding == UTF-8 (or when encoding
!= <ANSI code page>?).  That definitely will not break things, but
instead prevent things from breaking.  How do you think about it?

Best regards,

Yongwei
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

adah
In reply to this post by adah
> The pseudo-code is as follows:
>
> check_locale()
> {
>   if (encoding == 'utf-8')
>     setlocale(LC_CTYPE, "C");
> }
>
> set_lang(a:lang)
> {
>   ...
>   setlocale(LC_ALL, ...);
> #ifdef _WIN32
>   check_locale();
> #endif
> }

Missed one thing:

set_encoding(a:encoding)
{
  ...
  encoding = a:encoding;
#ifdef _WIN32
  check_locale();
#endif
}

Best regards,

Yongwei
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

Bram Moolenaar
In reply to this post by adah

Yongwei wrote:

> Bram Moolenaar wrote:
> > This is a change in Vim 7, it has been included in the latest
> > snapshots.
> >
> > I don't want to include this in Vim 6.4, the risk that it breaks
> > something is too big.
>
> If you are still doubting about the risk (yes, the doubt is reasonable),
> may I suggest some mechanisms that is safer?  The pseudo-code is as
> follows:
>
> check_locale()
> {
>   if (encoding == 'utf-8')
>     setlocale(LC_CTYPE, "C");
> }
>
> set_lang(a:lang)
> {
>   ...
>   setlocale(LC_ALL, ...);
> #ifdef _WIN32
>   check_locale();
> #endif
> }
>
> I.e., setting LC_CTYPE=C only when encoding == UTF-8 (or when encoding
> != <ANSI code page>?).  That definitely will not break things, but
> instead prevent things from breaking.  How do you think about it?

If there are any problems with calling setlocale() then this code would
restrict it to a smaller group of users.  But the problems would still
be there.

The problem is that setlocale() is a library function, we don't know how
it's implemented exactly, and it changes global things.  Therefore it is
impossible to predict what the effect is.  I don't want to change
anything like that just before a release of a stable version.

--
How To Keep A Healthy Level Of Insanity:
2. Page yourself over the intercom. Don't disguise your voice.

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\     Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html   ///
Reply | Threaded
Open this post in threaded view
|

Re: Very bad smell in MSVCRT (Was: Re: Details of the crash caused by "language messages zh_CN.UTF-8")

adah
In reply to this post by adah
Bram Moolenaar wrote:

>> check_locale()
>> {
>>   if (encoding == 'utf-8')
>>     setlocale(LC_CTYPE, "C");
>> }
>> ...
>> I.e., setting LC_CTYPE=C only when encoding == UTF-8 (or when
>> encoding != <ANSI code page>?).  That definitely will not break
>> things, but instead prevent things from breaking.  How do you think
>> about it?
>
> If there are any problems with calling setlocale() then this code
> would restrict it to a smaller group of users.  But the problems would
> still be there.
>
> The problem is that setlocale() is a library function, we don't know
> how it's implemented exactly, and it changes global things.  Therefore
> it is impossible to predict what the effect is.  I don't want to
> change anything like that just before a release of a stable version.

Agreed.  I was not intending it for 6.4.000 either.  Just want to show
some safer way: My test shows that MSVCRT make no difference between a
UTF-8 locale and a C locale character-wise; even if there is some
difference, we need only change check_locale to set locale to something
like ".65001", and it can be extended to other platforms if necessary
(say, matchstr(CurrentLocale, "^[^.]\+") . ".UTF-8" on Linux).  So it is
really better for the future.

Best regards,

Yongwei