Quantcast

Printing with utf-8 characters on Windows

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Printing with utf-8 characters on Windows

Đức Minh Thái
Hello,
I cannot get utf-8 characters printed correctly. For example:
 
bột
 
becomes
 
bá»™t 
My printing options are:
 
set printfont=LMMono10:h10 " This is the LMMono from LaTeX Latin Modern
set printoptions=number:y
set printencoding=ucs-2le bomb
 
 
Please help. Thank you!

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
On Sun, Dec 20, 2009 at 11:36:27AM EST, Đức Minh Thái wrote:
> Hello,
> I cannot get utf-8 characters printed correctly. For example:
>
> bột
>
> becomes
>
> bá»™t

U+1ED9   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW

See:

:help ga

In utf-8, this character is encoded by the following sequence of three
bytes:

0xe1, 0xbb, 0x99

See:

:help g8

This is what a utf-8 encoded file with the three characters 'bột'
actually contains:

00000000  62 e1 bb 99 74 0a                                 |b...t.|
00000006

0x62             b   LATIN SMALL LETTER B
0xe1,0xbb,0x99   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW  
0x74             t   LATIN SMALL LETTER T  

The final 0x0a is a line feed control character.

In Microsoft Windows' cp1252:

0xe1    á
0xbb    »
0x99    ™

  http://en.wikipedia.org/wiki/Windows-1252

You do not give much detail as to where you see what, but I am probably
not far off the mark assuming that 'bột' is what you see when editing a
utf-8 encoded file in vim, and that 'bá»™t' is what you see on your
printout.

Being unfamiliar with Microsoft Windows, I'm speculating a bit, but it
does look like your printing software is processing the file as if it
were cp1252 rather than utf-8.

> My printing options are:
>
> set printfont=LMMono10:h10 " This is the LMMono from LaTeX Latin Modern
> set printoptions=number:y
> set printencoding=ucs-2le bomb

If your file is utf-8 encoded, why do you tell vim that it is ucs2..?

:h penc-option

In particular, this help file states that:

Code page 1252 print character encoding is used by default on Windows
and OS/2 platforms.

> Please help. Thank you!

I am not familiar with Microsoft Windows, so I don't really have an
answer to your question but you could try:

:set penc=

or..

:set penc=utf-8

and see if the 'bột' string prints correctly.

My understanding is that compiled with the adhoc +options, Vim should be
able to process utf-8 encoded files transparently on any platform but
you may also want to ask Vim to convert the file.

Take a look at:

:h ++enc
:h ++ff

If that doesn't help, please attach a small sample file, see if someone
on the list can come up with something more conclusive.

CJ



--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Đức Minh Thái
Hello Chris,

I've tried to print in Linux (I use Linux Mint version 8, the printer is the Print To PDF) and the result is the same as in Windows.

I think this is a bug.

On Wed, Dec 23, 2009 at 3:31 AM, Chris Jones <[hidden email]> wrote:
On Sun, Dec 20, 2009 at 11:36:27AM EST, Đức Minh Thái wrote:
> Hello,
> I cannot get utf-8 characters printed correctly. For example:
>
> bột
>
> becomes
>
> bá»™t

U+1ED9   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW

See:

:help ga

In utf-8, this character is encoded by the following sequence of three
bytes:

0xe1, 0xbb, 0x99

See:

:help g8

This is what a utf-8 encoded file with the three characters 'bột'
actually contains:

00000000  62 e1 bb 99 74 0a                                 |b...t.|
00000006

0x62             b   LATIN SMALL LETTER B
0xe1,0xbb,0x99   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
0x74             t   LATIN SMALL LETTER T

The final 0x0a is a line feed control character.

In Microsoft Windows' cp1252:

0xe1    á
0xbb    »
0x99    ™

 http://en.wikipedia.org/wiki/Windows-1252

You do not give much detail as to where you see what, but I am probably
not far off the mark assuming that 'bột' is what you see when editing a
utf-8 encoded file in vim, and that 'bá»™t' is what you see on your
printout.

Being unfamiliar with Microsoft Windows, I'm speculating a bit, but it
does look like your printing software is processing the file as if it
were cp1252 rather than utf-8.

> My printing options are:
>
> set printfont=LMMono10:h10 " This is the LMMono from LaTeX Latin Modern
> set printoptions=number:y
> set printencoding=ucs-2le bomb

If your file is utf-8 encoded, why do you tell vim that it is ucs2..?

:h penc-option

In particular, this help file states that:

Code page 1252 print character encoding is used by default on Windows
and OS/2 platforms.

> Please help. Thank you!

I am not familiar with Microsoft Windows, so I don't really have an
answer to your question but you could try:

:set penc=

or..

:set penc=utf-8

and see if the 'bột' string prints correctly.

My understanding is that compiled with the adhoc +options, Vim should be
able to process utf-8 encoded files transparently on any platform but
you may also want to ask Vim to convert the file.

Take a look at:

:h ++enc
:h ++ff

If that doesn't help, please attach a small sample file, see if someone
on the list can come up with something more conclusive.

CJ



--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php



--
Minh Duc Thai - StudentID: 0711040
Faculty of Mathematics and Computer Science
University of Science
Vietnam National University - Ho Chi Minh City
227 Nguyen Van Cu street, District 5, Ho Chi Minh City, Vietnam

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Đức Minh Thái

About the ucs2, because the utf8 has failed so many times. :(
On Sat, Jan 2, 2010 at 6:16 PM, Minh Duc Thai <[hidden email]> wrote:
Hello Chris,

I've tried to print in Linux (I use Linux Mint version 8, the printer is the Print To PDF) and the result is the same as in Windows.

I think this is a bug.

On Wed, Dec 23, 2009 at 3:31 AM, Chris Jones <[hidden email]> wrote:
On Sun, Dec 20, 2009 at 11:36:27AM EST, Đức Minh Thái wrote:
> Hello,
> I cannot get utf-8 characters printed correctly. For example:
>
> bột
>
> becomes
>
> bá»™t

U+1ED9   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW

See:

:help ga

In utf-8, this character is encoded by the following sequence of three
bytes:

0xe1, 0xbb, 0x99

See:

:help g8

This is what a utf-8 encoded file with the three characters 'bột'
actually contains:

00000000  62 e1 bb 99 74 0a                                 |b...t.|
00000006

0x62             b   LATIN SMALL LETTER B
0xe1,0xbb,0x99   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
0x74             t   LATIN SMALL LETTER T

The final 0x0a is a line feed control character.

In Microsoft Windows' cp1252:

0xe1    á
0xbb    »
0x99    ™

 http://en.wikipedia.org/wiki/Windows-1252

You do not give much detail as to where you see what, but I am probably
not far off the mark assuming that 'bột' is what you see when editing a
utf-8 encoded file in vim, and that 'bá»™t' is what you see on your
printout.

Being unfamiliar with Microsoft Windows, I'm speculating a bit, but it
does look like your printing software is processing the file as if it
were cp1252 rather than utf-8.

> My printing options are:
>
> set printfont=LMMono10:h10 " This is the LMMono from LaTeX Latin Modern
> set printoptions=number:y
> set printencoding=ucs-2le bomb

If your file is utf-8 encoded, why do you tell vim that it is ucs2..?

:h penc-option

In particular, this help file states that:

Code page 1252 print character encoding is used by default on Windows
and OS/2 platforms.

> Please help. Thank you!

I am not familiar with Microsoft Windows, so I don't really have an
answer to your question but you could try:

:set penc=

or..

:set penc=utf-8

and see if the 'bột' string prints correctly.

My understanding is that compiled with the adhoc +options, Vim should be
able to process utf-8 encoded files transparently on any platform but
you may also want to ask Vim to convert the file.

Take a look at:

:h ++enc
:h ++ff

If that doesn't help, please attach a small sample file, see if someone
on the list can come up with something more conclusive.

CJ



--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php



--
Minh Duc Thai - StudentID: 0711040
Faculty of Mathematics and Computer Science
University of Science
Vietnam National University - Ho Chi Minh City
227 Nguyen Van Cu street, District 5, Ho Chi Minh City, Vietnam



--
Minh Duc Thai - StudentID: 0711040
Faculty of Mathematics and Computer Science
University of Science
Vietnam National University - Ho Chi Minh City
227 Nguyen Van Cu street, District 5, Ho Chi Minh City, Vietnam

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
On Sat, Jan 02, 2010 at 06:17:35AM EST, Minh Duc Thai wrote:

> About the ucs2, because the utf8 has failed so many times. :(
> On Sat, Jan 2, 2010 at 6:16 PM, Minh Duc Thai <[hidden email]> wrote:
>
> > Hello Chris,

> > I've tried to print in Linux (I use Linux Mint version 8, the
> > printer is the Print To PDF) and the result is the same as in
> > Windows.

> > I think this is a bug.

Possibly, but this doesn't tell us much as to where the bug might be,
vim, printing software, etc.

Anyway, the fact that you are able to recreate on a linux system is good
news, since I don't have access to any version of Microsoft Windows.

I believe I suggested attaching a _short_ sample file, so I can take a
look at the file and possibly give printing on a shot and see if I can
recreate the problem.

Also, please trim your posts to something manageable. There is no sense
in repeating my initial reply - about 100 lines - to add just a couple
of lines of your own. Just keep whatever is relevant of the post you are
replying to.

And lastly, try to avoid posting an html copy of your message, unless
you have good cause to do so.

| http://www.vim.org/maillist.php

So please post back with a sample file attached to your message (not
something copied and pasted into your message) and see if I or someone
else can come up with some idea as to what's going on.

CJ

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Đức Minh Thái
Hello Chris,

I've attached example files and my .vimrc configuration file. Hope you or someone can help.

Thank you!

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php

0.txt (18 bytes) Download Attachment
0.pdf (8K) Download Attachment
.vimrc (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
On Mon, Jan 04, 2010 at 08:57:36AM EST, Minh Duc Thai wrote:
> Hello Chris,
>
> I've attached example files and my .vimrc configuration file. Hope you or
> someone can help.
>
> Thank you!

> Bột bột

It looks like you sent this to me directly instead of the list.

CJ



--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Yongwei Wu
In reply to this post by Đức Minh Thái
2010/1/4 Minh Duc Thai <[hidden email]>:
> Hello Chris,
> I've attached example files and my .vimrc configuration file. Hope you or
> someone can help.
> Thank you!

I tested using your example, and can confirm that Print does not work
well on Windows when enc=utf-8.  I tried enc=latin1 and enc=cp936, and
in both cases Print is successful (as long as the characters can be
displayed in that encoding).

I tested using the Adobe PDF driver.

--
Wu Yongwei
URL: http://wyw.dcweb.cn/

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
In reply to this post by Chris Jones-44
On Tue, Jan 05, 2010 at 08:18:12AM EST, Chris Jones wrote:
> On Mon, Jan 04, 2010 at 08:57:36AM EST, Minh Duc Thai wrote:

> > Hello Chris,

> > I've attached example files and my .vimrc configuration file. Hope
> > you or someone can help.

> > Thank you!
>
> > Bột bột
>
> It looks like you sent this to me directly instead of the list.
>
> CJ

Hmm well.. your .virmc apparently did not make it to the list.

In any case, it looks like I am able to recreate your problem here or at
least getting similar resultts.

I tried changing the printfont to GNU/unifont, which I know has a glyph
for U+1ED9 -  :set printfont=unifont - and I was still getting the same
results. I tried other fonts and it looked like my 'printfont' settings
were silently ignored.

Then I saw this, under :help postscript-printing:

| There are currently a number of limitations with PostScript printing:
|
| - 'printfont' - The font name is ignored (the Courier family is always
|    used - it should be available on all PostScript printers) but the
|    font size is used.

I'm not sure how I could determine what font might correspond to 'the
Courier family' but it looks like it's defaulting to a font that  has
no support for anything beyond U+0100.

Maybe s/o could shed some light on this?

Anyway, I was able to print your sample by invoking the paps converter:

| :%w ! paps --font="arial 8" --paper letter | lpr     " proportional

| :%w ! paps --font="unifont 10" --paper letter | lpr  " monospace

You may need to install paps on your linux system since it's not part of
Vim, and then you could give this a try, possibly dropping the '--paper
letter' switch if you want 'paps' to default to A4.

Let me know if this helps,

CJ

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
In reply to this post by Yongwei Wu
On Tue, Jan 05, 2010 at 08:22:48AM EST, Yongwei Wu wrote:
> 2010/1/4 Minh Duc Thai <[hidden email]>:

> > Hello Chris,

> > I've attached example files and my .vimrc configuration file. Hope you or
> > someone can help.
> > Thank you!
>
> I tested using your example, and can confirm that Print does not work
> well on Windows when enc=utf-8.  I tried enc=latin1 and enc=cp936, and
> in both cases Print is successful (as long as the characters can be
> displayed in that encoding).
>
> I tested using the Adobe PDF driver.

On debian stable, I also tested printing utf-8 encoded files containing
samples of CJK, Devanagari, and a couple other Eastern scripts and I was
unable to get :hardcopy to print their contents.

Since utf-8 is the default encoding on debian Lenny, I find it hard to
believe that the Vim to Postscript implementation would not function out
of the box with utf-8 encoded files, and even less plausible that I was
unable to find anyone reporting this issue while searching online, apart
from a few reports where Vim 7.0 or older was involved, and dating back
7-8 years ago.

Leads me to think that there's more to it than the speculations in my
earlier post today.

Note, that I tried to implement the following in my .vimrc, also without
success:

| set printexpr=PrintFile(v:fname_in)                                            
| function PrintFile(fname)                                                      
|  call system('paps --font="unifont 8" --paper letter | lpr ' . a:fname)          
|  call delete(a:fname)                                                          
| return v:shell_error                                                          

The characters from the 'exotic' scripts were replaced by inverted
question marks or blanks, and _as far as I can tell_ it looked as if the
same ASCII or latin1 font was used not matter what font I passed to the
paps converter.

Can anyone shed some light on this mattter?

Thanks,

CJ

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Benjamin R. Haskell-8
On Tue, 5 Jan 2010, Chris Jones wrote:

> On debian stable, I also tested printing utf-8 encoded files containing
> samples of CJK, Devanagari, and a couple other Eastern scripts and I was
> unable to get :hardcopy to print their contents.
>
> Since utf-8 is the default encoding on debian Lenny, I find it hard to
> believe that the Vim to Postscript implementation would not function out
> of the box with utf-8 encoded files, and even less plausible that I was
> unable to find anyone reporting this issue while searching online, apart
> from a few reports where Vim 7.0 or older was involved, and dating back
> 7-8 years ago.
Printing UTF-8 text is hard, since PostScript doesn't support it
natively.  I was pretty surprised that 'enscript' never made it into the
Unicode age.  'paps' is the only thing I found that seems to do a
reasonable job.  Though, just now (while trying to find the page I found
yesterday) I found a few entries in a UTF-8 and Unicode FAQ under
'Printing'[1].

[1] http://www.cl.cam.ac.uk/~mgk25/unicode.html

CUPS supposedly handles UTF-8 via the texttops filter, but I was unable
to get anything reasonable (even fiddling with 'CHARSET=' and '-o
document-format=text/plain;charset=' options).  I eventually gave up and
replaced /usr/libexec/cups/filter/texttops with the following script:

#!/bin/sh
paps < "$6" | title="$3" perl -lpwe 's/stdin/$ENV{title}/ if 2==$.'



> Leads me to think that there's more to it than the speculations in my
> earlier post today.
>
> Note, that I tried to implement the following in my .vimrc, also without
> success:
>
> | set printexpr=PrintFile(v:fname_in)                                            
> | function PrintFile(fname)                                                      
> |  call system('paps --font="unifont 8" --paper letter | lpr ' . a:fname)          
> |  call delete(a:fname)                                                          
> | return v:shell_error                                                          
>
> The characters from the 'exotic' scripts were replaced by inverted
> question marks or blanks, and _as far as I can tell_ it looked as if the
> same ASCII or latin1 font was used not matter what font I passed to the
> paps converter.
>
> Can anyone shed some light on this mattter?
From the docs, printexpr only affects how the generated PS temp file
gets printed.  So, if Vim's already subbing out the chars in the PS,
it's not going to matter what happens next.

Testing with :ha > test.ps shows that no matter what encoding or
fileencoding or printencoding or printmbencoding I tried, it still shows
up as latin1 in the resulting PostScript.  Which is weird considering
the various charset handling that appears to be done in src/hardcopy.c.

The only way I was able to get decent printouts was by just shelling out
to paps:

:!paps < % > test.ps

Best,
Ben

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
On Tue, Jan 05, 2010 at 11:39:44AM EST, Benjamin R. Haskell wrote:
> On Tue, 5 Jan 2010, Chris Jones wrote:

[..]

> > Since utf-8 is the default encoding on debian Lenny, I find it hard
> > to believe that the Vim to Postscript implementation would not
> > function out of the box with utf-8 encoded files,  

[..]

> Printing UTF-8 text is hard, since PostScript doesn't support it
> natively.  

Actually, since this is rather messy and I'm probably not going to take
another a look at it for some time, I decided to write my own personal
mini-howto on the subject, and since I was unable to quickly think of a
short elegant preamble, I wrote: "Printing UTF8-encoded files is tricky
at best.." ;-)

> I was pretty surprised that 'enscript' never made it into the
> Unicode age.  'paps' is the only thing I found that seems to do a
> reasonable job.  Though, just now (while trying to find the page I found
> yesterday) I found a few entries in a UTF-8 and Unicode FAQ under
> 'Printing'[1].
>
> [1] http://www.cl.cam.ac.uk/~mgk25/unicode.html

Saw that too.. Nothing helpful.

> CUPS supposedly handles UTF-8 via the texttops filter, but I was unable
> to get anything reasonable (even fiddling with 'CHARSET=' and '-o
> document-format=text/plain;charset=' options).  I eventually gave up and
> replaced /usr/libexec/cups/filter/texttops with the following script:

Went down that road, only to reach the same dead end.

> #!/bin/sh
> paps < "$6" | title="$3" perl -lpwe 's/stdin/$ENV{title}/ if 2==$.'

[..]

> > Can anyone shed some light on this mattter?

> From the docs, printexpr only affects how the generated PS temp file
> gets printed.  So, if Vim's already subbing out the chars in the PS,
> it's not going to matter what happens next.

Pretty much what I speculated.

> Testing with :ha > test.ps shows that no matter what encoding or
> fileencoding or printencoding or printmbencoding I tried, it still
> shows up as latin1 in the resulting PostScript.  Which is weird
> considering the various charset handling that appears to be done in
> src/hardcopy.c.

I was expecting to find a bug report somewhere, or would that be a Vim
enhancement request - i.e. lifting this limitation, and saw nothing.

> The only way I was able to get decent printouts was by just shelling
> out to paps:

> :!paps < % > test.ps

Looks like I was on the right track re: the OP's problem then, and one
variation or other involving paps should fix it for him.

Thank you for your comments,

CJ

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
BPJ
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

BPJ
Chris Jones skrev:

> On Tue, Jan 05, 2010 at 11:39:44AM EST, Benjamin R. Haskell wrote:
>> The only way I was able to get decent printouts was by just shelling
>> out to paps:
>
>> :!paps < % > test.ps
>
> Looks like I was on the right track re: the OP's problem then, and one
> variation or other involving paps should fix it for him.
>
> Thank you for your comments,
Could someone please put this on vim.wikia.com?
It's really useful.

/BP



--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Yongwei Wu
In reply to this post by Chris Jones-44
2010/1/6 Chris Jones <[hidden email]>:
>> The only way I was able to get decent printouts was by just shelling
>> out to paps:
>
>> :!paps < % > test.ps

Except that the message subject is "Printing with utf-8 characters on
WINDOWS"....

The real universal solution for non-ASCII characters is NOT print from
Vim. Convert the document to HTML with ":TOhtml", and then print from
your browser.

--
Wu Yongwei
URL: http://wyw.dcweb.cn/

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Đức Minh Thái

Except that the message subject is "Printing with utf-8 characters on
 WINDOWS"....

Yes, the message subject is "... on Windows.'' since my printer driver works better than its cousin on Linux.

I'm thinking about the PDF-direct generate feature in Vim. Is it a reasonable feature? Should it be implement?

--
Minh Duc Thai - StudentID: 0711040
Faculty of Mathematics and Computer Science
University of Science
Vietnam National University - Ho Chi Minh City
227 Nguyen Van Cu street, District 5, Ho Chi Minh City, Vietnam

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Mike Williams
In reply to this post by Chris Jones-44
Hi,

I wrote the original PS driver for VIM, several years ago now.  This is
somewhat OT from the OP as it is not Windows related.  If you are not
interested, stop reading now.

The PS driver relies on fonts being present in the printer.  The only
ones guaranteed to be there are the base 35 western fonts (Courier,
Times, etc).  However, far east printers will have a few multi-byte
fonts to support CJK printing, for which the printmbcharset et al
options and handling of multi-byte encodings was added.  It is possible
to install additional multi-byte fonts on the printer which could also
be used

Technically PostScript is text encoding agnostic - it just deals with
sequences of byte values.  The selected font defines how to interpret
the byte sequence, as single bytes or a multi-byte encoding of some kind.

A lot depends on the characters being used.  If you are using UTF-8
encoding for text that exists in a single ISO-8859 character set then
you can just set printencoding and VIM should translate the UTF-8
encoded text to single bytes for printing.  If you are using characters
from multiple ISO-8859 character sets then things start to get complicated.

If you are just using ISO-8859 characters then it would be possible (but
not currently implemented) to support many such character sets when
printing with a single font.

If you are using true multiple-byte characters (i.e. ones not present in
any of the ISO-8859 or cp character sets) then you will need to use a
multi-byte font and the big issue is with handling them - their
discovery on the host system, metrics calculation for text layout,
selection of a sub-set of the contents (multi-byte fonts tend to be
large - do you want to generate a 12MB PS file to print <1K of text?),
and embedding in the generated PS.

Not a trivial problem to solve at the time.  When discussed with Bram it
was decided this was not wanted.  Dunno if time has changed the argument
at all.

TTFN

On 05/01/2010 17:47, Chris Jones wrote:

> On Tue, Jan 05, 2010 at 11:39:44AM EST, Benjamin R. Haskell wrote:
>> On Tue, 5 Jan 2010, Chris Jones wrote:
>
> [..]
>
>>> Since utf-8 is the default encoding on debian Lenny, I find it hard
>>> to believe that the Vim to Postscript implementation would not
>>> function out of the box with utf-8 encoded files,
>
> [..]
>
>> Printing UTF-8 text is hard, since PostScript doesn't support it
>> natively.
>
> Actually, since this is rather messy and I'm probably not going to take
> another a look at it for some time, I decided to write my own personal
> mini-howto on the subject, and since I was unable to quickly think of a
> short elegant preamble, I wrote: "Printing UTF8-encoded files is tricky
> at best.." ;-)
>
>> I was pretty surprised that 'enscript' never made it into the
>> Unicode age.  'paps' is the only thing I found that seems to do a
>> reasonable job.  Though, just now (while trying to find the page I found
>> yesterday) I found a few entries in a UTF-8 and Unicode FAQ under
>> 'Printing'[1].
>>
>> [1] http://www.cl.cam.ac.uk/~mgk25/unicode.html
>
> Saw that too.. Nothing helpful.
>
>> CUPS supposedly handles UTF-8 via the texttops filter, but I was unable
>> to get anything reasonable (even fiddling with 'CHARSET=' and '-o
>> document-format=text/plain;charset=' options).  I eventually gave up and
>> replaced /usr/libexec/cups/filter/texttops with the following script:
>
> Went down that road, only to reach the same dead end.
>
>> #!/bin/sh
>> paps<  "$6" | title="$3" perl -lpwe 's/stdin/$ENV{title}/ if 2==$.'
>
> [..]
>
>>> Can anyone shed some light on this mattter?
>
>>  From the docs, printexpr only affects how the generated PS temp file
>> gets printed.  So, if Vim's already subbing out the chars in the PS,
>> it's not going to matter what happens next.
>
> Pretty much what I speculated.
>
>> Testing with :ha>  test.ps shows that no matter what encoding or
>> fileencoding or printencoding or printmbencoding I tried, it still
>> shows up as latin1 in the resulting PostScript.  Which is weird
>> considering the various charset handling that appears to be done in
>> src/hardcopy.c.
>
> I was expecting to find a bug report somewhere, or would that be a Vim
> enhancement request - i.e. lifting this limitation, and saw nothing.
>
>> The only way I was able to get decent printouts was by just shelling
>> out to paps:
>
>> :!paps<  %>  test.ps
>
> Looks like I was on the right track re: the OP's problem then, and one
> variation or other involving paps should fix it for him.
>
> Thank you for your comments,
>
> CJ
>

Mike
--
yip yip yip yip yap yap yip *BANG* - NO TERRIER

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Chris Jones-44
On Wed, Jan 06, 2010 at 06:20:42AM EST, Mike Williams wrote:
> Hi,
>
> I wrote the original PS driver for VIM, several years ago now.  This is  
> somewhat OT from the OP as it is not Windows related.  

I think he wrote somewhere that he prints from Windows because his
printer is better supported.

> If you are not  interested, stop reading now.

I'm not sure who wouldn't be. As far as I'm concerned, you are salvaging
the thread from guesswork and speculations, thank goodness for that.

> The PS driver relies on fonts being present in the printer.  The only
> ones guaranteed to be there are the base 35 western fonts (Courier,
> Times, etc).  However, far east printers will have a few multi-byte
> fonts to support CJK printing, for which the printmbcharset et al
> options and handling of multi-byte encodings was added.  It is
> possible to install additional multi-byte fonts on the printer which
> could also be used

I have an old HP LaserJet 2100 that's still running on the original
cartridge. Do you mean that if I wanted to be able to use :hardcopy to
successfully print any character from the Unicode BMP, I would be able
to do so after installing a universal font such as GNU/Unifont on the
printer?

> Technically PostScript is text encoding agnostic - it just deals with
> sequences of byte values.  The selected font defines how to interpret
> the byte sequence, as single bytes or a multi-byte encoding of some
> kind.

So, in a UTF-8 context and with multi-byte characters, I'm still unclear
as to why I can use paps to create a .ps file that will print correctly
on my printer, and unable to use Vim's :hardcopy command to do the same
thing.

Why can't the :hardcopy command perform the same magic?

> A lot depends on the characters being used.  If you are using UTF-8
> encoding for text that exists in a single ISO-8859 character set then
> you can just set printencoding and VIM should translate the UTF-8
> encoded text to single bytes for printing.  If you are using
> characters from multiple ISO-8859 character sets then things start to
> get complicated.

> If you are just using ISO-8859 characters then it would be possible
> (but  not currently implemented) to support many such character sets
> when  printing with a single font.

> If you are using true multiple-byte characters (i.e. ones not present
> in  any of the ISO-8859 or cp character sets) then you will need to
> use a  multi-byte font and the big issue is with handling them - their
> discovery on the host system, metrics calculation for text layout,
> selection of a sub-set of the contents (multi-byte fonts tend to be
> large - do you want to generate a 12MB PS file to print <1K of text?),
> and embedding in the generated PS.

Yes, GNU/unifont, at least the file on my HDD is 16MB and it would
hardly make sense to download it to the printer with each an every print
job. But that would not be necessary if the font resided on the printer.

In any event, the size of the .ps file created by paps from an one-line
Vim buffer containing 'Bột bột' and nothing else is only 7.2. I looked
at a 16K UTF8-encoded text file containing multi-byte  characters and
the resulting .ps file that paps created was 329K.

So, I definitely missing something [some things] :-)

> Not a trivial problem to solve at the time.  When discussed with Bram
> it  was decided this was not wanted.  Dunno if time has changed the
> argument  at all.

Maybe these aspects should be clarified under :h postscript-printing
under limitations:multi-byte support.

Sorry if I'm asking the wrong questions, I don't know Postscript and I
have no experience with printers.

> TTFN
>
> Mike
> --
> yip yip yip yip yap yap yip *BANG* - NO TERRIER

That can't have been a *BULL*Terrier, then.. ;-)

CJ

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Valery Kondakoff
In reply to this post by Mike Williams
On 06.01.2010 14:20, Mike Williams wrote:

> Not a trivial problem to solve at the time. When discussed with Bram it
> was decided this was not wanted. Dunno if time has changed the argument
> at all.

I'm complaining about this issue for the last ten years. This is just
unbelievable: such a mighty text editor as gVim just does not allow
Windows international users to print their texts when gVim is set to use
UTF-8 as it's internal encoding... :(

Note, please: you are _forced_ to use the UTF-8 as gVim internal
encoding if you want to be able to perform encoding conversions...

I just don't remember any other text editor with such restriction (not
counting the crippleware ones)...

For many of us printing is as important as saving your edits.
Can you imagine a full-featured text editor in a year 2010 which does
not allow users to save or print the text files? :(

--
Best regards,
  Valery Kondakoff

PGP key:
http://pool.sks-keyservers.net:11371/pks/lookup?op=get&search=0xEEDF8590
np: The Big Pink'2009 (A Brief History Of Love) - Crystal Visions

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

bill lam
In reply to this post by Mike Williams
mer, 06 Jan 2010, Mike Williams skribis:

> If you are using true multiple-byte characters (i.e. ones not present
> in any of the ISO-8859 or cp character sets) then you will need to
> use a multi-byte font and the big issue is with handling them - their
> discovery on the host system, metrics calculation for text layout,
> selection of a sub-set of the contents (multi-byte fonts tend to be
> large - do you want to generate a 12MB PS file to print <1K of
> text?), and embedding in the generated PS.
>
> Not a trivial problem to solve at the time.  When discussed with Bram
> it was decided this was not wanted.  Dunno if time has changed the
> argument at all.
While I don't know how to print in PS or gtk, however, from my
experience in using gdi api to print unicode CJK in window, I don't
think it is all that difficult to print CJK character.  

--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Printing with utf-8 characters on Windows

Mike Williams
In reply to this post by Valery Kondakoff
On 07/01/2010 10:04, Valery Kondakoff wrote:

> On 06.01.2010 14:20, Mike Williams wrote:
>
>> Not a trivial problem to solve at the time. When discussed with Bram it
>> was decided this was not wanted. Dunno if time has changed the argument
>> at all.
>
> I'm complaining about this issue for the last ten years. This is just
> unbelievable: such a mighty text editor as gVim just does not allow
> Windows international users to print their texts when gVim is set to use
> UTF-8 as it's internal encoding... :(
Indeed.  Windows supports encoding conversion so it should be possible
to do it as part of gvim without having to find a copy of iconv.  It
just hasn't been an issue for any of the Windows VIM developers.  There
is not a lot I can do about that.

> Note, please: you are _forced_ to use the UTF-8 as gVim internal
> encoding if you want to be able to perform encoding conversions...
>
> I just don't remember any other text editor with such restriction (not
> counting the crippleware ones)...
>
> For many of us printing is as important as saving your edits.
> Can you imagine a full-featured text editor in a year 2010 which does
> not allow users to save or print the text files? :(
>
I hear you.  New features seem to be all the rage, I doubt this would be
a candidate for GSoC which would be a nice way to sort this all out.

TTFN

Mike
--
Education is what you get from reading the small print; experience is
what you get from not reading it.


--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php
12
Loading...