Quantcast

MacVim file encoding and Quicklook

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

MacVim file encoding and Quicklook

Nico Weber-3

Hi all,

Here's a little experiment. Take the following text and copy it to the  
clipboard:

<text>
|♜|♞|♝|♛|♚|♝|♞|♜|
|♟|♟|♟|♟|♟|♟|♟|♟|
|♙|♙|♙|♙|♙|♙|♙|♙|
|♙|♙|♙|♙|♙|♙|♙|♙|
|♙|♙|♙|♙|♙|♙|♙|♙|
|♙|♙|♙|♙|♙|♙|♙|♙|
|♙|♙|♙|♙|♙|♙|♙|♙|
|♙|♙|♙|♙|♙|♙|♙|♙|
</text>

Now paste it (1) into TextEdit and save it  and (2) paste it into  
MacVim and save it. (Other text with non-ascii characters will do, too)

For the Leopard users out there, take a look at the resulting files  
with Quicklook.

Results: The file written by TextEdit looks ok in QL, the file written  
by MacVim does not. If you compare the two files, you'll see that they  
both use UTF-8 encoding, so that's not what confuses Quicklook. So,  
what's the cause? /Developer/Examples/AppKit/TextEdit/README.rtf hints  
at some extended file attributes that TextEdit stores in Leopard and  
refers to http://developer.apple.com/releasenotes/Cocoa/ 
Foundation.html for more information. If you search the latter page  
for "com.apple.TextEncoding", you'll see that `-[NSString  
writeToFile:atomically:encoding:error]` and friends write the encoding  
to an extended attribute. There's also a description of the format of  
the extended attribute ("this is not some undocumented stuff").

  Indeed, if you check the two files with `ls -l`, you'll see that the  
file written by TextEdit has an "@" (that means it has an extended  
attribute). Now, if you do `ls -l@` or `xattr -l filename`, you'll see  
that the TextEdit file has the com.apple.TextEncoding attribute set:

     Macintosh-2:b nico$ xattr -l texteditfile.txt
     com.apple.TextEncoding: UTF-8;134217984

The file written by MacVim does not have this attribute. If you add it  
(`xattr -w com.apple.TextEncoding 'UTF-8;134217984' macvimfile.txt`),  
the file shows up correctly in Quicklook (and in TextEdit too; it  
didn't do that before).

Obvious suggestion: MacVim should have an option 'encodingxattr` that  
can have the values 'read' and 'write' (default 'read,write'). If  
'write' is set, the com.apple.TextEncoding xattr is written when  
saving a file. When 'read' is set, the xattr is checked when reading a  
file.

According to http://arstechnica.com/reviews/os/macosx-10-4.ars/7 , the  
xattr api is already present in Tiger, so the option could be  
supported in both Tiger and Leopard (TextEdit and Quicklook only seem  
to honor the xattr on Leopard, though).

Comments?

Thanks,
Nico
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MacVim file encoding and Quicklook

Björn Winckler
2008/5/17 Nico Weber <[hidden email]>:

>
> Hi all,
>
> Here's a little experiment. Take the following text and copy it to the
> clipboard:
>
> <text>
> |♜|♞|♝|♛|♚|♝|♞|♜|
> |♟|♟|♟|♟|♟|♟|♟|♟|
> |♙|♙|♙|♙|♙|♙|♙|♙|
> |♙|♙|♙|♙|♙|♙|♙|♙|
> |♙|♙|♙|♙|♙|♙|♙|♙|
> |♙|♙|♙|♙|♙|♙|♙|♙|
> |♙|♙|♙|♙|♙|♙|♙|♙|
> |♙|♙|♙|♙|♙|♙|♙|♙|
> </text>
>
> Now paste it (1) into TextEdit and save it  and (2) paste it into
> MacVim and save it. (Other text with non-ascii characters will do, too)
>
> For the Leopard users out there, take a look at the resulting files
> with Quicklook.
>
> Results: The file written by TextEdit looks ok in QL, the file written
> by MacVim does not. If you compare the two files, you'll see that they
> both use UTF-8 encoding, so that's not what confuses Quicklook. So,
> what's the cause? /Developer/Examples/AppKit/TextEdit/README.rtf hints
> at some extended file attributes that TextEdit stores in Leopard and
> refers to http://developer.apple.com/releasenotes/Cocoa/
> Foundation.html for more information. If you search the latter page
> for "com.apple.TextEncoding", you'll see that `-[NSString
> writeToFile:atomically:encoding:error]` and friends write the encoding
> to an extended attribute. There's also a description of the format of
> the extended attribute ("this is not some undocumented stuff").
>
>  Indeed, if you check the two files with `ls -l`, you'll see that the
> file written by TextEdit has an "@" (that means it has an extended
> attribute). Now, if you do `ls -l@` or `xattr -l filename`, you'll see
> that the TextEdit file has the com.apple.TextEncoding attribute set:
>
>     Macintosh-2:b nico$ xattr -l texteditfile.txt
>     com.apple.TextEncoding: UTF-8;134217984
>
> The file written by MacVim does not have this attribute. If you add it
> (`xattr -w com.apple.TextEncoding 'UTF-8;134217984' macvimfile.txt`),
> the file shows up correctly in Quicklook (and in TextEdit too; it
> didn't do that before).
>
> Obvious suggestion: MacVim should have an option 'encodingxattr` that
> can have the values 'read' and 'write' (default 'read,write'). If
> 'write' is set, the com.apple.TextEncoding xattr is written when
> saving a file. When 'read' is set, the xattr is checked when reading a
> file.
>
> According to http://arstechnica.com/reviews/os/macosx-10-4.ars/7 , the
> xattr api is already present in Tiger, so the option could be
> supported in both Tiger and Leopard (TextEdit and Quicklook only seem
> to honor the xattr on Leopard, though).
>
> Comments?

Thanks for an interesting read!

I think what you propose sounds good but this is more in the domain of
Vim than MacVim so I guess it is up to Bram to decide whether he'd
include such a feature in Vim or not.  Until then I'm quite willing to
merge a patch with the git repo if you decide to write one.

Björn

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MacVim file encoding and Quicklook

Nikola Knežević
In reply to this post by Nico Weber-3

On 17 May 2008, at 10:57 , Nico Weber wrote:
> Obvious suggestion: MacVim should have an option 'encodingxattr` that
> can have the values 'read' and 'write' (default 'read,write'). If
> 'write' is set, the com.apple.TextEncoding xattr is written when
> saving a file. When 'read' is set, the xattr is checked when reading a
> file.

Maybe this should go in vim, not just MacVim?

Cheers,
Nikola

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MacVim file encoding and Quicklook

Steven Michalske

:set bomb

then save

do quick look and see that quick look behaves.

see :help bomb for more info.

On May 18, 4:43 am, Nikola Knežević <[hidden email]> wrote:

> On 17 May 2008, at 10:57 , Nico Weber wrote:
>
> > Obvious suggestion: MacVim should have an option 'encodingxattr` that
> > can have the values 'read' and 'write' (default 'read,write'). If
> > 'write' is set, the com.apple.TextEncoding xattr is written when
> > saving a file. When 'read' is set, the xattr is checked when reading a
> > file.
>
> Maybe this should go in vim, not just MacVim?
>
> Cheers,
> Nikola
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MacVim file encoding and Quicklook

Steven Michalske

On a side note, i feel that this is more a bug of quick look.

The unix file command correctly identifies the files as UTF8.

For a reason for not setting this attribute,

it not portable.
if i mail a UTF8 file to some one this attribute would be stripped.
i believe it is not supported on FAT32/NTFS partitions

quicklook is broken....  if it sees a UTF8 file it should behave  
correctly, as the BOM characters are optional.

Steve

On May 18, 2008, at 11:39 AM, hardkrash wrote:

>
> :set bomb
>
> then save
>
> do quick look and see that quick look behaves.
>
> see :help bomb for more info.
>
> On May 18, 4:43 am, Nikola Knežević <[hidden email]>  
> wrote:
>> On 17 May 2008, at 10:57 , Nico Weber wrote:
>>
>>> Obvious suggestion: MacVim should have an option 'encodingxattr`  
>>> that
>>> can have the values 'read' and 'write' (default 'read,write'). If
>>> 'write' is set, the com.apple.TextEncoding xattr is written when
>>> saving a file. When 'read' is set, the xattr is checked when  
>>> reading a
>>> file.
>>
>> Maybe this should go in vim, not just MacVim?
>>
>> Cheers,
>> Nikola
> >


--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: MacVim file encoding and Quicklook

Bram Moolenaar
In reply to this post by Björn Winckler


Bjorn Winckler wrote to Nico Weber:

> > Hi all,
> >
> > Here's a little experiment. Take the following text and copy it to the
> > clipboard:
> >
> > <text>
> > |♜|♞|♝|♛|♚|♝|♞|♜|
> > |♟|♟|♟|♟|♟|♟|♟|♟|
> > |♙|♙|♙|♙|♙|♙|♙|♙|
> > |♙|♙|♙|♙|♙|♙|♙|♙|
> > |♙|♙|♙|♙|♙|♙|♙|♙|
> > |♙|♙|♙|♙|♙|♙|♙|♙|
> > |♙|♙|♙|♙|♙|♙|♙|♙|
> > |♙|♙|♙|♙|♙|♙|♙|♙|
> > </text>
> >
> > Now paste it (1) into TextEdit and save it  and (2) paste it into
> > MacVim and save it. (Other text with non-ascii characters will do, too)
> >
> > For the Leopard users out there, take a look at the resulting files
> > with Quicklook.
> >
> > Results: The file written by TextEdit looks ok in QL, the file written
> > by MacVim does not. If you compare the two files, you'll see that they
> > both use UTF-8 encoding, so that's not what confuses Quicklook. So,
> > what's the cause? /Developer/Examples/AppKit/TextEdit/README.rtf hints
> > at some extended file attributes that TextEdit stores in Leopard and
> > refers to http://developer.apple.com/releasenotes/Cocoa/
> > Foundation.html for more information. If you search the latter page
> > for "com.apple.TextEncoding", you'll see that `-[NSString
> > writeToFile:atomically:encoding:error]` and friends write the encoding
> > to an extended attribute. There's also a description of the format of
> > the extended attribute ("this is not some undocumented stuff").
> >
> >  Indeed, if you check the two files with `ls -l`, you'll see that the
> > file written by TextEdit has an "@" (that means it has an extended
> > attribute). Now, if you do `ls -l@` or `xattr -l filename`, you'll see
> > that the TextEdit file has the com.apple.TextEncoding attribute set:
> >
> >     Macintosh-2:b nico$ xattr -l texteditfile.txt
> >     com.apple.TextEncoding: UTF-8;134217984
> >
> > The file written by MacVim does not have this attribute. If you add it
> > (`xattr -w com.apple.TextEncoding 'UTF-8;134217984' macvimfile.txt`),
> > the file shows up correctly in Quicklook (and in TextEdit too; it
> > didn't do that before).
> >
> > Obvious suggestion: MacVim should have an option 'encodingxattr` that
> > can have the values 'read' and 'write' (default 'read,write'). If
> > 'write' is set, the com.apple.TextEncoding xattr is written when
> > saving a file. When 'read' is set, the xattr is checked when reading a
> > file.
> >
> > According to http://arstechnica.com/reviews/os/macosx-10-4.ars/7 , the
> > xattr api is already present in Tiger, so the option could be
> > supported in both Tiger and Leopard (TextEdit and Quicklook only seem
> > to honor the xattr on Leopard, though).
> >
> > Comments?
>
> Thanks for an interesting read!
>
> I think what you propose sounds good but this is more in the domain of
> Vim than MacVim so I guess it is up to Bram to decide whether he'd
> include such a feature in Vim or not.  Until then I'm quite willing to
> merge a patch with the git repo if you decide to write one.

When writing a file, I don't think there is anything against writing
those extended attributes (when supported).  The value of fenc in
buf_write() can be used.

When reading a file, an item in 'fileencodings' similar to ucs-bom can
be used to use the encoding from extended attributes.

--
It is illegal to rob a bank and then shoot at the bank teller with a water
pistol.
                [real standing law in Louisana, United States of America]

 /// Bram Moolenaar -- [hidden email] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\        download, build and distribute -- http://www.A-A-P.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Loading...