[vox-tech] vim and utf-8 support (newbie alert)
Peter Jay Salzman
vox-tech@lists.lugod.org
Mon, 9 Jun 2003 13:59:58 -0700
thanks mark...
On Mon 09 Jun 03, 1:27 PM, Mark K. Kim <markslist@cbreak.org> said:
> On Mon, 9 Jun 2003, Peter Jay Salzman wrote:
>
> > * start an xterm with a suitable font: "xterm -fn <fontname> -e vim"
> > * use utf-8 encoding which uses encodes unicode and ISO10646 text.
> > * load a suitable keymap to help make entering text easier.
> >
> >
> > is all this correct so far? even in a "touchy-feely" way? i'm a
> > complete newbie in this topic.
>
> It depends on the foreign language and how it's encoded.
>
> The XTerm has its own encoding and fonts (mostly designed for latin-based
> languages). VIM also has its own encoding and fonts. It gets really
> tricky because there are so many systems depending on each other, and you
> may have to trick one or more of the systems to make the foreign language
> work, but which systems you can trick depends on the foreign language
> you wanna work with.
>
> What language are you working with? Latin-based languages only need font
> change, and you can probably just change the fonts on XTerm. Multibyte
> languages (ie, CJK) generally need special XTerm that understands that
> language (generally using its own, non-utf-8, encoding). I won't even
> touch right-to-left or up-and-down languages (that requires both terminal
> and Vim support.)
right-to-left languages are really, really, really well supported in
vim. at least, they seem to be. check out:
:set rl
all the vim commands i can think of work well.
the language i'm thinking of is hebrew, but with some important issues.
1. i need vowel support.
2. i really want to have mixed hebrew/english
i believe taken together, i want to use ISO 10646 which can represent
all languages at the same time.
> > if this is about correct, how does one tell vim to encode the text using
> > utf-8?
>
> :set encoding=utf-8
> That tells VIM to interpret the file as though it's encoded in UTF-8.
> But VIM's got no idea how the data should be displayed so I think it
> attemps to display them in unicode by default. So your terminal should
> also be capable of unicode and got all the necessary fonts.
as a first stab at getting utf-8 capable xterms, i set:
LC_CTYPE=en_US.UTF-8
but wierd things started to happen, like mutt's threading lines turned
into really strange characters. i guess the applications themselves
need to be utf-8 aware too.
> Works great under WindowsXP (everything's in unicode; just make sure you
> got the fonts installed.)
that makes me very sad... :(
> > and how do you tell vim "i want to use language X whose characters are
> > unicode number UT-Y through UT-Z? or doesn't it work quite that way?
>
> I don't think the unicode characters are marked by languages. Some are
> obvious (CJK, though subset of C is also used by JK), but others are less
> so (punctuation, alphabets, etc.) Many characters are also not in
> sequence (I think Chinese is broken up in two or more sets -- unicode is
> constantly evolving and they need to maintain backwards compatibility.)
okay. it never is that easy, eh? :-)
it totally sucks that mixed hebrew-with-vowels/engish turned out to be
such a hard thing to do. :( sucks even worse that it's easy on windows
xp. :(
pete
--
GPG Instructions: http://www.dirac.org/linux/gpg
GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E 70A9 A3B9 1945 67EA 951D