[vox-tech] vim and utf-8 support (newbie alert)

Jay Strauss vox-tech@lists.lugod.org
Mon, 9 Jun 2003 16:06:01 -0500


You don't need vowels in Hebrew, you figure out the word by context :)

----- Original Message ----- 
From: "Peter Jay Salzman" <p@dirac.org>
To: <vox-tech@lists.lugod.org>
Sent: Monday, June 09, 2003 3:59 PM
Subject: Re: [vox-tech] vim and utf-8 support (newbie alert)


> thanks mark...
>
> On Mon 09 Jun 03,  1:27 PM, Mark K. Kim <markslist@cbreak.org> said:
> > On Mon, 9 Jun 2003, Peter Jay Salzman wrote:
> >
> > > * start an xterm with a suitable font: "xterm -fn <fontname> -e vim"
> > > * use utf-8 encoding which uses encodes unicode and ISO10646 text.
> > > * load a suitable keymap to help make entering text easier.
> > >
> > >
> > > is all this correct so far?  even in a "touchy-feely" way?   i'm a
> > > complete newbie in this topic.
> >
> > It depends on the foreign language and how it's encoded.
> >
> > The XTerm has its own encoding and fonts (mostly designed for
latin-based
> > languages).  VIM also has its own encoding and fonts.  It gets really
> > tricky because there are so many systems depending on each other, and
you
> > may have to trick one or more of the systems to make the foreign
language
> > work, but which systems you can trick depends on the foreign language
> > you wanna work with.
> >
> > What language are you working with?  Latin-based languages only need
font
> > change, and you can probably just change the fonts on XTerm.  Multibyte
> > languages (ie, CJK) generally need special XTerm that understands that
> > language (generally using its own, non-utf-8, encoding).  I won't even
> > touch right-to-left or up-and-down languages (that requires both
terminal
> > and Vim support.)
>
> right-to-left languages are really, really, really well supported in
> vim.  at least, they seem to be.  check out:
>
>    :set rl
>
> all the vim commands i can think of work well.
>
>
> the language i'm thinking of is hebrew, but with some important issues.
>
> 1. i need vowel support.
> 2. i really want to have mixed hebrew/english
>
> i believe taken together, i want to use ISO 10646 which can represent
> all languages at the same time.
>
> > > if this is about correct, how does one tell vim to encode the text
using
> > > utf-8?
> >
> >    :set encoding=utf-8
>
> > That tells VIM to interpret the file as though it's encoded in UTF-8.
> > But VIM's got no idea how the data should be displayed so I think it
> > attemps to display them in unicode by default.  So your terminal should
> > also be capable of unicode and got all the necessary fonts.
>
> as a first stab at getting utf-8 capable xterms, i set:
>
>    LC_CTYPE=en_US.UTF-8
>
> but wierd things started to happen, like mutt's threading lines turned
> into really strange characters.  i guess the applications themselves
> need to be utf-8 aware too.
>
> > Works great under WindowsXP (everything's in unicode; just make sure you
> > got the fonts installed.)
>
> that makes me very sad...   :(
>
> > > and how do you tell vim "i want to use language X whose characters are
> > > unicode number UT-Y through UT-Z?   or doesn't it work quite that way?
> >
> > I don't think the unicode characters are marked by languages.  Some are
> > obvious (CJK, though subset of C is also used by JK), but others are
less
> > so (punctuation, alphabets, etc.)  Many characters are also not in
> > sequence (I think Chinese is broken up in two or more sets -- unicode is
> > constantly evolving and they need to maintain backwards compatibility.)
>
> okay.  it never is that easy, eh?   :-)
>
> it totally sucks that mixed hebrew-with-vowels/engish turned out to be
> such a hard thing to do.  :( sucks even worse that it's easy on windows
> xp.   :(
>
> pete
>
> -- 
> GPG Instructions: http://www.dirac.org/linux/gpg
> GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E 70A9 A3B9 1945 67EA 951D
> _______________________________________________
> vox-tech mailing list
> vox-tech@lists.lugod.org
> http://lists.lugod.org/mailman/listinfo/vox-tech
>
>