[vox-tech] vim and utf-8 support (newbie alert)

Henry House vox-tech@lists.lugod.org
Mon, 9 Jun 2003 17:18:56 -0700


--ZGiS0Q5IWpPtfppv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Jun 09, 2003 at 01:59:58PM -0700, Peter Jay Salzman wrote:
[...]
> as a first stab at getting utf-8 capable xterms, i set:
>=20
>    LC_CTYPE=3Den_US.UTF-8

Locales do not work unless they are generated. Under Debian, run
dpkg-reconfigure locales and ask for en_US.UTF-8 and en_US.ISO-8859-1.
Your /etc/locale.gen should then look like mine:

	# This file lists locales that you wish to have built. You can find a list
	# of valid supported locales at /usr/share/i18n/SUPPORTED. Other
	# combinations are possible, but may not be well tested. If you change
	# this file, you need to rerun locale-gen.
	#
	# XXX GENERATED XXX
	#
	# NOTE!!! If you change this file by hand, and want to continue
	# maintaining manually, remove the above line. Otherwise, use the command
	# "dpkg-reconfigure locales" to manipulate this file. You can manually
	# change this file without affecting the use of debconf, however, since it
	# does read in your changes.
=09
	en_US ISO-8859-1
	en_US.UTF-8 UTF-8

As it says, run locale-gen. Then set LANG=3Den_US.UTF-8 in all shells.
Multibyte-supporting programs will Just Work. Multibyte-disabled software
will break, but you can then fall back on ISO-8859-1 (aka Latin1).

As you already figured out, you need to use a UTF-8 capable terminal emulat=
or
and a suitable font. Under the console, run unicode_start and UTF-8 should
start working. You may need to fiddle with console fonts as well.

> but wierd things started to happen, like mutt's threading lines turned
> into really strange characters.  i guess the applications themselves
> need to be utf-8 aware too.

Mutt works fine under the default configuration as long as you generated the
locales. I'm using Mutt under UTF-8 right now.

> > Works great under WindowsXP (everything's in unicode; just make sure you
> > got the fonts installed.)
> =20
> that makes me very sad...   :(

Sad indeed. The library and kernel system support are there, but four
difficiencies remain:

1. Fonts. This is being solved, but there may never be an optimal solution,
   because most folks balk at the huge sizes and memory consumption of full
	Unicode fonts.

2. PostScript has abysmal unicode support. Gnome Print and other projects
   address this, but not with much apparent success.

3. Keymaps. They exist, but normal people cannot figure out how to use them.

4. Old apps. Many are broken and need to be corrected to support Unicode.

Linux has a ways to go yet :-(.

For further reference, I recommend:

A Debian HOWTO (start with this):
	http://melkor.dnp.fmph.uniba.sk/~garabik/debian-utf8/howto.html

UTF-8 and Unicode FAQ (for Linux and Unix):
	http://www.cl.cam.ac.uk/~mgk25/unicode.html

LDP Unicode HOWTO (a little out-of-date):
	http://en.tldp.org/HOWTO/Unicode-HOWTO-1.html

--=20
Henry House
The attached file is a digital signature. See <http://romana.hajhouse.org/p=
gp>
for information.  My OpenPGP key: <http://romana.hajhouse.org/hajhouse.asc>.

--ZGiS0Q5IWpPtfppv
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+5SPwi3lu92AVGcIRAouZAJ9bFqrtgMs9cO3+IY3A4pmywtwVegCZAUA5
N+8/veOfQIe7s3GGq4Vn8Ng=
=ssCS
-----END PGP SIGNATURE-----

--ZGiS0Q5IWpPtfppv--