recommended kernel char sets (was Re: [vox-tech] Removing Files [SOLVED])

Mark K. Kim vox-tech@lists.lugod.org
Thu, 27 May 2004 11:22:04 -0700 (PDT)


I recommend ISO-8859-1 and UTF-8.  ISO-8859-1 can handle most latin-based
languages, including English (and is probably the default for everything
you do on the system).  UTF-8 *can* handle all other languages, though
whether it will or not depends on the actual charset of the system you're
trying to access, but at least it gives you *some* ability to access
international charsets.  These days with internet spanning across the
globe it can be very useful to be able to access international charsets.
Debian binary kernel apparently has all the charsets available as modules,
which is a good idea if you plan on accessing random foreign languages at
some point.  Whether to build these into the kernel or as module depends
on how often you plan to use them, I suppose.

-Mark

PS: Anyone else noticed Google runs on the UTF-8 charset?


On Thu, 27 May 2004, Jonathan Stickel wrote:

> Mark K. Kim wrote:
> <snip>
> > It's weird that Daniel's kernel didn't have UTF-8 support!  I figured it'd
> > be supported by default
> <snip>
>
> What are the recommended character sets to have in a kernel?  Obviously,
> it will depend on whether you want exotic languages support.  But for
> the Joe Shmoe English only user, what should be turned on?  Pasted below
> are the defaults for my kernel.  UTF8 is not on by default, and it
> sounds like it should be.  Any others I might run into and should turn
> on with future kernel compiles?  Would modules be OK?
>
> Thanks,
> Jonathan
>
>
> <*> Base native language support
> (iso8859-1) Default NLS Option
> <*>   Codepage 437 (United States, Canada)
> < >   Codepage 737 (Greek)
> < >   Codepage 775 (Baltic Rim)
> < >   Codepage 850 (Europe)
> < >   Codepage 852 (Central/Eastern Europe)
> < >   Codepage 855 (Cyrillic)
> < >   Codepage 857 (Turkish)
> < >   Codepage 860 (Portuguese)
> < >   Codepage 861 (Icelandic)
> < >   Codepage 862 (Hebrew)
> < >   Codepage 863 (Canadian French)
> < >   Codepage 864 (Arabic)
> < >   Codepage 865 (Norwegian, Danish)
> < >   Codepage 866 (Cyrillic/Russian)
> < >   Codepage 869 (Greek)
> < >   Simplified Chinese charset (CP936, GB2312)
> < >   Traditional Chinese charset (Big5)
> < >   Japanese charsets (Shift-JIS, EUC-JP)
> < >   Korean charset (CP949, EUC-KR)
> < >   Thai charset (CP874, TIS-620)
> < >   Hebrew charsets (ISO-8859-8, CP1255)
> < >   Windows CP1250 (Slavic/Central European Languages)
> < >   Windows CP1251 (Bulgarian, Belarusian)
> <*>   NLS ISO 8859-1  (Latin 1; Western European Languages)
> < >   NLS ISO 8859-2  (Latin 2; Slavic/Central European Languages)
> < >   NLS ISO 8859-3  (Latin 3; Esperanto, Galician, Maltese, Turkish)
> < >   NLS ISO 8859-4  (Latin 4; old Baltic charset)
> < >   NLS ISO 8859-5  (Cyrillic)
> < >   NLS ISO 8859-6  (Arabic)
> < >   NLS ISO 8859-7  (Modern Greek)
> < >   NLS ISO 8859-9  (Latin 5; Turkish)
> < >   NLS ISO 8859-13 (Latin 7; Baltic)
> < >   NLS ISO 8859-14 (Latin 8; Celtic)
> < >   NLS ISO 8859-15 (Latin 9; Western European Languages with Euro)
> < >   NLS KOI8-R (Russian)
> < >   NLS KOI8-U/RU (Ukrainian, Belarusian)
> < >   NLS UTF8
> _______________________________________________
> vox-tech mailing list
> vox-tech@lists.lugod.org
> http://lists.lugod.org/mailman/listinfo/vox-tech
>

-- 
Mark K. Kim
AIM: markus kimius
Homepage: http://www.cbreak.org/
Xanga: http://www.xanga.com/vindaci
Friendster: http://www.friendster.com/user.jsp?id=13046
PGP key fingerprint: 7324 BACA 53AD E504 A76E  5167 6822 94F0 F298 5DCE
PGP key available on the homepage