Help regarding Unicode in linux

Submitted by Anonymous
on October 6, 2007 - 12:15pm

Hi

I like to get the unicode completely working in my linux box. I am using RHEL 4 and Fedora 6 in my boxes. I like to know what all packages are related to text input and display.

I am trying to get Malayalam language working in my systems. Of course I have good Unicode compliant fonts ( which are working fine in Windows ). I need to configure my linux boxes as well for the following task

1. Read web pages made in unicode
2. Type directly in unicode characters.
3. Do searches based on Unicode texts.

Any help is really appreciated. I am in the process of implementing the linux setup in local language process. so that challenge is to show "it is possible with linux - better than others " !!!

Thanks
-B

Works by default on Debian Unstable

cushioncritter
on
October 6, 2007 - 2:10pm

My system is originally a standard EN-US system.

In order to display Chinese Traditional Characters, I did:

aptsh: install ttf-arphic*

The next test is to go to news.google.com in Firefox, at the bottom of the page, every font should display (no boxes with unicode code points). In my case, China, Hong Kong, Japan, Taiwan, Israel, Greece, Arabic, Russia, and India display correctly (because I have previously installed the fonts), but Korea displays as boxes, I should now:

aptsh: search Korean font
aptsh: install ttf-alee tf-unfonts ttf-baekmuk

Now if I exit the browser and restart, I see Korean characters at the bottom of news.google.com.

I can't really help you with input methods, because I do my translations from English to Chinese at babelfish.altavista.com, and just do a copy/paste of the Chinese characters into my text editor (leafpad). After a paste of unicode characters, Leafpad will not allow saving the file anymore as ANSI_X3.4-1968 or ISO-8859-1, for example, but you can do a "Save As" with the same file name and character coding UTF-8.

I use the CSS-2 feature to separate NLS elements from my HTML code:

div.myclass :before { content: Chinese Characters; }

This allows dynamic changes of language by clicking a button, as opposed to the flawed 'entity' approach used for translation that goes back 20 years.

A few years back, when I would paste unicode into a text editor, I would get something like \u20\u21\u22\u23 which worked in the CSS file when changed to \u20212223. This is no longer necessary, the raw characters unicode now go into the file. I'm sure web purists will be horrified as such characters are not a part of the XML character set.

As for searches, I find I can copy/paste the Chinese translation from "Beijing Olympics" from the Babelfish translator into Google and get hits (in Chinese) that are apparently correct.

As usual, running the latest Linux is the best, and RHEL 4 / Fedora 6 are not the latest of either.

Thanks for the reply. In

Anonymous (not verified)
on
October 11, 2007 - 3:24pm

Thanks for the reply. In fact my systems are Fedora based. Is there any instructions set or some packages which could be used to get this working in Fedora

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.