[Oriya-group] Collation data for all Indic scripts
gora_mohanty at yahoo.co.in
Thu Oct 28 21:48:58 IST 2004
I have uploaded a suite of programs to
This includes a single collation table for all Indian
scripts in Unicode (Devanagari, Bengali, Gurmukhi,
Gujarati, Oriya, Tamil, Telugu, Kannada, and
Malayalam). Please note that it is formally licensed
under the GPL.
Instructions for how to use this table are in the
file INSTALL. Besides this, there is a Perl program,
uniprint, that prints various combinations of letters
from these scripts, and can thus be used to test the
sorting. Some other Perl modules used by this program
are also included. I have taken the liberty of
bundling the CPAN Perl module, Readonly. Please
read the file README, or try "./uniprint -m" for
instructions on how to use the program.
Please note that this should be considered an alpha
release, i.e., do not use this for anything critical.
As I am not familiar with languages other than Hindi
and Oriya, I have gone by the Unicode names and the
ITRANS documentation. Some outstanding issues are
noted in the file BUGS, and in comments within the
collation table, col_indic, itself. Also, please feel
free to advertise this elsewhere, as I would like to
P.S. For the Oriya folk, an updated glossary is also
available now at
I will make a more general announcement after updating
some more dtuff tomorrow night.
Yahoo! India Matrimony: Find your life partner online
Go to: http://yahoo.shaadi.com/india-matrimony
More information about the Oriya-group