[Oriya-group] Re: [indic] Indic Sorting/collation challenging Problem - CLDR 1.3- data for Indic

Hariram Pansari hrpansari at yahoo.com
Tue Jan 18 23:48:41 IST 2005


--- Gora Mohanty <gora_mohanty at yahoo.co.in> wrote:

> I don't follow what you are saying here. To take a
> practical example with the letter "ka", as per my
> understanding the correct order should be
> ka
> ka + anusvara
> ka + visarga
> ka + candrabindu
> Are you saying that we should change this? 

No. You have set it rightly. I also support/like this
order only.

As per the views of Phonics scientists (as
tested/measured with frequency meters)the right order
is :

k [ka+halant]
ka
kaa
ki
kii
ku
kuu
kr [k+vocalic R)
krr  [k+vocalic RR)
kl  [k+vocalic L)
kll  [k+vocalic LL)
ke
kai
ko
kou
ka + candrabindu
ka + anusvara
ka + visarga

But traditional dictionaries Hindi/Oriya/Other Indic
also follows "ka+VM < ka".
(VM=Vowel Modifier, ie. anuswara < visarga <
chandrabindu -- as per Oriya traditional
dictionaries).

(chandrabindu < anuswara < visarga, -- as per Hindi
and other Indic Languages traditional dictionaries).

[< = less than]


All literary experts/pandits bounds/pressurises us for
that unscientific order. This is too dificult/compex
for computer's defaults.


> If so, I
> don't think that will be possible as that is the
> accepted dictionary order, and we cannot confront
> computer users with a different order. If you agree
> with this order, the current Indic collation table
> implements this, regardless of where the combined
> letter lies in the word, i.e., it works correctly
> even at the end of the word.

Right, But the (old orthodoxy) literary pandit's
traditional order not in line with computer's
defaults. This anyhow works where VM appears in
begining/middle of a word but does not work where VM
appears at end of a word (i.e. before a Space
character)

 
> > (2) The character with a halant practically has a
> > lower value, how to set computer's direct sorting
> > order that character with halant should come first
> > character without halant should come after
> > even if when it occures at the end of a word?
> 
> This is possible to do with the present POSIX
> locales
> used by glibc by defining what is called a collating
> element, e.g., one defines ka+halant as a single
> element, and orders it before ka in the collation
> table. The problem is that there is a glibc bug so
> that
> the locale compiler crashes when defining all the
> collating elements required for all Indian
> languages.
> I am looking for a workaround. The alternative would
> be to split up the comprehensive sorting table into
> one for each language. Doing this should not be a
> problem with the CLDR locale.

Right. But this applies to all Indic languages.
"X+halant < X" default setting is too complex for all
computer OSs.

What will be the root level solve of this? Is not it
lies in the root-level understanding and encodings of
Indic? 

My main aim is to create a sence among the literary
Pandits to forget the old unscientific/wrong
traditional methods and refresh their brains to adopt
new correct/scientific order to have a global
compatitibiltiy for Indic languages in this era of
internet. So that Indic could be simplified and could
be better/easier than English etc.

With regards.

Hariram Pansari


		
__________________________________ 
Do you Yahoo!? 
The all-new My Yahoo! - Get yours free! 
http://my.yahoo.com 
 




More information about the Oriya-group mailing list