[Oriya-group] Re: [indic] Re: Indic Sorting/collation challenging Problem - CLDR 1.3- data for Indic

Hariram Pansari hrpansari at yahoo.com
Tue Jan 18 21:56:04 IST 2005


Sir, 
Before sending Email (from my pc Win XP OS with IE6) I
checked veiw-->encoding-->Unicode/utf8 set.

Still how it garbled (as I also see in the mail
received back) I can't trace? Would you kindly guide
more?

This the great problem of Unicode that the text of
mail is appeared changed at destination in diffrent
formats as

1. U+xxxx
2. /uxxxx
3. &#xxxx(Decimal value)
4. उमड
5. #xxxx (Decimal value)
6. UTF8
7. UTF7
8. Proper Displayable Unicode character (with correct
font)
9.... others

There should be a program to convert between these
format. This is bare necessity. Would such a
utility(preferable of open source) could be developed
and provided/hosted at Unicode.org website for free
use. Otherwise Unicode could not be popular, else
people afraid of it or hate it.

However I am re-writing my Unicoded characters in HEX
Unicode in plain English for easy understanding and
avoiding such problems.

Please see again my message below.

With regards.

Hariram Pansari

--- Mark Davis <mark.davis at jtcsv.com> wrote:

> You have the charset settings on your emailer set
> incorrectly, and your
> characters are getting garbled. Could you please fix
> the settings and
> resend?
> 
> > Would the default collation order should not be
> based
> > on scientific and practical sence of computing,
> > over-passing all traditions?
> 
> Not quite. Our goal for the default order is to
> represent the sorting order
> that users are most likely to expect, no matter how
> it was derived. That
> must be tempered somewhat by the fact that in many
> cases user's expectations
> are not set in stone. And where there *is* leeway in
> user expectations, we
> do want to choose the tailoring that will give the
> best performance and cost
> the least in terms of memory, especially for
> database indexes.
> 
> ‎Mark
> 
> ----- Original Message ----- 
> From: "Hariram Pansari" <hrpansari at yahoo.com>
> To: "Mark Davis" <mark.davis at jtcsv.com>;
> <gora_mohanty at yahoo.co.in>; "Indic"
> <indic at unicode.org>; <hrpansari at vsnl.net>;
> <omvikas at mit.gov.in>
> Sent: Monday, January 17, 2005 05:20
> Subject: Re: [indic] Re: Indic Sorting/collation
> challenging Problem - CLDR
> 1.3- data for Indic
> 
> 
> > Sir, you have elaborated my point very correctly.
> Many
> > thanks.
> >
> > (1)
> > But again I like to say that --
> >
> > without using any special collation
> complex_routine
> > settings when using a computer's default sorting
> order
> > I faced the problem -
> > (1)
> > when words with
> > U+0901 ( â-Oँ ) Devanagari Sign Candrabindu
> > U+0902 ( â-Oà¤, ) Devanagari Sign Anusvara
> > U+0903 ( â-Oà¤f ) Devanagari Sign Visarga
> > comes at end of a word (i.e. before a space
> character)
> > automatically appears after the words without
> these.
> > (à¤.हा<à¤.हाँ). 

U0915+U0939+U093E < U0915+U0939+U093E+U0901

> > (as this is truely scientific order)
> >
> > whereas as per tradtional dictionary sort order
> these
> > should appear before
> (à¤.हाँ<à¤.हा).

U0915+U0939+U093E+U0901 < U0915+U0939+U093E

> >
> > (2)
> > when words with
> > > U+094D ( â-O् ) Devanagari Sign Virama
> (aka
> > > halant)
> > comes at end of a word (i.e. before a space
> character)
> > automatically appears after the words without
> it.
> > (रम<रम्). 

U0930+U092E < U0930+U092E+U094D

> (this does not seem scientific
> > order nor as per traditional order.)
> >
> > X+Viram having lower value should appear before
> > (रम्<रम).

U0930+U092E+U094D < U0930+U092E


> >
> > (3)
> > As you elaborated, in CLDR many collation orders
> has
> > been provided for a language and can be provided.
> >
> > Whether a UNIQUE collation order for a language
> should
> > not be standardised?
> >
> > Whether standardising a collation order is also a
> > part/duty_area of Unicode?
> >
> > Peoples likes to have standardising a collation
> > universaly for all OSs
> > (Windows/linux/unix/Mac/all...). Because different
> > collations will create much propblems for a
> language's
> > Data_indexing.
> >
> > (4)
> > Would the default collation order should not be
> based
> > on scientific and practical sence of computing,
> > over-passing all traditions?
> >
> > With regards.
> >
> > Hariram Pansari



		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250



More information about the Oriya-group mailing list