On Tue, 07 Dec 2004 01:40:38 +0600, Donald Gaminitillake <semage@mail.ewisl.net> wrote:Let me have the Set C and D from unicode.Set C and set D are derivatives of set A and set B. Please see the following technical report from Unicode consortium about the way they are generated. It talks about unions and subsets you wanted a link for. http://www.unicode.org/reports/tr17/tr17-5.html I think you are confused glyphs, characters and code points. Please read this article if you haven't done it already: http://www.unicode.org/standard/where/Your Union is not listed. I enclose the pdf files from unicode for your perusal.If you look at basic set theory, either a set can be defined by listing out all its elements (e.g.: set A and set B above), or by defining the way it is generated. The technical report above explains how this definition is done, and the paper by Dr Gihan and Aruni - which you have kind of miss-quoted on www.akuru.org - defines them for Sinhala specifically. Anuradha
On Tue, 7 Dec 2004 10:52:41 +0600, Anuradha Ratnaweera <gnu.slash.linux@gmail.com> wrote:
http://www.unicode.org/standard/where/I think you are confused glyphs, characters and code points. Please read this article if you haven't done it already:
Quoting a complete paragraph from the above text: "The character you are looking for may be represented as a sequence of code points in Unicode. Here are examples of such characters, and their representation as a sequence of code points." Anuradha
Received on Tue Dec 07 15:43:37 2004Dear Harshula SINHALA LETTER AYANNA = 0d85 sinhala letter ka = 0d9a In your system with a joiner (in set B) + 0dcf AELA-PILLA and get "KA" "Dumriya" train :: with the set a you cannot get DU or RI correctly SLS 1134 does not talk of a "UNION" BUT Unicode 0d80-0dff Can you show me a document in SLSI where they define SLS1134 as a Union and gives all the locations. (even a photo copy of a page would do if you can send as an image) Now show me where is set C and D located in the unicode or in SLSI so that I can download as a pdf This is where the problem is. All characters need individual locations as done in past Mono type etc. (see letter press lead characters) Then you can have more Majic in Sri Lankas ICT. Best Donald harshula wrote:On Mon, 2004-12-06 at 19:54 +0600, Donald Gaminitillake wrote:Dear Harsshula I am Still at the same place: ( as you wrote -- word by word) SLS 1134 = Unicode 0d80-0dff locations (this is a the incomplete sinhala alphabet) Is this correct?Hi Donald, Absolutely not! SLS 1134 is the union of: Set A = Set A1 Union Set A2 Union Set A3 = 0d80-0dff Set B = 200C-200D Set C = Cartesian product of A2 and A3 (Set C2 = Union of C and A2) Set D = a subset of the Cartesian product of B, A3, A2 and C2I find your "set b" contains General Punctuation Range: 2000–206F this includeNo. Set B does not contain general punctuation. As I stated in my previous email: "Set B consists of two elements, the magical ZWNJ and ZWJ" When I say that Set B consists of two elements, why would list all these other unrelated elements. Set B consists of TWO elements, ZWNJ and ZWJ. Before we continue any further, please answer these two questions with an explanation. 1) Can 0d85 encode a single Sinhala letter? 2) Can 0d9a0dcf encode a single Sinhala letter? Regards, HarshulaSpaces , Dashes,EM SPACE FOUR-PER-EM SPACE DOUBLE VERTICAL LINEs etc , LINE SEPARATORs , PER MILLE SIGN etc , DOUBLE DAGGERs etc, QUESTION EXCLAMATION MARKs etc, COMMERCIAL MINUS SIGNs etc , REVERSED SEMICOLON etc. QUADRUPLE PRIME, Greek enotikon, Urdu paragraph separator .Japanese kome, INVISIBLE SEPARATOR etc and Formatting characters 200C ZERO WIDTH NON-JOINER = ZWNJ 200D ZERO WIDTH JOINER = ZWJ 200E LEFT-TO-RIGHT MARK = LRM 200F RIGHT-TO-LEFT MARK Where are the unicode locations and/or SLSI locations for sinhala language byond this? Which you refers as Set C = Cartesian product of A2 and A3 & Set D = a subset of the Cartesian product of B, A3, A2 and C2 Can you give me the similar locations to download from UNICODE. I will be in and out of Colombo during the next fewe days and E mails may get delayed Best Donald harshula wrote:Hi Donald, Set A = Set A1 Union Set A2 Union Set A3 = 0d80-0dff Set B = 200C-200D Set C = Cartesian product of A2 and A3 (Set C2 = Union of C and A2) Set D = a subset of the Cartesian product of B, A3, A2 and C2 You seem to be very interested in Set B, so I assume you already understand and acknowledge Set C, as it's derivable from Set A2 and A3. Set B consists of two elements, the magical ZWNJ and ZWJ: http://www.unicode.org/charts/PDF/U2000.pdf Regards, Harshula
This archive was generated by hypermail 2.1.8 : Wed Dec 08 2004 - 17:56:45 LKT