Friday, February 29, 2008

Syllable Segmentation Software

Pls download and try Burmese Syllable Segmenter.
It is a free software. But the source may not be open. After a complete documentation, we will publish the technology.

Publications

Some of our publications:

2008

Wunna Ko Ko & Thin Zar Phyo, 2008 Selection of XML Tag Set for Myanmar National Corpus The 6th Workshop on Asian Language Resources (ALR6), IJCNLP08, Hyderabad, India. Slides

2007

Wunna Ko Ko, 2007 Burmese Language Enabling at OpenOffice.org Annual Conference of Myanmar Engineering Society, Yangon, Myanmar. Slides

2006

Wunna Ko Ko, Yoshiki Mikami Languages of Myanmar in Cyberspace Nagaoka University of Technology Bulletin on Language Science and Humanity, Vol. 19. pp.249-264 (2005)

Zavarsky Pavol, Wunna Ko Ko, Yew Choong Chew, Yoshiki Mikami, Tatsuo Kobayashi, Unicode Spreading on the Web: A case of Asian & African Domains, Internationalization and Unicode Conference 2006, San Francisco, USA, March 2006

Thursday, November 8, 2007

Sample Burmese Sorting Program

I uploaded the Burmese Sorting Software (Trial). It can sort the dictionary order of (Burmese-Burmese) published by Myanmar Language Commission.

Burmese Sorting Software Download

Thursday, October 18, 2007

Burmese Sorting

I had made a lot of research for Burmese Sorting in recent years. Now, I have developed an algorithm successfully. It can sort all the data as long as the data is compliant with the standard ISO/IEC JTC1/SC2/WG2 and technical notes UTN#11.

The fonts which compliant with the above mentioned standard are (as long as I know) Myanmar2, Myanmar3, Parabike and Padauk. There may be many more with compliant with the standard. The input (keyboard) should also be in compliant with the above mentioned standard.

It can sort out the input with the order of dictionary (Burmese-Burmese Dictionary, Burmese-English Dictionary) released by Myanmar Language Commission.

With the help of my colleague, I have developed a sample software and a .dll file for sorting.

Since all the cost incurs to me, I need some money to recover the research cost. I will provide the sample software free but I will sell .dll file with some cost. .dll file can be used by the developers who wish to include the Burmese sorting in their own programs.

I will upload the sample software in a few days. ( I can't use internet at office. I upload it as soon as I can get a chance. )

For developers, please contact me to wunnakoko at gmail dot com. I will provide you the detail for the cost.

No Internet Connection at Myanmar Unicode & NLP Research Center

The internet access was not available to public in Myanmar from 28th September 2007 but it restored to normal condition by 15th October 2007. But the authorities at Myanmar Computer Federation (MCF) blogged the staffs of Myanmar Unicode and NLP Research center till today.

I don't know when will they open back again. This makes the advancement of NLP research to a very slow condition.

Saturday, September 22, 2007

Burmese Language Project at OpenOffice.org

I have created "Burmese Language Project" at OpenOffice.org . Firstly, we need to enable Burmese Language at I18N (Internationalization) part. I18N is a kind of supporting usage of Burmese with OpenOffice. I18N includes:
  • Word, line and sentence break
  • Search and replace
  • Paragraph numbering
  • Transliteration
  • Character classification
  • Number Formats
  • Calendars
  • Collation
  • Locale data
I have grouped a team of members to work on it. Our team members do not work for money but they need money to work. So, I am looking for a research grant. If any person wants to provide any contribution, not only money but also works, please contact me at wunnakoko at gmail dot com.

Saturday, September 15, 2007

Syllable Segmentation & Line Breaking Software

I hope we can publish the beta version of Syllable Segmentation Software in near future. Although, it is in beta version, we have already tested for a period of time. The software will do the following:
1. Phonologic segmentation of Burmese Syllables
2. Orthographic segmentation of Burmese Syllables
3. Line Breaking or Word Wrapping

It can handle three types of documents:
1. Text Documents (.txt) files
2. XML Documents (.xml) files
3. MS Word Documents (.doc) files

By handling XML documents, we hope that it will be useful for segmenting all types of other documents like Spreadsheet files, Database files, etc.

I hope all of our friends and Burmese community will help us in testing it.