Monday, April 25, 2011

How it works?

Initially the user is required to login to the system. There are three user types Administrators, Translators and Reviewers.


Once the user logged in as an admin, he/she is basically capable of assigning new authorized users by providing them a username and password. Administrator is capable of uploading the projects/resource file to be translated. Those projects are stored in a specific folder at server directory and later used by translators. Further admin can correct the words in Translation memory and add words to the corpus and glossary.


When a user logged in as a Translator, he/she can retrieve the projects uploaded to the server by the admin and translate them using the translate manager module. First the translators can ‘autowrite’ the existing words. If the words are not translated automatically, he should enter the translation. At this point he can get the support of word suggestions from corpus and glossary. After entering translations the project is saved and the translation memory also gets saved.


A user logged in as a reviewer can basically observe the words and translations in the Translation memory. If there are wrong translations, reviewer can mark them and notify to Administrators. Then administrators can correct those words.


Universal open source localizer is having a proper localization process where the Administrators are uploading projects, Translators are dedicated to perform the translations and Reviewers are used to verify the accuracy of translations. This is the only this type localization tool available for Sinhala language.
The major components of the system are successfully integrated together. We have done the system integration testing and verify the compatibility of each module in the system. In addition we design the web pages in a very user friendly manner. As per the requirements, the Universal Open Source Localizer performs properly.

We put some effort to adapt a Fuzzy machine in our localizer. However it is figured out that existing Fuzzy machines are not appropriate for the Sinhala language. Hence it is required to do a research on a Fuzzy machine for Sinhala language and develop it separately.

When we try to develop the Sinhala spell checker, we came across a limitation of Unicode support in Php language. In our further works we will have to develop the Sinhala words spell checker by adapting another language and embed it to the system.


The limitation of Unicode support in Php, is also a reason for us to use .XLIFF format in our translate manager module. As Php does not read Sinhala Unicode we store the encoded Sinhala words in .XLIFF file and get them decoded in .po file.


Open source universal localizer can be identified as a frame work or an API. Additional plug-ins can be added to this frame work with the change of source code. Universal open source localizer,
• Provides a common platform for Sinahala and Tamil languages
• Open source and freely available.
• Gives Sinhala suggestions.
• Highly accurate (proper localization is followed manage by different type of users)

Wednesday, March 30, 2011


Our team is currently working on the development of the Universal open source localizer. We have completed the system designing phase. The overall system is divided into four modules and the coding is carried out separately per each module. So far we have developed the Translate manager module, Corpus and the Dictionary adapting.

The translate manager handles the string conversion part from English to Sinhala. It is basically take a resource file as its input and convert it into the .xliff file type. There after a unique process is called to translate that new file into Sinhala. There we have used string processing techniques and Unicode support. Once the translation is done we convert the file back to its original file type. And that is provided as the output of this module.

The corpus module is developed as a database of words. It is providing word suggestions to the translators. The dictionary module is used to check the accuracy of words and give suggestions for incorrect words. Mainly we have completed the coding and done the unit testing of these modules by now.

Along with the system developments our team is working on the project documentation. Furthermore we have to develop the User management module. There after the separate modules need to be integrated together and execute an integrated testing.


Friday, January 14, 2011

The Work Breakdown Structure for Universal Localizer




The Universal Open Source Localizer is comprised with four major modules namely Translate manager, Corpus, Glossary and User management module. The above work breakdown structure conveys the basic functions to be performed by each module.

Translate manager :
- Identify the type of resource files
- Convert into .XLIFF file
- Translate into Sinhala and Tamil
- Convert back to relevant file type
- Store the Translation Memory (TM) in a database

Corpus :

- Store Sinhala and Tamil words separately
- Use those words to check spellings in translation
- Provide word suggestions
- Make the users able to add words to corpus
- Handle program specific symbols

Glossary :
- Store Sinhala and Tamil words for technical words
- Provide word suggestions

User management module :
- Create different user types (Admin, Translators, and Viewers)
- Store user’s data (basically user name and password) in a database
- User authentication and access control
- Assign appropriate privileges for users

Saturday, December 4, 2010

Universal Open Source Localizer

Universal open source localizer is a web based software that provides a common platform to translate software application from English to Sinhala and Tamil languages. The translation is done by the use of Unicodes. Unlike currently existing localizing softwares, this system is capable of translating any kind of system file such as .po, .dtd, .properties etc.

The Universal Open Source Localizer is composed with four main components namely
The user management module
The corpus
The glossary
The translate manager.

This system is developed by adapting PHP and MySQL technologies.