---
I'm trying to create a public domain database of common words and expressions that developers can use in their mobile applications. In a few years, several billion people are going to have smartphones and most won't speak English.
I've been writing iOS language learning software myself (http://www.h4labs.com), so I realize that translation and localization are huge pain, being both costly and time consuming.
Creating a database, and building a better process, should help app developers reach a much larger audience. I'm trying to get people to contribute to the database. Even if it's simply a matter of taking your current localization files and adding them to the sheet, I'll sort through the data for duplicates.
I have a Google Spreadsheet:
https://docs.google.com/spreadsheet/ccc?key=0ArVkFagUZg7bdHB0MTNuMDJySGpnazFpWVZMVUVVNmc&usp=sharing
, which is probably the easiest way to
gather the initial data. However, using Github, especially since they support formatted csv's/tsv's https://help.github.com/articles/rendering-csv-and-tsv-data
...might be better in the long run.Here's what the data looks like on Github:
https://github.com/melling/AppDB/blob/master/tbls/AppLocalization.tsv
I've created a MySql table so I can check for duplicates and try to build some supporting scripts. https://github.com/melling/app-localization-dictionary/blob/master/mysql/tbl/app_localization_dictionary.tbl
Here are the initial languages that I put in the spreadsheet:arabic
bengali
chinese
chinese_traditional
czech
danish
dutch
farsi
finnish
french
german
greek
hebrew
hindi
hungarian
indonesian
italian
japanese
javanese /* Not japanese */
korean
norwegian
polish
portuguese
portuguese_br
russian
spanish
swedish
thai
turkish
vietnamese