Concept
Pocyr is an application for transliterating text into other alphabets. It can be used for entertainment, scientific purposes and savings. Savings? That’s right, because our research showed that using the optimal alphabet can significantly reduce the average number of letters in words.
At the moment, the application supports the Polish language and is available for Android devices: link In the future, it is possible to add new pairs of languages and alphabets.
Justification
It is no secret that recordings of all natural languages of communication in our world are not optimal. The simplest proof of this can be an archive of a text file, which weighs less than the text file itself. But there is an important aspect in language encoding that distinguishes it from ordinary compression algorithms - human readability. You can’t just thoughtlessly replace sound patterns with new symbols, otherwise the number of symbols will grow and the recording will become unreadable.
The simplest and most logical option is to have one symbol for each sound, at least from the frequently used ones. But even this obvious idea doesn’t work for most languages, but to varying degrees. One of the worst situations in the Polish language, where one sound can sometimes be marked with as many as three symbols - czsz. And this is not surprising, because the Latin alphabet is not adapted to express the sounds of the languages of the Slavic group. In the Czech language, for example, there are additional letters - š, č and others. We offer an alternative - the use of the Cyrillic alphabet for writing the Polish language.
Research
We used various resources for research, but for the demonstration we chose one of the most famous Polish books - The Witcher.
The total number of characters in this book is 609,763 characters, of which 497,388 are words, the rest are punctuation marks, spaces, and numbers. After transliteration into Cyrillic, the total number of characters decreased to 576,524, of which 464,149 are in words. The average word length decreased from 5,426 to 5,130 characters The most representative for us is the indicator of the reduction of the total number of symbols, which in this case is equal to 5.5%. The exact number of printed books is unknown, but the lowest estimate can be the sales of the game of the same name in Poland - more than a million copies. If these books were printed in transliterated form, then the same million copies could fit on the amount of paper equivalent to 945 thousand non-transliterated books. The 55,000 saved books contain 12,925,000 book pages and 6,462,000 physical pages, respectively. After rounding the approximate values, we will get 80 saved trees according to the smallest estimates. And this is only from one book by one author. A complete switch to transliterated printing could save thousands of trees every year.
Sources
.https://en.wikipedia.org/wiki/The_Witcher
.https://spidersweb.pl/2016/01/cdp-pl-sprzedaz-wiedzmin-polska.html
.https://science.howstuffworks.com/environmental/green-science/question16.html