• I could split every two characters which would be a bit more efficient, but it's just added complexity and it sounds like in some cases it might mess things up?

    hi, please DO NOT add spaces or split whatever if you are unsure!
    the following is an example (but may be only those who understand chinese can comprehend)
    https://www.ptt.cc/bbs/Learn_Buddha/M.13­15569690.A.648.html

    This is a joke from a famous movie in chinese.

    original, NO spaces:
    本人林大福將大樹街石屋租于恩人黃老十一家
    未能報恩萬一不交租亦可收回黃公年租銀兩三
    十萬不能轉租別人立此爲據本人兒孫不得有違

    Good guy read this as:
    本人林大福_將大樹街石屋租于恩人黃老十一家_未能報恩萬一不交租亦可
    收回黃公年租銀兩三十_萬不能轉租別人_立此爲據本人兒孫不得有違

    Bad guy read this as:
    本人林大福_將大樹街石屋租于恩人_黃老十一家未能報恩萬一不交租
    亦可收回黃公年租銀兩三十萬_不能轉租別人_立此爲據本人兒孫不得有違

    which is a complete opposite meaning!

    thank you.

  • I could split every two characters which would be a bit more efficient, but it's just added complexity and it sounds like in some cases it might mess things up?

    what do you guys mean by every two characters?

    a chinese word = a chinese character. in the past, , it take up double the byte of that of an ASCII character.
    e.g.

    for the number of bytes you store "AB",
    you can only store 1 chinese "中".

    thanks

About

Avatar for Gordon @Gordon started