[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Posted on Sat 28th, October 2023

1
2
Last

Page of 3

Last Next

• #1

ccchan
hi,
i am using chinese.
1. it seems for gadgetbridge.js, i have to use font12, but not 14, 16(default size!), otherwise the message will be empty.
assume i tolerate to use font12.

but when the message is too long, only the middle part is shown.
rendering it quite useless.

possible to provide an option to wrap the msg?

the below include a working short google message,
the longer not-useful message, the msg itself, and the content of the log file.
i think i should have removed my personal info from them.
thanks

ps the log file is at bottom, as the one previously with (storagefile) is with problem, thx

in the message app, setting is "font min: small"

4 Attachments
• #2

ccchan

the 1st screenshot, the "9" in the long line will be the 6th digit of the code, rendering it quite un-useful.... thanks

original msg: 【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。

current displayed as:
【知乎】你的验证码是
" 69844" missing
9，此验证码用于登录知乎或重置密
"码。10 " missing
分钟内有效。

ps, in gadgetbridge.js, i have to tick "use bitmap if font not available",
and only size 12 worked, 14, 16 not worked. sometimes 18 works.

setup info:
using the default bangle message UI app only.
• #3

ccchan
even clicked into the msg, it doesn't help.

and i tried all the several message apps in the store,
i'll prefer the default one as most of the time it could still work if the msg is short.
thx

(this messagedebug.log is a working one for the #1 message)

2 Attachments
- messagesdebug.log
• #4

Gordon
Thanks for those logs, that's great - I'm not entirely sure what's up with the larger font sizes not working, but the issue with wrapping is that when deciding what to convert to an image, Gadgetbridge uses the following logic:
- Split words up by space/newline/etc
- If any word contains a character Bangle.js itself doesn't have a font for, convert the whole word to an image.
The issue here is that Gadgetbridge thinks it is just one giant word. I'm afraid I don't know enough about Chinese, but is there some 'separator' character that we should be looking out for and just aren't?

Or maybe Gadgetbridge will just have to split any bitmap of a 'word' that's greater than 100px wide into multiple bitmaps...
• #5

ccchan in reply to @Gordon
hi,
1. let me try with different font size and if it really sucks, i'll provide you with the log file?
2. regarding the font 12 setting, how about if GB.js could split up at punctuations?
  may be starting with:
  a coma (the chinese fonts usually show punctuations in the middle) "，",
  a fullstop (this chinese full stop is an empty circle)"。"
  original msg: 【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。
because in this msg, if GB.js understand those 2 symbols,
it may show the msg as:
【知乎】你的验证码是 698449，
此验证码用于登录知乎或重置密码。
10 分钟内有效。

Hope this could happen, may be at least only do to the chinese msg? i.e. spare other languages?

thanks

3.
Or, may be, if there is a sms in chinese language, GB.js could simply fill it to a line, if the line is full, GB.js can simply make a new line?

because in chinese, each font/character is indeep a word on itself already.

say in the above line 此验证码用于登录知乎或重置密码.
i worry it is still too long to fit into 1 line,
however, you could cut it in anyway as long as the seqeunce is correct.
i mean, as long as you ONLY cut it and dont ADD space/puntuation, it's ok:
此验证码
用于
登录知乎
或重置密码.

is same as
此
验证码
用于
登录
知乎
或
重置密码.

even you cut it like this, everyone still understand this:
此
验
证
码
用
于
登
录
知
乎
或
重
置
密
码.

so you could cut however you like, just keep the sequence and dont add space/puntuations. thx
• #6

ccchan

this is overkill but just for reference:
https://www.omniglot.com/chinese/structure.htm

so as long as the sequence is correct, no space/puntuations added,
you can arrange it horizontally or vertically. and it doesn't matter about the number of rows or columns.
• #7

houshou_m

The typical Mandarin written word consists of either one or two graphs, with two being more common. For reference, the notification text can be broken down into the following "dictionary words":

你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。

The simplest solution might be to simply preserve groupings of two. Unlike the Latin script, Chinese characters are all meant to be written such that they take up the same amount of space (like monospace fonts do), so perhaps that wouldn't be too difficult a task?
• #8

ccchan

hi, but without the AI, for such a string of 20 chinese character/word, GB.js wont know where to break it up. i am sure it will need a dictionary to only GUESS the breakpoint, and i am sure it wont perform as good as a human brain.

so, for this msg from zhihu.com
【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。

I just wish it to fill the screen space, and when that line is full, make a new line.
thus if the screen allow 15 characters, pls simply output:

【知乎】你的验证码是 698
449，此验证码用于登录知
乎或重置密码。10 分钟内有
效。

ANYONE who knows chinese could interpret as long as the sequence is unchanged, nothing like space nor punctuations are added.
To the extreme, as i mentioned before, even you cut at every character/word and dont change the sequence, nothing added (except make a new line), every one could still understand.
【
知
乎
】
你
的
验
证
码
是

6
9
8
4
4
9
，
此
验
证
码
用
于
登
录
知
乎
或
重
置
密
码
。
1
0

分
钟
内
有
效
。

Any breaking it up into phrase of 2-3 characters introduced space which could imply an opposite meaning so I completely disagree on that.

thanks
• #9

ccchan

pls dont add space nor punctuations, it could ruin the message, thx
• #10

Gordon
Thanks - well, that's promising. I'd propose:
- Allow split on "，" and "。" - we use comma already but maybe chinese uses a separate charcode for it: https://codeberg.org/Freeyourgadget/Gadgetbridge/src/branch/master/app/src/main/java/nodomain/freeyourgadget/gadgetbridge/service/devices/banglejs/BangleJSDeviceSupport.java#L1152
- If it's a Chinese character (char code range U+4E00..U+9FFF ? This includes Japanese too) then encode each character as its own image. This will use a bit more memory, but will have the effect of wrapping each character whenever there is a new line (otherwise they will still be placed inline)
I could split every two characters which would be a bit more efficient, but it's just added complexity and it sounds like in some cases it might mess things up?
• #11

ccchan in reply to @Gordon

I could split every two characters which would be a bit more efficient, but it's just added complexity and it sounds like in some cases it might mess things up?

hi, please DO NOT add spaces or split whatever if you are unsure!
the following is an example (but may be only those who understand chinese can comprehend)
https://www.ptt.cc/bbs/Learn_Buddha/M.1315569690.A.648.html

This is a joke from a famous movie in chinese.

original, NO spaces:
本人林大福將大樹街石屋租于恩人黃老十一家
未能報恩萬一不交租亦可收回黃公年租銀兩三
十萬不能轉租別人立此爲據本人兒孫不得有違

Good guy read this as:
本人林大福_將大樹街石屋租于恩人黃老十一家_未能報恩萬一不交租亦可
收回黃公年租銀兩三十_萬不能轉租別人_立此爲據本人兒孫不得有違

Bad guy read this as:
本人林大福_將大樹街石屋租于恩人_黃老十一家未能報恩萬一不交租
亦可收回黃公年租銀兩三十萬_不能轉租別人_立此爲據本人兒孫不得有違

which is a complete opposite meaning!

thank you.
• #12

Gordon

hi, please DO NOT add spaces or split whatever if you are unsure!

That's a bit of a problem then. What do you suggest we do instead? Is there some simple rule, like 'if char code is within this range we can put a newline after if needed'
• #13

ccchan
i mean add a new line/wrap to the screen is usually ok.

may be could you please try your method(s) on the above 2 msg and let me see what will happen?
1. a SMS from zhihu:
  【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。
2. a continous text from the movie:
  本人林大福將大樹街石屋租于恩人黃老十一家未能報恩萬一不交租亦可收回黃公年租銀兩三十萬不能轉租別人立此爲據本人兒孫不得有違
if msg 1 now become:
【知乎】你的验证码是 698449，
此验证码用于登录知乎或重置密码。
10 分钟内有效。
then split at , and 。works

if msg2 become:
"fill the whole line then wrap to the next line" like
(suppose bangle can only display 15 chinese character per line)
本人林大福將大樹街石屋租于恩人
黃老十一家未能報恩萬一不交租亦
可收回黃公年租銀兩三十萬不能轉
租別人立此爲據本人兒孫不得有違

(the following is suppose bangle shows 18 chinese characater on 1 line)
本人林大福將大樹街石屋租于恩人黃老十
一家未能報恩萬一不交租亦可收回黃公年
租銀兩三十萬不能轉租別人立此爲據本人
兒孫不得有違

thanks
• #14

houshou_m

Hello, Gordon,

As with all languages, it's certainly the case that if you unnaturally break things up, there's the chance of introducing ambiguity. However, it's also the case that context will resolve the matter most of the time. And in the worst case scenario, it's not as if the watch is the only means by which we can read the messages. We can think about the well known joke in English about the importance of commas for a comparable example: "Let's eat grandpa!"

Unfortunately, although I am familiar with the linguistic and orthographic side of this problem, I am not familiar with the coding side. But a method that could theoretically be used to handle run on English sentences such as "Thequickbrownfoxjumpsoverthelazydog" would be our best starting point.

One solution I can imagine would be to cut the line once it approaches the side margin of the screen at a point where the preceding characters form a block divisible by two and then starting a new line; the software could go on like this until it both reaches the bottom margin and cannot reasonably shrink the bitmaps down any further, at which point it must necessarily elide any remaining text.

So applying this proposal for how to cut up lines to article 1 of the Universal Declaration of Human Rights, we would get the following if we assume a character limit of 11 per line (characters bolded to show where words get split up):

人人生而自由﹐在尊嚴 (10 characters)
和權利上一律平等。他 (10 characters)
們賦有理性和良心﹐並 (10 characters)
應以兄弟關係的精神互 (10 characters)
相對待。 (4 characters)

As you can see from the above, only two words get split up (thanks in part to punctuation in Mandarin being full-width, hence taking up the same space as a "normal" character). These splits do not result in any ambiguity in this example either.
• #15

ccchan in reply to @Gordon

I could split every two characters which would be a bit more efficient, but it's just added complexity and it sounds like in some cases it might mess things up?

what do you guys mean by every two characters?

a chinese word = a chinese character. in the past, , it take up double the byte of that of an ASCII character.
e.g.

for the number of bytes you store "AB",
you can only store 1 chinese "中".

thanks
• #16

houshou_m in reply to @ccchan

While in Classical Chinese it was certainly the case that one character most often wrote one word, in modern Mandarin, it is more often the case that a word consists of two syllables, represented by two characters. This change can actually start to be seen as early as Xunzi, who uses a noticeably larger amount of disyllabic words than Mencius. Let's look again at the example sentence from the first article from the Universal Declaration of Human Rights, this time adding spaces between words*:

人人生而自由﹐在尊嚴和權利上一律平等。他們賦有理性和良心﹐並應以兄弟關係的精神互相對待。

Sum of monosyllabic words (one graph, one word): 10
Sum of disyllabic words (two graphs, one word): 15

*Note that for the purposes of this conversation, a word should be understood as something one would encounter in daily life and also something that one would find in a dictionary of modern Mandarin.
• #17

ccchan in reply to @houshou_m

i am not sure you natively speak chinese or not.
for me, my mother tongue is chinese's cantonese.
for your sentence:
人人生而自由﹐在尊嚴和權利上一律平等。他們賦有理性和良心﹐並應以兄弟關係的精神互相對待。

you will be correct that some are "two words" , a short phase, like 人人, 自由 , 尊嚴, 權利, 一律, 平等。他們 , 理性 , 良心﹐, 兄弟 ,關係 , 精神, 互相, 對待。

but some are "single words" like 在, 和 , 和 ,並 , 的.

for a bangle.js v2, with that 512KB ram, there wont be a dictionary, there wont be an AI, i am pretty sure that is NO WAY for the bangle to understand the chinese.

you may think this looks better, by adding empty spaces for easier reading:
人人生而自由﹐在尊嚴和權利上一律平等。他們賦有理性和良心﹐並應以兄弟關係的精神互相對待。

but i am 100.000000% sure that, the lines looks 100.000000% same FUNCTIONALLY without the spaces. and the bangle watch dont have the luxury to add them.

just A. wrap the msg untouched into new lines will be ok.
if possible, then B. make news lines at puntuations.

pls only do C. to introduce empty spaces only if you got AI, you got a dictionary to check for.

ps: the author of the message will already consider whether the message is understandable, so you really DONT need to introduce empty spaces. this is a simple logic.

人人生而自由﹐在尊嚴和權利上一律平等。他們賦有理性和良心﹐並應以兄弟關係的精神互相對待。
this one without empty spaces indeep is more formal than the above with the introduced spaces above.

as seen in #1, the 2 messages i received dont have the empty spaces.
those empty spaces may help some people, but formally not needed.
ps: when i reply chinese in youtube's comment, i'll use those empty spaces, so that those keywords are separated out, and looks more like "highlighted", to emphasis what i want to express.
• #18

ccchan

to be simple:
as in my original msg: 【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。
the author already make sure that other people who know chinese will reasonably understand the message.
so you simply wrap it by the width of the screen, and make no sense for you to add empty spaces or modify it, unless you want to help the author to emphasis on something, which should not be the case.

Same as a string of number 8374018374018374018237401837083,
the author dont expect you to add empty spaces into it,
yet i guess he will expect that the string will be wrap by the width of the screen due to physical limitation.

thanks
• #19

Gordon

@user156881 thanks for the info! From what you can say, it feels to me like wrapping (if needed!) every 2 chars is probably best.

Because the alternative to not wrapping every two chars is that the text goes off the screen, in which case it is definitely unreadable.

But for now, I have just added conversion of ， and 。 to chars that will wrap, so if you try a Gadgetbridge Nightly tomorrow that should really help and may fix most of your issues without having to add any mid-sentence wrapping
• #20

ccchan in reply to @Gordon

hi,
i am still not very understand what you guys mean by "wrap by 2 chars",
suppose this is my msg:
【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。

do you mind clarify what do you mean by that?

because currently i can only think of
【知乎】
你的
验证
码是
698449，
此验
证码
用于
登录
知乎
或重
置密
码。
10 分钟
内有
效。
which do work like a single vertical line as below, but is less common to see.
because if the wrap is between the 2nd and 3rd block of a 3 block phrase, that will be more un-natural than a single line, as the brain will be implied that everything is in "2 blocks".

as i told before that for that long chinese sentence, some phrase consist of
1 "block" - a chinese charactor, e.g. 你,的, 是, 此, 或, 内,
some 2, e.g. 用于, 登录, 重置, 密码, 分钟, 有效
and some 3, e.g. 验证码
【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。

and honestly practically non-human could interpret another set of phrases, which is expected as even english/other languages sometimes got some ambigious issue due to historical reasons.

but if you mean by 1 "block"=2 char, and wrap it,
thus making it like a vertical single line, that will be ok:
【
知
乎
】
你
的
验
证
码
是

6
9
8
4
4
9
，
此
验
证
码
用
于
登
录
知
乎
或
重
置
密
码
。
1
0

分
钟
内
有
效
。

thanks
• #21

ccchan in reply to @houshou_m

你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效

so in this example,
你, 1
验证码, 3
登录, 2

i 'll say in the author 's orginal line, he/she have already considered that others will understand his/her line without spaces 【知乎】你的验证码是 698449，此验证码用于登录知乎或重置密码。10 分钟内有效。

i just afraid if you rely on the android phone or bangle to choose the length of the phrases, it may hurt, because sometimes the phrases in the dictionary are overlapped, pre-GPT4 AI dont have the ability to do so. And even GPT4 is not good in chinese too. (but chinese made their own AI which process chinese better).
• #22

houshou_m

@Gordon I'm happy to contribute to the development of this device. As for your suggested change, I think that's a fantastic start to tackling this matter. It really might be enough! I will test this out for a few days before reporting back to you. I will also see how it fairs with Japanese.

Additionally, I want to clarify my suggestion in the space below, since it seems that I was not able to communicate what I meant well. I apologize for that. I was writing late at night and wasn't as careful about my words as I should have been.

I am assuming that there is a character limit to each line of text which the device observes with the default messaging app. If that is the case, then should the punctuation fix not be enough, it would be ideal to be able to wrap the lines in such a way as to preserve a number of characters which is divisible by two. So if there is a limit of, say, 13 characters per line, we wrap at the 12th. Would that be possible?

To give an example of what I mean using ccchan's original message, let's say hypothetically that the Bangle.js does have a 13 character limit with the default messaging app. The message should be split up like so:

【知乎】你的验证码是 6 (12)
98449，此验证码用于 (12)
登录知乎或重置密码。10 (13)
分钟内有效。 (6)

Each line of text is wrapped perfectly in this example. No Chinese is split up in a way that splits a two syllable word across two lines. I had the second to last line be 13 characters on the assumption that Gadgetbridge would be able to handle whitespace normally, even within a message consisting of both full- and half-width characters.

@ccchan I hope the above explanation clarifies what I mean. We are only concerned with the number of characters per line. There is no need for any kind of advanced programming to deal with splitting lines of Chinese text this way. Gadgetbridge only needs to be able to count characters and be aware of character limits. Again, I apologize for not being clear.
• #23

ccchan in reply to @houshou_m

i still could not understand what you guys mean by "wrapping (if needed!) every 2 chars is probably best." because your example looks like it's wrapping by even numbers, not two.

anyway may be you guys could proceed ahead, just please try leave me an option to choose the versions so i can still use a working one while it's improving, thanks
• #24

Gordon

@houshou_m great! Yes, I think we're on the same page there. I'm not expecting to wrap every 2 characters, but merely to as you say: "wrap the lines in such a way as to preserve a number of characters which is divisible by two"

Also, I just wanted to check - are 【/】 basically just like [/]? If so it would help if I added that conversion in Gadgetbridge to render them better and allow wrapping after them
• #25

houshou_m
Yes, for all intents and purposes, those characters are equivalent. Chinese also has the optional 、 type of comma for when enumerating items (as in, "milk, butter, eggs"). This is also the standard comma used in Japanese.

I did some tests with the nightly build of Bangle.js Gadgetbridge (commit 1aadc04fd) today, and unfortunately nothing seems to have changed. Chinese text was still running off the screen, and no line wrapping was being done. Also, I don't know if I was just experiencing some exceptionally well-timed bad luck, but I was not always being notified on my watch when a message containing Chinese was sent to me --- this despite the fact that I never encounter this issue with messages using ASCII characters in the body. The messages would simply be completely absent from my watch, even when there was a lull of 10 seconds between the messages. I do not have Gadgetbridge set to limit messages if multiple messages come in too quickly, and I of course had the screen of my phone off while performing these tests, which I did using two different instant messaging platforms.

I also tested to see how Japanese would be handled. Although I received the Japanese messages on my watch, the device strangely would not buzz to alert me to them. Furthermore, the messages were all blank. Perhaps these are two related issues? These issues occurred with messages that had both English and Japanese, as well as only Japanese.

I took pictures to show the issues I have described and have attached them to this message. I hope this helps. The Chinese message sent in full is supposed to be: 孟子見梁惠王，王立於沼上，顧鴻鴈麋鹿，曰：「賢者亦樂此乎？」

2 Attachments

1
2
Last

Page of 3

Last Next

Post a reply
- Bold
- Italics
- Link
- Image
- List
- Quote
- code
- Preview
Formatting Help

Don't worry about formatting, just type in the text and we'll take care of making sense of it. We will auto-convert links, and if you put asterisks around words we will make them bold.

Tips:

Create headers by underlining text with ==== or ----

To *italicise* text put one asterisk each side of the word

To **bold** text put two asterisks each side of the word

Embed images by entering:
![](https://www.google.co.uk/images/srpr/logo4w.png)
That's the hard one: exclamation, square brackets and then the URL to the image in brackets.

* Create lists by starting lines with asterisks

1. Create numbered lists by starting lines with a number and a dot

> Quote text by starting lines with >

Mention another user by @username

For syntax highlighting, surround the code block with three backticks:

```
Your code goes here
```
Just like Github, a blank line must precede a code block.

If you upload more than 5 files we will display all attachments as thumbnails.

For a full reference visit the Markdown syntax.

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

About

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Actions

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Formatting Help

About

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Actions

Espruino