[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Posted on Sat 28th, October 2023

Page of 3

Prev Next

• #26

ccchan in reply to @houshou_m

The Chinese message sent in full is supposed to be: 孟子見梁惠王，王立於沼上，顧鴻鴈麋鹿，曰：「賢者亦樂此乎？」

hi, do you need pay money to send this test message?
and also how to send, from what device? thanks
• #27

ccchan in reply to @houshou_m
Furthermore, the messages were all blank.

yeah, quite many times my bangle turn into this and i dont know what to do.

i guess some messages with some error make it this way.

thanks

1 Attachment
• #28

Gordon

You could try looking at the Gadgetbridge log to see if any errors were reported by Bangle.js when the screen got displayed like that?

If it's reproducible you could ensure that in the Android app settings on Bangle.js Keep Msgs is set, then you can disconnect from Gadgetbridge, connect with the Web IDE and see if anything gets reported to the console when you try and view the message
• #29

ccchan in reply to @Gordon

okay, will do that next time,
it happen randomly so have to wait for it to happen.
thanks
• #30

houshou_m in reply to @Gordon

Gordon, I sent you a direct message with my watch's log and a description of the testing I did. Please let me know if there is anything else I can do.

@ccchan You can get your friends to send you quotes from Mencius if you just ask. I'm sure they'll do it for free. Mine did :)
• #31

ccchan

ps, as i installed both the default "message UI" and the "message list", today i found some chinese SMS was wrapped correctly in the "message list".

yet, the empty blank msgs problem still occur with it.

so i'll try record down the record logs for it/them later.
thanks
• #32

Gordon
Thanks for the update.

I believe I have now fixed this - there were a bunch of issues:
- Chinese Punctuation wasn't handled so words weren't being split on punctuation (Gadgetbridge)
- Words with Chinese chars couldn't be split at any point (Gadgetbridge)
- Very long words that made an image of over 255 pixels wide made an invalid image, which is what caused the blank messages you saw (Gadgetbridge)
- When there's an image+image, Bangle.js wouldn't 'wrap' them to the screen (image+space+image worked)
So you need to update Gadgetbridge to nightlies, as well as the Bangle.js firmware, but hopefully it's sorted now!
• #33

ccchan in reply to @Gordon
@houshou_m do you have experience to use the GB nightly and ... update of bangle.js firmware? It will take me sometime to figure it out, thanks.
pls see if "msg list"will work too, if possible thanks.

or in a few days time i could try my self.

===========extra, msg list also not working=======
however, today i got a long SMS in chinese, it seems wrapped ok in "msg list" but not so in "msg UI"(old version, as i dont know about GB nightly and firmware update).

anyway i'll try screenshot it out and copy here first. for reference, with software version numbers.
thanks

summary: both still not wrapping correctly.

better try out the modifications by gordon later, thanks

the 1st attachment is "msg UI", the 2nd one is "msg list"

2 Attachments
• #34

Ganblejs in reply to @ccchan

You'll find a link to Gadgetbridge 'nightly' builds on this page: https://www.espruino.com/Gadgetbridge#how-to-set-up

Here's the bangle.js firmware updater on the app loader (there will be one file that's the 'cutting edge' one):
https://banglejs.com/apps/?q=fwupdate
• #35

ccchan in reply to @Gordon
hi,
1. same as before, nothing seems to be changed. i am already on latest nightly (thru f-droid nightly repo, commit 5abd46d7b) and 2.19v60 firmware (commit cfbc4040d).
  phone config: font12, text as bitmap, others should be default
  watch config: use message UI only
  i'll try upload the screenshot later
2. just found i could somehow send myself whatsapp msg to test it. i am still using the above msg.
  original msg: 【知乎】你的验证码是 488379，此验证码用于登录知乎或重置密码。10 分钟内有效。
3. what is the text message that you used? may be i can send to myself to test? thanks
ps: so, previously i am using SMS, currently i am using whatsapp to test the text.

1 Attachment
• #36

ccchan in reply to @Ganblejs

ok, thanks for the instructions.
• #37

ccchan in reply to @Gordon

when i compare the screen shots,
the original msg's "chinese coma" and "chinese fullstop" are in the middle of the square,
in previous versions, the msg 's chinese coma are still in the middle of the square.
but the latest version modified it into a comma at bottom....
is this the problem? thx

you can view the screenshots in the thread, thanks
• #38

Gordon

but the latest version modified it into a comma at bottom....

Ok, so I think you're on the latest Gadgetbridge, that's good, and should mean that you no longer have messages that won't show on the Bangle. Now you need to update your Bangle.js firmware to a cutting edge build.

As an aside, I've been looking into storing the full Chinese fonts on the Bangle using the PBF font format that's been supported for a few versions now. However it seems the font format itself, while designed for Unicode, wasn't designed for Chinese unicode usage. It can only store around 10,000 characters maximum inside it before overflowing the indices in the file format.

https://stackoverflow.com/questions/1366068/whats-the-complete-range-for-chinese-characters-in-unicode would seem to imply that there are 90,000 characters in full, with around 20,000 that we have to support (even though that also supports Japanese and Korean in the same set of chars).

Do you believe that really is the case? If it's possible to get down under 10,000 chars then I can use the current PBF file format with current firmware, otherwise I'll have to make some custom changes to the file format and update the Bangle.js firmware to get it working
• #39

ccchan in reply to @Gordon

isn't i am already on the "edge" version of 2v19?
i mean i am on (commit cfbc4040d)<-- this appear in the "about" app.
• #40

ccchan in reply to @Gordon

hi,
i remember daily use chinese characters is just 6000-10000,
i searched google, it retured similar.
i further asked in gpt4, it also gave a similar result as below.

HOWEVER there are 1. simplified chinese characters, invented by china-mainland after communist took place in 1949 (now used by the 1.4 billion china-mainland users), and 2. those tradional chinese characters, used in china-taiwan, china-hongkong, china-macau (21 million+7 million+1 million users.).

yeah i knew why it say 20,000 then.

it would be a honor to have chinese fonts stored onto the bangle, but will it use up a lot of resources? also, althought chinese is 1 of 6 languages used in United Nations, what if other languages users also want to have their fonts stored on the watch?

thanks

=========================

User
how many chinese characters are commonly used daily? i am in hongkong

ChatGPT
In the context of Chinese language use, especially in a place like Hong Kong, the number of commonly used Chinese characters can vary significantly based on context, literacy level, and the specific requirements of communication.

In general, a literate person in a Chinese-speaking environment is usually familiar with at least 3,000 to 4,000 characters, which are enough for everyday reading and writing. However, for more advanced literacy, such as reading newspapers, literature, or academic texts, knowledge of 6,000 to 8,000 characters is often necessary.

The education system in Hong Kong, like in many Chinese-speaking regions, emphasizes the learning of a large number of characters from an early age. However, the number of characters used in daily life can be much lower, as many characters are used infrequently.

For daily practical purposes, such as texting, social media, casual reading, and signage, a smaller subset of these characters is often sufficient. It's also worth noting that Hong Kong uses traditional Chinese characters, which are more complex than the simplified characters used in Mainland China.

To sum up, while a literate person in Hong Kong might know several thousand Chinese characters, the number actively used in daily life is likely lower, focusing on a core set of commonly used characters.
• #41

ccchan

https://eriksen.com/language/simplified-vs-traditional-chinese/

it's quite complex....
i'll suggest simply "break the line at the punctuations" AND "break the line when it's at the end of the horizontal display line". thx
• #42

Gordon

isn't i am already on the "edge" version of 2v19?

No - that's just the last release. Go to the firmware update app and use 'cutting edge' - it should be something like 2v19.60 or later

how many chinese characters

Well it's heartening to see that maybe 8000 chars would do it - now can you find out what their character codes are? :)

will it use up a lot of resources?

Adding all 20,000 chars uses about 850kB of flash. It takes a while to upload, but the Bangle has 8MB and usually uses maybe only 2% of that - so there is plenty to go around.

what if other languages users also ...

Well, most of the UN could be handled by under 1000 chars which can easily be part of the main font. It's nothing compared to 20,000.

But my plan was to have separate font apps for different Locales, but it would seem that splitting the characters that way isn't actually very straightforward.

If someone had a list of characters for each locale it'd make it really easy.
• #43

ccchan in reply to @Gordon
hi,
1. i just re-installed the firmware, the commit is still the same as above, but this time i am pretty sure i clicked 2v19.62 something
2. negative, still not wrapped at the punctuations. btw, i found that, even in the message UI, the same SMS when displayed and when "clicked into" display differently. i can by chance read the code by one method, but since the other parts is outside of the screen, this is still not appropriate.
3. i just asked gpt4, pls fact check if possible. btw, gpt4 were "sold out" now. too many people want to buy their plans.
gpt4 result:
You:

in unicode, there are just about 10000 daily use traditional chinese characters and some 10000 daily used simplified chinese characters. do you know their codes? the code ranges?

ChatGPT:

In Unicode, Traditional and Simplified Chinese characters are not separated into distinct blocks based solely on their writing system. Instead, they are mostly included together in several blocks, primarily in the "CJK Unified Ideographs" block. The range of these characters in Unicode is quite extensive, encompassing multiple blocks. Here's an overview:

CJK Unified Ideographs: This is the largest block of Chinese characters. It includes both Traditional and Simplified forms. The range is U+4E00 to U+9FFF. This block contains the majority of characters used in daily Chinese text.

CJK Unified Ideographs Extension A: This block extends the basic set and includes more obscure and less frequently used characters. The range is U+3400 to U+4DBF.

CJK Unified Ideographs Extension B to Extension G: These blocks contain even more rarely used, historic, or dialect-specific characters. The ranges are:

Extension B: U+20000 to U+2A6DF
Extension C: U+2A700 to U+2B73F
Extension D: U+2B740 to U+2B81F
Extension E: U+2B820 to U+2CEAF
Extension F: U+2CEB0 to U+2EBEF
Extension G: U+30000 to U+3134F

CJK Compatibility Ideographs: This block includes characters that are variants of those in the main CJK block and are used for compatibility purposes with other character sets. The range is U+F900 to U+FAFF.

CJK Compatibility Ideographs Supplement: This is an extension of the compatibility ideographs. The range is U+2F800 to U+2FA1F.

These ranges cover both Traditional and Simplified characters, and the distinction between the two is often based on context and regional usage rather than Unicode encoding. Unicode primarily focuses on the unification of characters that have the same etymological root, regardless of regional script variations.

You:

for the above ranges, can you also tell how many characters are within them? thanks

ChatGPT:

Here is the number of characters in each Unicode block relevant to Chinese characters:

CJK Unified Ideographs: 20,992 characters
CJK Unified Ideographs Extension A: 6,592 characters
CJK Unified Ideographs Extension B: 42,720 characters
CJK Unified Ideographs Extension C: 4,160 characters
CJK Unified Ideographs Extension D: 224 characters
CJK Unified Ideographs Extension E: 5,776 characters
CJK Unified Ideographs Extension F: 7,488 characters
CJK Unified Ideographs Extension G: 4,944 characters
CJK Compatibility Ideographs: 512 characters
CJK Compatibility Ideographs Supplement: 544 characters

These counts reflect the number of individual characters encoded in each respective block.
• #44

ccchan

let me try upload the 2 screenshot for the SAME msg later. thanks
• #45

ccchan

ps, so gpt4 say trad. chinese and simpl. chinese have union, and each have unique parts. and these 3 parts are all within the 1st block of 20000 characters.

gpt4 record:

You:

i dont understand, for the english "old", tell me the code for the chinese word 舊and 旧 then.

ChatGPT:

The Chinese characters "舊" (Traditional) and "旧" (Simplified), both meaning "old" in English, have different Unicode code points. Here are their respective codes:

舊 (Traditional Chinese): The Unicode code for this character is U+820A.

旧 (Simplified Chinese): The Unicode code for this character is U+65E7.

These codes are hexadecimal representations of their respective positions in the Unicode character set. You can use these codes in various digital systems to represent these specific characters.

You:

so, the 1st block of 20000 you mentioned will include both the traditional and simplified daily use chinese characters?

ChatGPT:

Yes, the first block, "CJK Unified Ideographs" (U+4E00 to U+9FFF), includes both Traditional and Simplified Chinese characters commonly used in daily life. This block is designed to encompass a wide range of frequently used Chinese characters, irrespective of whether they are in Traditional or Simplified form.

The Unicode system aims to unify characters with the same etymological roots, so many characters that have both Traditional and Simplified forms are encoded only once in this block. The distinction between Traditional and Simplified in Unicode is often a matter of font rendering and context rather than separate encoding.

For specific characters that have different forms in Traditional and Simplified Chinese, like "舊" (U+820A) and "旧" (U+65E7), they are encoded separately, as they have distinct graphical representations. Such characters may appear in the same block (as is the case with these examples) or in different blocks, depending on their Unicode allocation.

You:

will some trad. or simpl. chinese have the same coding then?

ChatGPT:

Yes, many Traditional and Simplified Chinese characters share the same Unicode code point. This is because Unicode is designed to unify characters that are essentially the same, regardless of regional script variations. The distinction between Traditional and Simplified Chinese is often a matter of how the character is rendered (i.e., displayed or printed), rather than how it is encoded in Unicode.

For example, a character that has the same form in both Traditional and Simplified Chinese scripts will have only one Unicode code point. It's the font and language settings of the user's system that determine how the character is displayed.

In cases where the Traditional and Simplified forms of a character are graphically different, they are assigned separate Unicode code points, as seen with "舊" (U+820A) and "旧" (U+65E7).

Therefore, while there are distinct Traditional and Simplified forms for some characters, many others are encoded just once in Unicode, covering both scripts. This approach reduces redundancy and simplifies text processing across different languages that use Chinese characters.

You:

show me one that is same in trad. and simpl chinese.

ChatGPT:

A common example of a character that is the same in both Traditional and Simplified Chinese is the character for "water," which is written as 水 in both scripts. In Unicode, this character has the code point U+6C34.

This character exemplifies many instances where the Traditional and Simplified forms of a Chinese character are identical, and therefore, they share the same Unicode encoding. The character 水 (U+6C34) is used in a wide range of contexts and is a fundamental character in both Traditional and Simplified Chinese writing systems.
• #46

Gordon

So I guess that means 'no', you can't find a list of common characters?

If we were to go for it, perhaps the only other option is to get a big corpus of representative Chinese text, and to check what character codes are in it.

... but I did fix the PBF file format now, so as long as you're on the latest cutting edge firmware it is possible to load a font file with all 20,000 glyphs onto the Bangle.
• #47

ccchan in reply to @Gordon
1. will try the latest cutting edge firmware and report then.
2. but that doesn't have anything to do with the wrapping, correct?
incorrect wrapping is the major problem as sometimes the important info is out of the screen, thanks
• #48

ccchan in reply to @Gordon
hi
1. i think i am on latest GB.js and the cutting edge firmware
2. now i use a fake whatsapp to send the msg to my own whatsapp, to test.
  i only use message ui now, (but sometimes message list perform better)
  (let's focus on msg UI first?, as msg list is by another author?)
the msg appear itself as "it-show-up"

, if i manually click into the msg app, and click view msg, it appear differently as "press-into".

original msg: 【知乎】你的验证码是 488379，此验证码用于登录知乎或重置密码。10 分钟内有效。

i will suggest just add a new line after a punctuation.
【知乎】<--(new line after close bracket, or not is also ok)
你的验证码是 488379，
此验证码用于登录知乎或重置密码。
10 分钟内有效。
thanks

2 Attachments
• #49

Gordon
will try the latest cutting edge firmware and report then.
but that doesn't have anything to do with the wrapping, correct?

It has a lot to do with the wrapping - what version does it say on the about page of the Bangle?

if i manually click into the msg app, and click view msg, it appear differently as "press-into".

Yes, the font size is different? Otherwise it appears to wrap at the same points?

But I'm not sure what's going on with your firmware because I just tried the exact text you pasted on a fresh install of latest Gadgetbridge and Bangle.js firmware and I see this - which seems a lot better?

2 Attachments
• #50

Ganblejs

Looking at the commits on Gadgetbridge master branch you seem to have a recent enough Gadgetbridge nightly (https://codeberg.org/Freeyourgadget/Gadgetbridge/commits/branch/master).

But judging by that Bangle.js screenshot of the about app the firmware is not updated to cutting edge version there. So update that via the app loader on https://banglejs.com/apps/?q=fwupdate (I guess you already know this step. But something seems to have made it so it didn't update for you).

Page of 3

Prev Next

Post a reply
- Bold
- Italics
- Link
- Image
- List
- Quote
- code
- Preview
Formatting Help

Don't worry about formatting, just type in the text and we'll take care of making sense of it. We will auto-convert links, and if you put asterisks around words we will make them bold.

Tips:

Create headers by underlining text with ==== or ----

To *italicise* text put one asterisk each side of the word

To **bold** text put two asterisks each side of the word

Embed images by entering:
![](https://www.google.co.uk/images/srpr/logo4w.png)
That's the hard one: exclamation, square brackets and then the URL to the image in brackets.

* Create lists by starting lines with asterisks

1. Create numbered lists by starting lines with a number and a dot

> Quote text by starting lines with >

Mention another user by @username

For syntax highlighting, surround the code block with three backticks:

```
Your code goes here
```
Just like Github, a blank line must precede a code block.

If you upload more than 5 files we will display all attachments as thumbnails.

For a full reference visit the Markdown syntax.

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

=========================

About

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Actions

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

=========================

Formatting Help

About

[Solved] message was cropped and render un-usable, sometimes, for chinese fonts.

Actions

Espruino