Best strategy to append to binary files

Posted on
  • Hello,

    we are building a sort of fitness tracker type of app. The idea is to store data at a high(ish) freuency (minute by minute or so) which does not fit with the already existing health database that you have implemented.

    In order to use space efficiently, we would like to save the data in a binary format, AND we need to append data to the file. Basically the files are written in append mode on sensors reading, and read week or so for synchronization with a smartphone.

    Which option do you suggest using?

    1. Storage.write() and Storage.read(). The problem with these is that there is no append mode. Maybe pre-allocating the file size and then writing with an offset would work? If yes, is there any efficient way to read bytes instead of strings (ideally without preloading everyting into RAM)?

    2. StorageFiles offer a handy append mode, but, AFAIK, one can only write strings.

    3. Flash library, would do to access the underlying flash memory, but I checked how much memory was available and got only 250kB, which makes me think that it points at the 1024kB on-chip flash instead of the exteranl 8MB flash? There is no way to access that memory directly (and safely)?

    any other idea?

    Thanks.

  • StorageFiles are not well suited for writing binary data, \xFF is used to detect the end of those and can not be used inside them. Flash is dangerous to use and you could easily overwrite your bootloader or destroy the file system created by the Storage module.

    I think you would be best off with using Storage.write/read with allocating a whole day/week at once and use offsets for writing data points. For reading you can either use Storage.read which returns memory-mapped strings which is light in RAM use or if your data is actually binary encoded and not easily parsed as string use Storage.readArrayBuffer which can be wrapped into one of the array types and used for easier parsing.

  • You can convert string via E.toArrayBuffer or E.toUint8Array or to any typed array var b=Uint16Array(E.toArrayBuffer(s))

    Or you can actually use Storage.readArrayBuffer since the typed array constructor can take offset and length into ArrayBuffer so something like b=Uint16Array(Storage.readArrayBuffer(..),byteOffset,length).
    All these will reuse underlying data in storage and won't create copies in RAM.

    Too bad there is no api to erase pages in the middle of storage files or create page-aligned storage file. That would give advantage of Flash module without the danger of breaking something. That would allow to do circular buffer in the file.

  • Actually E.toUint8Array seems to make a copy, just tried in emulator

    >var h=E.toFlatString("Hello")
    ="Hello"
    >var hb=E.toArrayBuffer(h)
    =new Uint8Array([72, 101, 108, 108, 111]).buffer
    >var hu=Uint8Array(hb)
    =new Uint8Array([72, 101, 108, 108, 111])
    >var hu2=E.toUint8Array(h)
    =new Uint8Array([72, 101, 108, 108, 111])
    >E.getAddressOf(hu,true)
    =179852
    >E.getAddressOf(hb,true)
    =179852
    >E.getAddressOf(h,true)
    =179852
    >E.getAddressOf(hu2,true)
    =180006
    >var hu3=E.toUint8Array(hb)
    =new Uint8Array([72, 101, 108, 108, 111])
    >E.getAddressOf(hu3,true)
    =180300
    

    So hu2 and hu3 made by E.toUint8Array makes new array from same buffer. Others work as expected.

  • thanks for the answers!

    That would allow to do circular buffer in the file.

    A circular buffer is roughly the idea I am thinking about: as new data comes in it gets appended. When the file is full, old data gets overwritten. This should be possible pre-allocating enough space using also the size parameter in the write() method so that I can re-write on the same file with offset in the write() method to append. However, the documentation says that I can't rewrite the same position on the same file which brings two problems:

    Problem 1: I cannot rewrite older data. Workaround is to pre-allocate enough space and not overwrite. If space ends, I either create a larger file (and copy the existing content), or I give up and warn the user. I can live with the second, though not great.

    Problem 2: I need to keep track of what is the last byte that was written on the file, otherwise when I read back the file I will get all the "empty" data, the 0xFF bytes. Solutions are that I either impose that 0xFF cannot be used or that I keep track of the actual size of the storage file in a separate permanent memory, but Storage (and StorageFiles) would not work for this because I cannot rewrite them.

    Do you wise people have a better idea?
    Gordon, maybe you can extend the current API to support overwriting already written positions?

  • Gordon, maybe you can extend the current API to support overwriting already written positions?

    In general that's not possible. Flash memory can be erased only in blocks aligned to some boundary. Storage files occupy continuous area of flash but are not page aligned (to save space). However if there was some flag to create the file page aligned and there would be api to erase blocks it is doable.

  • As others have said, the storage library with large files is best - it calls into the same flash write routines under the hood but it just ensures you're writing to the correct place. There are some examples at https://www.espruino.com/Data+Collection#flash-memory which might help you for reading/writing raw data

    This might help too - showing how to write data and also read it back automatically when the Bangle goes in range: https://www.espruino.com/Auto+Data+Download

    Personally, on Bangle.js 2 you've got a decent amount of RAM. If you're not doing anything else with the watch and are just using it for logging:

    • Stop the user easily resetting the watch: https://www.espruino.com/Bangle.js+Button+Reset
    • Make a big (100k?) RAM buffer with a Uint8Array and then access it with a DataView
    • When that gets full, write it with Storage.write to a new file

    Then you don't need to bother with appending or anything fancy. You've got 8MB of flash, so with 100k buffers it's only 80 files which isn't really a huge problem (each file has a 32b header so it's not like you're losing much space).

    Writing 100k to a file will take about 0.3s, but that might well not be a big problem for you? Even if you used slightly smaller files it may still be easier and safer than managing writing individual bits to a big file.

    The benefit is that when you're downloading from the watch you've also got a nice easy way to handle transfers - just search for and download any data files you see, then delete them afterwards.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Best strategy to append to binary files

Posted by Avatar for user107850 @user107850

Actions