"ERROR Connection Timeout" on gatt.connect()

Posted on
  • I am using the MDBT42Q with software version 2v04 and the console on Serial1.

    The problem I have is that gatt.connect() only rarely succeeds connecting to either of 2 BLE peripheral devices, in more than 9 out of 10 times it throws "ERROR Connection Timeout". I am using very basic code:

    NRF.requestDevice({ filters: [{ name: 'devicename'}] })
    .then(function(device){
        console.log("found - trying to connect");
        return device.gatt.connect();
    })
    .then(function() {
      console.log("Done!");
    })
    .catch(function(err) {
      console.log("ERROR", err);
    });
    

    I am trying this with 2 BLE devices: A BLE sensor and the nRF app on Android set as an advertiser (discoverable, connectable).

    I am using a 2nd Android phone with the nRF app as BLE Central to scan and connect (and then disconnect) to either of these 2 BLE devices, which works reliably. As far as I know no other BLE central device is trying to connect to the devices.

    Given that it works from the nRF app and there are very few parameters to tweak the connection establishment in the MDBT42Q I am not sure what to try next. Any help is appreciated.

    Thanks

  • Bump ... any ideas? I was thinking maybe it's related to security/key exchange, but I don't believe security is an issue here as neither peripheral (sensor nor the nRF app) has security enabled.

  • Sat 2019.10.05

    @user103949 no experience with gatt but;

    Does this thread with more verbose code provide some insight?

    GATT connection timeout?


    https://www.espruino.com/BLE+Communications

  • Hi Robin,

    Thanks for the pointers, I've looked at these two before.

    The "GATT connection timeout" doesn't apply because as far as I can tell in that thread the connect() call initially succeeds but then quickly disconnects due to bonding issues. In my case I can't even get the initial connect() call to succeed.

    I've also reviewed "BLE Communications" and tried various versions of BLE example code, including using the "ble_uart" module to see if my BLE sensor responds to the Nordic UART service (even-though the nRF app doesn't seem to need it), but no success.

    At this point, connect() continues to fail most of the time with a timeout, and when it succeeds (rarely) it is not reproducible.

    Best,
    -- Terrence

  • Ok, some more info:

    I tried connecting to the BLE sensor from iOS and that worked as well, so I am pretty sure the sensor accepts incoming connections just fine and the issue is with the way the MDBT42Q tries to connect to the sensor.

    I noticed https://www.espruino.com/Reference#l_NRF_setSecurity, which indicates that "bond" is true by default. The BLE sensor does not do bonding, so trying to bond will fail ... however, I believe since the connection attempt comes before bonding it should still succeed.

    Just to be sure, I tried
    NRF.setSecurity({bond:false});
    but that fails with "Uncaught Error: BLE error 0x7 (INVALID_PARAM)"
    Other security parameters can be set just fine, so something is wrong with the "bond" parameter or the documentation.

    In any case, I am still stuck.

    Thanks,
    -- Terrence

  • Terrence, until others respond timely, my guess suggestions may not be of any assistance, but to assist the others;

    When testing from iOS is that also/only with the WebIDE and iOS? (e.g. not testing using windows10 - Is it possible to test with Win10? not an Android device)

    Which BLE sensor is in use? Would you post a link to it's doc's please.

    Is it possible that the 'bond' attribute must be used with the/some of the other attributes? (S.W.A.G. guessing here)

    Posting *ALL* the code may better assist others.

    Is this of any use beneath heading 'Bonding / Whitelisting'

    https://www.espruino.com/BLE+Security#line=37,38

     

    'Just to be sure, I tried NRF.setSecurity({bond:false});'

    I notice in the documentation there, the display attribute has a value of 1 rather than true. Has a numerical value been tried instead?

    'so something is wrong with the "bond" parameter or the documentation'

    My guess is that the doc's are okay as they have been around and in use for over two years now. But there are many correlated forum responses, too many to mention in this thread but try Google gatt ble site:espruino.com with the site specifier to obtain a list of many tens of entries. That might provide some insight, while waiting. . . .

  • Hi again,

    so this is really weird ... for reasons I don't understand it just started working. I can connect to the sensor now (well, most of the time, see below) and I also just added some code to read a service and characteristic and that works as expected. But the connect part of the code is still the same as before, so it's a bit of a mystery.

    What I did change was the console settings. I switched from serial back to Web Bluetooth and that seems to have changed the MDBT42Q Central behavior. Although this is still not 100% reliable ... sometimes Web Bluetooth fails to connect to the MDBT42Q and sometimes Web Bluetooth unexpectedly disconnects and I have to power-cycle the MDBT42Q to get it to connect to the IDE again.

    It feels like there is some combination of serial, Bluetooth, the console setting, and some other factor that messes up the MDBT42Q Bluetooth stack to a point where it wouldn't connect to anything as a Central. I also still get occasionally "Uncaught InternalError" from the MDBT42Q.

    I will keep on eye on this and try to create reproducible behavior. Thanks for helping so far!

    Best,
    -- Terrence

  • 'I switched from serial back to Web Bluetooth and that seems to have changed the MDBT42Q Central behavior'

    Have you seen this note beneath heading 'Serial Console' from:

    https://www.espruino.com/MDBT42Q#Serial-Console

    "When you connect via Bluetooth, the console will automatically move over. To stop this, execute Serial1.setConsole(true) to force the console to stay on Serial1"

    The entire section contains valuable tid-bits


    'I can connect to the sensor now'

    Which sensor? Link would be valuable to others attempting to diagnose.

    What is being used to power the MDBT42Q breakout board? Could an old battery be the culprit? Power spikes, decoupling caps?

  • Hi Robin,

    I experimented with this some more now. I simplified the setup and removed the TX/RX Serial1 lines, so only using Web Bluetooth to connect the MDBT42Q to the IDE.

    I am connecting to this sensor:
    https://www.aspion.de/en/transport-data-logger-aspion-g-log-2/
    I have the protocol specs but cannot provide them as they are not public. The sensor behaves like a Nordic UART. I am pretty sure the sensor is the not culprit as I am seeing the same MDBT42Q behavior using the nRF app as a peripheral.

    Thanks for the hint with the power supply. I just switched to a fresh 4 x AA (6V) battery power supply, so power should not be an issue ... and it doesn't appear to be (no change in behavior).

    Right now, the MDBT42Q is just very unreliable. I can connect to the sensor and get data with the Android nRF app reliably 100% of the time, but executing the below MDBT42Q code gives me the following results:

    • About 30% of the time: "ERROR No device found matching filters"
    • About 30% of the time: "ERROR Connection Timeout"
    • About 30% of the time: IDE disconnects from MDBT42Q during execution of the program (usually around connect()), so I loose the console and don't know what's happening
    • About 10% of the time: Connection succeeds, sensor data is displayed, then connection is successfully closed

    I see the same behavior regardless of whether I freshly power-cycle the MDBT42Q or run the program several times in a row using the IDE upload button. I also checked the Bluetooth traffic in my vicinity but the nRF app actually only sees about 10 devices advertising, so traffic density is not a problem (my 2.4 Ghz WiFi environment is also not crowded) ... and again, the Android nRF app works fine so the wireless environment is likely not the issue.

    At this point the evidence points to a bug in the Espruino stack making the MDBT42Q too unreliable to be useful.

    Thanks,
    -- Terrence

    var gatt;
    var service;
    
    console.log("start: searching for devices ...");
    
    NRF.requestDevice({ timeout: 20000, filters: [{ name: 'devicename'}] })
    .then(function(device){
        console.log("found - connecting ...");
        return device.gatt.connect();
    })
    .then(function(gattp){
        console.log("get primary service");
        gatt = gattp;
        return gatt.getPrimaryService("6e400001-b5a3-f393-e0a9-e50e24dcca9e");
    })
    .then(function(servicep){
        console.log("get TX characteristic");
        service = servicep;
        return service.getCharacteristic("6e400003-b5a3-f393-e0a9-e50e24dcca9e");
    })
    .then(function(characteristicp){
        console.log("start notfifications");
        characteristicp.on('characteristicvaluechanged', function(event) {
          console.log("RX: "+JSON.stringify(event.target.value.buffer));
        });
        return characteristicp.startNotifications();
    })
    .then(function(){
        console.log("get RX characteristic");
        return service.getCharacteristic("6e400002-b5a3-f393-e0a9-e50e24dcca9e");
    })
    .then(function(characteristicp){
        console.log("send command");
        characteristicp.writeValue('g');
    })
    .then(function() {
      return new Promise(function(resolve) {
        setTimeout(resolve, 10000);
        console.log("delayed for 10s");
      });
    })
    .then(function() {
      console.log("end delay - disconnecting ...");
      gatt.disconnect();
      console.log("Done!");
    })
    .catch(function(err) {
      console.log("ERROR", err);
    });
    
  • Sun 2019.10.06

    Good Morning Terrence,
    Nothing immediately stands out, and having a solid power bank should rule out a power issue.

    I posted a link in #3

    GATT connection timeout?

    It would be really helpful to others is you would post the output as was done as the first #1 posting shows there, when connected and any state when not connected if possible.

    Confirming we have: 'MDBT42Q with software version 2v04'

    and is that with the breakout board or just the 42Q module itself?

  • Hi Robin,

    Confirming this is the MDBT42Q breakout board with 2V04.

    I poked around various BLE documentation some more and -- running out of other options -- set the scan to 'active' in NRF.requestDevice({active:true, ...)}. That made a big improvement in the connect() success rate!

    The sensor I'm using supports active scanning ... in my BLE understanding this should be compatible with passive scanning (it just allows the peripheral to send additional advertising data in a second packet) but apparently the MDBT42Q stack doesn't handle this transparently.

    I found a hint here: Passive scan returns different address type than active scan for BLE devices -- this appears to be related to the nRF BLE stack, but this post is for a Linux Python driver.

    The point is that for (some?) devices a passive scan appears to return a different MAC address of the peripheral than an active scan. So when NRF.requestDevice() does a passive scan (which is the default) and finds a device with a specific MAC, but then you pass that MAC into connect() then it doesn't match the active scan MAC of the device and the connection times out. This is very reproducible in my setup if I flip between {active:true} and {active:false}.

    The reason this worked in the nRF app may be because the nRF app handles passive vs. active scans transparently ... there appears to be no way in the app to select passive vs. active scanning and maybe the underlying stack deals with the change in MAC automatically.

    In any case, using NRF.requestDevice({active:true, ...)} raised the rate of connect() success significantly ... I still get occasional errors that the device is not found or the connect attempt times out (BLE isn't 100% reliable, so that's OK), but now the code succeeds in pulling off the data from the sensor in >80% of runs so that's a huge improvement.

    Best,
    -- Terrence

  • Hi,

    Sorry you've had so much trouble with this - I'll try to answer what I think might still be outstanding but let me know if I missed anything...

    set the scan to 'active' in NRF.requestDevice... MAC addresses change

    Wow, I have never come across that before at all. And you actually see a different MAC addresses if you do:

    NRF.requestDevice({ timeout: 20000, filters: [{ name: 'devicename'}] }).then(print);
    

    Is it a completely different number, or just the difference between public/random?

    Usually non-active scan works fine and as you note, Active just allows extra data to be transferred (via a scan response packet). Normally devices don't rely on that because it's super unreliable when you get in congested radio areas.

    The fix applied for that Python issue you found looks odd as well: https://github.com/IanHarvey/bluepy/commit/91227b7335acdc23703be4b7de45e804b3231be8

    It seems like the address type being reported isn't public or random - and I guess it's possible that in that case Espruino defaults to the wrong one...

    About 30% of the time: "ERROR No device found matching filters"

    The usual fix for this would be a long timeout, but it seems you're already doing that. My guess would be that the device is advertising its name in a scan response packet (only available with active:true rather than in the main packet).

    If you could find another way to filter (maybe run NRF.findDevices(print) and see if anything recognizable like a service UUID is present) then you might be more reliable connections.

    About 30% of the time: "ERROR Connection Timeout"

    What does NRF Connect say the advertising interval is?

    I guess what might be happening is: the BLE device you connect to advertises very rarely (1 sec or less) and only speeds up when it gets asked to provide a scan response in an active scan.

    If you're then connecting after doing a non-active scan, the advertisements are too far apart for the MDBT42 to be able to reliably connect.

    About 30% of the time: IDE disconnects from MDBT42Q during execution of the program (usually around connect()), so I loose the console and don't know what's happening

    That's really strange. The 2v04 firmware is really very reliable with Bluetooth and I can't remember the last time I crashed it while connecting to anything. You mentioned a Serial connection earlier - if this happens again please can you try while connected on Serial and see if any errors get reported.

    Anyway, glad you got it working more reliably with active:true. It's very hard for me to debug any firmware issues if I can't reproduce though.

    Am I right in thinking you're in Germany? If this continues to be an issue is it possible that you could lend me one of those devices to try out?

    I just tried this connecting to nRF connect on my phone, and definitely have issues connecting. On closer inspection it seems to be because the phone is using a RANDOM_PRIVATE_RESOLVABLE address (most devices either public or random static), and Espruino was interpreting that as a public address.

    I have now fixed it, so if you try with a 'cutting edge' firmware from http://www.espruino.com/binaries/travis/master/ you may have more success.

    It still doesn't explain the crashes you were seeing though as mine just failed with ERROR Connection Timeout each time.

  • Hi Gordon,

    Thanks for the reply. Some answers to your questions:

    Is it a completely different number, or just the difference between public/random?

    This is what I get when I print(device) right after NRF.requestDevice():

    BluetoothDevice: {
      "id": "d1:1a:a8:29:8c:0d random",
      "rssi": -62,
      "data": new Uint8Array([2, 1, 6, 9, 9, 78, 88, 84, 71, 78, 32, 35, 50]).buffer,
      "name": "devicename"
     }
    

    I get the same for active:true and active:false (except for RSSI, of course)

    The usual fix for this would be a long timeout, but it seems you're already doing that. My guess would be that the device is advertising its name in a scan response packet (only available with active:true rather than in the main packet).

    According to the sensor documentation the local device name is included in the regular advertising packet, but the manufacturer-specific data is included in the scan response (active scan).

    The device is designed to live for many months on battery, so I suspect it sleeps a lot and minimizes transmissions. nRF Connect says the advertising time is 7ms, but I can see 5 to 10s between packets from the device. So I am not surprised that Espruino misses the device occasionally ... it's not a problem, I can just re-scan.

    I guess what might be happening is: the BLE device you connect to advertises very rarely (1 sec or less) and only speeds up when it gets asked to provide a scan response in an active scan.

    Sounds plausible.

    That's really strange. The 2v04 firmware is really very reliable with Bluetooth and I can't remember the last time I crashed it while connecting to anything. You mentioned a Serial connection earlier - if this happens again please can you try while connected on Serial and see if any errors get reported.

    Could the cause of the closed connection also be on the Mac OS side (e.g. the Mac Espruino IDE or the Mac OS Bluetooth support)? I will try and capture a reproducible situation for this disconnect and let you know.

    Am I right in thinking you're in Germany? If this continues to be an issue is it possible that you could lend me one of those devices to try out?

    Yes. I only have one loaner device from the company, but it may be possible to get another one for testing.

    I have now fixed it, so if you try with a 'cutting edge' firmware from http://www.espruino.com/binaries/travis/­master/ you may have more success.

    Great, thanks. I will try that out tomorrow, I hope.

    It still doesn't explain the crashes you were seeing though as mine just failed with ERROR Connection Timeout each time.

    I tried many different permutations over the last few days so I am not sure about the exact circumstances under which this happened. Will try and reproduce.

    Thanks again,
    -- Terrence

  • This is what I get

    If the address is reported as the same then I'm surprised it's an issue - it's actually the exact address string you see there that is used to initiate the connection, so if it's the same there should be no difference.

    I can see 5 to 10s between packets

    This would probably explain some connection issues too - I forget the timeout on connections but it's probably no more than 5 seconds. If it's reporting 7ms intervals but is sending only 5 sec, that's probably upset Espruino's bluetooth stack a bit too!

    If the interval is reliably 5 seconds, you might get more reliable connections by doing requestDevice, waiting 4.5 sec once you get the response, and then initiating a connection.

    Could the cause of the closed connection also be on the Mac OS side

    Maybe. If you're trashing WiFi as well it can definitely make things flakey but usually it's quite good. If it happens again and the LEDs on the MDBT42Q breakout flash then it'll be a reboot caused by an error reported from the BLE stack though (in which case the error should be reported on Serial), in which case it's something I could hopefully track down.

    I only have one loaner device from the company

    I guess they might loan one out to me.

    Good luck with the testing - however it looks from the device address string that it may not be the problem you were seeing. However you should now be able to connect to your phone at least!

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

"ERROR Connection Timeout" on gatt.connect()

Posted by Avatar for user103949 @user103949

Actions