Problems with "access the internet"

Posted on
  • Hi,
    I have tried the function "access the internet", but have encountered problems.
    I have tried several web pages, with each one I get a (different) SaxParseException.

    For example, the following call results in the exception:

    Bluetooth.println(JSON.stringify({t:"htt­p",url:"https://banglejs.com/reference",­xpath:"/html/body/div/h2"}));
    

    Does the call work for anyone? Is it possibly a problem with my phone?

    Using the relevant lines from the BanglejsGadgetBridge project, I was able to recreate the error in a small sample project of my own.

    InputSource inputXML = new InputSource(new StringReader(response));
    XPath xPath = XPathFactory.newInstance().newXPath();
    response = xPath.evaluate(xmlPath, inputXML);
    

    Simple HTML/XML structures work, but complex web pages do not. I think the parser can't handle the embedded JavaScript. But I am not sure.

    I searched for other solutions and tried Jsoup. For me it worked very well.

    Document doc = Jsoup.parse(response);
    response = doc.select(xmlPath).html();
    

    A big disadvantage is that XPath cannot be used, but selector.

    The question is, am I alone with this problem? Can someone help me to get the default implementation to work? Is an implementation with Jsoup possible, or is there something against it?

  • So, which SaxParseException do you get?

  • @myxor, thank you for your response.

    I'll have another look this evening.

    But there were different exceptions depending on the page I wanted to load.
    If you open the html, for example with firstobject XML editor, you will also see structural errors due to the embedded javascript. As already written above, very simple HTML/XML structures work.

    Does the example work for you/anyone?

    Bluetooth.println(JSON.stringify({t:"htt­p",url:"https://banglejs.com/reference",­xpath:"/html/body/div/h2"}));
    

    If so, then it seems to be probably a problem with my phone. I do not want to exclude that. It is the Teracube 2e with Android 11 (there are some open issues with the phone).

    I have now modified Bangle.js Gadgedbridge for my use and added Jsoup. I also added the possibility to execute multiple XPath expressions (or selectors) with one call. If I ever have some time, I'll refactor the code and create a merge request.

  • With the URL "https://banglejs.com/reference" I get the following exception:

    org.xml.sax.SAXParseException: unterminated entity ref
    (position:ENTITY_REF @1:1230 in java.io.StringReader@299f0e3) at
    org.apache.harmony.xml.parsers.DocumentB­uilderImpl.parse(DocumentBuilderImpl.jav­a:147)
    at org.apache.xpath.jaxp.XPathImpl.evaluate­(XPathImpl.java:474) at
    org.apache.xpath.jaxp.XPathImpl.evaluate­(XPathImpl.java:521) at
    gutzeit.com.xpathtest.MainActivity.lambd­a$testAsync$0(MainActivity.java:123)
    at
    gutzeit.com.xpathtest.MainActivity$$Exte­rnalSyntheticLambda0.run(Unknown
    Source:0) at
    java.util.concurrent.ThreadPoolExecutor.­runWorker(ThreadPoolExecutor.java:1167)
    at
    java.util.concurrent.ThreadPoolExecutor$­Worker.run(ThreadPoolExecutor.java:641)
    at java.lang.Thread.run(Thread.java:923)
    --------------- linked to ------------------ javax.xml.xpath.XPathExpressionException­:
    org.xml.sax.SAXParseException: unterminated entity ref
    (position:ENTITY_REF @1:1230 in java.io.StringReader@299f0e3) at
    org.apache.xpath.jaxp.XPathImpl.evaluate­(XPathImpl.java:479) at
    org.apache.xpath.jaxp.XPathImpl.evaluate­(XPathImpl.java:521) at
    gutzeit.com.xpathtest.MainActivity.lambd­a$testAsync$0(MainActivity.java:123)
    at
    gutzeit.com.xpathtest.MainActivity$$Exte­rnalSyntheticLambda0.run(Unknown
    Source:0) at
    java.util.concurrent.ThreadPoolExecutor.­runWorker(ThreadPoolExecutor.java:1167)
    at
    java.util.concurrent.ThreadPoolExecutor$­Worker.run(ThreadPoolExecutor.java:641)
    at java.lang.Thread.run(Thread.java:923) Caused by:
    org.xml.sax.SAXParseException: unterminated entity ref
    (position:ENTITY_REF @1:1230 in java.io.StringReader@299f0e3) at
    org.apache.harmony.xml.parsers.DocumentB­uilderImpl.parse(DocumentBuilderImpl.jav­a:147)
    at org.apache.xpath.jaxp.XPathImpl.evaluate­(XPathImpl.java:474) ...
    6 more

    I was interested to see if this works for anyone. It would be great if someone could try this out.

  • So... I believe if you don't specify an XPath you can load a webpage just fine (albeit only smallish ones).

    But I think I've had this error too - because HTML != XML. If someone wanted to contribute some changes that would ensure we used a parser in Gadgetbridge that handled HTML too that'd be great - but as I remember I didn't find an easy solution so I decided that some XPath support was better than none.

  • Post a reply
    • Bold
    • Italics
    • Link
    • Image
    • List
    • Quote
    • code
    • Preview
About

Problems with "access the internet"

Posted by Avatar for ingo @ingo

Actions