-
-
Thanks @Gordon I think that would be a good short term solution until BLE is fully supported in all browsers. I'll have a play and see if I can get it working.
-
I've already exhausted that as an option, the issue is it only works in Chrome right now and Google requires an internet connection for it to work.
I am at a juncture where I've formulated 2 possible solutions:-
Snips
Install Snips on a Raspberry Pi, and have Puck act as a Speech Recognition controller sending start and stop signals directly to the MQTT overriding their hotword detection, from reading on their forums this seems to be possible and you can use JS to do this.Pros
- Completely offline/private and the trained assistant is very accurate at transcribing numbers.
- Whilst there would be much more prod development involved we'd have a nice branded product at the end of it that would differentiate us from competitors.
Cons
- Potentially high licensing costs to use commercially, I always worry when a company don't publish their costs, usually means it's expensive :(
- Getting a bluetooth headset to behave reliably with a Raspberry Pi for audio input/output, seems from a few things I've read Pi's are more suited to USB mics which wouldn't suit our situation as the operator needs to be mobile/hands free.
- The time Snips is listening is short so if we can't reliably extend that this might mean it's not a workable solution. I have tried changing a config setting and it did last a bit longer but not as long as it should have.
Audio Recorder
Record mono wav files in browser(this would work whether offline or online) using the tried and tested Recorderjs then upload the wavs to Googles Cloud Speech API for transcribing. Use Puck as a Record/Pause/Stop controller. From my understanding this would work out of the box in all browsers except Safari but thats ok, hopefully they will add BLE at some point.Pros
- We could store the wavs for a period of time allowing users to audit the results if for example a particular transcribed number was inaccurate it could be corrected via the web app at a later time.
- The whole system could be ran on Google App Engine thus should be quick to load and operate the transcribe process.
- Much quicker to bring to market as it really is just the Puck communicating with a web app.
- Potentially bigger market due to the fact it would work in most browsers out of the box, other than the puck cost and bluetooth headset end users wouldn't have to invest in any special equipment to use the service.
Cons
- Difficult to calculate/forecast Google costs.
- Not a fully offline solution.
I thought I'd post my thoughts here to get them out of my head, I'm currently leaning more toward the Audio Recorder solution, my instinct is telling me the Pi route would be a challenging and tech moves so fast the product could be out dated in no time, a web app can keep evolving and I can envisage a really nice UI where you can check the Pucks battery life and other stats etc.
- Completely offline/private and the trained assistant is very accurate at transcribing numbers.
-
-
-
-
Thanks for that @Gordon I spent yesterday getting my head around the Snips console and have trained an assistant that very accurately recognises numbers and various other keywords I've thrown in. I currently have it running on my Mac and am using JS to respond to intents. Would Snips run on any of your microprocessor devices?
-
I'm researching ways of having an offline ASR just recognise numbers, does anyone know of anything that could do that?
Came across this https://www.matrix.one and https://snips.ai but it wasn't obvious to me if they could handle numbers. The Matrix sure looks a neat bit of tech though!
-
Thanks Gordon, I ordered a Puck last night and will have a play, yes I watched that demo and it looks to be along the lines of what I'd need. I plan to use Nativescript to code the app so I'm hoping I can retrieve the Puck data via it's bluetooth plug-in: -https://github.com/EddyVerbruggen/nativescript-bluetooth
-
-
-
I've had a look on the site but can't find any info, is the device always on? we are looking at using it to start and stop audio recording in a mobile app, there will be a need to turn it on and off multiple times in a time window that could be anywhere between 1-4 hours, is that doable do you think? what's the battery life like?
-
-
Look forward to it.