Recent research about open source speech recognition libraries

What is the issue?

the app i am developing requires

a coustomized hotword(keyword) detection
korean text to korean speech
korea speech to korean text

Hotword detection (Continuous Speech Recognition)

Snowboy (KITT.AI)
- customizable hotword detection engine for you to create your own hotword like “OK Google” or “Alexa”
- DNN (deep neural networks)
- can customize model(personal, private) from https://snowboy.kitt.ai/ (login, korean keyword is okay)
- always listening (record thread can run on foreground service)
- https://stackoverflow.com/questions/50956228/thread-within-service-android-app
- no internet required
- Light-weight (if there noise around, CPU usage may increase)
- Supports various platforms and programming languages
- Need a commercial license (Please contact snowboy@kitt.ai)
- Android Demo https://github.com/Kitt-AI/snowboy/tree/master/examples/Android
Porcupine (Picovoice)
- Alireza Kenarsari (Founder at Picovoice) says porcupine has lower miss rate than snowboy
- https://github.com/Picovoice/wakeword-benchmark
- https://medium.com/@alirezakenarsarianhari/yet-another-wake-word-detection-engine-a2486d36d8d4
- but when i really try it, snowboy worked better on my development environment (I may need to try more models)
- AAR style library
- Provides (Standard, Tiny) version
PocketSphinx
- demo: https://github.com/cmusphinx/pocketsphinx-android-demo

Speech to Text & Text to Speech (Korean)

kaldi is a toolkit for speech recognition written in C++
Zeroth Project (Kaldi based)
- MoreCoin: a mobile app to collect voice data from various users link
- Explanation of how korean speech recognition works: link
  - Multi layer structure to analyze voices
  - explain the difference between English and Korean
- using AWS server, web crawler to collect 13GB pronounciation dictionary and language models
KaKao API (Newton API) 20,000 requests free
- provides both STT & TTS
- reaserch of open source API(Korean): link
- good quality of speech
Naver Clova Speech Recognition API (STT): link, 4 Won/15sec
Naver Clova Speech Synthesis API (TTS): link, 4 Won/1000chars
CMUsphinx
- How to use other languages? link
- Need Korean acoustic model, language model
Amazon Lex (STT) & Polly (TTS)
- Maybe we can use third party language translation soulution
- IoT solution: link
IBM watson (SK NUGU use this API: http://www.newspim.com/news/view/20170206000145)
- STT: link
- TTS: link