It looks as though Google has made offline speech recognition available from Google Now for third-party apps. It is being used by the app named Utter.
Has anyone seen any implementations of how to do simple voice commands with this offline speech rec? Do you just use the regular SpeechRecognizer API and it works automatically?
This question is related to
android
speech-recognition
offline
speech-to-text
google-now
I would like to improve the guide that the answer https://stackoverflow.com/a/17674655/2987828 sends to its users, with images. It is the sentence "For those that it doesn't, this is the ‘guide’ I supply them with." that I want to improve.
The user should click on the four buttons highlighted in blue in these images:
Then the user can select any desired languages. When the download is done, he should disconnect from network, and then click on the "microphone" button of the keyboard.
It worked for me (android 4.1.2), then language recognition worked out of the box, without rebooting. I can now dictates instructions to the shell of Terminal Emulator ! And it is twice faster offline than online, on a padfone 2 from ASUS.
These images are licensed under cc by-sa 3.0 with attribution required to stackoverflow.com/a/21329845/2987828 ; you may hence add these images anywhere along with this attribution.
(This the standard policy of all images and texts at stackoverflow.com)
I successfully implemented my Speech-Service with offline capabilities by using onPartialResults when offline and onResults when online.
It is apparently possible to manually install offline voice recognition by downloading the files directly and installing them in the right locations manually. I guess this is just a way to bypass Google hardware requirements. However, personally I didn't have to reboot or anything, simply changing to UK and back again did it.
Working example is given below,
MyService.class
public class MyService extends Service implements SpeechDelegate, Speech.stopDueToDelay {
public static SpeechDelegate delegate;
@Override
public int onStartCommand(Intent intent, int flags, int startId) {
//TODO do something useful
try {
if (VERSION.SDK_INT >= VERSION_CODES.KITKAT) {
((AudioManager) Objects.requireNonNull(
getSystemService(Context.AUDIO_SERVICE))).setStreamMute(AudioManager.STREAM_SYSTEM, true);
}
} catch (Exception e) {
e.printStackTrace();
}
Speech.init(this);
delegate = this;
Speech.getInstance().setListener(this);
if (Speech.getInstance().isListening()) {
Speech.getInstance().stopListening();
} else {
System.setProperty("rx.unsafe-disable", "True");
RxPermissions.getInstance(this).request(permission.RECORD_AUDIO).subscribe(granted -> {
if (granted) { // Always true pre-M
try {
Speech.getInstance().stopTextToSpeech();
Speech.getInstance().startListening(null, this);
} catch (SpeechRecognitionNotAvailable exc) {
//showSpeechNotSupportedDialog();
} catch (GoogleVoiceTypingDisabledException exc) {
//showEnableGoogleVoiceTyping();
}
} else {
Toast.makeText(this, R.string.permission_required, Toast.LENGTH_LONG).show();
}
});
}
return Service.START_STICKY;
}
@Override
public IBinder onBind(Intent intent) {
//TODO for communication return IBinder implementation
return null;
}
@Override
public void onStartOfSpeech() {
}
@Override
public void onSpeechRmsChanged(float value) {
}
@Override
public void onSpeechPartialResults(List<String> results) {
for (String partial : results) {
Log.d("Result", partial+"");
}
}
@Override
public void onSpeechResult(String result) {
Log.d("Result", result+"");
if (!TextUtils.isEmpty(result)) {
Toast.makeText(this, result, Toast.LENGTH_SHORT).show();
}
}
@Override
public void onSpecifiedCommandPronounced(String event) {
try {
if (VERSION.SDK_INT >= VERSION_CODES.KITKAT) {
((AudioManager) Objects.requireNonNull(
getSystemService(Context.AUDIO_SERVICE))).setStreamMute(AudioManager.STREAM_SYSTEM, true);
}
} catch (Exception e) {
e.printStackTrace();
}
if (Speech.getInstance().isListening()) {
Speech.getInstance().stopListening();
} else {
RxPermissions.getInstance(this).request(permission.RECORD_AUDIO).subscribe(granted -> {
if (granted) { // Always true pre-M
try {
Speech.getInstance().stopTextToSpeech();
Speech.getInstance().startListening(null, this);
} catch (SpeechRecognitionNotAvailable exc) {
//showSpeechNotSupportedDialog();
} catch (GoogleVoiceTypingDisabledException exc) {
//showEnableGoogleVoiceTyping();
}
} else {
Toast.makeText(this, R.string.permission_required, Toast.LENGTH_LONG).show();
}
});
}
}
@Override
public void onTaskRemoved(Intent rootIntent) {
//Restarting the service if it is removed.
PendingIntent service =
PendingIntent.getService(getApplicationContext(), new Random().nextInt(),
new Intent(getApplicationContext(), MyService.class), PendingIntent.FLAG_ONE_SHOT);
AlarmManager alarmManager = (AlarmManager) getSystemService(Context.ALARM_SERVICE);
assert alarmManager != null;
alarmManager.set(AlarmManager.ELAPSED_REALTIME_WAKEUP, 1000, service);
super.onTaskRemoved(rootIntent);
}
}
For more details,
Hope this will help someone in future.
I was dealing with this and I noticed that you need to install the offline package for your Language. My language setting was "Español (Estados Unidos)" but there is not offline package for that language, so when I turned off all network connectivity I was getting an alert from RecognizerIntent saying that can't reach Google, then I change the language to "English (US)" (because I already have the offline package) and launched the RecognizerIntent it just worked out.
Keys: Language setting == Offline Voice Recognizer Package
In short, I don't have the implementation, but the explanation.
Google did not make offline speech recognition available to third party apps. Offline recognition is only accessable via the keyboard. Ben Randall (the developer of utter!) explains his workaround in an article at Android Police:
I had implemented my own keyboard and was switching between Google Voice Typing and the users default keyboard with an invisible edit text field and transparent Activity to get the input. Dirty hack!
This was the only way to do it, as offline Voice Typing could only be triggered by an IME or a system application (that was my root hack) . The other type of recognition API … didn't trigger it and just failed with a server error. … A lot of work wasted for me on the workaround! But at least I was ready for the implementation...
From Utter! Claims To Be The First Non-IME App To Utilize Offline Voice Recognition In Jelly Bean
A simple and flexible offline recognition on Android is implemented by CMUSphinx, an open source speech recognition toolkit. It works purely offline, fast and configurable It can listen continuously for keyword, for example.
You can find latest code and tutorial here.
Update in 2019: Time goes fast, CMUSphinx is not that accurate anymore. I recommend to try Kaldi toolkit instead. The demo is here.
Source: Stackoverflow.com