Articles:

Speech Input

People love their mobile phones because they can stay in touch wherever they are. That means not just talking, but e-mailing, texting, microblogging, and so on.

Speech input adds another dimension to staying in touch. Google's Voice Search application, which is pre-installed on many Android devices and available on Google Play, provides powerful features like "search by voice" and Voice Actions like "Navigate to." Further enhancing the voice experience, Android 2.1 introduces a voice-enabled keyboard, which makes it even easier to stay connected. Now you can dictate your message instead of typing it. Just tap the new microphone button on the keyboard, and you can speak in just about any context in which you would normally type.

We believe speech can fundamentally change the mobile experience. We would like to invite every Android application developer to consider integrating speech input capabilities via the Android SDK. One of our favorite apps on Google Play that integrates speech input is Handcent SMS, because you can dictate a reply to any SMS with a quick tap on the SMS popup window. Here is Speech input integrated into Handcent SMS:

The Android SDK makes it easy to integrate speech input directly into your own application. Just copy and paste from this sample application to get started. The sample application first verifies that the target device is able to recognize speech input:

// Check to see if a recognition activity is present
PackageManager pm = getPackageManager();
List activities = pm.queryIntentActivities(
  new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH), 0);
if (activities.size() != 0) {
  speakButton.setOnClickListener(this);
} else {
  speakButton.setEnabled(false);
  speakButton.setText("Recognizer not present");
}

The sample application then uses startActivityForResult() to broadcast an intent that requests voice recognition, including an extra parameter that specifies one of two language models. The voice recognition application that handles the intent processes the voice input, then passes the recognized string back to your application by calling the onActivityResult() callback.

Android is an open platform, so your application can potentially make use of any speech recognition service on the device that's registered to receive a RecognizerIntent. Google's Voice Search application, which is pre-installed on many Android devices, responds to a RecognizerIntent by displaying the "Speak now" dialog and streaming audio to Google's servers -- the same servers used when a user taps the microphone button on the search widget or the voice-enabled keyboard. You can check whether Voice Search is installed in Settings > Applications > Manage applications.

One important tip: for speech input to be as accurate as possible, it's helpful to have an idea of what words are likely to be spoken. While a message like "Mom, I'm writing you this message with my voice!" might be appropriate for an email or SMS message, you're probably more likely to say something like "weather in Mountain View" if you're using Google Search. You can make sure your users have the best experience possible by requesting the appropriate language model: free_form for dictation, or web_search for shorter, search-like phrases. We developed the "free form" model to improve dictation accuracy for the voice keyboard, while the "web search" model is used when users want to search by voice.

Google's servers support many languages for voice input, with more arriving regularly. You can use the ACTION_GET_LANGUAGE_DETAILS broadcast intent to query for the list of supported languages. The web search model is available for all languages, while the free-form model may not be optimized for all languages. As we work hard to support more models in more languages, and to improve the accuracy of the speech recognition technology we use in our products, Android developers who integrate speech capabilities directly into their applications can reap the benefits as well.

↑ Go to top

← Back to Articles