These samples show how to use the Google Cloud Speech API to transcribe audio files, as well as live audio from your computer's microphone.
If you have not already done so, enable the Google Cloud Speech API for your project.
These samples use service accounts for authentication.
-
Visit the Cloud Console, and navigate to:
API Manager > Credentials > Create credentials > Service account key > New service account. -
Create a new service account, and download the json credentials file.
-
Set the
GOOGLE_APPLICATION_CREDENTIALSenvironment variable to point to your downloaded service account credentials:export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials-key.jsonIf you do not do this, the streaming sample will just sort of hang silently.
See the Cloud Platform Auth Guide for more information.
-
Clone this repo
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git cd python-docs-samples/speech/api -
Create a virtualenv. This isolates the python dependencies you're about to install, to minimize conflicts with any existing libraries you might already have.
virtualenv env source env/bin/activate -
Install PortAudio. The
transcribe_streaming.pysample uses the PyAudio library to stream audio from your computer's microphone. PyAudio depends on PortAudio for cross-platform compatibility, and is installed differently depending on the platform. For example:-
For Mac OS X, you can use Homebrew:
brew install portaudio
-
For Debian / Ubuntu Linux:
apt-get install portaudio19-dev python-all-dev
-
Windows may work without having to install PortAudio explicitly (it will get installed with PyAudio, when you run
python -m pip install ...below). -
For more details, see the PyAudio installation page.
-
-
Install the python dependencies:
pip install -r requirements.txt
If you see the error
fatal error: 'portaudio.h' file not found
Try adding the following to your ~/.pydistutils.cfg file,
substituting in your appropriate brew Cellar directory:
include_dirs=/usr/local/Cellar/portaudio/19.20140130/include/
library_dirs=/usr/local/$USER/homebrew/Cellar/portaudio/19.20140130/lib/
-
To run the
transcribe_streaming.pysample:python transcribe_streaming.py
The sample will run in a continuous loop, printing the data and metadata it receives from the Speech API, which includes alternative transcriptions of what it hears, and a confidence score. Say "exit" to exit the loop.
-
To run the
transcribe_async.pysample:$ python transcribe_async.py gs://python-docs-samples-tests/speech/audio.flac
You should see a response with the transcription result.
-
To run the
transcribe.pysample:$ python transcribe.py gs://python-docs-samples-tests/speech/audio.flac
You should see a response with the transcription result.
-
Note that
gs://python-docs-samples-tests/speech/audio.flacis the path to a sample audio file, and you can transcribe your own audio files using this method by uploading them to Google Cloud Storage. (The gsutil tool is often used for this purpose.)
When you're done running the sample, you can exit your virtualenv:
deactivate