Skip to content

Latest commit

 

History

History
 
 

README.md

Google Cloud Speech GRPC API Samples

These samples show how to use the Google Cloud Speech API to transcribe audio files, as well as live audio from your computer's microphone.

Prerequisites

Enable the Speech API

If you have not already done so, enable the Google Cloud Speech API for your project.

Authentication

These samples use service accounts for authentication.

  • Visit the Cloud Console, and navigate to:

    API Manager > Credentials > Create credentials > Service account key > New service account.

  • Create a new service account, and download the json credentials file.

  • Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to your downloaded service account credentials:

    export GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/credentials-key.json
    

    If you do not do this, the streaming sample will just sort of hang silently.

See the Cloud Platform Auth Guide for more information.

Setup

  • Clone this repo

    git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
    cd python-docs-samples/speech/api
  • Create a virtualenv. This isolates the python dependencies you're about to install, to minimize conflicts with any existing libraries you might already have.

    virtualenv env
    source env/bin/activate
  • Install PortAudio. The transcribe_streaming.py sample uses the PyAudio library to stream audio from your computer's microphone. PyAudio depends on PortAudio for cross-platform compatibility, and is installed differently depending on the platform. For example:

    • For Mac OS X, you can use Homebrew:

      brew install portaudio
    • For Debian / Ubuntu Linux:

      apt-get install portaudio19-dev python-all-dev
    • Windows may work without having to install PortAudio explicitly (it will get installed with PyAudio, when you run python -m pip install ... below).

    • For more details, see the PyAudio installation page.

  • Install the python dependencies:

    pip install -r requirements.txt

Troubleshooting

PortAudio on OS X

If you see the error

fatal error: 'portaudio.h' file not found

Try adding the following to your ~/.pydistutils.cfg file, substituting in your appropriate brew Cellar directory:

include_dirs=/usr/local/Cellar/portaudio/19.20140130/include/
library_dirs=/usr/local/$USER/homebrew/Cellar/portaudio/19.20140130/lib/

Run the sample

  • To run the transcribe_streaming.py sample:

    python transcribe_streaming.py

    The sample will run in a continuous loop, printing the data and metadata it receives from the Speech API, which includes alternative transcriptions of what it hears, and a confidence score. Say "exit" to exit the loop.

  • To run the transcribe_async.py sample:

    $ python transcribe_async.py gs://python-docs-samples-tests/speech/audio.flac

    You should see a response with the transcription result.

  • To run the transcribe.py sample:

    $ python transcribe.py gs://python-docs-samples-tests/speech/audio.flac

    You should see a response with the transcription result.

  • Note that gs://python-docs-samples-tests/speech/audio.flac is the path to a sample audio file, and you can transcribe your own audio files using this method by uploading them to Google Cloud Storage. (The gsutil tool is often used for this purpose.)

Deactivate virtualenv

When you're done running the sample, you can exit your virtualenv:

deactivate