DataCloudies: Transcribe

Transcribe - AWS Service

AWS Transcribe is a deep learning method that uses the automatic speech recognition (ASR) service to convert the audio files into text files. It is useful for customer service calls decoding, create captioning and subtitling, and generate metadata for media transcriptions as searchable archives.

This helps us to add speech to text capability to the applications very easily. We can analyze the audio files stored in Amazon S3 and have the service return a text file of the transcribed speech by calling the Amazon Transcribe API.

1. Initialize the service. Before this, store the Audio file in S3 Bucket.

transcribe = boto3.client('transcribe')

2. Create the Lambda function with Python Runtime and call the Transcribe API

jobname = uuid.uuid1()

job_uri = "https://"+bucket+".s3.amazonaws.com/"+key

logger.info(job_uri)

transcribe.start_transcription_job(

TranscriptionJobName=str(jobname),

Media={'MediaFileUri': job_uri},

MediaFormat='mp3',

LanguageCode='en-US',

OutputBucketName=outfile_bucket)

3. Check the Job status and verify the JSON output,

transcribe.get_transcription_job(TranscriptionJobName=jobname)

{

"start_time":"0.710",

"end_time":"1.080",

"alternatives":[

{

"confidence":0.91,

"word":"Hello"

}

]

}

Pages

Transcribe - AWS

Transcribe - AWS Service

Recent Posts