Transcribe Audio to Text for FREE | Whisper AI Step-by-Step Tutorial

186.56k views1379 WordsCopy TextShare

Jennifer Marie

Learn how to transcribe automatically and convert audio to text instantly using OpenAI's Whisper AI ...

Video Transcript:

Hello, everyone, and welcome back to my channel, Jennifer Marie, where I teach you different ways to make money online and how to become a work-from-home freelancer. So some of my most popular videos talk to you about transcription, how to transcribe audio to text, and in today's tutorial, I'm excited to show you how you can convert audio files or video files to text completely for free without any limit. We are going to be using something called Whisper, and Whisper is a machine learning model for speech recognition and transcription.

And it's created by Open AI. Open AI are also the creators of ChatGPT. This is completely free, and Whisper supports 99 languages, so you can convert audio or video files to text in 99 different languages using this method.

Now there is a way that you can install this on your computer. But I know a lot of you don't have really fast, powerful computers. So in this method, we will not be installing it on our computer.

Instead, we're going to use Google Colaboratory within our Google Drve account. And this method allows you to write and run code directly in your browser. So that way you could do this if you're on your computer, your friend's computer, at work, because you're not installing something on the computer itself.

Okay, so first of all, let's open Google Drve. All you need is your Gmail account to access Google Drve, and it's also free. Then you're going to click here on New.

Then go down and click More. Then click Connect More Apps. So now we have to search for the app that we want to install.

So click on Search apps, and type in Colaboratory. And you're going to click on the first one that pops up. And now just click Install.

Then click Continue. They may ask you to sign in with your Google account. So just click on your Google account, and it will be installed instantly.

So now just click Done and close off the marketplace window. And now we have to open Google Colaboratory. So to open it, just click on new once again.

Click on More, and it will show up right here Google Colaboratory. So just click that. So I'm going to do a demo of how we can transcribe an audio file and a video file.

So first, we're going to transcribe an audio file. Double click where it says Untitled to rename the file, but keep the extension as it is and then press Enter. So now click on Runtime and click Change runtime type.

So we want to change the hardware accelerator from CPU to T4 GPU, then click Save. So now we need to install Whisper AI and FFmpeg to be able to work with both audio and video files. And remember, we are not installing this on our computer but instead in Google Colab.

And this might seem complicated, but just follow the instructions and you'll see how easy it is. So in the description below, I have pasted this code. So go into the description below and copy and paste this exact code.

And you're going to paste it in this field right here. Then click Run Cell on this icon to run the code. And this will go ahead and install Whisper and FFmpeg.

And it should only take a few minutes. You can see here it took three minutes to install. So now we're ready to upload our file on the left, click here on this folder icon.

And what you're going to do is drag and drop your audio or video file into this section here on the left. So this warning will pop up basically telling you to save your files on your computer because the runtimes files will be deleted when this runtime is terminated. So once it's finished transcribing, and you're finished your session on Google Colab, it will erase this audio or video file.

So now we want to get the text from this file. So click here on Code, and we're going to insert this code here. Again, I have pasted this code in the description below.

So paste that in here and then replace your file name with your exact file name including the spaces and the extension. So in my case, it was Corporate-Sample. mp3, then click Run Cell.

So that will begin extracting the text from the file. You can see it's automatically detecting that this file is in English. And right here it is transcribing it perfectly with punctuation, capitalization, and even with time stamps.

So in our first demo, this is around a two minute file. So we're going to see how long it takes to transcribe a two minute file. You can see it took 50 seconds.

So in order to download this transcript, just wait a few seconds, and you will have a few different options on the side here. So you can see here, there's a . srt file, which is your typical subtitle file that you can upload to YouTube, for example, and a .

txt file. If these haven't popped up for you, just click on the Refresh icon here. So to download any of these files, let's try the .

txt, one, just hover over it, and then click on the icon here and click Download. And let's do the same for the subtitle file. And I'll show you what they look like.

So this is the . txt file, and you can see it's done an amazing job. This is perfect.

There's punctuation; it's broken up the sentences correctly, It's even used hyphens correctly. And if we open up the . srt file, you can see it has done captions for us.

So we could go ahead and upload this to YouTube. So now I want to quickly show you what it's like when you upload a video file. And this video file is around 12 minutes long.

So once again, you're going to drag and drop your file over here on the left, and the file will start to upload. And once the file has finished uploading, you'll see it in the list here. So you can see types of sentences.

And once again, you're going to click on Code. And we're going to paste what we did before. Again, you can find this code in the description below.

So we have to replace your file name, and in this case, it's a really long file name. And I don't feel like typing it out. So I can actually rename it by hovering my mouse over the file and clicking on the three dots icon.

And then click Rename file. I'm going to rename this to Sentences, so it's easier. So now replace your file name with sentences.

And remember to put the extension. It won't work if you don't put the extension, so Sentences. mp4 in this case, and then click Run cell.

So once again, it's going to begin transcribing. And it only took two minutes to transcribe this 12 minute file. And if you know anything about transcription, you know it takes a long time to manually type this out, especially considering it's added punctuation and capitalization and everything else.

So again, if you wait a few seconds, you can see here we can download the . txt file or the . srt file.

And it's done an amazing job so quickly. So you can go ahead and transcribe as many files as you like using this method. Now once you're done your session and you close Google Drve, when you open it again to transcribe, you'll have to repeat this process once again.

So it does take around three minutes or so to install Whisper, but it's definitely worth it, considering how fast it transcribes. And if you were to do this manually, it could take you hours. So I hope you guys enjoyed this tutorial.

Make sure to subscribe to my channel for more videos like this one. If you have any questions, feel free to ask me in the comments section. I really hope you enjoy this.

Let me know if it works for you. And I'll see you guys in my next tutorial!