Write Better Video Transcriptions with Whisper AI

Introducing Whisper AI for Transcription

Hey team, Brian here. Today, I want to show you a video transcription method using something called Whisper AI. The reason I’m sharing this is because Whisper does such a good job with video transcriptions. I don’t know about you, but in the past, I spent a lot of time reformatting and making corrections to video transcriptions. With this method, I just didn’t have to do that. So let’s hop right in.

From LinkedIn to Blog

Basically, I started with a LinkedIn post. I posted a video about a major update to Google Workspace and I thought this would be cool to put on my blog as well. So what I did was hopped over to Google Colab.

Setting Up the Environment

To get started, I did two things. Number one, go to runtime and change the runtime type to GPU. That’s the first thing. Second, add this code and run it. You can pause the video to see what I typed in, or I will include it in the comments or description depending on what platform you’re currently using. Give this about 30 to 60 seconds to run.

Uploading the Video File

Once that’s done, what you’re going to do is drag your raw video file over here to the left and just drop it in. You can’t see mine right now because the Google Colab instance reset itself, and I don’t want to drag it in again because the video actually takes a while to load in; it took probably 5 to seven minutes.

Running the Transcription

This is where you’re going to drag your file. Once you’ve done that, you can reference it with this second bit of code right here. So you’re going to drop this code in and run it. Here you can see I’m calling Whisper AI right here. You can see I’m referencing my raw video file on my computer, and then here I’m calling the transcription model I’m going to use. There are four or five different types you can use, and I used medium. I learned this on YouTube, and that was the recommendation.

Getting the Transcription Results

Now, after you run that, you will see your video transcription appear after a short time. You’re also going to see some files appear on this left side below your raw video file name. The one I think you’re going to be most interested in is the text file that’s just got the text and it doesn’t have any of these timestamps, so you know you won’t have to clean things up in that way.

Final Formatting with ChatGPT

As you can see, this is kind of like sentence by sentence; it’s not formatted, and I wanted to put this on my blog. So what I did next was just grabbed this and went over to ChatGPT. This prompt actually took me a few tries. Let me share this with you. I asked it to reformat this text in the following ways: group sentences into paragraphs, add relevant subheadings, and don’t reword anything. That last one to me was the most important because ChatGPT often takes liberties if you don’t tell it not to do that. After I entered that in with the transcript, that’s exactly what I got. I got subheadings, I got paragraphs, and these are all my own words from the video.

Conclusion and Encouragement

Then I hopped over to my blog and added it in. Whisper did a great job here, and if you have been spending a lot of time making corrections to your video transcriptions, I suggest this process. It is a bit technical, and some of the steps did take a while like waiting for the file to upload here, but it’s really not that bad. I found it to be a little bit easier than some of the other methods I’ve used in the past, and I wanted to share it with you today. So I hope you found that interesting, and if you enjoyed the video, give me a like, give me a follow, and I’ll see you in the next one.