In this video, I'll show you how I solved my problem of having an endless YouTube video list and catching up on the latest industry news. Imagine being able to get all the key insights from your favorite YouTube channels without spending hours watching those videos. Sounds amazing, right?
But first, to those new to the channel, my name is Nadia. I have been a software developer for my whole life and last year I started my AI automation agency where we help service businesses. create automations and have more time for their clients rather than spending that time on boring repetitive tasks.
Okay, enough about me. Let's jump into building. This automation consists of three key parts.
First is gathering our YouTube channels that we want to monitor and saving them into our database. Second is the flow that requests transcription and save that transcript to our database. And third is our transcription API.
I'll also be dropping some insider tips along the way and sharing some common pitfalls that beginners spend hours and days solving, so watch the whole video to not miss them. I'll be using a mix of no code and code solutions, including Zapier, Make. com, Python for code, and our friend ChatGPT.
The best part is that it all happens automatically. Let's get started. This is our database, um, and here we have link to our videos, the channel which we monitor, date the video has been posted, and here transcripts and summary will be added by our automation.
I also have to mention that those links are also added here by another automation, which monitors our YouTube channels. But first let's run our automation and see how transcript and summary will appear here. For that, we will go to make.
com and I will also show you how to create this part and the server, which is right here inside the HTTP module. Let's run this once and wait. Until we wait for it.
I will show you what happens behind the scenes. The first module fetches all the records from our table, from our YouTube feed, and then after that it filters only those records which don't have transcripts, so we don't transcribe the same video over and over again. Currently, I run this automation just once a day.
After we get videos which we need to process further, then we send those Videos URLs to our HTTP module, which is the server, which I also will show you how to create in this video. And I will show you how to create it fast without even learning how to code, but wait until we get there. And now let's wait until this module completes.
Meanwhile, if we monitor the logs on our server, we'll, we'll see that, uh, it. logs the transcript from the video. And now what happens behind is that ChatGPT helps us format it into a better way.
Because, um, if we take a closer look right now, then you can notice that there is no punctuation. It's not actually clear where one sentence starts and where it ends. And for that, we ask ChatGPT to help us format This in a better way to save the transcript.
And while it is happening, it will take quite some time because, uh, this video, which we are transcribing is 40 minutes long. So just chatGPT takes a lot of time to output those tokens. It processes this whole text quite quickly, but then it takes a lot of time to output it.
Okay. That one has finished. The second transcription has started and now, boom, we have the transcript and the summary right here inside our table.
Let's take a look. This one, this is a video which is over 40 minutes and the whole transcript is here. And then just take a look.
Summary with key takeaways from the video and the conclusion. Oh, I wish I had it a year ago. By the way, if you want to learn sales, then I highly recommend you to watch videos on this channel.
This guy is, like, I really like how he presents the information. He shares a lot of useful information. Okay, but anyway, let's now go to building.
The first challenge which I have faced is that how to gather all those YouTube videos and put them into one table. It may sound not so complicated, but it turns out that you have to monitor each channel and, for example, Here in my automation, I use 10 channels to monitor. How you would approach this solution.
You would usually, let's say you want to use make. com. What you can do, you would start to monitor YouTube channels.
You see, we have YouTube and we can watch videos in a channel. But the thing is that we only watch videos in one channel. And if we want to monitor videos in 10, let's say channels, then you need to create the same automation for each of those channels.
It is absolutely not scalable. One way to solve it is to create your own RSS feed. You can do it in this RSS app.
It will cost you some money. The second option which I use in this video is using Zapier. So Zapier, currently I have this.
RSS monitor. This is the most basic automation, which can exist. Let's take a look into this.
What we have here is we basically have RSS by Zapier and we monitor new items in our channel. And what I did here, I just put the links to. the channels which I want to monitor.
They say that they allow only 10 channels. I haven't tested more than that, but even having 10 channels is already a good starting point. It runs each 15 minutes and then it adds those new videos which it found to our database, which I have inside the Air table.
And we basically just copy the link. from our RSS feed to the link, the author name to channel and publication date to the date. And that is how we have a list of the videos which we want to transcribe later.
And let's take a look at how many operations it consumed. Now I'm on a free plan, I don't use Zapier now often. I haven't found this RSS monitor anywhere else, so that's why I use it here.
And now after a day it just consumed only five tasks. which is those five videos which I have inside my database. Now first challenge completed.
We have a list of videos inside our database. Our next challenge is to create transcript. There are two ways how we can approach transcribing those videos.
The first one would be to extract the audio from the video. It may sound not to be complicated but then when I started Uh, doing it, it turned out that I need to run a piece of code to extract audio from a video. There are some tools which allow to do that.
But if you want to place this inside an automation, then you still have to run some code. And it can be challenging, especially because make. com doesn't allow you to run any code right here inside Make.
And you need to use other solutions for that. Another way is to use native features of YouTube and just get the transcript which YouTube already provides. But in that case, the transcript that we get doesn't contain punctuation and it is not a human friendly, let's say.
And in order to make it look better, we then run it through ChatGPT and ask ChatGPT to reformat this, uh, transcript. In this video, we explore this second way. But again, it turned out that in order to get transcript from YouTube, I have to run some code and I tried different ways to do it.
I tried to find some applications which would allow me to run that code. It turned out that there are not so many. So for me, the better option turned out to be just to write code and create my own server.
So let's take a look at how to do it. I found this library on GitHub. It is called YouTube transcript API.
Even though it is called API, you still have to install a package and run the Python code. And this one, this allows to just extract transcript from any video. Now, our new challenge is to create our own API and run this code inside it.
I don't really like creating the code from scratch. So what I did, I went to my old friend, Claude, and I asked him to create a Python server, which has one API that takes a YouTube URL and does the following. And this code I have taken from this description after I have found this API.
And what it returned back was almost fully functioning code. You know, I just copied this code, tried to first run it in Repl. it.
Repl. it is an online coding platform which also allows you to post your code. But then I decided to go, uh, a proven way and use another server and run all that code on my local machine first and then just deploy it.
If you know a little bit of coding then you know what Visual Studio is. So I basically just copied what Claude gave me here and I pasted it into my Visual Studio. But if I scroll this conversation down you'll see that I asked it to use OpenAI library instead because the initial code which it gave me didn't work well and then I just iterated based on that, asked Claude to modify the code a little bit and then I asked it to create a Visual Studio code instead.
So instead of replit, I asked it to create a code for a Visual Studio and then I wanted to deploy to render. We'll get back to it later. It gave me just step by step blueprint which I could execute and it actually worked well.
I can confirm that the advice which Claude gives is enough for you to create very basic APIs. After that I copied the whole code. Here I run it locally.
I checked that it all works well and deployed it to Render. Render is my go to platform where I deploy my applications and they have free tier. So it's, yeah, on a free tier you will get a little bit slower responses.
But overall, it's fine for a test project like this one, and even if you decide to go on a paid one, then it doesn't cost much. So basically, if you are a beginner, ask Claude to help you with that. In my case, I have this Visual Studio code, I tested it, then I published it to GitHub, and from GitHub, it automatically deploys to Render.
Each time when I introduce new changes, it deploys the new version of my solution and it works like a charm. Now I have my API at / transcribe. It is a post method and what we get there, we just get the URL of our video, then we get, uh, the YouTube video id, well it's actually a video id.
And after that we transcribe this video with that library, which I showed you before. And then we improve the output with, uh, the help of ChatGPT 4 mini. 4Mini is very cheap, so don't worry about the cost at all.
And don't worry about writing this code yourself. In the description, you will find a link to the GitHub repo with this code. Now when we have our own server, it will be hosted on this URL.
Just copy that link from our render application inside our make. Com inside URL and we add transcribe which is the name of our API. This is a post method and then I also created Some basic authentication, the header authentication, I asked actually Claude to help me do this.
And then we need to send our actual URL. This is the parameter which our API expects to get. How it looks, um, here, yeah.
So this is the JSON with the URL and the actual URL of our video. And then we put our JSON. Here.
The first time I did it, I just basically copied this URL here. You know, I was just thinking that if I put my request here, then you put your thing here and this is our request content. It sounds to be what we need, right?
Okay. Now let's see what happens if we run this module. We get an error.
And I have to say when I first saw it, I couldn't understand what is going on and why I get this error. And then it turned out, then we have to add this JSON module. So pay attention to this.
You can't actually just, um, put your request content here like this. If your API expects to see JSON there, then you have to put a JSON there. Just like this.
Before any API. Put a JSON module and convert your data into a JSON. How you do it, let's take a look together.
We have a JSON module here and this is a create JSON module. If you tested your API locally, then just copy what you use here inside body. And now we will add a new data structure.
Here we click generate and paste sample data here, you generate it, and you get your URL, your data structure, and then you can use it. Just like here. Put your link here, URL as a key, and your link as a value.
And then this HTTP block will work for you as it should. After my last video, someone also pointed out that you can pick yes here, and then your response will also be parsed and you don't need to include another JSON module as well, so do this. Now I will rerun this automation so you can see it in action again.
And this time that was a YouTube short, so it didn't take long to process, and now you see that we get our data here with our result. Okay, and this was the result. This is the transcription, which is already formatted with the help of ChatGPT.
And then we want to create a summarization of this transcript. I put another filter here. We only create transcripts if this block ran successfully and if it didn't produce an error.
At first, I've got this error once I tried to deploy my API. onto the server. Remember this YouTube transcript API?
It turns out that this is not an official one and YouTube doesn't like when someone scrapes the content from YouTube. So they actually block IP addresses from which those requests are made. If you use only your local environment to scrape YouTube transcripts, then it's fine.
But once you deploy it somewhere on the server, then your application may be blocked. Now I will give you another tip which I learned is to use a proxy server. Lately, a lot of people have been facing the same issue, and they wrote that you can use proxy, and proxy helps to avoid that, uh, YouTube would block you.
After that, I spent a couple of hours trying to understand what is the best way to get a proxy address, and then I found one solution, uh, this one. You see, I used it today a couple of times. This is WebShare, and they also have a free tier, so currently I'm on the free tier.
And they provide you with, uh, different proxy addresses. And in my application, I created that, uh, proxy environment, and you put your proxy address there. Just make sure that you put your proxy address in the correct format, If you decide to copy this application, you will need to have your OpenAI API key, your proxy address, and then when you deploy it to your server, then you go to your environment and you put your keys there.
This way it will work. Now back to our chat GPT model, you see, I use a GPT4 o mini to create a summary of the YouTube video. So instead of watching this video for 40 minutes, I can just read the summarized version and decide if I want to watch the whole video or if I'd better just get takeaways from it and forget about this video, because anyway, I will never remember what was there.
Writing a prompt here is not complicated. This is a basic summarization so you can adjust it to your own needs depending on what you are looking for in those videos. For me, I'm looking for takeaways and for lessons from the video.
That is what I care about. And then you put another Message here, another role user, and here you put the transcript from our previous step. Max tokens is zero, so we have all the information.
And that's basically it. After we get the transcript, we just save it into our AirTable, into the same table with the record ID which we have got on the first step. So this is the same ID from the first step.
Of the record, which we just have transcribed. And then we put transcript here from our HTTP module and we put summary from our open AI module. This is how our automation works.
We now have a transcript, a summary. And an automated way to transcribe all the videos and get all the summaries. Of course, you can take it one step further and create another automation that runs after this one.
And after your transcripts and summaries are populated, that automation can send all those summaries to your email list. But for me, it is fine just like that. I know that many of you don't like the idea of writing code and posting it because it can be a huge pain.
Actually, I found another way to achieve absolutely the same and in the next video I will show you how to transcribe YouTube videos with only no code solutions.