VEO 2 is now Publicly Available! Hands on Testing! (BEST AI Video)

32.87k views6343 WordsCopy TextShare
MattVidPro AI
HUGE Thank you to Invideo AI for sponsoring today's video. Check em out here: https://invideo.io/i/m...
Video Transcript:
folks this is not what I was expecting to make a video about today but I'm pleasantly surprised Google actually gave us access to VO2 I know it's a shocker for those of you who aren't caught up V2 is the best AI video generator on the planet and it has been for the past 2 months no one has been able to lap it however it's been nonpublic for those two months only a select few beta testers have got access but now finally it is public and this is the Google VO2 model this is not a fine-tuned
model for YouTube shorts that's lighter weight and not as good this is the real deal and coming to pretty much everyone's surprise it's available on free pick check out this announcement V2 is here first on free pick worldwide I mean I had to rub my eyes and do a double check here excuse me on free pick Google partnered with us to debut the most advanced AI video model in the world World unmatched realism precision and smooth animations what a surprise honestly I'm shocked because I would think that Google would debut VO2 on its own web
page it's pretty much what they've done with all of their other technology that they've wanted to make public they've made it available through some Google app or website but not this one no this one's going to debut on free pick for some reason now this is going through Google's API it's not being hosted by free pick and the servers are getting clobbered right now but hopefully we'll get some generations through today I really want to test this thing and play around with it I also assume that free pick is just the first of a lot
of other AI video generator companies that are going to be debuting the V2 API on their site I can imagine sites like Korea ai poloo ai might get access pretty soon obviously though all of these clips I mean they're downright insane in terms of the quality level Google V2 beats everything out of the market no question asked now you'll see here that the first 10,000 users get two free Generations this is already used up to the best of my knowledge and when I tried to cash out those free Generations earlier I was met with a
nice little screen here saying that V2 is at full capaity I'm also going to be showing you guys a free way you can use Google VO2 with a special code that I've just been given I just got a DM with a code so you guys can get some free credits for v2 so that'll be in this video but before we talk about pricing and all that stuff let's actually try to generate a Google V2 output in the creation tab under generate a custom video we can select Google VO2 as our model we can either do
16x9 wide or 9x6 just widescreen flip the other way around for a phone or something like that and 5c Max duration for generated Clips even now that is public it's still quite Limited in terms of what settings you're able to access first prompt seen from a 3D animated feature film 3D animated lemon character Pixar style is eating a cheeseburger in a restaurant and we'll click generate and on my plan I have 18 total Generations available to me using Google V2 yeah it's pretty expensive we'll go over pricing and credit comparison later in the video it's
not cheap though let's try the other aspect ratio of 16x9 for our next prompt a firstperson POV video uploaded to the web recorded on an iPhone meeting a green alien in the middle of the Woods on the East Coast we shake hands lovely send it through Let's test it out on video game gameplay footage Minecraft gameplay footage building a snow golem in Minecraft that's something that you could probably see in 5 seconds so I'm going to send that through and you can see things are kind of just building up in a queue and hey our
first generation is now done oh there he is uh our 3D Pixar animated lemon character he looks all right he's definitely got a burger in a restaurant but he's sort of just talking he's not actually picking up and eating it very interesting indeed I mean it's very good quality it's very coherent it's definitely 3D animation and I like how consistent the hands in the Character Are it's just he's not eating the burger maybe if this was a 10-second generation he'd eventually pick it up and take a bite or something like that but as far as
a generation goes that didn't follow the prompt to a this is pretty good quality overall oh looks like our Alien video has also completed so we got the first person PE uh with the iPhone okay I think it didn't really capture the idea that I wanted but it makes sense given the prompt this guy's literally just kind of fumbling with his phone trying to take a video of this alien yeah firsters POV video uploaded to the web recorded on an iPhone so it's trying to put the iPhone in the POV as well and it does
that pretty well this is not the first video generator I've seen to take a prompt like this and include the iPhone as a part of the prompt as well overall though I mean it does look pretty clear and it does look very much realistic we can even see the person's reflection in the iPhone if you look very closely you can see that it looks like there's a face or person there in the reflection now we have Minecraft gameplay footage oh my God it's decided to go with Minecraft Pocket Edition with the hand controls and everything
which is pretty crazy this is staggering it's definitely like a little snow golem it kind of looks cool but yeah there you can see it is trained on Plenty of YouTube data because it Nails the Minecraft interface the correct amount of slots it's got the Playstation controls down here uh that looks like some wheat some snow blocks that's supposed to be a bed but like even the UI is pretty darn close pretty crazy to see something like this you can tell it's trained on some insane data and the textures are so close before we dive
into some more V2 Generations I've got a quick word from today's sponsor huge thank you to you guys for putting up with sponsors they quite literally make this channel a full-time job for me they make that a reality today's video is brought to you by nid AI V3 this is the only AI video tool that makes you f length ready to publish videos using nothing but a simple text prompt for a special demo here I'm going to showcase the hyper realistic film's workflow check this out I'm creating a three minute long generative film for YouTube
about oppenheimer's Legacy with Einstein this custom prompt is asking to tell the story of how these two legends first met and how Einstein supported Oppenheimer through his Journey we also specifically request invid AI to use details from a particular article which we will link and check it out guys I think the results speak for themselves the pursuit of knowledge is Noble Robert but we must always consider the consequences of our discoveries I didn't fully understand his warning then how could I the Manhattan Project consumed me day and night we raced against time and our own
consciences Einstein's letter to Roosevelt had set it All In Motion yet he remained outside the [Music] project and on the plus side if there is something that we do want to change about this they have a really easy to use in seamless editing capability we can literally just use natural language to ask what we want to be changed in the video and the AI will go ahead and make those changes for you if you want to be extra precise though you can use the built-in editing framework to tweak any piece or portion of your project
if you're ready to take a dive into The Cutting Edge of content creation click the link down in the description below to check out nid AI V3 for yourself it's quite a surreal experience huge thanks to invido for sponsoring today's video now back to your regularly scheduled content welcome back folks let's get back into it okay I want to try to fix some of these Generations since I know this is trained on a vast amount of YouTube data we're going to try this prompt POV I met a real alien in the woods his name is
gorp so we're just going to title it like a YouTube video and send that through and see how it does with a more natural or or YouTube title prompt maybe that will you know hit the right cords in the training data so to speak well now that we know it will try some copyrighted stuff especially this Minecraft Gameplay let's see if we can get some other famous characters in here cinematic wedding ceremony Shrek in Bigfoot get married yes that is what I want to see Tech reviewer YouTube video throwing my gaming computer into a volcano
that would definitely get a lot of views on YouTube lonus Tech tips better get on that so for this next prompt I'd like to enhance it closeup macro video of a small mushroom Village teaming with life enhance the prompt with AI obviously it's using some sort of llm here to rewrite my prompt give it maybe a little bit more detail in life sure uh that's a very detailed prompt put that through whoops I actually put that one in twice first up we've got I met a real alien in the woods glorp there he is he
kind of looks disgruntled again it does just sort of look like a man in the suit kind of like how the other one looked like a statue it seems to be having a difficult time making like an alien that looks like it's from a movie or something like that but to be fair our prompt here is POV I met a real alien in the woods his name is glorp so honestly guys I wouldn't be able to tell that this isn't real video footage it looks straight up real to me there isn't really any telltale signs
it looks like a guy in a big alien costume standing in the middle of the woods there's not a ton of movement though to be fair so it's not really pushing the model as far as it can go next up we've got the beautiful wedding ceremony between Shrek and Bigfoot and uh this one definitely was having a little bit more of an issue here but it still looks pretty cool I really like the sweeping immediate growing plants across the circle which kind of makes this a beautiful romantic moment you know they're taking pictures of Shrek
and Bigfoot obviously the clear issue here is that we're getting subject confusion doesn't know which one to make Shrek it looks like Shrek is both of them but this is more Bigfoot than this so I'm assuming that the female here is Shrek and Bigfoot is just a Bigfoot with like a Shrek mask on it's like a a Bigfoot Shrek hybrid so there you go but the people look so realistic it's very interesting the way that the model is trying to combine different concepts together and mash them into one somewhat coherent video I like that they're
holding hands there and then the video cuts and then they're no longer holding hands and they're just taking photos and videos well this Shrek is is holding the flowers it definitely looks like a real wedding all right here is our close-up macro footage of a mushroom Village and here it is okay very interesting the model seems to lean more towards realism especially in this scenario and probably some of the other scenarios we've seen so far tiny insects fil between the mushrooms their movements quick and Lively yeah ladybug crawling up the stem okay so it's trying
to capture the miniature ecosystem that the prompt describes it's definitely not a mushroom Village though it sort of just decided to ignore that piece and make you know some mushrooms with ladybugs kind of crawling all over them and I mean this looks a little gross but it's also kind of cool it's very intriguing to me how well the model is able to keep all of these legs and movements of the legs consistent I mean there's a little bit of mushing there's a little bit of warping but overall you can at least while the video is
playing distinguish the individual legs and it's not super jarring and also you know lots of detail here plenty of detail on the mushroom super closeup macro shot I love the reflections on the ladybugs as well some of you might be grossed out uh with this and I'm sorry for that but it's not a half bad generation it's pretty impressive home VHS video Christmas opening a present to reveal a mini nuclear reactor inside which is glowing green and radiating oh lovely let's generate that here's the tech reviewer video all right definitely uh your average Tech reviewer
with the background and the computer but doesn't look like he's throwing the computer computer into a thing of lava it's interesting because as I'm generating these videos I'm learning more and more about how you should prompt V2 it definitely doesn't want to be prompt like a YouTube video here I mean it definitely will make something that's like a tech reviewer YouTube video but it's not going to capture the part where it's you know actually throwing the gaming computer into a volcano you want to be very descriptive I think and very literate with your prompts I
want to retry that one wide angle view POV at the edge of a active volcano we see a man Huck a brand new RGB gaming PC into the volcano yes all right that's a little bit more literate and descriptive cinematic movie footage tracking a woman with an umbrella in New York City she looks mildly displeased lemons are plummeting to the ground from the sky back bouncing off her umbrella see I'm trying to get those lemons to have a little bit of physics there and we've got another mushroom Village prompt here that went through and finally
generated and it's pretty much exactly the same as the last one they're extremely consistent very difficult to tell apart here but again you know we see the ladybugs are sort of crawling all over the mushrooms it seems like this model really does you know focus on realism here we're going to try this same exact mushroom prompt in Sora oh in our VHS Christmas uh mini nuclear reactor is here as well and you can see definitely does not look like VHS footage which is really interesting I prompted home VHS video but it seems to be more
like you know a YouTube video of some guy exploring this mini nuclear reactor instead it's definitely still Christmas time I mean he's wearing the Christmas hat which is pretty cool we've got the Christmas tree here and it looks very realistic and I do really like the mini nuclear reactor as well but it's not home VHS video which is so interesting to me it seems like the find tuning that's gone on with Google V2 is preventing it from accessing certain artistic qualities shall we say again seems very much posed for realism I wonder if they're working
on some other submodels in the background that are focused on different tasks maybe more focused on animation or focused on different styles of video not just things that look like YouTube videos I suppose all right now we've got the gaming computer going into the volcano here and again you know this comes so close this comes so close it's only 5 Seconds long so maybe if it was a 10-second generation we'd actually see it getting thrown into the volcano but yeah it's just sort of some guy here holding this gaming computer he's ready to throw it
in but he's not doing it just yet I got to say though looks like a pretty realistic gaming computer you got the power supply down here CPU Cooler looks like there's some Ram we got the fans and the GPU it looks pretty solid and I like this video like I like the way that he's holding and manipulating the object and everything everything and I like how realistic it looks it's just he's not throwing it into the volcano let's retry this one and we'll also enhance it maybe the enhancement will work wonders you know what I
got to say open AI Sora is also just sort of generating mushrooms with ladybugs it looks like this prompt maybe is just too long honestly I definitely prefer the Google V2 result although these camera movements are pretty cool like the legs on the ladybug don't look nearly as good as it crawls on the mushroom this one again is pretty cool but we only one ladybug here and it doesn't look very nice it looks a little deformed almost let's see if we can get this uh gaming computer into the volcano and you know what let's try
the home VHS video as well in Sora little comparison well we've got the disgruntled woman as well with the lemons but they don't really seem to be falling from the sky it looks more like we've got this video of a woman with an umbrella and then the lemons are like an animation that was pasted over the video very realistic still and definitely lemons falling from the sky but they're not bouncing off of her umbrella like I requested in the prompt this is the volcano gaming computer prompt like we've got the guy here at the edge
of the volcano and then we've got you know something that looks kind of like a gaming computer but it's sort of just being like lowered down here and then we've got someone's legs it's like literally someone's holding it with their toes and they're being held up by a crane or a helicopter maybe a very very silly video it's kind of funny though honestly all right now we've got another guy oh there's another gaming computer kind of going into the The Volcano here it's having a hard time stringing these Concepts together for sure see this is
what I'm saying V2 is more coherent in this sense in comparison to Sora like these are very obviously AI generated that doesn't look like a normal gaming computer Google V2 though definitely comes out here with uh something that looks a lot more realistic here let's see do we get the gaming oh my God he's going into the lava himself he's trying to throw it in very clearly again AI generated but it comes a lot closer it looks like he's trying to get rid of it but somehow the universe is holding that will against it saying
like no the computer must not go in the lava no matter what oh wow interestingly Sora does a lot better of a job trying to capture the VHS footage with the mini nuclear reactor hey this video of the kids looks very realistic though good job Sora for that but yeah definitely kind of looks like home VHS footage yes the mini nuclear reactor oh my God just what I wanted for Christmas there it is interesting oh we can help improve Sora folks which one is the best one with the lemons falling from the sky all right
all of these are pretty terrible not going to lie the one that's closest to being mostly accurate is probably I want to say this one maybe keep selected video I guess yeah it looks all right doesn't really get the physics at all with the lemons still Sora is worse for sure in comparison to Google vo which makes something a lot more coherent but still not even close to being perfect here let's test out my favorite Sora prompt of all time this is real found footage an alien getting a slushy inside a gas station this actually
turns out shockingly good for a SORA generation here you can see he's drinking the slushie I want to see how vo tackles the same exact prompt this is also a decent Generation by Sora as well this is found footage CCTV gas station clip man rioting a a giant bullfrog and that came out pretty darn good from open AI Sora here like that's not bad he's kind of drifting around the parking lot with the bullfrog which is cool so I'm going to send this over to Google VO2 as well we're also going to go ahead and
try this closeup of an orange tabby cat eating a lemon sort of did an all right job attempting to replicate this the cat looks great the lemon slice looks great but he's not really eating the lemon is he sort of just kind of moving his mouth around and we've only got three generations left using Google to so we'll send this one off we'll wait for these to generate in the meantime honestly guys I got to be real with you so far I'm not overly impressed with Google VO2 I was expecting it to be a little
bit better than this I wouldn't consider these prompts to be straight up easy and they definitely all have great qualities for sure and come really close to getting what I'm asking for but none of them really seem to nail it it's definitely very clear and very realistic and even a lot of times difficult to tell the video is even AI generated in the first place but it's having a hard time really nailing down exactly the prompt I want I wonder I really do wonder if it's because we're capped at 5-second Generations I wonder if with
the longer Generations we'd cut eventually to a clip where it's actually doing what we propose but we gave it a lot of tries to get you know for example the gaming computer into the volcano correct it gets a lot of aspects that we want the gaming computer looks great it's very realistic wide angle POV at the edge of the volcano but just isn't really bringing it down into the volcano it's not throwing it off as we would expect but to be fair sort is much much worse like take this clip for example I could show
this to someone random and they would probably have no clue that this is AI generated I mean it looks like a very realistic person very plausible you know gaming computer in the background it's blurred out so it looks straight up realistic Ram CPU Cooler fans and then you know this just looks like some Walt art perhaps hard to tell that it's even AI generated in the first place and again the Minecraft footage as well comes very very close if you have a Keen Eye and have played Minecraft before you might be able to tell you
know these items don't look quite correct at the bottom but it looks very much like things that I've already seen online all right here's the found footage with the alien getting a slushie in the gas station here he is it actually did some 2D animation definitely an alien with a slushie it looks like a pretty cool animation realistic background here here but again it's not really following the prompt I actually think the sore example for this prompt was a little better definitely doesn't look like found footage and I mean he's getting a slushy at a
gas station for sure like that's a gas station in the background but he's animated this is not what we were looking for here let's take a look at this one you know this is this is the Sora generation with the same prompt I think this is way better it actually looks like real video footage of an alien getting his slushie it looks a little cinematic I wouldn't say it's found footage but you know this is definitely a lot closer to the idea I was going for than this video by Google V2 even though it still
is coherent and decent for 2D animation generated by AI these days right it's still clear it still conveys things well like the car is driving by but something is off anime gibli style 2D animation giant lemon monster destroying skyscrapers and again but also try this in V2 all right our cat eating the lemon is back up here is the Sora generation for a reminder this is what it looks like he's not directly eating the lemon but we see both subjects here and it looks pretty realistic overall so not a terrible generation but not what we
were specifically asking for now okay V2 does a much much better job here all right yeah this is pretty much exactly what I asked for here's a little orange tabby cat eating a lemon very simple he actually looks like he's licking and trying to bite into the lemon and enjoying it he even closes his eyes as cats typically do when they're enjoying a meal so there you go actually a really solid generation that captured my prompt but it was pretty dead simple at the end of the day just an orange tabby cat eating a lemon
nothing too crazy finally let's do one more Generation video depicting a man sitting at a table on the table there's a huge pile of Nails in parenthesis the tool he is casually grabbing handfuls of these nails and eating them quickly all right that's our final Google V2 generation also going to throw this into Sora all right looks like our gibli style 2D animation of a giant lemon monster destroying skyscrapers is done let's take a look here definitely 2D animation I actually like the 2D animation style quite a lot for sure anime style here definitely a
lemon monster he's not really destroying the buildings though the buildings are more or less static while this little lemon guy kind of moves around again not exactly what we were looking for here it didn't nail it perfectly definitely 2D animation though definitely pretty close to anime or gibl style so we'll give it that here's Sora coming out of the woodwork though absolutely not gibl style but we are getting closer to destroying buildings definitely not 2D though this is very much 3D animation scary lemon monster though I mean man is ripping through these buildings doing who
even knows what it looks like Pacman that has taken a wrong turn in life you know he he's taken a dark Road and now he is trying to destroy buildings I guess I definitely think that the Google prompt is closer to what we were looking for but I like the fact that Sora actually tried to make it destroy buildings at the very least all right here's our clip of the man riding the bullfrog at the gas station and you know what I actually really like this generation for some reason it starts off super weird but
it goes quickly away from that and we just see a dude riding a bullfrog he's pulling up to the gas station probably just getting some chips or something but I really really like this clip I know it doesn't look anything like found footage or CCTV but uh I'm pretty happy with this I think it's cool here is the Sora generation again also pretty awesome here riding the bullfrog around this one definitely looks a lot more like CCTV footage for sure so Sora of I mean I wouldn't be surprised if people preferred this generation just because
it followed the prompt better but yeah there he is just riding the bullfrog around Fan freaking tastic again the model just doesn't have the coherency to understand that this makes no sense all right I hope hope V2 can give us something around this quality big pile of nails he's grabbing them shoving them into his mouth like a maniac and eating them kind of still has that slow motion effect going on and it it's not very obvious that he's actually chewing and eating the nails it looks a little bit like confused and blurry when he brings
him up to his mouth but overall gets the gist of the prompt all right here we go now this this is definitely better I think than the Sora generation we just witnessed it's just a guy sitting there I love the background here I love that it's like looks like it's from maybe the Philippines or something like that again a lot of YouTube videos actually come out of you know different countries and different areas so again more proof of the YouTube training data rearing its head in here but I think this is a pretty great one
it has a good understanding of the physics with the nails as they move around and he's picking them up putting him in his mouth for the most part I mean it's not again entirely obvious or coherent that that's exactly what he's doing but it's good enough I think it's good enough to kind of give off the impression and show off him biting down and and having a snack of big Nails there he goes just sort of eating it he looks very coherent though perfect looking human being perfect looking hands the background looks great I love
that the trees and stuff are swaying and moving in the background can't really complain here I think this is a solid final generation to leave off of terms of actually following the prompt so yeah guys first impressions for Google VO2 I got to say it's definitely at least capturing portions of the prompt accurately every single time but it's not nailing the prompts in fact only a couple of prompts actually went through fully ready and coherent following exactly what I asked for almost all of them had some sort of cavat or area where they weren't nearly
as good still though the overall quality is great it's very much centered towards realism there are quite a few videos where like I wouldn't be able to tell it was generated by AI in the first place it's definitely no slouch when it comes to video generation however I'm a little underwhelmed overall I was really hoping that it was better at following prompt it seems like they still have a little bit to go in that regard and even Sora is able to hold its own in some certain prompts against Google V2 most notably with certain different
styles like more VHS focused style or CCTV found film footage or you know the the alien at the gas station having a sip of a slushie for example so really telling in that sense in terms of pricing to get access to Google V2 any paid plan will grant you access but it cost 1,000 AI credits per singular V2 generation so it's very expensive in fact with the premium plus plan that's 40 bucks per month you're only getting 45 VO2 Generations here I bought the Premium plan here for 20 bucks that got me 18 Generations so
that's what I bought to show this off to you guys today and then you only get seven generations for n bucks very very very expensive over a dollar per generation for both premium and essential plans and of course you can save money by switching to the annual plan which will give you more Generations sure but I don't recommend switching to the annual plan because you never know V2 could show up on another website or even a Google website for a lower price in a few days I recommend that you hold off for now unless you
really want to try Google VO2 and wait to see if it shows up on any other platforms soon our friend sha rollston over at open aai just tweeted this out V2 is available not only on free pick but it ALS also seems to be available on Fall AI where you can pay per generation instead again very realistic video Generations here in his examples I think maybe these came out a little bit better than mine but yeah pretty darn cool here's fall AI access to VO2 this one actually has a duration that goes up to 8
seconds very nice same aspect ratio settings available here for this one and oh man though it's more money for sure than free pick for an 8sec video your request will cost $4 oh man it's expensive so yeah it is available on Fall AI here if you want to do 8C long videos instead but it's going to cost you four bucks per generation so fall AI actually just dm'd me and they're like hey do you want to give some free credits away to your audience to use V2 on Fall AI I'm like sure I'm not going
to keep free credits for my awesome viewers and this isn't sponsored by the way they literally just dm' me and were like hey do you want to you want to get a code to just give to your viewers for free credits I'm like sure not sponsored in any way all you have to do is make sure that you're logged into an account here on fall. a then click the link that's going to be in the pin comment and in the description it will be super obvious and it will should just give you 10 free fall
AI credits that you can use on VO2 so that's like two generations I know it's not a lot but it's something and here is the generation of our little 3D lemon dude he's sipping his drink floating around in the tropical waters and enjoying himself definitely a pretty good generation no doubt I think this generated a little bit faster in comparison to free pick so yeah two free generations for you guys on Fall AI so yeah guys Google V2 is here but with a few catches first of all the model wasn't as good as I expected
it didn't follow all the prompts I wanted it to but it was still pretty darn good and very very much realistic and several videos without me even necessarily trying just looked straight up real I wouldn't be able to tell their AI generated otherwise it's that expensive your best bet right now is probably still free pick even though it's expensive on free pick 20 bucks for 18 gens I don't know tell me your thoughts on this very interesting launch for Google V2 I really thought that Google would come out with their own website to use this
maybe they're still working on it and they're going to undercut everyone on pricing here with the cheapest way to use V2 being Google's own website I still feel though that 5 Seconds isn't really enough it's enough to get a taste of the model but it's not enough to really get fully creative I think with it I'd like to see at least 10 seconds if not 20 seconds become the standard for AI video generation models at this point and I mean the pricing is just out of this world insanely expensive so I think the open source
models are still my hope for the future of AI video generation at this moment V2 is absolutely a great model but if you're a heavy heavy user of AI video and you really want to create something meaningful using the tools we have accessible today I think your best bet is to go with like hon video cling some of those other options that are going to give you better bang for the buck V2 is just too expensive to justify tomorrow I'm going to upload one of two videos it's either going to be a deep dive into
grock 3 I've been testing grock 3 quite a bit lately and I've definitely made some discoveries or it's going to be an AI news Roundup I mean let me know in the comments which one of those two videos you guys want to see and I'll pick the one that gets the most likes or the most comments for it anyways also check out the Discord server it's fantastic in there they're always talking about the latest as soon as V2 dropped publicly like they were on it they knew it had happened in there so if you want
to know what's going on in the AI space join the Discord server and also follow me on Twitter because I'm always reposting stuff and I'm pretty active on there thanks so much I'll see you guys in the next video and goodbye
Copyright © 2025. Made with ♥ in London by YTScribe.com