I Gave the Same Prompts to ChatGPT and Midjourney… Who Won?

46.98k views3333 WordsCopy TextShare

Futurepedia

🖥️ Build your free website with AI: https://clickhubspot.com/ecjk More from Futurepedia: 👉 Join t...

Video Transcript:

We have had huge updates with image generation recently. The most viral was Chacht's new image capabilities, which was a massive leap forward from Dolly 3. Then just a few days ago, Mjourney released version 7, and that's been hyped up for a very long time.

So, I did a super deep dive comparison between those two. I ran prompts across tons of different styles and challenges. We'll see which one comes out on top.

There's things that Midjourney is better at and things chatt is better at. So, it really depends what you plan to use these models for. I tried to cover as many categories and shot types as I could so you can see what's best for you.

I'll do just a quick overview of how I'm using these for chat GPT. It's integrated natively as part of GPT40. You can use it anytime you're chatting.

I'll be using it mainly on sora. com for this since that's easier for organization, but it's the same model. The aspect ratios are limited.

So I went with 32 or 23 for all of these. Now in MidJourney to use V7, you have to rank 200 images first. That is annoying, but only takes around 5 minutes.

Just puts two images up on the screen. You choose which you like more. And this is for personalization.

It trains midjourney on what types of images you personally like and will generate images more customized to you. But once that's trained, you can turn it on or off. Um, I kept it off for all of these because everyone's personalization will be different.

I wanted to work with the base model. Also, mine leans pretty dark and surreal or abstract, so it's not the best comparison for most people. Also, I use the default settings to generate, but you can adjust some of the parameters in mid Journey, like stylization.

So, there's a lot of customization there. I usually do that when I'm trying to get the perfect shot, but I left it all default for this. The final caveat is I selected the best from a grid of four in Midnour.

In chat GPT, I have the plus plan, which only lets you generate two at a time. And since it's so much slower in there, I wasn't going to generate four for every one of these. So, I did have extras to choose from in Midjourney.

Although, with Chad GBT, when you do generate four with a specific prompt, the outputs are way less diverse than they are in Midjourney. So, I actually don't think that would have affected this much. Anyways, we'll start with some portrait photography.

Here's a very basic one without too many details in the prompt, which I'll have the prompt at the bottom for every one of these. They both follow the prompt well, but Midjourney looks more authentic. The Chachi BT side doesn't look very real.

The water droplets feel off. So does the depth of field. Her neck is more out of focus than it should be compared to her arms.

And just generally midjourney feels a lot more lifelike. To get some more portrait prompts, I asked ChachiT for a list of diverse portrait prompts from around the world. I'll go through a few of those, but a little quicker because spoiler alert, basically everything I said about the last one also applies to these.

Like in this one, ChachiT is more saturated. The skin doesn't have the same natural imperfections. Just overall it feels more posed and edited.

This one from Midjourney feels like it could have been taken by a professional travel photographer. This one is just awesome. This next one, the skin texture from Midjourney isn't as great, but I still prefer it over the saturated chatt version.

About the same with this one and this one and this one. This one was a little closer. So, I really like this shot from MidJourney.

There's a really creative composition. The 40 option is solid, too. It still is oversaturated and a little on the yellow side.

It's like someone shot it with the white balance a little too warm. That's actually part of the issue with all of these. They all have that slightly too warm yellow hue.

Not sure what's up with that. In this one, again, all the same stuff. Like again, with that white balance issue, it's basically every image.

If you don't know what I mean, here's with it fixed. Just a tiny white balance adjustment makes it way better. But I like the MidJourney one more anyways.

However, if we zoom in on the hands, you'll see some issues. In Chacht, the hands look great. A lot of image models have mostly solved hands at this point, so it's pretty frustrating that MidJourney didn't solve them with this update.

like they are better and look good in a lot of cases, especially if hands are the focus. So, I'll just jump into some of the hands tests really quick. Actually, both of the hands look great on this one.

They're all anatomically correct. Good texture. Same with on the guitar.

Although, I did ask for a C major chord to see if it could do that, and neither of them did at all. In fact, it looks like at least one of the fingers is directly on the fret in both of these. So, not just the wrong chord, but bad technique, too.

And when I asked for one hand holding up two fingers and the other four, Chad GBT nailed that. But midjourney, not so much. It just gave me a hand with four fingers.

I guess technically they are all sort of held up. So maybe you could say that one passed. The next one is where the prompt understanding went out the window with midjourney.

I asked for them playing rock paper scissors. Rock on the left, scissors on the right. Chach nailed it.

Midjourney doesn't know what rock paper scissors is. Apparently the hands did actually look decent though. I tried that again but with it zoomed out and same thing.

Chacht got it and Midjourney did this. Basically Chacht is better at hands. Midjourney gets it sometimes.

You probably noticed the prompt understanding was better in Chad GPT2. We'll get much more into that later. For a lot of people and businesses, setting up a website can be a time-consuming task that's been made much easier by HubSpot, the sponsor of today's video.

They have a free AI website generator. All you need to do is answer a quick series of prompts about your business. Then the AI generator will create a custom site without any need for you to code or use any complex interfaces.

The site looks great out of the box, but that's just the starting point. From there, you can tailor it with no code drag and drop editing. And HubSpot has like hosting and mobile friendliness all baked in.

You don't even need to think about it. Maybe you're testing out a prototype or you built a landing page to validate an idea and it started gaining traction. With HubSpot, you can scale your AI built site as your business grows.

They have all sorts of tools for SEO, AI powered email marketing campaigns, and lead genen. They're one of the leading customer platforms, and the AI built website integrates natively with all of those tools to help attract and maintain customers. You can create your site for free using the link in the description.

And thanks again to HubSpot for sponsoring this video. I did have a couple more quick portrait shots that Midjourney was better at. All the same stuff with these, but then I tried for a more cinematic look.

Midjourney was still the leader in aesthetics here, but we start to run into prompt adherence and coherence issues. I liked this clown one, but that's a gigantic iron. And if you look close at the body in the back, it's weird and morphed.

The signs don't have real letters. There will be full text comparisons in a little bit, but this one I'd have to give to Chacha BT, especially since this was the best from four with Midjourney. The others were even worse.

This one in the diner with Midjourney, a lot of the areas are morphed when you zoom in. Even her face isn't very good, but I do have a lot of tests on further away faces and crowds later. I also did a few that didn't involve people.

Midjourney aesthetics win there, as long as you don't have any super specific prompts you need. I've got a couple action shots, too. I really liked the style from Midjourney with this one, although I did ask for the car flipping.

This T-Rex chase shows just another face issue in Midjourney. I'd give this to Chachi BT since this is just a blob head. But I did test an old trick in Midjourney where if you run a creative upscale, it fixes the face a lot of the time.

It looks like that still works pretty well. I also switch it up to kind of the opposite of cinematic, a more unpolished and candid look. They both did really well with that.

I wouldn't say there was a clear winner there. The final test in this area was I tried to ramp up the specifics and complexity in some portraits to see how good the prompt adherence was. Chat GPT nailed every part of this prompt with the lighting, her clothing, hair, necklace, and all that, but even down to a small scar on her nose.

Also, every tattoo, the owl and the clock tower on her arm, a small constellation on her collar bone, compass and coiled serpent on her forearm, matching tally marks on her knuckles. It got all of that perfect. And I did a few more of these.

I won't read off each part of the prompt, but the consensus across all of these tests was the aesthetics look better in Mid Journey, but it missed multiple parts of each prompt. So, for those specifics, ChachiBT is way better, and you could use something like Magnifices. That's probably the route I'd take.

Overall, some of these tests show areas where Midjourney shines. But an area it is terrible at, even with this V7 update, is text. There was very little improvement from V6.

You can get things that are really simple, but for anything even remotely complex, it is not even close. Catchy BT nails every part of this prompt, which is incredible. Like, this was unthinkable not too long ago.

And here's another one. Midjourney only got the name time loops. Everything else is a garbled mess.

Gachbt got that. Also got the tagline. Got the free mini wormhole inside.

It wasn't quite there on the madeup vitamins like chronoton B and antiparadoxium. So there were some mistakes, but it's night and day compared to Midjourney. And here was a fake Wikipedia page.

I asked it to write the description itself in the prompt. Chip did great. Midjourney just seemed to have no idea what to do with that prompt.

And those are just some like silly prompts, but this carries over into use cases like advertising. What GPT40 is able to do and some other tools as well. This will just completely transform the way design and marketing and advertisers work.

Next up, we've got some complex prompt adherence. Just adding a lot more details and specifics into the prompt to see how well it follows. For this first one, Midourney was pretty close, but again, this was the best out of four, and the dog was supposed to be sitting on top of the cube.

Chachi PT got it perfect on the first try. For this one with Mid Journey, the toaster is a little wonky. It did get the three apples right, but the spoon isn't resting on one, and it was supposed to be a single sunflower.

Chad GBT got it almost perfect. The spoon was supposed to be resting on the second apple, but it nailed everything else. All right, then I ramped it up a lot and asked for a chess board with alternating sapphire and marble tiles.

I also described how each of the pieces should look. Pawns are robed travelers holding staffs. Knights are armored wolves with glowing eyes.

I described each piece. They both got the tile part, and Chad GPT got more of the pieces right than MidJourney. It actually got almost perfect on some of the attempts.

Majour was actually closer than I thought to, but not as good as Chad GBT. That is a crazy difficult prompt. Something you never would have even thought to attempt a couple months ago.

Another thing I like to test because it is just such a struggle for image models and usually ends up looking really funny is actions or poses where the face is upside down like a handstand or mid cartwheel. It usually gets the face and body wrong in just really funny ways. However, when you ask for a close-up, it's not as bad.

But I love the results of these further away ones. As a side note, I tried this in Reeve and it was actually pretty good at these, but that comparison will be for a future video. I plan to make one that includes all the other image generators like Reeve, Ideogram, Imagine, Flux, Frames, Leonardo, but that will take a lot longer to make.

Another issue in Midjourney that I mentioned earlier is with crowds. When there's a lot of faces, it really struggles. There's just a ton of morphing.

Here's at a concert. There's a couple good faces in here, but that's it. If you zoom in and look around, a bunch of them are all morphed.

It's the same with a busy street. This image looks good at first glance, but if you look closely, most of the faces aren't good. JBT was much better with this.

Next, I tried some prompts to see what gets censored. Midjourney was a lot more relaxed with IP, especially when it comes to the big names like Disney or Pixar. Understandably, some of that is censored in ChatBT.

They could both generate public figures, but had limits on what they could be doing. They were both fairly loose about it, but Chacht was less censored. I don't think I should show everything I was generating on YouTube, though.

Now, as far as likeness, when you ventured outside of anyone that wasn't ultra famous, Midjourney struggled a lot. The explore page of Sora showcased how many celebrities it could do with this prompt of a low-quality selfie. There were a bunch of these, and they are really good.

I was surprised at how many lesserk known people it could do. And here's just a few more celebrity images I came across looking through the explore page. And another really fun prompt I saw was these old school photos of people or characters playing their own video games.

Then I had to try out some anime prompts, especially since that's went super viral, the whole giblification of everything. But they can both do Studio Giblly. But I was surprised to see that Chad GPT wouldn't do most other anime styles when I asked for them, like Masaki Yasa, Makoto Shina Kai, or Mamoru Hoda.

But I did discover that it worked better if I asked by movie title instead of the artist like in the style of wolf children instead of mamoru hosa chatbt was much better at replicating these styles in midjourney was when it worked. Now I tried out some other styles too. There's this tilt shift style with workers that I've seen go viral a bunch of times on Instagram.

Tried fauxism mixed media Pixar style although I couldn't say Pixar and chatbt that's why it's not as good. They both made some solid character sheets. They're both good in a lot of styles, but one thing that's a huge benefit for Chacht right now is you can ask for more scenes with the same character.

With Midjourney, at least right now, they don't have the same character reference tools that they had with version 6. Although, they are going to be rolling out an omni reference tool within the next couple months where you'll be able to use references for characters, logos, scenes, objects, things like that. But as of right now, there's none of that.

So, Chad GBT is the clear winner for that type of thing right now. Now, to take this in a different direction, here's some surreal type prompts. They both did really well here.

Like Midjourney missed some of the specifics. Like this one was supposed to have a suit made of clouds. Or in this one, the hands were supposed to be clouds, but I wouldn't say there was a clear winner here aesthetically.

It's more of a personal choice on what you like. Sort of in this realm. Something I like to do is use one-word prompts or sort of vague topics and see what happens.

It's kind of a test on the model's default creativity. Overall, I think MidJourney leads in the default creativity and also in my opinion has maintained its leads in aesthetics by far. is definitely behind in prompt adherence, text, anatomy, coherency, multiple faces, consistent characters, and some other areas, but it still wins out in aesthetics.

Especially when you start trying to make abstract or weird images, it's not even a comparison. I haven't done too many of these yet in V7, but here's just a few. So, I guess to summarize, generally the times where Midjourney gets the prompt right, I tend to prefer it, but it doesn't get the prompts right as often.

The more specifics you need, the further off it will be furthest if that involves text. Chacht is way slower though, so that's also a factor to consider. And this was just comparing these two models at their base prompting level.

They both have a lot of additional features. Like there's MidJourney's new draft mode where you can just talk to it and it generates at an insanely fast speed. It's more like, you know, vibe prompting.

It's pretty fun. You just say what you want and MidJourney comes up with the prompt itself, then generates at a lower quality, so it can go really fast, but then you can just quickly iterate like having a creative partner, then upscale to a high quality image once you're ready. You can do that in chatbt, but it takes so long to generate you just wouldn't want to.

They do both have editors as well with like inpainting and adjusting the images. Midjourneys has a lot more features currently, but it's not as good at chatbt at some things. It's hard to cover all those small details like that.

I wanted to compare just the bulk of what these are used for. I did mention before there are a lot of other image generators. I do want to make a video testing every one of them.

That's a much bigger undertaking, but I do plan on doing that. Let me know in the comments if you have any prompts or specific aspects you'd like me to test in that. And if you want to go way more in-d depth on learning AI on Futureedia, we have over 20 comprehensive courses on how to incorporate AI into your life and career to get ahead and save time.

You can get started for free using the link in the description or check out this video with 13 AI tools that can save you a,000 hours in 2025.