Oh, lonely dark man. Okay, I spent the last week trying to break GPT40's image generation prompt after prompt, pushing it to its limits, getting stuck by rate limits, waiting. But I've come away with 10 examples that genuinely surprise me.
Some of them were total failures, and others turned into paid client work almost instantly, which is really cool. And a few have completely changed how I think about the creative workflows I have, especially for product marketing, brand shoots, and rapid concepting. If you work in art direction, videography, branding, or even niche marketing, I'm telling you this isn't hype.
The tool is here and it's already incredibly useful. So, in this video, I'm going to walk through the exact prompts I used, what worked, what didn't, and where this tool fits into real world creative stacks. A quick note before we dive in.
A lot of people in the comments previously asked whether they actually had access to the GPT40 image model, the new one. So, let's just clear that up first. I'll show you where you can find it.
So when you land in J GBT, whether that's pro or not, you'll notice the create image button. And so this is a sure sign that you've got the new image model. Everyone does now, so I wouldn't worry too much.
And let's just stick an image. So I want to show you another cool thing I've just realized we can do. An image of a man in a sad duck costume eating a hot dog on um on the beach in the rain generates final thing you should be seeing.
As long as you're seeing the getting started and the image generation like this, then you're definitely using the new model. So, let's look at some of the stuff that I've been playing with while this uh dark man gets built. Here is the stuff I did last time around.
So, the photo shoot. A couple of people asked in the comments how I was able to achieve such consistency between the characters and such lifelike imagery. But there's the answer was I just asked Chat GBT how to go about it, what art direction we should work with, and these were the results.
Now, I stuck these into a couple of different video generators just to see how they looked. And you I'll pop them up on the screen now. Then I came to this idea of using chatbt to create these kind of branded photo shoots.
So working with GPT I asked for a series of art direction prompts and then composition prompts for for company. So the first one I tried was a electrical contractors company. So these the idea is these guys would be on site and it's as if they had paid for a branded shoot.
They were models in branded clothing dealing with electrical contractor stuff. I just used the website text to help me come up with ideas and these were fine. They're just slow to build and they don't look that realistic.
They look fine, bit deadeyed. Really cool compared to what we could get. Similarly had a quite a better idea here with the and a landscape gardener on the on the coast in England.
And here I wanted everything to be shot in golden hour. I wanted to have men in it in um in these these polo shirts that were branded. And I think you could get somewhere with this.
It still feels a little bit artificial, but I think that's more in the art direction I went with. To be honest though, for these kind of shoots, I had much more success still doing this stuff in replicate. So within Replicate, I can within 90 seconds generate all these images.
Each one of these took at least 90 seconds each one in Replicate um using the Flux Pro Ultra model, which I'll show you a second. Over here, we can generate tons of imagery. I think this looks much more realistic because I use the RAW output, which is over there.
Um and this everyone's in lanyards. Everyone feels like they're part of the same company. It's really consistent.
We got these three similarish looking dudes throughout. Now, if I need to batch process images, we're still not at a place where the GPT40 image generation is fast enough to be viable to for commercial reasons. It's fun for hobbies and stuff, but I'm not going to be sat there doing that.
Maybe I'd pay someone else to do it, but we're waiting for API access and we're waiting for things to speed up. Third application here is just Mother's Day stuff. also had some pictures of my wife and kids and put them into GPT and worked around getting it to deliver Disneyish style imagery without using the big scary dword which scared it.
Uh these came out really cool like Wallace and Grommit style more Pixar style. Occasionally it would add like a deformed half dog half human which we don't actually have in the family. Um also even more terrifyingly fourth child which we definitely don't have in the family.
Uh and otherwise it did really well I think in terms of making everyone look pretty and beautiful and uh my wife and family really like them for Mother's Day. So this is really cool and opens up personalized messages and cards and stuff for people and onto a whole new level now because unlike previously this stuff comes out and looks so much more like the photos that you started with. A good friend of mine I was a bit jealous.
He had he has a boiler firm and he went away and had a real human graphic designer come up with these really beautiful old school kind of skate style t-shirts for for for his company. Wouldn't it be funny if I could get this working for you showing you what this would look like on a on a skater in Camden in London. And so my prompt was put a British skateboarder in this t-shirt when a low res screenshot in Camden holding his board.
And we got a couple that came out quite well in terms of just the structure and just whacking that onto a t-shirt and trying its hardest to get it. Now it got a bit confused. It brought over my steady bow colors, which is my company cuz I got jealous and I decided I also needed skateboarding t-shirts.
Um, I asked for my own range of skateboarding t-shirts and it kind of it got the colors right. I think actually ironically the one here that's best is the skateboard ding which I would probably wear. Uh, and then asked for it to take a picture of me and put me in a skate shop wearing the skateboard ding t-shirt.
And facially it's not right. I swear that's not me. really cool kind of stuff that just just you know 10 days ago, 7 days ago was not possible to just be like hey I want this on a t-shirt.
I want to see what this looks like over here. That mindset that way of thinking about how you can compose images now uh has completely changed which brings me on to structured information and and how powerful that can be with this image generation model. User Jack here put together this amazing JSON prompt and it set me off on the whole journey.
I just don't use JSON enough in my communications with language models. Jason is is a way of in plain text structuring information uh in a way that language models seem to really like. Now, I'm not an expert at all, but he shared this prompt about how to generate very stylized logos that could be pulled out as SVGs very easily because they are well, you'll see they're like bold and stylized.
There's no curves. It's all very chunky. And using his his scripting, I was able to go away and make a whole collection of really fun uh icons.
Everything I could think of from ice cream to wallpaper decorators, a luchador penguin eating fries, uh a sloth knight, uh a mummy horned mummy gorilla, an axelottle transformer um builder, and then add some shading on. So logo stuff has got really cool and this is using this JSON to make sure that the the outputs are very very similar. So you could use this for example on a website if you wanted to have a site branded with icons in the navbar or icons for your services all to feel cohesive.
You could start with a baseline prompt like this and then build it out. Maybe you do want curves. Maybe you want something that's specific to your brand.
This uh this opens up infinite possibilities about logo and and and generate around logo creation which is really cool. Logo and icon creation. This is fun.
done a bit of work for a client whose social media channels I look after and we were able to very quickly create an April Fool's prank which was that they had not only been in concrete game but they've moved into uh Concrete as well. So lovely little Instagram thing and we just fed in the picture of the truck and asked it to be spat out as a as an ice cream truck and it did a really good job. So back to Jason here and this was combining a couple of things.
If I zoom back over. So, I wanted to combine this idea of a branded photo shoot with art direction and composition and and consistent characters with instead of using plain English and making all these tables and trying to understand image plans and couldn't we just structure that in the same way as Jason, which is what I wanted to do. So, I just asked GPT to help me get there.
I knew I wanted this idea of like a rugged called the Maverick who using AI and going back and forth. came up with this together which was Jason that talks about who the Maverick is. We look at his sunglasses, his wardrobe, the visual tone, the art style.
It's always going to be cinematic, shallow depth of field. We're going with heavy canvas textures, raw denim, angel leather. We're giving it a essence of the art direction, and then on top of that, sticking in um more plain English prompts.
So, I wanted to see him walking away from a jet and a bat like a a jet fighter. I wanted him chilling out uh in the Penines having a chill with his socks off. I should probably be charging you for the feet picks.
Um, on a hike. There he is back with the jet. There he is in Cuba.
He's over in Berlin having a coffee being super cool. He's on the 1980s subway puffing on a on a massive stogy. He's on a Soviet abandoned uh roller coaster.
But the thing here that works is these images. The man is the same man throughout much more so than other models, which I'll show you right now. And I'll go and show you what I've just trying to spit the same thing out using um Flux Flux Pro Ultra in the playground here in in Replicate.
While it's great I can generate these images instantly, none of them are consistent from a from a consistency. These are great images. I really like I think some of these come out really well, but it's not the same dude.
So, if I was working with influencers or trying to sell something with a with a personality behind it, while I think you can still get some great imagery from other models, I mean, this is also a fraction of the price and a fraction of the speed. There's three more to generated that I just turned on. We get a much better sense of of art direction consistency and character consistency using GPT.
This is a bit more of a practical application asking it for how is my lighting setup. I don't really know what I'm doing. I don't know.
I've been blinded by key light and and this thing. Oh god. And got lights dotted around in the background to try and make it uh pretty, but I don't really know what I'm doing.
And I've asked for advice from other people and they've given me notes, but I thought I could take that over to to GPT. Um, as you can see, I asked it to try and show me how I could look, and it's a bit of a disaster, but the lighting advice seemed interesting. But then I asked it, okay, well, show me the optimal version of how that I would look.
And it came up with this, which I thought was not good at all. I don't want to look like that. That's too dark and gloomy.
Um, so another one I saw on Twitter here was to feed some normal product shots to GPT and then ask it for some macro shots of that product. So that means very, very closeup high-res imagery. So, I fed it in some low res shots of this random Halford's uh track pump, which I just came popped into my mind, and these got sped out.
Now, these are entirely fictional. Pumps go way above 80 PSI. That's not actually usable as a as a pump.
Uh, and that looks nice, and it's kind of based on this, but it's not the same. So, that one was interesting from a conceptual point of view, and I can imagine shops wanting to use this kind of stuff, but you would have to get a few iterations out. And I probably take high-res images up close with an a camera and then take that those images and ask them to be macros.
I then did the same thing with uh this Swiss watch style just to see what would happen. So, again, fed it to low res images straight from its website and asked it to spit out some macro shots and a man modeling it standing in the middle of the train tracks. And it's done fine.
the text is readable. It's not really not good enough, but there is a situation where with enough prompting and enough good images up front from references, you might be able to get some really interesting stuff here. Again, stuff like this was just not possible a week ago.
So, still fresh coming up with all these ideas. This was borrowed from someone on on on Twitter and uh it does start to make you think of all the other possibilities that you could do with this kind of thinking. Back over to our sad duck man.
He's done now. Cool. Fine.
We can be able to generate images like this all the time. But now if I said I want a profile shot, a side profile shot, a shot from behind images, but we can take the camera and move it around. We could change the lens.
We could add another subject. Manipulating in that space is really interesting when it comes to taking this stuff to video generation because we can then tween those two images and and the the video generation can generate the the frames in between from shot one to shot two. That's the lot.
10 prompts, a few total flops, and a bunch that I think really show where this tech is going. But now I need to shift gears a bit. My core focus is still building custom tools and automations, bots, backend logic, front-end tools, whatever it takes that solves real problems for my current client base and hopefully more that come on.
Starting with improving their ad conversion rates and going from there. So, I'm back working inside Lovable, sorting out some of the SEO quirks that we've uncovered and largely solved and scaling out tools that are already generating quite serious revenue for me with some crazy potential. These aren't hypotheticals, but things actually out in the wild delivering results for clients.
So, crazy exciting time at the moment. If you want to see more on chat GPT, image generation, the cool promps, automation flows, whatever, drop it in the comments. I can't wait to get API access, which will again change everything.
I'll reply where I can and I'll make more videos. But honestly, the best thing you can do is just ask the GPT tool itself. If you want to create something, try it, break it, see where it bends.
This stuff only gets more valuable when you start plugging it into real workflows, not just poking prompts, but automating the whole thing end to end for business owners that have these problems and need them solved. And that's where I'm heading.