Creating Images and Videos with ChatGPT

Discover how AI tools like DALL-E and ChatGPT 4.0 can transform simple text prompts into vivid, detailed images. Learn key techniques for effective AI-generated image creation, from crafting precise photo prompts to understanding AI limitations.

Key Insights

Use photography-specific terms like lens types (e.g., wide angle, 50mm), aperture settings (e.g., F4 for shallow depth of field), and lighting conditions (e.g., golden hour, midday sun) for generating realistic photos with AI.
Be aware of AI image generation limitations, including difficulty accurately depicting human hands, faces, and readable text, as well as struggles with continuity in character creation across multiple images.
While ChatGPT integrated with DALL-E can produce creative and realistic visuals, advanced editing tasks like cropping or detailed photo retouching are better performed using dedicated editing software such as Adobe Photoshop with Adobe Firefly integration.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

Next up, we can generate images. Behind the scenes, there's this technology called DALL-E. That's what they named their image generation feature.

So, we need to use something like 4.0 or the older version of 4.0 to generate these. Prior to that, we had 3.5, which couldn't do it. If you run out of messages, by the way, bing.com slash create is basically the same thing.

I do think it also has some limitations, as far as how many images you can create. But if you run out of messages and have a free account, you can always go over to Bing, because the 4.0 mini can't create images. The mini versions are less capable, so you can't do as much.

So, when you fall back to mini versions, they can't do as much. So, you can't do data analysis with mini. You can't do image creation with mini.

You need the full version of 4.0. Or if you've used all your messages to 4.0, you can go back to the older version of 4. I'll tell you, with numbers and naming, it's so confusing, because most people think it's 4.0, and so they think, why is there a difference between 4, which is the older, and 4.0? But it's not 4. It's 4.0, which is the second version. I'm like, version 4. That's bad naming. So, you can go there and say the image that you want to create.

AI Classes: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes, 1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

AI Classes & Bootcamps

Live & Hands-on
In NYC or Online
Learn From Experts
Free Retake
Small Class Sizes
1-on-1 Bonus Training

Named a Top Bootcamp by Forbes, Fortune & Time Out

Learn More

So, you might be able to get this for free, but if you run out of messages, you might have to pay. So, what do we want to create an image of? What do you want me to type in here? Create an image of what? Any ideas? Puppies running in a backyard.

Puppies running in the backyard. There we go. Now, it's funny that you said puppies, because the one thing that I save in my account is this.

This. That is just the cutest little fluffy puppy. That's a photograph of a poodle with puffy hair playing with a chew toy.

That is just the most adorable little fluffy… Oh, I just saw him hugging. So, I always save him in my account so I can show him. So, this image was never taken.

It is not a real thing. It is created. And I don't know what that is on the ground right there.

Almost kind of looks like a little football. Or it could be a dog poop. Not sure.

But I don't like that, whatever that is. Because it could be interpreted as a little dog poop. So, up here there's a little select area, which I can click select, and I can just kind of hit that little area right there.

Where was that select? Oh, I see. Yeah, it's this little brush right there. So, I have to click on it to open it up.

And then, once I expand it, I can go into Select. And I can paint over that little area. And I can say, remove the brown thing.

I'm not saying anything more than that. Just remove the brown thing. Whatever that brown thing is.

So, that's actually a little bit better. Because I kind of wondered where his third leg was. Because the other ones are kind of hopping along.

So, that's actually pretty convincing. There's maybe something a little bit, not quite 100%. But that's pretty darn good as far as being a convincing image.

Anybody else have an idea? Well, I put one of them on my screen, and it looks fake. Yes. So, that's what I want to find.

So, I started with image of skiers at El Tam Snowman at Colorado. Okay, image of skiers. Skiers at El Tam at Snowman's.

So, skiers. Okay, I'll just say at a ski area. Oh.

Because I'm sure they're not going to know one specific place in the world, probably. They might. Okay.

So, let's see. Oh, I said image or. They'll probably figure out what I meant.

Image of. Yes. Okay.

So, first of all, they're very evenly placed, which is a bit odd. So, okay. Now, notice it says, here's a lively ski area scene with skiers enjoying the snowy slopes.

It captures the essence of a vibrant winter day at the resort. So, if you click the little I for the prompt, this is what they sent to DALL-E. What? I wrote image of skiers at a ski area.

They said a lively scene at a ski area with skiers of various ages and skill levels gliding down snowy slopes. The landscape includes a tall, snow-covered pine trees, a clear blue sky, a cozy ski lodge at the base of a mountain. Some skiers are wearing colorful winter gear.

There are ski lifts in the background. The atmosphere is vibrant and energetic, capturing the joy of winter sports and a festive holiday vibe. They turned my thing into this.

They basically wrote a story and sent it to DALL-E because they were like, 'Hey, your prompt isn't detailed enough.'

Now, if I don't like this, I can copy this, and I can paste it here, and I can revise it and change it if I think it's a good base. Right? Also, did I ever say it was a photograph? No. A photo of, and I'm just going to add a photo.

That's all I'm adding to this. And to see, because I just said an image. Can images be illustrations? They can.

I didn't say a photo of. Still not great. But I also didn't say anything about the images themselves, like, or the camera that it was being taken on.

I didn't say what lens or F stop or any general information about this. So. I mentioned about the refining an area if you want to fix it.

But this is a very common thing illustration versus photography. So, the prompt on the left, I said, realistic dog playing with a chew toy in a living room couch with a living in a living room with a couch and rug on the floor. I said, realistic dog, because I thought, well, I want this to be realistic.

Right? But do you ever say to a photo? That's a realistic photo. Is realistic a term that you ever use about photos? Are photos ever unrealistic? No. But in painting.

Oh, that's realism. Or that's a realistic painting. That is language you use to describe illustration.

That's a very realistic illustration. Right? So, when you want to photograph, say, photograph or photo. You might want to say the lens that it was shot on.

You might want the lighting. At a certain time of day. Because that's the way you would describe a photograph.

So, for example, you can say a wide angle lens. And if you know anything about photography, an F4 has a much shallower depth of field. Versus like an F22, where there's more things in focus.

So, do you want a shallow depth of field? Do you want a big depth of field? And so, this is another photo. And look at how real that looks. Because this is the prompt that I used to get that.

Very, very different results. So, if I say, going back to this prompt here. Image.

So, I'm going to say. Photo of skiers at a ski area. Okay.

I'll say, shot with a wide angle lens. At F4. So, it's going to be like a nice shallow depth of field.

At sunset. And let's see what different result I'm going to get. Okay.

Because they're going to train on real photographs. And real photographs have metadata. Of the lens that it was shot on, the aperture, the ISO.

Those are some crazy people. Okay. I did say wide angle lens.

Okay. That's a little too. A little too wide angle.

Okay. We're starting to get better, right? No, I didn't say perfect. Okay.

So, maybe I would say. A 50 millimeter. Because that's, yeah.

That's like fisheye. I could have said maybe 32. 50 is not wide angle.

But I'm going to go with 50 to see what 50 looks like. Okay. We could try 32, which would be more wide angle.

Like 28 to 32. Like, this is probably like a 18th or something. Which is like fisheye.

I said skier. There's too many skiers. I feel like.

So, I need to, I need to have. Yeah. I shouldn't say something.

I shouldn't say. Although it's kind of cool. Right.

Okay. Okay. So I'm not going to say sunset.

During midday. The sun sunlight. Of like.

I'll say three skiers. And. Yeah.

Let's try that. This is where you can definitely burn through a bunch. Of image generations.

And I'm going to go with. Because to get exactly your vision. Okay.

See now. Yeah. Now we're like, okay.

That's looking like now, if you look at the people in the backgrounds. There's something a little fishy. And like, I don't know why that's kind of glowing there.

That's a little weird, but like, this is looking much better. It's definitely much more realistic. Now you can see like little areas like here, which are weird, like what, I don't know what's going on there.

It's like there was a wheel and the thing should go around that, but it's, or well, no, actually it would go around there, but that's still just weird. If you look closely, a lot of AI images will have weird things like that. Like where you can find, if you're looking for it, you will find mistakes.

It's hard to get rid of all of those mistakes right now. If there's just some little things you can select it and like tell it to fix this area or something. I can say, fix this area, or I can tell it what to put in there if I want something else.

But especially when it comes to like hands, hands are hard. Like look for like missing fingers. They often they're like, wait, you only have four fingers.

So missing fingers are very common. Not the right number of teeth. Like, you know, granted they're pretty good about having like two eyes, one nose and one mouth.

Now they've gotten a lot better with people. People were really challenging and that's a fine, that's a little better. And so the thing is for most people, if they're just looking at it quickly, you can get to the point where for most people, you could probably get some of these images to the point where most people would notice, but they start looking really close.

They might be like, wait, wait, there's something a little off here if they start analyzing it too much. But this is really where your prompt is important. And still as little as mine was, look at what they wrote.

Although they did use a realistic photo again. Even they wrote that, I didn't say that. They're dressed in colorful ski gear, standing on a, you know, with ski poles in their hand.

So, you know, they're saying this. So maybe, let me copy this and put this in. I'm gonna get the realistic part going.

A photo of three skiers. Midday sun is definitely better. I do like colorful ski gear.

The background shows trees, tree covered mountains. I don't like the ski lift maybe. Their green gear.

Oh, with the previously problematic area now seamlessly fixed. That's funny. So I'm just gonna do that.

Let's see. Maybe I shouldn't say on a ski area. Maybe I could say like on a ski slope.

But you see how specific you have to be? Where are their faces? Like, yeah, like they don't. Now, so this is also an issue. Okay, okay.

So let's say, are dressed in colorful ski wear with the bottom part of their face exposed. How much naturalistic can you show at some point in this? Naturalistic? Rather than realistic, naturalistic. Any of those, like, anytime you start putting, those are ways you describe illustration.

Okay. It's natural. You don't ever say a photograph is natural, realistic.

Use photography terms. Dude, why do you have a camera? And he's got, he's got two. Now they got two goggles.

Like those goggles are not correct. So. I put in women's skiers with three skiers because I just got like enormous number of skiers and they put in four skiers.

Yeah. So, and this is where, does it truly understand what you're saying? No, it really doesn't. Like most of the time it does not.

And so sometimes, okay, for example, like if I said, give me a picture of a dog, give me a photograph of a geek doing something with, and do not put glasses. Yeah. Geeks to chat, GPT, always wear glasses.

You, I mean, I can highlight it and say, remove the glasses. It puts the glasses back in. Geek, glasses.

I've not figured out a way to remove the glasses from a geek. And like, even if I say like a smart person, sometimes they put the glasses on and like, no matter how many times you say, don't put glasses or give them contacts. No, they have glasses.

I cannot get those glasses off with any, I've tried so many different ways to try to remove the glasses. And I thought, well, okay, if here's glasses, it doesn't know the not part of this. You know, like if I say, don't think of a pink elephant, you're like, crap, I just thought of a pink elephant.

Right? So like, if you say, don't put glasses. So I tried it differently. I said, you know, put, give them contacts.

No, still give them glasses. So there's just sometimes when you can't get Chattopiti to do what you want it to do. And this is all AI image generators, by the way, not just Chattopiti.

Also, when it comes to texts, it's really bad with texts. Most of the texts is illegible. You can't read it.

It looks like gibberish. So don't try to ask it to generate texts. Normally speaking, it's not gonna work very well.

It's just not. Hopefully at some point they'll get through some of these issues. They're better, I feel like, at text generation than they are at image generation.

Now, of course, then you add on top of this, if you try to create videos, which are 30 pictures per second, imagine this at 30 pictures per second, video is even harder. They're starting to get better with image generation. And I'm sure they'll, you know, they're getting there with video generation.

Chattopiti, we'll talk about that more in a little bit. Chattopiti right now cannot generate images, cannot generate videos. It can only do images.

We're still waiting for the, that they've announced that they're going to, they just haven't released it yet. So, so tips. For example, if you're saying realistic dog, probably not what you want, unless you want illustration.

Photograph of a dog. Photograph of the dog with the lens and the lighting, right? But you could say like golden hour, or you could say ceiling lights. So like, if you just change things like the lighting, these are gonna make a difference for you.

Be as specific as possible with these kinds of things. Really. So let's say, for example, I just do each one of these.

So in my paid account here, realistic dog with a chew toy. Also, I didn't tell what kind of dog. I didn't say what color dog.

See how much more detail you can give it if you have something in mind? See, that looks like an illustration. That's not a photo, but I didn't ask you for a photo. Notice it added happy.

What if I want a sad dog? There's probably constraints, for example. Like, you probably couldn't say like a dog that's been abused by its owner or something to like, maybe you're doing an article on like animal abuse. There's probably things that it's not going to do.

Okay, that's not very photo realistic. And also, why is his tongue on top? And why does his tongue look like the same texture as the chew toy? It's kind of funny, except when you're trying to do it realistically. But notice I still wasn't very specific of anything about camera stuff.

If I throw in there, the lens that it was shot on and the aperture and the lighting environments, if I describe it more in photography terms, that would match up with the metadata from a camera, maybe to say add digital. See now, that's the first one, when I said the lens that it was shot on, see how the lens trigger kind of gets it to be more realistic? Like that's much more realistic. And if I switch, so that was golden hour.

This is ceiling light. If I just change to the ceiling light, see how this feels different versus golden hour light. Golden hour is when the sun's going down and it gets that nice warm glow, either sunrise or sunset.

And that is a floating dog toy. It's just floating in air, because that's normal. Yeah, that's, see now this is a time when you might want to say to just put it in the background and do it again.

Which interestingly enough, they don't have a generate again button here, which they used to. I don't know why they got rid of that. They used to have a regenerate.

Maybe I guess people were generating it too many times. So I would just go back and edit it again and send it again. They used to just have a regenerate button.

I don't know why they got rid of it. The only thing I can think of is that people were generating too many images if they made it too easy to generate images, because they realized that it wasn't perfect the first time. But notice how many different types of dogs, right? Because I didn't say a poodle, right? If I said a poodle or a Great Dane or some specific dog, a brown dog, a white dog with black spots, now Dalmatian.

I wonder if you could have a purple Dalmatian. Poodle, right? With like two claws that are kind of weird. They've gotten the face and the teeth pretty much, although there's this random blue spot there.

I guess he got a part of the chew toy on him. But like, look for like feet and paws and those kinds of things. They tend to not be very good.

Now if somebody's just looking at it kind of briefly, like if they were just looking at it small like this, it might be passable. But with this one, definitely be specific. Otherwise you're not gonna get the result that you want.

So be descriptive. If you're looking for paintings, say a specific art style for photographs, the lighting, if there's a color palette where you wanna see certain colors, say the colors you wanna see, and this one is probably gonna require more regenerations of trying things out till you get something that's descriptive enough for yourself. So for example, here, these were just some tests back with GPT-4 versus 4.0. If I just said very simply, hummingbird feeding from a flower, this is what I got.

And if I said a colorful hummingbird feeding from an intricate flower with a wide array of other hummingbirds in the background, it's golden hour in the forest. Now it has a much bigger context and look at how much more colorful this particular one is. So be as descriptive as necessary.

And if there's little ones, like maybe there's too many here, you could go and highlight that with the select part and say remove that or add to that. And earlier Terrence had asked about, can you crop a picture in Photoshop for use in Illustrator or like… When it comes to functions, what ChatGPT does, ChatGPT is not like an image editor, it's an image generator. So you can't upload images to say crop this, do that.

You're generating new imagery. You could upload images to say like in a reference or something maybe to use that or use similar colors, but it's not gonna be replacing Photoshop. Like take this thing out of it.

It's goal is to generate new images, to not be a photo editor. That's where it's different. Like if you're using Photoshop, Photoshop builds Adobe Firefly into it.

So in Photoshop, you can select an area and kind of like the ChatGPT has that select area tool. And in Photoshop, you can say, I have a photograph, let me just retouch this area in Photoshop. That's Adobe's Firefly built into their app, Photoshop.

If you want to use AI in Photoshop, you're using the Adobe Firefly. So still the way that you describe things in prompts will translate over because you can describe what you wanna see in an area. But actually in Photoshop, if you highlight an area and then you just don't say anything, it'll try to figure out what should be there using AI.

And it'll like fill in an area using what it thinks. Or if you wanna put a dog somewhere, you can highlight an area and say, add a dog. And it'll add a dog.

Or you can say, put a poodle sitting up, begging for food or something. And it'll do that. Once again, what you can't do is things like, say like a specific person, it'll generate, let's say a woman, a man, maybe a certain age, but you can't say a famous person, use this actor doing this thing.

It's just not gonna do that. So, and just remember, this image will never again be generated. If you want this, best to download it because even if you do this prompt again, you're not gonna get that same exact image ever again.

If you're not familiar with photography, this article has, it's a good thing that talks about it, about the realism versus photography part of it. So things like exposure and lenses and that sort of stuff. So it goes through the different qualities of like, if it was a slow shutter speed, a fast shutter speed.

So like if you want light trails, you'd want a slow shutter speed or is it a long exposure, slow, or is it fast? So fast would capture things like actions, right? If you're not so familiar with photography, this is a great way to go through and see what the prompts would be for different lighting types. Like do you want lens flare on something? Do you want studio lighting? Bokeh is talking about the blurry background. So you can say, make sure there's a bokeh or you can say like F4 or something.

So these kind of give you some of the terminologies in photography if you're not familiar with photography. Giving you some examples of different camera lenses. So this can be a good kind of source or different artistic styles too.

So. And then you said that these can't be copyrighted so we can download and use. Yeah, so you're not gonna be able to copyright this image but you can still use it.

Just know that. There's no issue about using it. No, you can certainly use it commercially.

Just know that you can't copyright it. Because. No, you don't need attribution.

No. So you can do anything you want with it. You just can't copyright it.

That's the only limitation. But you can do whatever you want with it. You can add a photographer to your front, it says.

So just like to make a photo in the fashion of, you know, some book photography like. Yeah, like a famous photographer. Kind of like, yeah, like if you wanted to say like.

Make a painting in the, painting of a, let's say, a cup in the style of Picasso. Ah, okay, so it couldn't do Picasso. Can you say cubist? Yeah, cubist.

In the cubist style. So that it can do. Also, some image generators, some image generators might take Picasso's style.

Yeah, Chappachippy Tee does not. So for example, like Mid Journey does more stuff. Rock, which is made by X.com and Elon Musk.

They don't put safeguards on that. You can do whatever you want. Not necessarily legally, but you can do whatever you want.

So other image generators might not have the constraints that Chappachippy Tee does. It does look like a painting and it is in the cubist style. I'm gonna say make it a bit simpler.

Because I think it's a little, maybe a little complex. It doesn't break up the objects. Yeah, that's simpler.

Yes. So, and it does kind of have a painterly style. Yeah.

So just to clarify, I just opened Photoshop. I wasn't seeing you set up this firefly in Photoshop. Is it in the, do I read it? You opened up Photoshop? Sorry.

Oh, this one is not, this one is not updated. So let me see. I would need to update first.

Let me sign in and I'll have it update in the background while we're doing some other stuff. Sorry about that. Thank you.

Sign in here. The older versions, they did not have firefly built in. You won't see the term so much like fill with firefly or whatever.

Basically you highlight something and a little contextual taskbar will pop up. And then- So it's just the content of the web thing? So, well no, content aware was kind of pre-firefly. Content aware, in a way it's, so they don't call that AI, but in a way that was kind of the first AI thing, even though they never called that AI officially.

Content aware was more so, let me look around and see stuff that's around and kind of suck in and create stuff based on surroundings. It didn't generate its own completely brand new stuff. It kind of used stuff that was already in the image.

And that's, to me, that's the differentiator between something like content aware fill, which is aware of stuff around, but it's still using what's in the image to generate it versus generative where you can make up anything you want. And it's not limited to what's actually in the image itself. So content aware sometimes could be more accurate in some ways because it's actually using what's in the image and uses the full resolution of that image.

If you're using Firefly, the generative fill, that can, let me just update to this one. The Firefly stuff will actually generate anything you want. It's also limited to a specific resolution.

I don't know what the latest one is, but it used to be 1024 × 1024, so that's 1,024 pixels by 1,024 pixels. That's what it used to be, but I think they upped it to like over 2,000 pixels because it's generating its own stuff, not using what's in the image. It'll actually generate its own stuff.

That'll update the background. Okay, let's go back to here. Okay, also the, what was it, Inception.

Anybody remember the movie Inception? Inception. It's a very good movie where like worlds would bend and like interesting things happened. It was a cool movie.

If you never saw it, it's a cool movie, but there's like unrealistic things would happen because like in the mind, in the mind of the person, they could change things, and so it was very interesting. So Chachapiti can create things that are not based in realism, right? Because it can make up your own stuff. So this image here, it kind of reminds me of that like Inception kind of visual for it, but like creating like futuristic worlds or things, they don't have to be based in reality.

So you can use your creativity, and that's where it actually, if you have a story in mind, like a children's story or adult story or something, you're writing like a fiction book, it could be actually really cool to create fictional characters and things. Now, one of the issues with characters is it cannot reproduce people over image generations because it can't make something look like somebody else. So that's right now an issue with Chachapiti because it might be a character that it even made, but it won't keep making multiple images looking like that same person.

Every time it'll create a new person who's kind of like that person, but like it doesn't look exactly like that person because it's generating a new image every single time of like, let's say if you said a young boy, you can still get a young boy, maybe he even has brown hair, but it wouldn't be the same face every time. So continuity across images is not something we can really do right now. I'm hopeful that at some point in the future we'll be able to get continuity across images so we can actually create characters that we can reuse.

I understand it's a tricky thing to solve because how do you make an AI that knows when to copy people and when not to copy people? Right, like, hey, you generated this, so you can make other people who generate, that look like this. You're not trying to copy a real life person, right? You're using this generation, like keep the same character, I like them, remember them and always generate other people like that. Because that's actually required for video generation to keep the same person from frame to frame to frame.

Because if you're generating video, every frame of 30 frames per second needs to be the same person, slightly moving. So you have to solve it for video. Hopefully they'll solve it for photos too.

Because imagine being able to create an illustrated book with characters that have continuity of like, this is this character, this is this character. Can we keep continuity with let's say the comic book character? There's no continuity of anything, unfortunately. There's no way to keep the same thing across image generations.

If I try to write a book and I need illustration, I can't use this. Because it can't, let's say I have a main character. Yeah, exactly.

Not when it comes to characters, it's gonna be very, very difficult to try to get image generations that will be consistent across all of those, correct. Yeah. I'm hopeful someday, but that day is not here yet.

I just don't know when that will happen, if that will happen. I'm on a slightly older system here, so I don't know if I can get the, yeah, I don't wanna update this Photoshop and break this. But basically in Photoshop, you would just select an area and then in the little context toolbar that pops up, there'll be an area where you can type in a prompt if you want to.

If you don't type in a prompt and you hit fill, it'll fill with whatever it thinks should be there. If you type in a prompt, it'll fill with whatever you type into the prompt. And in general, when you're in Photoshop and you're having it fill stuff in, don't tell it what to do.

Like don't say remove this or add this, just say what you wanna see. They say, don't tell us, don't give us instructions, just describe the thing that you want to see. So just say like dog sitting up or child fishing, or not say anything and it'll just fill in what it thinks should be in an area.

So that's one way to do it in Photoshop. There is also, there's firefly.adobe.com if you wanna generate stuff, kind of like I'm doing with DALL-E. If you want DALL-E-like things, but you have an Adobe Creative Cloud account, so this is part of your Adobe Creative Cloud subscription, if you're paying for it, you can come in here and describe it and it will generate images for you.

They have design specific stuff, like they'll have like color scheme that you can choose. You can give it reference images to say, create another image that's like this with the same pose. So like, for example, if you took a picture of yourself posing, you can say, make a person with this pose and they will use your reference image to create another person in that pose.

But you can choose like to say like, I want an older Korean man in this pose and they'll make that. You can only use the Adobe for that. Right, so ChatGPT does not have so much like design specific stuff like with reference images and they have more of an interface for creating those types of things because it's more suited for creatives that have something very specific in mind.

And the video generation is coming to both ChatGPT and Adobe Firefly, but the video stuff of both of them, they're not here yet, but some point. So they have to get better with images to really do video well. Now OpenAI has shown what they call Sora.

Sora is their video. These videos were created from a text prompt. Those are like little flying paper airplanes.

It's not realistic, but it's like flying bird paper planes. Okay, now this video right here, where it just start playing here. This video is not real.

This video is AI. She is really quite good. Now, if you look at the people, they kind of float in a bit of an odd way.

And like, if you look at their feet, they're not quite right. They're close, but there's still something a bit off. But like.

Text is really hard for AI stuff, yes. That's probably why they chose Chinese because most of us don't speak Chinese. So they know their audience is primarily American.

And so they're like, oh, they won't know Chinese. So they can fool us, right? But yeah, you're right. Like that's, I don't know what's on the side of her glasses there.

But like at the same time, like even though it's not perfect, like look at the detail there. Like she is completely computer generated. And like the reflections on her glasses.

They wrote text to generate this image. But like here, this part over here, this is the worst part for me, is like the way that people, they're not like walking along, they're like almost floating. There's something a little, it's that uncanny valley where it's like, eh, something's just, you feel something's off.

You can't quite put your finger on it. Okay, but like this is unreleased version one, right? So I'm sure when they actually release it, it'll get better. And then over here, okay, so check it out.

Wooly Mammoths. This is the prompt that they typed in down below here, this prompt. They wrote this prompt to get that video.

That they could write text and get that video is pretty darn good. Again, this hasn't been released yet. This feels like a Hollywood movie.

Low budget aircraft there. Okay, now that, I have no idea what that guy just did there. Like, look at how weird that, you're putting a piece of metal onto something.

Like, what did you just do? But still, I mean, that aside, this is actually, to me, aside from that one weird thing, like, he looks great. And like the camera wobble thing, that's just weird. I don't know what he was doing there.

People are totally AI here. This is 100% AI generated. But like this, I would not know that that's AI generated.

Like, that looks, that looks very, very good. But like, if I showed you that video and you didn't know that it was AI, you would never pick that out. And this just looks like a movie.

Cool. That looks like it could be like a Pixar movie. The crazy thing about this is imagine, like, imagine you're J.K. Rowling before she's famous.

Making, writing Harry Potter, and you write an amazing book, and then you say, I am going to make my own movie without the need for Hollywood. Imagine being able to do this stuff. On your own.

The creativity and, you know, that you can get all the rewards for your creative work is kind of mind boggling. Like this, this is very creative. This is like a paper craft coral reef.

Look at that. Look at how cool that is. They're paper cut.

Like, that would, you'd require like a Hollywood studio to create 3D graphics like this. And yet they wrote this text prompt. They wrote a gorgeously rendered paper craft world of a coral reef with colorful fish and sea creatures.

That's what they wrote to generate this. Literally that one sentence is what they wrote to generate this. So I'm curious when they're going to actually release this, because right now it's just, they're showing this on this website here.

This bird, by the way, is not a real bird. I mean, it's supposedly a video of a real bird, but they just ran a text prompt. And then this is the coffee pirates, because they're in a coffee cup, the coffee pirates.

And then this guy reading a book. Now watch the book. Watch him flip pages in the book.

There's something weird going. The pages in that book are not acting normal. There's something weird going on there.

But still, it's a guy sitting in a cloud written, you wrote a text prompt to get that. And theoretically, this is the worst it will ever get, because they haven't even released this. It'll probably be better by the time they release it.

And it should only improve as time goes on with more and more training. So the first, by the way, the first video AI generator, generation, first AI generated video. Which one was it? Oh, no, where is it? Oh, here it is.

Just so you can see how far it's come from where we started, that this will be a little slightly disturbing. Ah, ah, ah, ah, ah, ah, ah, ah, ah, ah, ah, ah, ah, ah, ah, ah. I'm gonna turn off the sound.

His face is kind of off. His face is weird. This was the first AI generated video.

Like, look at how weird it is. He's eating his face. Like, that's just weird.

So that was a year ago. So, yeah, they're coming a long way with things. Well, remember that when you're open AI, you can remove your own constraints.

This is demonstration of stuff. Just because you can demonstrate something doesn't mean that you'll actually release it to the public. Okay.

So like, so if you just look at the difference there. Yes. Yes.

This is in one year, the advancement that they made. Granted, it's not perfect, but that's amazing for one year of innovation. But you can also see why open AI will not release generation, video generation of real people.

Because now, once they do that, you can't trust anything you see ever. And even still, like, it's a little debatable now because there are some video things out there that will generate AI video. Same thing with photography.

Like, you now have to be suspicious of any video or photo you see as to whether it's real or not. This is actually the most impressive thing, Runway. Runway ML.

So this is their text to video generation. This is AI generated video, not open AI. This is another company.

This is crazy stuff. They're on kind of the leading edge. So they're working with Lionsgate and other places so they're trying to bring this to real full-fledged image creation.

So imagine not having to hire a 3D company and just being able to do text prompts and stuff. They're giving, they have a thing where you can record a video of yourself and then say, create this character doing this thing. And your facial emotions, your movements, you can act it out.

And then they'll put any scene onto that. They'll recreate your things, your emotions, your acting, they'll create any character doing that thing. And they give like camera control so you can pan and zoom and do things like they would like a real life camera.

They are giving those controls to people with AI that you could like choose to like zoom in, zoom out, go left and go right, change focus, those kinds of things. So they're getting a lot of press right now talking about that. This I think is kind of the next big thing will be video generation because this will change how we create and consume entertainment.

It'll just make much more accessibility to creating these things and turning your ideas into something. If you wanna play it out a little bit by yourself, there's this thing called Dream Machine by Luma. And I just link out to these because even though this is not ChatGPT stuff, it still relates to it.

So you can do this for free. Actually, you could sign up for an account with them. You could try it out.

And they'll let you take an image and turn it into video. So let's say in ChatGPT, you create an image, then you take it over here into this Dream Machine and you say, generate a video and it'll generate a video for you. So, right now they're very short, they're not very long.

It takes a while to generate. So right now it just generates five second shots. So people use this on TikTok and stuff? I have seen some AI videos on TikTok.

I've seen some people, a lot of people were talking about the runway thing right now make a lot of videos about that, I see that. So is this the continuity of character that you're trying to create? It is. So in this case, they're creating five second shots.

So they are keeping continuity for five seconds. Now, if you're taking images, how are you creating multiple images to have it create those shots? That's still a challenge. That might be a challenge for ChatGPT.

Some other video creation, sorry, image creation apps might be fine saying, hey, take this image and change the pose or do something else. ChatGPT has more constraints than some of the other image creation platforms like Mid Journey lets you do more stuff than ChatGPT will do. ChatGPT is trying to keep you more in line and they don't want you misusing it.

So they limit what you can do more than some other AI platforms.

Key Insights

AI Classes & Bootcamps

Dan Rodney