Beyond The Prompt - How to use AI in your company

Use AI for Stunning (fast and cheap) Product Images: Insights from Salma Aboukar

Episode Summary

In this episode, we explore the world of AI-generated imagery with the incredibly creative Salma Aboukar, founder of CREATE. Salma shares her journey from e-commerce to becoming a pioneer in AI-driven creative production, revealing how she harnesses tools like MidJourney, Stable Diffusion, and GPT to transform product photography. Its a very practical episode for everyone who is looking to use AI to make images for their products, services and sites.

Episode Notes

In this exciting episode, we dive into the world of AI-generated imagery with the incredibly creative Salma Aboukar, founder of CREATE. Salma shares her journey from e-commerce to becoming a pioneer in AI-driven creative production, revealing how she harnesses tools like MidJourney, Stable Diffusion, and GPT to transform product photography.

Key Takeaways:

  1. Start with a conversation: Engage with GPT conversationally to generate initial ideas and refine your vision for the perfect product image.
  2. Leverage MidJourney for aesthetics: While GPT is great for ideation, MidJourney excels at capturing artistic elements like lighting, reflection, and refraction.
  3. Upscale for photorealism: Use upscalers like Crea or Magnific to add details and make images look photorealistic, especially when working with GPT-generated images.
  4. Experiment with styles: While focusing on photorealism for work, don't be afraid to explore different styles like anime or niche aesthetics for personal projects.

Resources Mentioned:

📜 Read the transcript for this episode: Transcript of Use AI for Stunning (fast and cheap) Product Images: Insights from Salma Aboukar |

Thanks for listening to this episode of Beyond The Prompt! If you enjoyed the conversation, please share it with a friend and subscribe to the podcast on your favorite platform.

Episode Transcription

[00:00:00] Henrik Werdelin: Hi, and welcome it's time for another exciting episode of beyond the prompt

in this one, we're digging into the image generation part of AI. S we speaking to the amazingly creative Selma. Founder of crate. Selma is a true innovator in the world of AI generated imagery. She harnessed the power of tools, like mid journey, stable diffusion, and Dolly to transform the realm of product photography. And this episode Selma shares her fascinating journey from the world of e-commerce to becoming a pioneer in AI driven creative production. She dives into her process, revealing how she craft the perfect prompt to bring her visions to life. She also discussed the future of AI and design and the ethical considerations surrounding this rapid evolving technology. So grab your favorite beverage, get comfortable and join us for another great conversation at the intersection of art and artificial intelligence.

[00:00:54] Salma Aboukar: So good to be here. I'm Salma Abaka . I'm founder of CREATE. We create photorealistic product photos using ~AI, and we also integrate that into our workflows, um,~ AI products like Midjourney, Stable Diffusion, and ChatGPT.

~Um, ~And similar to you guys, my journey,~ um,~ Started around 2018. I was actually in the econ space selling makeup products and then around 2020 when COVID happened, I couldn't really create content at speed. So that's when I started learning about 3D design, really ~like ~fell in love with that. And then in 2021, what started out to be an agency creates,~ um,~ started doing really amazing photorealistic product imagery,~ um,~ using traditional tools.

And then. Around 2022. That's really when I started getting serious about at the time was Dali and then mid journey. And ever since I've been like exploring how to,~ um,~ basically put it into your creative workflow and create amazing images just using tools which are readily available.

[00:01:50] Henrik Werdelin: ~ Um, how when you, let me start with~ when people ask you to help them use generative AI, is it mostly for cost or is it for creativity? Or what do you think is the driver why people might consider doing this instead of taking a photography of their makeup product first?

[00:02:07] Salma Aboukar: Yeah. Well, that's a very good question. So I find there's two types of audiences. So the first type of audience is they want to brainstorm. So like they would like to have many different ideas, creative ideas that they want to experiment. Say, for example, you want to,~ um,~ create just say a t shirt or a product.

~Um, ~instead of, you know, going through the traditional method of ideating, you can just prompt out these ideas very quickly, rapidly,~ um,~ and cost effectively. Right. And then, The other segment are people who want to actually incorporate this into their creative workflow. ~Um, ~so, as you guys know, traditional product photography, creating set designs,~ um,~ studios, these take a lot of time, effort, and money.

~Um, ~what these image generation tools do is they just make,~ um,~ now you can, you're, you're able to basically create very high quality photorealistic images that are usable, right? And we only really got to that part,~ um,~ I'd say last year. Before then, it was quite difficult to incorporate that in a meaningful way.

But now what we're seeing is a lot of creatives and a lot of businesses,~ um,~ are able to generate these amazing images, which are photorealistic, and they're using it for their ads,~ uh,~ for their social media, for their website, and reducing costs in it to a significant degree. ~Can we type this? ~

[00:03:20] Henrik Werdelin: ~Go ahead, Kenny.~

[00:03:20] Jeremy Utley: The thing that I'm curious about is how do you, can you talk about the process of from kind of call it initial prompt to usable? What's that, what's the kind of journey like to get from, you know, original prompt to this is production ready.

[00:03:37] Salma Aboukar: Yeah. ~So, um, ~so let's. Give an example of a skincare product. If you want to come up with a skincare brand for men,~ um, you, if you,~ I like to use chat GPT,~ um,~ I would prompt out the base.

So you'd create the base image fast. So that's like your first iteration. So you'd say something like. You'd give it your target demographic, you explain the type of branding, the colours,~ um,~ and you would have ~like ~a conversation, that's how I'd like to describe it, with GPT, really laying out the idea. And once you've done that, then you would,~ um,~ ask GPT to generate a few images.

images around the initial prompts. And then you actually get those images and I like to throw them into mid journey. And then when you put them into mid journey, now you're, you have your set of images that you can actually send to your manufacturer or whichever production team that you're working with.

[00:04:28] Henrik Werdelin: Is that because mid journey renders higher,~ uh,~ Body imagery or is it just because it's more has a better look?

[00:04:35] Salma Aboukar: Yes, it has a so what I find is that mid journey has a better aesthetic, right? ~So when you~ Chad GPT has more of a futuristic vibe to it, not quite sure why that is, but ~um, ~it, I think their training model, the way they train it, they train it with a bunch of different.

you know, variety of styles, whereas Midjourney is very artistic, it's very creative, it sees,~ um,~ it captures things like lighting very well, reflection, refraction, even captures caustics, which is basically if you have a, for example, wine glass,~ um,~ and the reflection that comes on it, it really captures the elements.

in a natural way, whereas with GPT it's not there yet for products. However, having said that, if you were to,~ um,~ create like a t shirt, for example, clothing, I do find that GPT is definitely the way forward. One thing to be mindful of though, is if you wanted to create like a t shirt and you wanted it like right off Or GPT, you'd want to use an upscaler to make sure the fabrics are like,~ um,~ those details are captured and you'd need to use an upscaler in that case.

So maybe

[00:05:36] Henrik Werdelin: you, you talk about these as kind of creating the different layers of the image, right? And do you ever render the image, sorry, the product itself, or do you tend to have ~like ~a product shot you then superimpose on top of

[00:05:51] Salma Aboukar: this? Yeah. So one of the things that we found was with the products anyways,~ um,~ you can't really,~ uh,~ these tools don't give you the creative control.

So you can't actually export a product and put it into a scene. And that's actually one of the things that we saw as a,~ um,~ as an opportunity and we've created a studio. So that's where create studio comes into play. And that's where you'd actually upload your product,~ uh,~ images on, onto the studio with your AI background images, move things around.

Yeah, sure.

[00:06:18] Jeremy Utley: be, um, one that I did yesterday. So for example,

I'm going to have a conversation with a GPT. Can you tell us about,~ like,~ what does that conversation look like? How long does it last? How much back and forth is there? Define,~ like,~ what do you mean by conversation?

[00:06:38] Salma Aboukar: ~Yeah, sure.~

So a good use case would be,~ um,~ one that I did yesterday. So for example, I'm ~Um, ~I was trying to create a image for a brand. They wanted a black model who is having coffee in a very luxurious scene. So the initial prompt I gave to GPT was, Hey GPT, this is the brand description. ~Um, ~we want a model. ~Um, ~can you only show her arms?

~Um, ~the table needs to be luxurious. The scenery needs to be luxurious. ~Um, ~and on the table, can we have a red velvet cake? And also a cappuccino. So when I gave it that initial prompt, it generated similar, but the model was white. So all I asked is that good job to try GPT, but the skin color isn't quite there yet.

Could you make it darker? Or could you, you know, make the skin a little bit black? And then it will generate another image. And then I'm like, okay, this is great. Could you ~Um, ~maybe make her,~ uh,~ clothing silk because it generated an image of a model that was wearing like a woolly jumper. So, ~um, ~I just asked it, Hey,~ like,~ could you make sure she's wearing like a golden silk dress?

And then it generated that. So just like a back and forth conversation.

[00:07:47] Jeremy Utley: And are you finding there that it's basically the same ~like, ~you know, take, I mean, image is 90 percent of the same, but now instead of the wooly jumper, you've got silk, or are you finding my own experience has been, I say, wow, for example,~ uh,~ I'd like to talking heads with thought bubbles coming out of each of their heads.

First image. It's like the thought bubbles are inside their heads. I'm like, this is a great start, but the, the, the thought bubbles are actually inside their heads. Can we pull them outside? And then all of a sudden it's ~like, ~It's not that the thought bubbles have moved. It's a totally different thing. So how do you, how do you get to the point where, where you're basically getting a consistent image, but instead of wool, it's silk, or, or do you have the same experience I've had?

[00:08:27] Salma Aboukar: Yeah, I've had a similar experience as you. ~Um, ~so it does generate a new image, right? But what I tend to do is I always ask it to stick to the previous image. So whenever it does that and it completely changes the scenery, I ask it, Hey, okay, this is good. The silk is good. The color is great, but the scenery is.

off. Could you copy this,~ uh,~ image? And I'll just sometimes say the image that they initially generated and upload it onto GPT and say, can you, and, and one thing I've found is that if you give it more data, so more base images,~ um,~ it generates something more

[00:09:00] Henrik Werdelin: accurate. Also, you could make a reference saying, I like Basically this ad, could you make similar ad, but make it with, you know, like a model wearing a silk shirt?

[00:09:09] Salma Aboukar: Yes. One thing to note though, is you'd want to put in many base images, the more data. So I like to now put in six base images and then I give it a detail and it captures it closely. If you give it only one or two images, sometimes it doesn't really capture it as close as you want it. It all depends on what you're trying to achieve, but if you want to really like.

Get GPT to understand your vision. You'd want to start off with base images.

[00:09:34] Jeremy Utley: Well, so you're, you're like, you're obviously an experienced expert. Just take yesterday as the example. I love that brand description. I just wrote down a few notes, black model, only her arms, luxurious setting, red velvet cake.

Okay. So that's where you started with then. Good job. But this is a white person, you know, Good job, but I don't want to wall jump, right? So start. So how long does it take to get the base image that you're excited about plugging into mid journey? And then do you have a similar kind of conversation with mid journey?

[00:10:05] Salma Aboukar: Yeah. And so that workflow, what I tend to do is once,~ uh, uh, ~GPT has sent me the reference images of the base images, I just copy that over. So I either copy the prompt. Over from GPT and send it over to MidJourney and surprisingly MidJourney captured exactly what I want in a more nice stylized way. And sometimes, well it depends on the types of content obviously, but what I find is for products, MidJourney captures it really nicely.

For models and anything around that. It captures it closely, but not as close. So sometimes I am having to go back and forth. With mid journey though, it's not conversational. You'd want to edit the prompts out and add things and move things around. And it's more technical. GPT, they've done a great job in terms of making it very simple for the average layman who doesn't know how to use AI models to just generate images and get started with it.

~Um, ~the only downside in GPT is that you'd want to upscale. So those images that,~ um,~ have been that the output images aren't really photorealistic. So you'd want to use an upscaler.

[00:11:11] Henrik Werdelin: ~What about, um,~ I feel we're bumping the question, but I guess that's the point. ~Um, ~it is really seldom that I meet somebody who knows so much about this like specific area.

~Um, ~when you have the backgrounds. I would imagine sometimes the,~ uh,~ the angle of the camera doesn't quite fit, for example, the angle that you have the product in. ~And so do you then just have, maybe that's what your whole product is doing, but, but is there,~ how do you think about making sure the composition of the different layers kind of like fits

[00:11:34] Salma Aboukar: proportionally?

~Yeah.~ So that that is what the problem that we were trying to,~ um, um, ~solve, which is to use the next. So creates in this case, you're able to actually move things around, move the camera around, etc. But if you wanted a image,~ um,~ on GPT or mid journey, you just have to define the camera. So for example, front view shot, bird's eye, flat lay, these key prompts, key tokens work really well.

And it does capture the image within those prompts.

[00:12:05] Henrik Werdelin: What are some of the things that it's horrible at? ~Like ~what should you not even start try?

[00:12:11] Salma Aboukar: ~Um, ~that is a very good question. Cause now these models, they've like really upgraded before it was fingers and hands. Those are still risky. It does get them wrong.

Sometimes you have like a. a hand five, five fingers, sorry, six fingers. ~Um, ~and sometimes it messes up eyes as well. GPT does. ~Um, ~but right now the models, you know, they've been trained, they've done such a great job in training them that it's getting, it's getting most things right, I'd say. And if, and if it isn't, you can always use an upscaler, right?

So an upscaler. ~Um, ~really does bring all those details out and makes it look amazing.

[00:12:47] Henrik Werdelin: Maybe I'll jump a little bit. The,~ um,~ quite a few of the people we've spoken to on this podcast are people who don't come with a computer science background, you know, like they don't necessarily. You know, they haven't been in kind of like engineering before, and then they stumbled over this tool and kind of was curious about it.

And then now it's a big part of their job. Could you maybe talk a little bit to ~like ~the COVID happened and then how you stumbled into it? And maybe what was some of the points where you were like, Hey, wait a minute. This is actually going to be something that I can make a living on.

[00:13:21] Salma Aboukar: Yeah. So when COVID happened,~ um,~ one of the key challenges were creating content at speed.

And I remember trying to,~ um,~ you know, I hired a few designers to begin with, and it was taking them very long to create just one image. And, you know, as you can imagine, it was very expensive as well. So I thought, okay, well, if that's the case, why don't I just learn 3D design? ~Um, ~and I started playing around with.

3ds max, and I started learning,~ um,~ photo based realism, which is PBR,~ um,~ and learning the details, what components go into play, lighting, which is a big factor, materials,~ um,~ composition. ~Um, ~and then it was towards the end of 2020 when I finally like mastered glass. Glass was one of the most difficult things to,~ uh,~ create a photorealistic image of.

And then. ~Um, ~I, you know, create an agency and thought, okay, well, this is good. And there was so much demand for at the time. The next challenge was the background images, right? Creating a background images using 3d tools is incredibly painful. And,~ um,~ also I didn't want to,~ um,~ give background images that looked similar to, you know, different customers.

So I was already in search of like a solution. I was thinking, okay, is there a way to kind of like. Find a way to make 2D, 3D components work you learn yourself, like, a 3D software? Just, like, YouTube and, like, figuring out?

Really? Yeah, so what I did was I watched YouTube videos, but also I had,~ like, uh, ~One of my designers, he still works with me,~ um,~ he taught me most of it as well.

So we were, like, working together every single day. Blender is quite easy by the way, so if you guys want to learn 3D design to start with Blender, I'd say I went ahead and did 3ds Max. So, and it took me a year to, to really, to really master some of those components. But,~ um,~ but really and truly that, that kind of ~like, ~as I said, the next challenge was building environments, right?

Cause I specialize in products, the environment bit was another challenge and we. We're outsourcing that for a very long time.

[00:15:15] Jeremy Utley: ~Can you talk about the, the team dynamic? ~ What is the, what's the, do you have a community of practice where you're now sharing, like how do you learn new tips and tricks with mid journey GPT?

Are there places that you go for reference? Are there people that you lean upon?

[00:15:31] Salma Aboukar: Yep. Yep. So we actually,~ um,~ do like spaces each week on Twitter. ~Um, ~it's called like a generative AI spaces and everyone talks about new workflows, new creative tools, how they piece it together. And I'm also part of,~ um,~ Avalanza, which is a discord channel and they are very good like at prompting mid journey.

They use a lot of stable diffusion as well and also check GPT. ~Um, ~so these communities, the way it works is. ~Um, ~when I joined, it was last year around, I think, April. ~Um, ~there was just literally back and forth of ~like, ~you know, I want to create this image, what prompts work better, and then,~ um,~ there's like a support channel where you can just ask questions and then everyone tips,~ um,~ gives their tips and tricks and then you just combine the pieces together.

[00:16:11] Henrik Werdelin: ~Is there a, um, you use mid journey, sorry for,~ it seems like you use,~ uh,~ GPT a lot for ~like ~the ease of the prompting and the dialogue that you can have in mid journey because the design aesthetics, you know, pleasing.

What, where are you in stable diffusion and making your own tuned models and stuff like that? Is that just not worth the effort or? What's your thinking about that?

[00:16:30] Salma Aboukar: Yeah, no, I've been playing around with stable diffusion last year before the upscalers, right? And what I used to do was I used to just grab a mid journey image and upscale that on stable diffusion.

Stable diffusion is really good if you want to, and it's worth the time investment. If you are a brand,~ um,~ who has ~like, ~quite a big budget, and you wanted to fine tune a model,~ uh,~ a Laura,~ um,~ for skin, hair,~ um,~ also your products, and you can just prompt it. If you get it to a level where it's trained, you can just prompt it out, and then it will consistently generate images,~ uh,~ that,~ um,~ which will follow the guidelines that you've given it.

So that's the good thing about state diffusion,~ um,~ especially. Last year, obviously, when the upscalers weren't around, so we were using it a lot more,~ um,~ to kind of make images look more photorealistic. But right now, if the goal is to make images photorealistic, I wouldn't recommend Safe Diffusion. I'd recommend it only if you wanted to train an actual model.

[00:17:24] Henrik Werdelin: And since you mentioned photorealistic, are you mostly in the photorealistic world? Or do you sometimes try to create backgrounds that are animation, kind of animation, like anime, a different style like that?

[00:17:37] Salma Aboukar: I flip flop between the two. ~Um, ~I, I, I normally do realism when it comes to work, but when it comes to like hobbies and other things,~ Um,~ I, I flip flop, I do a bit of knee G as well,~ um,~ and get those, and that's where I have to use stable diffusion for it to get those details to really come out.

~Um, ~

[00:17:53] Henrik Werdelin: and so since Jeremy is an expert in ideation, and I think you were kind of glossing quickly over what seemed to be kind of like a pretty creative process of thinking about. And you're having an inner picture of what the model would look like or if you would wear and stuff like that. Do you have, is that just something that comes intuitive to you or do you have a little bit of a process of thinking like, hey, wait a minute, I need to have these different elements?

[00:18:21] Salma Aboukar: Yeah, it depends on, sometimes I have a starting point, other times not so much. And when I don't have a starting point, I again go to GPT and I ask it questions. I say, Hey, GPT, can you help me? ~Um, ~I have ~like ~an idea, it's quite vague though. ~Um, ~and I'll give it like some context. So for example. ~Um, ~if I can think of the last one, I think it was food.

It was a kitchen, food, set design. So I was like, I really want something fun, pastel colors. ~Um, ~can you like help me,~ um,~ generate a few ideas? And then it would give me like bullet points of the type of ideas of set design,~ um,~ that I could potentially,~ um,~ use. And then from there I'd work through and tell it to, okay, give me more.

Or sometimes I even let GPT to come up with its own ideas. And I say, okay, well, why don't you give me ~Um, ~and then it generates a bunch of random stuff and out of the random stuff, I pick one and then I'll,~ um,~ refine it to be on that. And

[00:19:17] Jeremy Utley: what does it know about you? When you say, show me stuff, do you think I would like it?

How have you set up your custom instructions so that it knows who you are and what you like? Or is that just based on the history of the conversation? ~How, how does it know what you like? Yeah, ~

[00:19:27] Salma Aboukar: ~no, I tell it, so I say, um, I like,~ for example, I'll say I like pastel colours, I like,~ um,~ cake,~ um,~ I like chocolate cake,~ um,~ I also want something that's fun, pop, think pop colours,~ um,~ sometimes I'll also give it references, so I might find like a, a reference from the internet or like something that I like, give it a bunch of links.

And then I'll say, could you generate ideas based on what I've just given you? And then it will just come up with its own ideas.

[00:19:51] Jeremy Utley: One of the things,~ uh,~ we've primarily been talking about image generation, but you're, you're obviously quite fluent in chat GPT, you're AI literate. Can you talk for a minute about other workflows that have been impacted by generative AI and also potentially personal use cases beyond kind of the job?

~Okay,~

[00:20:10] Salma Aboukar: well I use right now I do use GPT internally as well. ~Um, ~so I give GPT instructions to give me like for task management. So if I want to deploy a task for my team members, for example, I'd ask it to give me,~ um,~ I'd give it details. So I'll say things like, Hey, I'd like my assistant to work on, for example, right now we're working on tracker.

~Um, ~could you give me a task for Trello? ~Um, ~and I want my assistant to,~ uh,~ complete the tracking sheet to update the Trello board and also to do it by the end of the week. Right. And then it just literally gives me the whole checklist and then I just copy and paste that over to Trello.

[00:20:49] Jeremy Utley: Okay. That's great.

And, and what kind of talk about the impact of that or what has that prompted other, it sounds like that's a really.~ Uh, ~it would be a relief, ~you know, just like I mentioned, it's like, I don't have to do that anymore. ~What are other times where you found yourself going, Oh, I'm so glad I can ask chat GPT to do that.

[00:21:05] Salma Aboukar: Yeah, so, ~um, ~other work. So with GPT, other workflows beyond the image generation will be things like, as I said, internal team, task management, KPI reporting. Now, also, I sometimes,~ um,~ just upload a spreadsheet,~ um,~ on there and say, Hey, could you highlight the things on these spreadsheets that I need to flag up?

Right. And then it, It tells me the number, the time numbers, and then I just go in and review it. And it's just literally helped my, you know, it's such a relief because instead of me having to go through a lot of ~like ~Excel spreadsheet, I just flag up the things I should be raising and then it does that.

And then I just have to double check.

[00:21:44] Jeremy Utley: So how many. ~Uh, ~GPT or mid journey stable. If you, how many windows do you have open right now? How many tabs do you have open right now? Is it in the hundreds? Is it in the scores? What are we talking about here? You know,

[00:21:57] Salma Aboukar: it was in the hundreds before the school. I shut everything off cause cause my zoom started to ~like ~do some weird things.

But normally I'd have like at least. 50. And then another one has another 50. So it'd be running all the time in the background.

[00:22:13] Henrik Werdelin: Where are you on, where are you on the video? Like obviously,~ um,~ you have runway ML, you have PK, you have a bunch of kind of things that are coming up there. Do you think that's the next

[00:22:24] Salma Aboukar: step for you?

~Yeah, yeah.~ So video generation, there's still quite a way to go, right? ~Um, ~the output quality, what I find is good, but it's not great. Now, a lot of people, even including myself, when we use video,~ um,~ generation models,~ um,~ we have to do a lot of post production. It's almost, It's not worth it, right? The amount of post production that goes into it,~ uh,~ is, it takes a lot of time and effort.

However, having said that,~ um,~ I did find a very good tool. It's called Leapix and it adds subtle movements. So if you wanted gifs or if you wanted subtle animation, that doesn't like, it doesn't really change the output quality. It doesn't mess up your cans or whatever things that you put in there. It doesn't mess up faces.

~Um, ~the quality, the output quality is really consistent and is really like clean, good enough to use anyways. ~Um, ~that's the only one that I've kind of like were able to implement it into my workflow, but I, I think with videos, there's still, I think, give it a few more months. It will be different. ~Um, ~there have been improving their models at a rapid pace.

So I'm really, really excited to see, but I've been definitely keeping tabs on runway, especially be using that for a while. ~Um, ~I did like a Barbie satire,~ um,~ it's a clip last year just for jokes and giggles and it was fun. ~Um, ~so yeah.

[00:23:38] Henrik Werdelin: One thing we've been working quite a bit on is something we call neighborhooding, which is basically trying to use AI to allow us to create an environment for our customer to make them feel a little bit more at home, you know, ~like, ~like a local shop would, you know, where they speak your dialect and they kind of like, they know how the weather is, you know, that kind of stuff, but also they make you feel at home without kind of,~ um,~ Being overly personal and creepy, which obviously personalization sometimes can become.

~Um, ~and so I'll give you an example. ~Um, ~we sometimes at Bark, render a lot of different dog breeds. So that if I come into the site, I have ~like ~a golden lap mix. ~Uh, ~then I can see more lap based dogs. And, you know, let's say Jeremy is a, I don't know, French bulldog type guy. Then we'll,~ uh,~ we'll.

[00:24:27] Jeremy Utley: You know, you'll have more Malamute Malamute.

Come on.

[00:24:30] Henrik Werdelin: Well, we'll go on that. ~Um, ~how are you ever thinking about, you know, scaling the output so that you don't just get kind of like the perfect image, but you get like hundreds of thousands of image in the same universe, but then might be able to be used for different kind of flows and in a product or service.

~Yeah, no, I ~

[00:24:49] Salma Aboukar: ~think~ We don't do it to a massive degree, but it is something definitely worth trying. I think,~ um,~ we did do like a few cats. not dogs, but cats. We've done

[00:24:59] Henrik Werdelin: cats. Why, why did the cat, the sneaky cat people always come into my life?

[00:25:05] Salma Aboukar: Yeah, no, it was, it was interesting because,~ um,~ I think it was a friend of mine.

She wanted like a bunch of images of cats and different scenes and scenarios. And we just generated a bunch,~ um,~ cause she wanted to use it as like a stock image. ~Um, ~but yeah, I, I think there is definitely an opportunity there. I haven't really dabbled too much, but ~um, ~but yeah.

[00:25:25] Henrik Werdelin: Yeah, you shouldn't dabble more in the cat stuff.

I mean, ~like, ~my mission in life is to make dogs more famous on the internet than cats. That is, when I'm completed with that, then I'm done. We don't need more good looking cats. Suddenly getting that score and even

[00:25:39] Jeremy Utley: okay, Henrik's Henrik's question about a bunch of images in the same universe can make me think of this.

I saw this tweet the other day from Nick St. Pierre. He said, what gets generated versus what gets posted, expecting perfect results every time you run a prompt is a fantasy. And then he, he posted this image.~ I don't know if you can see, but it's something like. I don't know, Selma, if you've seen this,~ it's ~like, ~got to be like 2, 500 or 3, 000 images, I think, to get to the one that he ended up posting, right?

Yeah. Can you talk for a second about volume, you know, does this accord with your experience? Is it, because I think that's one thing, I think everybody expects that image from the outset, right? Nobody expects, maybe it's 10x, it's certainly not 1, 000x, right? But talk about the kind of the quantity and quality.

ratio.

[00:26:24] Salma Aboukar: ~Yes, absolutely.~ I think in order to get an amazing output image, especially on mid journey, and this is one of the reasons why I've moved a little bit towards GPT, right? But mid journey, for example, you'd want to generate, especially if you have an idea in mind, right? If you have a vision in mind, you'd want then it is worth generating and using a variety strong, like variety, they have strong, they've got a subtle one and they've got creative ones as well that you can use.

And the, the more output images that you generate, the more likely you would get close to the end result, like whatever vision that you have. So yes, I've generated, I think right now I'm probably like at 50, 000 generations,~ um,~ on mid journey. And,~ um,~ to get to ~like ~one amazing image, sometimes I am having to prompt out at least 30 minutes straight.

And it could be anywhere from a hundred, 200, 300 images. It depends, it depends on the prompt as well. Right. So if you, the, my approach is if you have something that comes close to it, I just edit the prompt. So they have a remix function and then I edit it. Actually write out the prompts. ~Um, ~I say add this, I put image weights on there as well.

So image weights. What it does is it kind of prioritizes certain elements. So for example, I might put image weights on red dress, and that just will tell the model to generate an image that has a vibrant red dress. So that's the approach that I take. And then what I find is by, you know, by the time I have ~like ~a few generations going, and I run like loads at a time, and then I leave them all running on the background, and then I pull them all up, see which ones I like, I can upscale the one that I like the best.

[00:28:03] Jeremy Utley: For people who don't know the meaning of the word upscale, I mean, imagine that there's someone in the world who doesn't, but it's a word you've probably used 10 times. What does it mean to upscale?

[00:28:12] Salma Aboukar: Okay, so Mid Journey Upscale is slightly different to the other Upscale that I mentioned during the call.

Mid Journey Upscale is the functionality where,~ uh,~ when you prompt an image on Mid Journey, an image comes up, is a, four images gets generated in a go. If you wanted to select any of the images, you click on the Upscale button, and that would just make that image bigger and more refined. ~Um, ~with the other Upscalers that I use for GPT, That actually adds the details and you use an external tool.

Right now, there's one called Crea or Magnific. And what they're using is stable diffusion. So before the upscalers were popular, we'd use stable diffusion to grab some of those elements and make it better. So for example, you have an output image, the jumper is smooth, the fabric is very smooth, unnatural, uncanny.

And then you'd want to bring that to an upscaler. And then you just basically use the easy ones,~ um,~ the ones that are available right now. You just basically make the resemblance high and you keep everything the same. And you click on the upscale button and then it just refines and brings the fabric to life.

So it just adds detail and makes it look. It's photorealistic.

[00:29:28] Jeremy Utley: It's beautiful. It's magical. Thank you.

[00:29:32] Henrik Werdelin: ~So if that,~ if somebody is listening to this and they have not,~ like,~ for example, some people I meet, things make journeys hard because you have to figure out how to use discord and you have to go into a chat window, you have to figure out that it's not.

The main channel, but you have to DM the bot and stuff like that. Is that, is, am I hearing you right? It's saying just use GPT for now.

[00:29:55] Salma Aboukar: If this, what I'm seeing is if you. are intimidated by discord, and you are a beginner and you don't want to put in the effort to find the bot and all of the rest of it. To get started, yes, just use GPT and upscaler.

You have to use a very good upscaler though, right? Because GPT comes out very smooth. Everything comes out smooth. It doesn't look photorealistic, right? But a good starting point would be GPT. Plus an upscaler of,~ uh,~ any, any upscalers. If though you can, you know, take maybe 30 minutes of your time and set up your discord bot.

And it's quite simple. All you do is you sign up to mid journey, you go onto the mid journey server and then on the members tab. You select, invite bot to your server, and then you have the bot ready to go. So it's quite simple, but it is intimidating, especially if you haven't used Discord before.

[00:30:53] Henrik Werdelin: This is super fascinating.

I think,~ um,~ I think many companies obviously is going to start to use it. What is your thoughts on customers expectations when now anybody who's sitting with a laptop can now basically make million dollar photo shoots? Is that just kind of an odds race that we'll all kind of need to be prepared of?

Like all Instagrams will ads going to look like a gazillion dollars?

[00:31:22] Salma Aboukar: That is a very good question, because going back to that point of stable diffusion, if you wanted to generate images, you can actually do that on stable diffusion, right? And it is gonna, I think it's gonna get to a point where the lines will be blurred, right?

You wouldn't know. ~Um, ~especially for those brands that know what they're doing. Right. ~Um, ~I've worked with brands and they've like their output quality. You see it on,~ uh,~ Instagram. You wouldn't believe that it's been generated with AI, right? So we are getting closer to that and where things are going to be blurred to a significant degree.

And I think that's going to probably pose some ethical issues down the line. ~Um, ~but yeah, I think it's already happening. And I think,~ um,~ the brands that have tapped into it right now are obviously slightly are ahead of the curve. Yeah.

[00:32:09] Jeremy Utley: You see it with stuff like, you know, Henrik and I were just texting last night about the whole kind of.

uproar over Duolingo using AI. And it's for, it's interesting, right? Customers, there's a different perception if a human is doing this versus if an AI is doing it, right or wrong, by the way, I would, I would imagine that a, that a well run and, you know, data driven company like Duolingo isn't doing it unless the AI is delivering consistently better results.

And yet, Somewhat paradoxically, humans are basically saying, I don't want the consistently better result option if it's an AI, right? And so that's, you know, getting to Henry's question about customer expectations. Is there some premium we're going to actually apply potentially to this is 100 percent human generated, right?

I mean, ~like, if I think about this is somewhat an aside, but I think it's, it's relevant.~ If I think about my favorite albums, for example, I'm a, I love music and I'm really drawn to call it sixties, you know, rock and things like that. All of the albums I love are actually imperfect, you know, and Bob Dylan's ~like, ~he's ~like, ~he sounds like he's trying to ~like ~catch up with the beat the whole time.

He's like never actually on beat. Right. And you, and you learn. They only did two takes that song and they thought both of them were garbage. And they were, and then they figured out actually one's not let's put it on an album, right? Whereas now the band does a thousand takes. They're perfectly on cue and it's so perfect that I don't come back to it.

And yet the stuff that's imperfect, I come back to, I wonder how we think about kind of imbuing imperfection as we can, you know, to use your word upscale and upscale, add, you know, nauseam how we think about imbuing imperfection as well. Is that trigger anything for you?

[00:33:44] Salma Aboukar: Yeah, no, that's interesting because ~like, ~I mean, it is, I feel like because of these images that are being generated with AI and AI just keeps getting better and better, it's definitely diminishing, like there's the standards, right?

So it's exceeding standards to such a high degree that now. ~Um, ~I can't even imagine like some of the images if you were to get like traditional photographers, unless you want that type of aesthetic, right? Unless that's what's sought out,~ um,~ really and truly,~ like,~ is an interesting one, because it's quite difficult, because ~like, ~unless, because I've seen it as well, like I've seen photography studios that are now put in human, you know, driven,~ um,~ content, etc.

But it's gonna be an interesting one, because I think unless Customers are they want that type of imperfection, and I think most of these customers, especially,~ uh,~ CPG brands, for example, they almost like the stylized images. They like the creative images, right? The one and then for the imperfect ones, they tend to use.

user generated images, right? I do wonder though now, now with stable diffusion, I have seen people generate and user generate images with stable diffusion. So they're adding the imperfection to satisfy the users and customers. So it's, it's so interesting because now it's all, it's all about the imperfections to make it look And ~like, ~I guess human passing, I guess, or like something that's been generated by a human.

I don't know how to even describe it anymore.

[00:35:09] Henrik Werdelin: I think that's, I think that's some of the stuff that we've found. I did a podcast a few years ago with Abe from Mischief, which is this amazing art collective shoe making, I don't know even what to call it, but they do incredible work and he talks about what is called brutal design,~ uh,~ and what he put me on to and which I now know from,~ uh,~ Barks social handles, which have millions and millions of followers is that if we make it too slick I think people kind of don't like it and and so the more that we can kind of make it this brutal design Which is what gate?

Called it the more it seemed to perform and so I think Jeremy what you're talking to and the way that I computed is that It's all about originality, obviously, that comes with authenticity and that comes with kind of like emotional connection. And so, if it becomes just, like, all this very polished stuff might have a very rapid regression to the meme.

And so, I mean, ~like, ~my big takeaway from having this conversation is also that Salma is kind of like the new photographer, right? What, what you are doing is to direct the model to where it should go. And obviously now it's a data model, not a, not a human model. ~Um, ~but you are the one who decides if it should be a red dress or not.

And I think if you just said, Hey, Megan app for, you know, like kitchen product, you would probably get almost per definition, the very standard kind of output. And so you do need the creativity and the humanity. to go, no, actually, what would be really cool here was to put the model in a silk blouse,~ uh,~ and make it really red.

And then, you know, all the stuff that you seem to be doing. And so that's probably why it's working. I would theorize.

[00:36:51] Jeremy Utley: All right, Selma, I think we're just about at time. Wonderful to meet you, Salma. Huge fan. Talk to you soon. Thank you.

[00:36:58] Henrik Werdelin: What, what was the thing, what's your takeaway?

[00:37:00] Jeremy Utley: I'll start. Yes, that's fine. Go for it. You start. So,

[00:37:05] Henrik Werdelin: I'm impressed. By Salma, who kind of seemed to be yet another one of these non engineers that are now basically the experts in a new technology and how she's really taking her clear creativity and understanding of a specific space being kind of like creating product shots and then just.

Having a conversation with the tool like it was a craft man of some kinds that so that she can express the visions that that she has on her,~ uh,~ in her eye,~ uh,~ for whatever she wants to create. And what blows my mind is that she has a lot of Twitter followers. She has. You know, I've seen a lot of our work and it looks amazing, but it's just from somebody that a few years ago was doing something completely different.

[00:37:55] Jeremy Utley: Yeah, I, I'm humbled by her curiosity and her rate of learning and her willingness to teach herself new tools. You know, the thing that I,~ uh,~ couldn't help but observe was how using AI in one area with one particular application, it. Starts to get tentacles into other parts of your life. I loved her talking about managing her assistants Trello board usage, for example, but it's almost like finding a particular niche then begets other applications and other experiments.

And her saying that she had hundreds of tabs open before the call just shows this, these. Technology is so powerful. And a lot of times there's there's a starting point for a user. But then once you start there, you start learning some of the conversational techniques and instincts that it just ends up taking on a life of its own.

And you can't you almost I mean, her describing uploading a spreadsheet and asking it flag asking chat to be to flag the areas that I need to review. It's an incredible example of how these these technologies are just they're growing into people's lives. And And wherever you can find a starting point, go there and then the technology itself will inspire you.

[00:39:05] Henrik Werdelin: ~Need to jump on, um, I'm doing a talk over at MIT and a better, they're probably worried that I'm not on, uh, it's not on top of the hour. Get there. Good luck. I should show you like, by the way, it's getting really slick, the talk. So ~

[00:39:05] Jeremy Utley: ~let's do it. Is it, is it a public talk or no?~

[00:39:05] Henrik Werdelin: ~No, I don't think so. But I'll uh, I'll give it to you. Uh, ~

[00:39:05] Jeremy Utley: ~take care. Okay. Love it. Bye~

And that's it for this time. We very much hope you enjoyed the episode. And if you did. With love. If you could like, and subscribe it, maybe share it to a friend. Until next time, take care.

Â