Beyond The Prompt - How to use AI in your company

AI’s Next Frontier: World Models Explained by Christian Keller

Episode Summary

Christian Keller from Meta’s Superintelligence Lab joins Henrik and Jeremy to explain how world models help AI understand context, change, and cause and effect. He discusses why text alone captures only part of the picture, and how adding modalities like video can provide richer inputs. They explore how AI is now used to help train future models, how research workflows have evolved, and how personal experimentation often uncovers new possibilities.

Episode Notes

In this episode, Christian Keller joins Henrik and Jeremy to explain how world models are shaping the next stage of generative AI. He talks through how AI learns using different types of inputs, and why video adds a sense of continuity, change, and cause and effect that text alone does not provide. Christian shares vivid analogies and clear examples to show what multimodal models make possible.

The conversation moves into how AI is now used throughout the research process, from generating synthetic data to evaluating model outputs. Christian shares how this loop is already in motion and how AI is helping scale and accelerate experimentation. He also reflects on the shift after ChatGPT launched, and how that changed the pace and structure of research work.

Later in the episode, Christian describes how individual workflows are evolving, and how asking simple questions like “Could AI help with this?” often opens new possibilities. He shares examples from his own work and home life, including how his wife built and graded her own French exercises using generative tools.

Key Takeaways:

Text removes essential information
Christian explains that text compresses reality and loses detail, context and temporality. Images and video help restore what text leaves out.
World models give AI a sense of change
Video introduces the before and after and how things move or enter a scene. This helps models learn cause and effect and builds more robust understanding.
AI helps build AI
Models can generate data, evaluate results and support researchers during development. Christian shows how this creates new ways of scaling experimentation and training.
Workflows shift when AI handles early steps
Christian shows how tasks like debugging and prototyping change with generative tools, which reshapes roles and opens new opportunities for innovation.

LinkedIn: Christian Keller | LinkedIn

00:00 Intro: Information Compression
00:37 Meet Christian Keller: AI Expert
01:13 The Evolution of AI Products
02:11 Impact of ChatGPT on AI Development
02:38 Understanding PyTorch and Its Role
07:41 The Bitter Lesson in AI
09:12 Challenges and Future of AI Models
18:57 Using AI to Build AI
23:25 Innovative Chat Interfaces
23:41 Building the Autos Platform
24:35 Epiphanies in AI Integration
25:18 AI in Entrepreneurial Workflows
26:32 Challenges in AI Integration
31:15 Bias in AI Models
38:06 Debrief

📜 Read the transcript for this episode: Transcript of AIs Next Frontier: World Models Explained by Christian Keller |

Episode Transcription

[00:00:00] Christian Keller: So text is by definition already a compression of information. Information is lost. And so using other modalities helps get other information. And oftentimes when you get texts and images or texts and videos with images, you get the context of the text in an image in the setting with the colors. When you add the video, you get the temporality of it that you would have had before and after.

How things can change, how things can enter the screen or not. And so I think the first point is that LNS alone, text-based alones. We'll never be able to learn like a human because they're missing so much information we've removed by Sure. Definition of like putting it in words. Hi, my name is Christian Keller.

I am an angel investor, startup advisor and currently work as a product lead, the Meta Super Intelligence lab. I've been building AI products and models for past 15 years and I'm really excited today to talk to you about. Research and building AI products.

[00:00:53] Jeremy Utley: Okay, so, uh, one of the things Christian that Hendrik and I love to do is invite our cool friends to introduce to each other. So, uh, Hendrik has very little introduction. Why don't you tell Hendrik and also listeners who may not know you, why they should be interested in this conversation today.

[00:01:13] Christian Keller: Um, well. I've been working in AI for the past 15 years, so I'd like to say it was like before it was cool. And, uh, I've taken it a pretty kinda like deterministic approach to kinda my experience where I've started with building AI products myself, like thinking through problems that I knew and leveraging AI in order to apply them.

And then I had some experience in infrastructure around, uh, my work on PyTorch and understanding like what it takes to actually deploy AI at scale. And at the time, like how researchers also were, were using these tools and lately I've been doing, uh, a lot of work in AI research working on LAMA three lama four world models also, which I think give me kind of a, a good overview of the three kind of main categories I see of the AI ecosystem and what it takes to build successful products in it.

[00:02:05] Jeremy Utley: So obviously your career spans long before kinda the chat GPT moment. Maybe we could start with that moment as a inflection point. What changed for you and for the teams that you lead who are building AI systems, what changed for you when, uh, Chad g Petit came out? And I'm sure there's one element, which is people's awareness of what you do and understanding somewhat of what you do, but then also you and your team's abilities to do what you do in a different way.

I'd love to hear about that.

[00:02:38] Christian Keller: So at the time that Chad GBT launch, I was working on PyTorch. And so PyTorch is already quite known in the space of ai and was like, could you just explain shortly what that is for people who don't know? Oh, right. Yeah. So PyTorch is a, is a framework that's used to build AI models.

And so it's the language by which like researchers are able to formulate, uh, what an AI model should be, how, what it should do. And it's a translation layer in a sense between like these functions and these that the researchers have in mind with like the hardware. So in effect, when you code normally you might use like Python, right?

So PyTorch is like on top of it, an additional kinda layer that allows you to build AI models. So for example, like I said, it's an open source tool built at Meta initially, but it's used broadly now. It's part of a foundation and uh, Chad GPT was built leveraging PyTorch. So it's used in many, many other areas.

So what happened was, while on PyTorch, I think what was interesting is that first of all, people started talking to me about it. You know, oh, is AI a thing? Like have you guys seen it? And it's interesting because from the PyTorch standpoint, we saw a lot of what researchers were doing ahead of time and language models were not something new.

I think the technology had been there for some time and there were many, many teams that were leveraging them using, uh, a key technology called Transformers that had kind of changed the space quite a bit at the time. So we're seeing a lot of innovation already happening. I think the surprising part for a lot of people was how well packaged this technology was in chat GPT, and how the team that developed chat, GPT, was able to make it accessible to a lot of people, uh, at scale.

So suddenly there was an awareness that wasn't there before of what this technology could do and how well it could be applied. And so I think there was a shift at that time in terms of, um, of how people were using it.

[00:04:32] Jeremy Utley: And why, why was packaging not a consideration before? I mean, is it the classic engineers aren't thinking about end user consumption or what's going on there?

[00:04:42] Christian Keller: I don't know, but I think my, my bet on this would be that oftentimes people were looking at things like assistance. And so you had the, the Google assistants, you had the Alexas, you had even assistants were working on at Meta. But each time I think we're considering them more. From the perspective of like one use case, uh, with one specific deployment, with chat, GPT was a way to say, look, we've got this really cool technology, we don't know what people are gonna do with it.

It sort of create the most generic, the most like accessible way to do it. And, and discussion chatting is literally like, you know, the group of society in a sense. So, which probably

[00:05:14] Henrik Werdelin: is a little bit of like a big bet, right? Because I think as a product developer you would normally. Kinda like tell people, you need to figure out like who's the customer, what's the problem, and then kind of be very specific Niche.

[00:05:25] Jeremy Utley: Niche down. Niche down. Yeah. Yeah.

[00:05:26] Henrik Werdelin: So in many ways, I mean like, do you think they were very purposeful with that kind of understanding that we just, this is such a foundational technology. This is more like electricity than T-C-P-I-P, so therefore we just need to kinda like put it out there and, and basically throw spaghetti at the wall.

[00:05:44] Christian Keller: Well, there, there's two things that are going on, I think, I don't know, like what they were thinking about it at the time, but. But it's one of these problems where you've got a really nice hammer and you're looking for nails here. Uh, and that's the opposite approach that you wanna usually take when you build products, right?

You wanna find the problem first and then like develop the tool that's gonna help, like solve it. So what do you do in this case here? I think, um, what was very interesting in their approach is that the chat interface allowed for people to, in a sense, explore what use cases could be interesting. You've seen like some more or less like formalization, I think of some of these use cases in the last several years that were not quite clear, um, initially.

And so I think what they did is they created this like learning engine by having all this data from all these users that they were able to kind of capture. And so today you can see use cases that are, um, I mean the way I formalized use cases for regenerative ai, like one could be UI interface type.

Something that helps you better communicate with, uh, a machine. So for example, the voice interface or the chat interface. And it's different way from like being explicit about clicking and, and typing things. The second thing I see is more around like text generation. And so that's initially your chat.

That could be your, your thought leader assistant. That could be the editor that's gonna help correct your, write up your newsletter, or what do you wanna say? So anything that has to do with like, interactions, text knowledge creation, uh, whether it's like with text and images, and then there's the kind of automation, which is like, let's actually do actions and take tasks and entirely kind of, not necessarily remove the person from it, but take some chunks of work away entirely that can be either done better or faster or even if it's not better or faster, maybe cheaper and like, you know, free people up to do other things that are more valuable for them.

[00:07:33] Jeremy Utley: You know, what you, what you describe about what the field had been doing in creating narrow assistance versus what OpenAI did when they released IGPT in creating a general technology is what I've heard called the Bitter Lesson. I don't know if you all are familiar with that, but it's what researchers call the Bitter Lesson, and it's been established time and time again that.

You can work to train a model very specifically. For example, deep Blue, the the model that Beat Kasra had been trained by chess masters for the purpose of chess. The models that beat Deep Blue aren't trained on chess. They're just told play chess a bunch of times. And the Bitter lesson basically is the following Any attempt humans have made to train for a narrow task.

Ultimately gets done better by scaling laws of a general model. That's, that's been proven again and again. That's called the Bitter lesson, which is to say in a way the, it's bitter because there can be a sense of exasperation or fatalism. Why bother training it for this narrow thing? Because if scaling laws catch up, eventually a generally trained model's gonna be able to do this better than what I can train to do it.

[00:08:43] Henrik Werdelin: Now, it's a little bit also, you know, as you know, I'm obsessed with, uh, Kenneth Stanley's book, white Greatness com be planned. He's an AI researcher. Turn. Sold his company to Uber, then became head of ai, Uber, and then went to Omi I, and he has the same argument. And basically this book is about how you get to a GI.

And his point is that you can't define what it is. You have to be, have these systems of open-endedness and then that will becomes basically the stepping stones of leaky deer. And so I think. I'm fascinated by that. Um, I have a bunch of questions,

[00:09:11] Christian Keller: but one thing I'll say about the, the chess part is that, uh, Chad, GBT does not beat the current best players at chess.

So that's, uh, something interesting to see, which is there is some level of like specialization in, in the functions that some models can do. And like the counterpoint I think here is that, I think there is definitely your point proves that there is value in like building these frontier models are very general, both because I think.

They unlock new capabilities that we were not expecting and that they allow, I think for the discovery piece of use cases that, that we wouldn't even have known, like would be possible. But we know today how to take these models also and to make them very specialized and even better in a more narrowly and more efficient kind of way to run.

Which is the kind of the other side of the coin here. Yeah.

[00:09:59] Henrik Werdelin: Is that because that generalism, obviously the models are statistic kinda like averages in many ways. Often when you want to be kind of beating a chess master, you need elements of originality or differences. And the general models don't do that.

And so when you go very purpose is because either you need to work a very specific workflow and you don't really want like the average making that up. You want something that you know is the best, or when you sometimes need to have basically outlier answers, you need to kind of train the model to kind of look a little bit off the rapid regression to the mean.

[00:10:33] Christian Keller: There's two things actually, and you're making a really important point here. The first thing is that I think different model architectures and you know, will kind of solve different types of problems in some cases. So like the way LMS are built right now are very much focused on like language or like omni models, maybe very small multim modalities, and they're good at doing that.

They're not necessarily good at planning initially, right? That's not what they've been trained for. They've been trained for finding the next most likely token, or the next most likely word. So there's a question around which models are best suited for what type of application. But the second point here is that there is something to be said about these frontier models, even when they're post-training and where, when they've actually gone through fine tuning and any other round or, you know, reinforcement, millennial, whatnot.

We've reduced the entropy for them. We've like kind of tailored them towards very specific types of answers and very kinda unique approaches or ways we want them to kind of answer some question. And so. When you take a pre-train model, there's still a lot of variability in it. There's more variations of type of answers that they can give you.

And so that's an interesting point because you can take that pre-train model and then specialize it in different ways based on your post-training that you're gonna do to make it, you know, useful for, I don't know, maybe knowledge about chess or knowledge about like math serums, uh, proving or knowledge about proving.

Is there a

[00:11:54] Henrik Werdelin: way that users that are just using the models as they are. Can use that insight to get better outputs of their generalized models. So for example, with that thing, if I'm sitting and I'm being asked to make a, an agent that, uh, does HR stuff or brand stuff or whatever it is, obviously you use it rack, I would imagine all like whatever context window to kind of do some of that.

Are there other kind of like tricks to make sure you achieve that non generalizing ness?

[00:12:27] Christian Keller: I mean, I think it depends like what stage of your product development you're in. Like right now, I'd say most teams I've talked to, uh, like in startups and whatnot, like they, they usually start with taking a generic model, like the best they can find, and they'll test some prompts and then they'll kind of generate some like system prompts in a sense that will.

Somewhat like help narrow the model usage, but you're still starting from a model that's been postering already that's been kind of narrowed to following the instructions and to having certain language. And so you can move it, but you've got only so much like leeway to kinda move it around. So once you validate that maybe with your product market fit and see that it's working and you want your tool to improve where your model to improve, then this is when you take a model that's been more like pre-trained with instruction full, for example, and then you fine tune it.

And so if you're willing to invest in it. You have a whole strategy you have to figure out around benchmark definition, like data strategy, acquisition in order to create models that are gonna be even better at doing these things through that fine tuning process. The question

[00:13:30] Henrik Werdelin: that I was keen to ask you before, uh, we got on today is I saw a YouTube video of Jan Lako, who is from your world.

He was making an argument that I kind of understood, but not really, and so I thought I would use the excuse to ask you. He was talking that these language model had built in basically limitations and so as we're kind of like trying to raise towards a GI, whatever that is, he felt that we needed. So you kind of start to train on video models or other kind of multimodal models, I think is how I computed it.

Is that, do you, do you sound familiar? Yeah. Yeah. And can you explain to me why the language models might not be. Right, the text-based models and why suddenly having video would make it better.

[00:14:17] Christian Keller: So I am very much, uh, aligned with Jan, I think on his message. So let me kind of summarize there. There's two main points here.

The first one is around the modality. So text is by definition already a compression of information, right? Like, uh, the example I usually give is like, you can go and watch a game at the stadium. There's the crowds, there's the smells. There's the game you see, maybe not as well 'cause it's further out, you know?

But you've got this whole emotion. You've got like a feeling, there's a sense of community with the people around you, and then you can like watch it on tv. You get maybe that if you've got a few friends, but you're missing what's happening in the stadium, you're missing that experience. Then you can read an article about it.

Like in the newspaper the next day. I, I can tell you, you're gonna have a very different emotional reaction between like reading the article and like actually experiencing the game in the stadium. So information got compressed. I even tell people like when you read Tolkien, there's like three pages long description of like one tree, right?

Like you still haven't touched that tree, you still haven't smelt it. So information is lost. And so using other modalities helps get other information. And oftentimes when you get texts and images or texts and videos, um, with images, you get the context of the text in an image in the setting with the colors.

When you add the video, you get the temporality of it that you would not have had before and after. How things can change, how things can enter the screen or not. And so I think the first point is that, is that LN alone, text-based elms, like will never be able to learn like a human because they're missing so much information that's already we've removed, but it's Sure definition of like putting it in words.

The second point I think is around, um, you didn't mention it, but it's his main point I think, is around the fact that LLMs are, uh, hallucinate, right? They hallucinate by design. Literally they're just trying to make predictions of most likely things that are coming next. So they're really good at it. And with post-training we create an illusion of, of intelligence, right?

Because they seems like that's what I would want to hear. And, um, that it makes sense when we're having conversation loud that LLM totally gets me, you know, like, uh, gets my problems and um, I'm delight. And so that's actually fine for a lot of use cases. When it comes to using a lens for super intelligence or a GI, or however you wanna call it these days, like you're gonna wanna solve like really hard problems, you're gonna need robustness.

And so there's this second concept of world models that Jan pushes for and that I'm a firm believer in, which is we need to create models that can predict a change of state in the world. Robustly, not simply say, look, uh, this is the next likely word that was gonna be said. But I understand that if in a, in a video, I'm holding an apple and I stop the video, and then, you know, I say, okay, the hand is gonna open.

What's gonna happen? Well, the apple's gonna fall, right? Because there's gravity. Uh, we need to understand that if it's seizure in the space station, that's gonna act differently also. Right? And so you need to be able to build models that are gonna help increasing this robustness, increasing this. Level of predictability, of, of outcomes for, for the models to solve much bigger and harder problems, uh, in the world to get to super intelligence

[00:17:28] Henrik Werdelin: and video is probably the thing that is most available to treat on.

But there might be, remember when kind of like suddenly Google put cameras on cars and suddenly there's like street map and it was like mind blown that somebody would actually drive a car through all the streets in the world. Do you think almost we need to get a point where, kind of like you need the Google Street cut of like the models to go around and kinda like sucking smells and sounds and everything in order to, to figure out how to map the world?

[00:17:55] Christian Keller: Well, you know, like your, your, if you have an iPhone, your iPhone, it takes pictures, like has a mini radar for distance in it, like in some cases, depending on the models and whatnot. So you're already getting some of that additional information beyond simply the image or the video that can start being recorded.

I don't know if anybody's using it. But we'll definitely need more. The videos what it, what it adds is the temporality. I think that's like a really important one. Uh, and so it's, it's better to kind of teach like cause and effect in a sense. If you can see things that happened before and after for at least in our scale of the universe.

But we will need, I think, uh, more, but we don't know yet. I think we haven't pushed the research like far enough yet around these jpa, so world models and videos and, and others to understand yet if that might be enough, right. To get to what we consider, you know, a GI.

[00:18:41] Jeremy Utley: Okay, so I'm dying to know the other part of our first question, which I, you could basically describe our interviews as a series of rabbit trails, right?

That, going back to the first question, what's changed for you in developing AI models now that we have LLMs? And part of the reason I'm asking this question is I heard an interview is Z recently, where he talked about one of the things meta is doing is really using its WMA models, et cetera, and tuning it for the development of ai.

And I think there's kinda broad belief that in the developer community that using AI to build AI is kind of, you know, the way to achieve escape velocity. Right. I'm curious, uh, kind of from a abstract existential point of view, I'm curious to know for a practical point of view, for you personally as a developer of this stuff, what started changing and when did you feel like, whoa, all of a sudden I'm riding an electric bike, not a regular bike, or, or, you know, whatever the right metaphor

[00:19:36] Christian Keller: is.

Yeah, no, that's, uh, essentially when I think it's been, it is been continuous. I think every time like AI improves, like we, we find new ways to use it. But I mean, AI has been building AI for quite some time. Like the whole like post-training process in a way is like, has a component to it where we generate, you know, synthetic data, we generate data traces that are generated by AI models, right?

And so initially, and this is all like in the LAMA three paper or LAMA four papers is all, all very public. You can see that initially what we use is like what we call like SFT, I think for for fine tuning, which is, uh, you have humans making adaptations say, okay, what should the answer be? Let's have a human right, because a human will know like, what's the best answer there?

And so you would do that. You train models, try to replicate a little bit like what that answer, what's gonna be, and, and kind of guide them towards that. But then what realized is that, you know, you can scale generating. A bunch of different solutions with AI itself much faster than with a human, right? Uh, and so we would generate a bunch of data points of outcomes with models, and then we'd say, well, the humans is still, you know, the better judge here.

So the human would look at all these different answers and say, okay, this one's the favorite. Let's like, try to train the model to kind of do that. And, uh, then we're like, wait a second. Wouldn't the AI be a better judge?

[00:20:55] Jeremy Utley: Can we train it? So now, so these are all, these are all, as you know, they say in literature, in media, Ray, like in the middle of the action, you're actively developing and realizing, wait a second.

The humans are, you know, to start with, as you said, doing all these annotations, could AI help with that? And then you go, yeah. And then you go, okay, now humans we're, it's our job to evaluate. And then somebody has this epiphany, wait, could AI help with that? Is that basically how it's going is people who are fluent and facile with generative ai.

Everything that they're doing right now, they end up having an epiphany where they go, wait, could AI help with that? Is it basically just kind of working its way through the progression of, you know, activities in the workflow? I don't, I'm, I'm lacking the vocabulary here, but you're getting my check.

[00:21:40] Christian Keller: Yeah, yeah.

No, it's, it is definitely that. I wouldn't say like, it's that, like for every single aspect of research, there's a lot of, like, didn't, I

[00:21:46] Jeremy Utley: didn't even this comprehensive claim, I just, I just meant it to be describing kinda the way it works. Yeah.

[00:21:50] Christian Keller: Yeah. No, and definitely. And that's the thing is like, I think about it in my personal, like life flow, right?

This is exactly like how I use ai. I, I'm like, wait a second. I'm like, I'm not using AI for this. Like, would that be easier and better? And so I asked that constantly to myself to try to optimize, know I have to eat my own dog food in a sense, right? It's like I have to keep using these things and understand what the use cases are and tell them, and, and researchers do that.

[00:22:11] Henrik Werdelin: What's the last use case you've done yourself, where you went like, Hey, wait a minute. That was smart.

[00:22:14] Jeremy Utley: Wait, I could do that with ai? Yeah. I'm dying. The last one, this one, this one. Actually,

[00:22:19] Christian Keller: I'll credit my, my wife for that one because she's learning French right now. And, uh, she was taking this like, uh, B one level exam, uh, for her leveling.

She was like, wait a second. Like, I'm trying to do more tests, but I can't find these tests online. Like, and so she just worked with various AI models, like, you know, Lama, Chad, GPT, and Gemini to try to like, create tests for herself and like learning programs. And so she would like, she didn't like typing, so she would even write everything by hand.

Just take a picture of the thing. Then it would grade it, but it would grade it because it had provided the model with the online reference to how these things should be graded. And she did some tests to figure out like how known, you know, rating for some specific examples like would come up based on her thing.

And they came out exactly right every time the exact same grade that were recommended. And so it was, uh, it was great because that allowed her to one scale herself. Generate a lot like test, you know, data for herself to learn faster. That wasn't available online. At least she couldn't find it. And so that's an interesting point, right?

Because she found this way to do this. This is something she did with a chat interface. But a lot of innovation and products are happening right now where somebody realizes, look, I've done all this with a chat interface. Let me just package it in a product and make it simpler, like for people to just go and use it as, give a fun example

[00:23:40] Henrik Werdelin: of that.

I'm building this platform called Autos, where we're trying to launch a hundred thousand companies a year. It's kind of like an entrepreneurial platform, and one of the things I saw the other day, it blew my mind. That is, um, the woman that is, she's American Chinese. She was learning, I think French too.

And one of the things that she found out was that people who grew up in different cultures, they hear sounds differently. And so the ton, and so she created this little kind of like, almost like guitar auto tune kind of visualization. So if you want to say, I don't know, she could actually go and she could kind of find the 10 that she had to do to say it correctly in the accent that was correct.

Because it, it just said it like, you know, phonetically or whatever. What she would kind of see and what she would say would seem right to her, but for a French person to be like, this is completely wrong. Fascinating. That's so cool. I think, yeah.

[00:24:35] Jeremy Utley: So Christian, what you're saying in both your case and your wife's case is there's this moment where you go.

Could it do that? You know, kind of a epiphany moment, which I love and I often tell folks, I think sparking your own imagination is kind of the, the highest priority. And being in an environment where your imagination can be sparked is very important. Do you have practices or routines or rituals either for.

Seeing new, uh, use cases and applications and or for auditing your own kind of existing workflows on some kind of a regularity to identify what's the next area of your life you should be inviting AI into? Or is it more kind of serendipitous and random?

[00:25:18] Christian Keller: No, no, I, I, I, I do, I feel like two or three different things.

Like the, the first one I think is like any job that I'm doing that's taking me some time, I always like will try to apply eye to it. What I noticed is initially it'll take me longer with ai, but if I commit to it, like it'll usually make my life easier after a while, once I've kind of like narrowed down the kinks.

The second one is, uh, you know, I, I, being an entrepreneur before, I think I always look for problem spaces. 'cause all these ahas comes from, this is part of my workflow, this is a problem that I'm solving right now, and I need to, to really make it better or I need to scale it. Like, in the case of research, like, I mean, I didn't invent these things like this is.

What researchers do. But when they are at these moment, it's like, look, I need to be able to scale more, right? I need to be able to get more data. I need to be able to leverage better the humans in the process. I need to be able to do these things. And so, so it's having a keen eye on the problems because now we, we do have this hammer, but we we're looking for the nails everywhere.

So looking for these problems constantly, I think is a, is a, is an important one. But I think the pitfall long term with this is that that helps you to. Optimize like existing workflows and existing processes,

[00:26:30] Jeremy Utley: not necessarily doing new things.

[00:26:32] Christian Keller: Exactly, and so I think where you'll see real disruption is when somebody tackles a bigger problem, you know, white page like from scratch and doesn't try to replicate any specific kind of workflow that was solving the problem before that says, okay, I'm starting with, I'm starting fresh.

Like what should the workflow be entirely?

[00:26:52] Henrik Werdelin: I think one of the, the issues right now I think is obviously it has to be integrated with the real world. Right? You know, I've, I've talked to people that have integrated AI in the organizations and they feel a little bit kind of depleted because they don't really see either their top line growth or basically the ability to be more efficient.

Right. And one of the things that I notice is that you take, for example, a designer that makes dog toy design, 'cause that's my work. And then suddenly they can also just randomly, just hypothetically, just random, just picking it up. Um, and then you say to them, Hey, now you can also write basically the marketing description of the tool itself, right?

Because it's easy. They're doing that. Now, that obviously doesn't reduce the workload for the designer. It reduces the workload for the marketing team. Now the marketing team is doing other things, so you can't just, you know, you, you're not reducing an FTE just because the designer now can do that. And so there's this kind of interesting patchwork that need to happen as we're putting AI Ironman suits around people that need to kind of like.

Get overlaid, and you probably need to have enough of things happening so suddenly you can think of a new organizational design. Yeah. And so it's interesting and, and I think you, you're truly right in what you're saying, but obviously the issue is that all this has to be then deployed to the slowness and the conservatism of

[00:28:05] Christian Keller: humans.

Uh, and, and you're, you're making an important point here because it's not just the workflows that change, it's the roles themselves don't make sense anymore. The, like, I, I think about it like a developer like today, you know, might be spending a lot of time debugging, looking for things like, I mean, finding where you missed a comma.

Like, you know, you shouldn't be doing that on your own now. Like you should definitely leverage it. Yeah. Uh, but that means that, I'll give you an example. Like for the entrepreneur, right? When you're thinking of. And I always have these ideas that I wanna test out. Like when I think of an idea now, I'm not gonna create like a slide deck or a wire frame, right?

I'm just gonna actually build the MVP, right? Like I can vibe code it, or I can do it, it's gonna do the design for me. I'm gonna even deploy it and it'll take me like less than 24 hours from the ideation to the thing. And so suddenly, like, what's the role of my CTO when I start? What's the role of the, uh, if you're a technical enough person that you can.

Like spin it off. Like I wouldn't be able to build like an architecture design for like large scale deployment of some of these products, but building the initial one I can do on my own now. So do these roles make sense? Like is the role of the product manager the same as the engineers, the same as the designer?

And are these the same based on the different verticals you are and where's it gonna go? Is take anybody's guess here. So

[00:29:16] Jeremy Utley: I wonder if more, you know this, you're reminding me of kind of three conversations. We had the first point. Cass Uni, who's the CEO of applied intuition, he mentioned in our last episode the kind of the movement towards kind of call it tiger teams or small team models where folks almost guilds, you can imagine an enterprise as a guild of a bunch of small teams doing different things, right?

That's kinda one thing. The other thing that you reminded me of is, uh, our conversation with John Waldman, who's the CEO of Homebase, Stanford guy. Who, uh, said that the product developer process has completely changed there. They used to be reviewing 20 page PRD docs. Now he's reviewing clickable, lovable prototypes.

So they're spending far more time in the field than they spend in the conference room reading documents. Right. And then the third conversation you reminded me of, and getting back to your question of are we just solving an existing thing versus doing something from a blank slate? We talked to Martin Reeves, who's the head of BCGs Henderson Institute, like their think tank.

One of the things that he found, which I kind of found fascinating was there's very little overlap between the kind of person who is kind of execution oriented or kind of solving the known problem and the kind of person who's doing something new, who's very least says per his research, only 3% of people are capable of doing both.

And granted the prior probability of kind of starting from a blank page kind of person is relatively small. So maybe, you know, but it's probably more than 3%, I would assume. But the point is, most people who are kind of finding incremental opportunities, I kinda gave Martin this hypothesis, you actually need to do the incremental stuff in order to imagine the new stuff.

And he said to me that presumes that the person who's doing incremental is capable of imagining the new. And that's a faulty assumption, was just. That was really interesting to me. So anyway, I wanted to, if you haven't gone into the bad catalog, you've now got three I'll three. Killer

[00:31:08] Henrik Werdelin: next. This is basically just pitch from our episode, and if you enjoyed this camera, right?

Yeah, exactly. I have, um, let me ask you just a slightly changing gears. Um, 'cause you know, we've bid a little bit of concrete. Now we can go back to the philosophical stuff that we always end up with. So one of my colleagues went to Bhutan a few weeks back, and then he did a lesson about entrepreneurship and AI for like a, you know, high school or college, you can't remember.

And one of the questions that came out was basically, uh, hey, do you guys use open AI or do you use deeps? Seek. And, you know, he being an American, he was a little bit like surprise by, because obviously in the US you know, it's not used as much the Chinese models over there. And, and so he came back with kind of like this interesting thought, which was basically that if we are gonna put AI on top of everything we do, the principles and the values and the thinking of the models is gonna influence obviously how we see the world.

Now meta obviously is very big on open source. Your model is kind of like unique because of like that everything is very transparent, the weights and stuff like that. So as curious on your thoughts on that observation of as we are now using models, do we as humans need to be aware that we are looking at the world through a prism?

You know, like through a specific lens? Because these model will have different kind of like. I wouldn't call them personalities, but they will have different approaches. Or do you think that is a moot point? Because the questions in are gonna define the answers out, and so like your biases is the one that's kind of gonna take the most weight.

[00:32:48] Christian Keller: Yeah. I'll, I'll emulate the end a little down This one also, I think the point is, the main point I wanna make is that all LLMs are bias, right? It's a question of how you define bias at the end of the day. It's biased because the information we put in, the information we put in is biased because the internet is biased.

Uh, we, you know, we can work, filter, try to make it as unbiased as we can, but that's based on what we can measure, right? You can only change what you can measure. So at the end, long, long story short with that is I think this is great that there are so many models out there that allow for so many different viewpoints because one, we might discover that.

Certain models work better with certain types of application, not than than others, but I think we, we, we gotta go back and look at what are these models gonna be used for. And so 90% of the time, I think, uh, I could be wrong on this one, but I, I don't think the bias is gonna matter because it's gonna be around like, you know, automatically ordering my coffee, you know, or making sure I get the right directions to go somewhere.

Or understanding how to have a more active voice and a passive voice that I'm writing. Like these things are very like objective in nature. They're really about like language or about actions or about things that, that are like that. Now, where that becomes tricky is then where for the rest of the use cases, when you use it as like maybe somebody to help you think about a problem, like you wanna make sure that the information that's provided back to you is either as complete or as objective, or you know, based on what you see.

So. Point on this is that we're gonna want a lot of different models out there, uh, that can help users, you know, pick and choose based on what they think they get the most utility out of ultimately. So, uh, I'm not very surprised that maybe Bhutan, they would use more Deep Sea, let's say, than, than the other open source models that exist from Western companies, let's say.

But it's great that we have those coming up now.

[00:34:42] Henrik Werdelin: I, I, I think it's such an interest point. I think the whole like bias conversation obviously. Has a lot of personations. I, I met the woman who at the time ran Google image search and she told is an interesting story where she says, if you search for a business person on, um, and she's an Indian descent woman, um, when you, when you search for a business person on Google Image, what it will do is it'll just statistically say, how many times did an image appear with that kind of word appeared next to it happens to be a lot of white dudes, right?

Obviously, you know, she being like a, you know, a high powered professional version's, like that's the, maybe the world that I kind like necessarily endorse, but like it is the statistical model now. I shouldn't be the one who suddenly should change the model to a worldview that I kind of like subscribe more.

And so I do think it's, it's an interesting example because it just. It is just, it is what it is, right? Where she says, but, but then sometimes we do, if see people then go like, how do I kill myself? We do kinda like, we don't just add the best answer to that, right? We kinda like put a disclaimer up there because that seems to be one.

And so it's hugely complicated making these decisions. So I agree with you. It's probably like great that there just is a lot of models so you can just kind of go. Towards the model or choice,

[00:35:59] Christian Keller: but depending on the type of model you use, like one of the images is a good example where you, you have even like language models that it helps you, like you can test your model if it's, you know, for example, the male female issue, which is like translate the sentence in in French from English.

Let's say for example, the doctor is like crossing the the street, like the doctor in French is the doctor. You could say that's most the case is gonna say low doctor, like a masculine word. There's other types of professions where there's a different word between like female and male, and every time, like the model would translate it as the male, and so you can kind of test like that.

Also for LLMs, whether or not there is some bias now there's like three ways you can address bias, right? There's the first one is like data in, so you can filter the data initially and look at what you have. That's very difficult, especially at the scale of the data. But there's still some work being done there.

I think that that's helping. Uh, the second step is depending on the types of models. Uh, some models, not cell, but others for, for images for example, have what's called like an embedding space, which is that, call it like a vector representation of concepts like in, in this space. And so there's the concept of male, there's a concept of female and you can test out whether for certain like professions or certain things like, uh, your vector in a sense for that profession is too close to the male one versus the female one.

And you can correct for that after the training in some cases. That's very hard. And it depends on the types of model architecture you've got. And the third one is like at post training, which is you want teach the model. For example, the, the, the, the example you said around like the, you know, you wanna kill myself.

Like you don't want the, the model to answer that. So you can teach the model to in very specific, but that's like, let's call it, that's at the end kind of the narrowing down the scope of answers. It's not changing what's like underneath and the, the core model initial.

[00:37:40] Henrik Werdelin: Thanks for, uh, giving that introduction.

It's super fascinating. We do have like an interesting other episode where we talk about biases in data collection, which

[00:37:50] Jeremy Utley: Phoebe Young is the founder of Reto AI to is all about eliminating bias in the training side. So yeah, that's a, that's a really cool conversation.

[00:37:58] Henrik Werdelin: We unfortunately is running out 'cause Jeremy has to go to a keynote.

Gotta up, gotta up, gotta up. Professor have to profess

[00:38:06] Jeremy Utley: Professor Verlin. What stood out to you outta that conversation?

[00:38:11] Henrik Werdelin: You know what, I'm always so impressed when you meet people with kind energy who seemed just wicked smart and who was just in the middle of all this and I don't know. Right. I just, I get highly energized by having this conversation and I felt I literally could ask him like a thousand more questions.

So, uh. First, I'm just kinda like high on the, being able to talk to people like Christian, so

[00:38:32] Jeremy Utley: that's good. You're buzzing. I love it. You're, you're glowing. Henrik. Are you glowing? Glowing? Do you have the man crush?

[00:38:37] Henrik Werdelin: Yeah. Of the bit, but also just warm and sweaty. Um, I. I was intrigued about this thing of, you know, I thought it was very fascinating to hear about this idea of the quality of models we can get out of the type of data we train on.

And I think his explanation of text doesn't have the same fidelity as video was intriguing 'cause Yeah, thinking about it and didn't quite understand it when last time I heard about it and now I did. And obviously this other point about ization. I found that to be very fascinating. I, I do think like the whole bias stuff was nice to get just an explanation on because I do think that, as we talked about in the pod before, we will increasingly have to be good at understanding what are some of the limitations or some of the, the areas where using AI will not just in different direction and then be a little bit more active and kinda like understanding that, so that we don't necessarily.

Kind of have the same thing that happened with social media happened to all of us. Um, yeah. And so I thought this was a good conversation to,

you know, for me, the thing that

[00:39:51] Jeremy Utley: it's, it's a, it's, it's a big theme I think on the show. You use AI to use ai and the more you work with ai, the more you discover ways you aren't and the more you get, you know, kind of triggers of, oh, I never thought to do that. And what I find is, or AI hypothesis, I haven't tested that, but I bet those realizations start to increase.

Um, over time, I don't know about you, but I feel like in the last two weeks I've had more, I can't believe I never taught task CAD than ever before. And I'm, you know, we've been in the game for a while, right?

[00:40:28] Henrik Werdelin: I had this, this morning with something where I, there's something called Google Scripts, which I don't understand, but I use, and it's basically have access to my Gmail, right?

And which is my email platform. And I have now all my newsletters, I get in and then every morning it looks through all the newsletters and it creates like basically a summary. And I've been using that for a while. It's been very useful. And it's one of the only newsletters I read every day is basically this kind of like, uh, now the meta news, it was fine, but it wasn't kind of like, it was a better idea than I found utility out of it.

And so back to the thing about like finding problems. I then took the one from yesterday and went to chat. I was like, this is good, but it doesn't really do it for me. There seemed to, and so I actually used the model to help me figure out what is the problem I'm trying to solve. Yeah. And then went like, here's the different things.

You should basically look at not just the expression of what was in it, but you should see them in the context of some of the way that you compute whatever it was, right? Yeah. And so it has these like five different things, which was like description of the problem. It's like, okay, now here's the code.

Google script code, redo it and make it so I can just copy, paste it back in. And then I. Put it in a like completely night and day kind of outcome, right? And so, I don't know, I just thought it was like when he was talking, I was like, yeah, I could see how you need the problem. Some of you, you can even use AI to help you find the problem.

[00:41:54] Jeremy Utley: No, it's so important. And uh, it just, it makes me think, you know, many times, and by the way, Henrik, you've been doing this a long time and you're a world leader and you're an expert, and it was just yesterday that you thought, and the moment you say it, it's like, oh, that's pretty obvious. And yet you just thought of it yesterday.

Yeah. And, and the point there is not to, uh, belittle, but say there are tons of things like this in people's lives and if they aren't actively realizing, wait, could air help? You know, we, we can have this moment where we're maybe we're self-conscious. Oh my goodness. You know, forehead's laughing. I can't believe I didn't think of that.

But I think another step is to be willing to say, have you asked Chachi pt? And that not be an insult? Because the honest truth is for whoever I'm talking to, if I have the idea, chances are they haven't thought of it. And maybe the kindest thing I could do is be like, Hey, you've got an amazing collaborator who's tireless and creative and willing to spitball with you.

Have you invited them in? You know? And I think that that does a few things. I mean, one, if they haven't thought of it, it gives them that, oh, you're so right. And two, especially if you're a manager, if they have thought of it and they're kinda wondering whether they should, it gives them permission. And then three, as a manager, it gives you.

The ability to now follow up and then incorporate that into your kind of story kit of cool things that you know your people are doing, right? All of that happens if you ask the simple question, did you try ai? Right? And just like we are having these kind of personal realizations ourselves, I think we can actually facilitate these epiphany moments for others if we're bold enough to ask that simple question.

[00:43:35] Henrik Werdelin: Amen. I appreciate you for waking up very early in the morning to do this podcast with us in Europe, so thank you so much for that. And I know you have my pleasure. Uh, uh, keynote, can you tell what the keynote is about?

[00:43:50] Jeremy Utley: I am, uh, yeah, I am. I'm talking with a group of private equity CXOs about shifting their mindset in how they collaborate with ai, from treating it like a tool to treating it like a teammate and onboarding AI as a teammate in their companies.

[00:44:08] Christian Keller: That's awesome.

[00:44:09] Henrik Werdelin: Well, with that, I think it's time to say.

No. Um, and as always, we very, very much appreciate when people share this with a friend. 'cause that's how we grow our audience and that's how we get better people on it. And so if you enjoy this and you made all the way here to the end, share it. We'll come, we'll share it with somebody else. And the. Um, I was about to say, and the key word or the code word for this episode is baguette, but then I realized I was so stereotyping that I kind of like made a complete mess.

[00:44:47] Jeremy Utley: Can we say croissant? Is that better? Croissant?

[00:44:49] Henrik Werdelin: Yeah.

[00:44:50] Jeremy Utley: Anyway,

[00:44:51] Henrik Werdelin: let's make, let's not make it that. Let's make the, uh, pass the, the cutch having could be, could be pie, torch. And with that Byebye, thank you. Bye.