Jonathan Ross, Founder & CEO @ Groq: NVIDIA vs Groq - The Future of Training vs Inference | E1260

68.9k views16160 WordsCopy TextShare
20VC with Harry Stebbings
Jonathan Ross is the Founder & CEO of Groq, the creator of the world’s first Language Processing Un...
Video Transcript:
we did not raise 1.5 billion that's Revenue that's actually about 30% of the revenue of open AI your job is not to follow the wave your job is to get positioned for the wave you can almost say we're one of the best things that ever happened to Nvidia because they can make every single GPU that they were going to make and they can sell it for training high margin gets advertised across deployment you know we'll take the low margin high volume inference business off their hands and they won't have to sell either margin we are
growing faster than exponential and when you are growing FAS faster than exponential there is no amount of profit that you can make that matters what matters is getting a toe hold in the market and becoming relevant ready to [Music] go Jonathan thank you so much for agreeing to do this imparis you look fantastic by the way I feel so underdressed but you look great thank you I I could take the tie off if you want I'll never be able to tie it again I don't know how to tie a TI No literally my chief of
staff has to tie it for me it's and it's like a struggle cuz like he's putting it on himself he's tying it I I literally only bought this suit recently well I mean you look fantastic I don't I think have a suit so you one up on me I want to split the show into two parts there I want to talk about the landscape where we're at and then I want to dive specifically into Gro where you're at you've announced a massive new deal that I think everyone's slightly misunderstanding what we're just talking about um
I just want to start on where we're at in terms of like scaling laws everyone says we are at the limits of scaling laws and then there seems to be exponential Innovation happening with the likes of deep seek and others where are we at in terms of the limits of scaling laws so scaling laws is a paper that was published by open Ai and what it does is it effectively says the more parameters your model has basically the better it can absorb information so you'll you'll see these curves that they draw and they're they're amazing
you should show it if you can but effectively um you have these sort of ASM totic drop offs where you keep getting better and better but you get a logarithmic Improvement when you put a linear number of tokens in this is why you see people doing 15 M trillion tokens of training and whatnot but they're misunderstood because um the assumption is that all of the all of the data is the same quality so you have a kid now right So eventually you're going to be training your kid and you're going to say and play along
with me here what's 1 + 1 two what's 2 * 3 six what's the second derivative of the square of the hyperbolic tangent yeah yeah good question but but that's how we train these models we give them really simple problems to solve and then we give them these really hard ones we we don't really train them up we don't do it smart so what some people do is they will train on the drgs of the internet and then they'll save some high quality data for the end to make them better but what you can do
and this is where I think everyone's getting confused is it's sort of like with Alpha go zero where it generated its own data and trained you could have an llm generate synthetic data and when it generates the synthetic data the data is better you then train on that synthetic data so what you do is is you you train why is synthetic data better than real data because the model is smarter so you know Reddit is great but not necessarily high quality as talking to someone with a PhD in a topic sure and so just like
with um more expert people who are more knowledgeable and more capable if you have a better model it generates better data so you train the model it gets better you you produce better data and you produce a a range of data here and you get rid of all the parts that are wrong so now it's the best part so it's a little better than the model is because you're pruning it because you get to do this offline right and then you train the model and the model comes up here and then you do this again
and then you keep the better data you train it again you just keep moving up so when you do that the actual scaling laws don't look like these astics they actually but they has to be a ceiling on efficiency no does there so there's a mathematical limit so if you if you study computer science you probably heard of something called bigo complexity bigo complexity is um you know if I I'm solving a problem and I look at how I solve it I might need to take more steps if I solve it with one algorithm versus
another so for example quick sort versus bubble sort quick sort I need n log n steps bubble sort I need n squ what's the difference if I'm sorting 1,000 numbers n log n that's 10,000 steps but with um n SAR that's a million steps because it's either 10 * a TH or th time a th000 one of the reasons that these llms struggle to multiply large numbers is because um multiply is not linear these llms can do anything linear without you know needing to think but just like on a piece of paper how you need
to write out all those intermediate steps these llms need that intermediate space in in order in in those steps in order to compute these things it's a mathematical requirement there's nothing you cannot train a model enough so that it'll see any arbitrarily large number just be able to multiply it but you can choose bigger and bigger groupings of numbers for to memorize in which casee it can do it in fewer steps and effectively as you are training the model on more and more data it's seeing more and more examples so now it just has the
answer for more specific situations so it doesn't need to do as much reasoning but it still needs to do reasoning for some of these problems so what does that mean for the next step in terms of what happens now if we have no efficien like if we have no efficiency ceiling what does that actually mean you need both so the the the training of the model makes it more intuitive it means that it can sort of just come up with the answer like that more stream of Consciousness the reasoning part is different the reasoning is
the algorithm on top right the the Big O complexity portion so it's system one system two thinking or thinking fast thinking slow like Daniel Conan's book and so when you pair them together when you make it more intuitive you know you get you get better this way right but when you start adding in the system 2 portion you start to get this right you you hear the volume is very little but when you do this and so you get this um polylinear is the term but you could think of it as geometrically increasing Improvement in
the model when you combine it with that improved training but also the improve what they call test time Compu or runtime compute totally get that so just so I understand so when we think about bottlenecks if we have synthetic dat that powers of training it gets more intuitive it get it gets to the answer more quickly sort of like a grand master in chess just seeing the Right Moves sure but synthetic data is not constrained in terms of its supply side if we think if we think about the other bottl legs there is Hardware there's
Energy Efficiency there's algorithmic limits what is the but but if I'm telling if if your job is to get better at multiplying numbers and I tell you that I want you to be able to do it with fewer steps more intuitively for you to be able to multiply three-digit numbers versus two-digit you need 10x the data and you need 10x the examples right and so as you get better on the intuitive part you need more examples to train on makes sense totally and so what is the bottl net then is it the hardware quality is
it compute is it algorithms because it's not data it is the compute it is the data it is the algorithms it's all three of them but so people misunderstand the concept of a bottleneck compute has been more of a a less of a Bott neck and more of a a sort of a you know soft neck or something right where when you provide even more compute you can sort of overpower the lack of data the lack of improvement in algorithms so it's not a hard bottleneck it's a soft bottleneck but ideally you would improve all
three you would be getting better data you would be getting better algorithms and the algorithm improvements are going to be there the the data improvements are going to be there but compute has always been the easiest lever because it's so fungible if I just give you more compute it works better has deep seat not showing you that actually we don't need the compute and you can do more with less not exactly there was an algorithmic improvement on that and the algorithmic Improvement as I explained you know is this seemingly silly thing where they just wrote
the answer in a box and then they knew what to look for rather than having to have a human being check it or something like that right it was very simple um but that was an algorithmic improvement and it made it easier to generate the data That Was Then trained on can I ask I think there's misconceptions around compute data uh especially kind of synthe synthetic data as you said there algorithms when you think about the biggest misconceptions that people have around Ai and specifically kind of inference what do you think they are when we
started the first misconception which people don't hold anymore is that training was more expensive than inference um at at Google anytime we would train a new model we would end up using 10 to 20 times as much compute on the inference as the training so we always ended up inference was always the the critical infrastructure piece that we needed um but then after getting you know past that now everyone understands inference is important um I think one of the do you think they fully do because when you look at nvidia's stock price post deepsea it
was down 15% if you understood the value of inference shouldn't be down 15% and jeans's Paradox and all that and yeah I I don't agree that Nvidia stock should have gone down for that I think that was a a misum understanding on most people's part but it also shows I think that shows more like everyone keeps saying Nvidia stock can't possibly go higher right and they were looking for an excuse for oh now that's it that's why we were wrong and we need to sell now but that has nothing to do with the that that's
just a sort of popularity contest side of the market that had nothing to do with the weighing machine of the market so sh found is building stage should they build with the assumption that scaling laws will continue should they build with what we have today how do you advise them on that I would advise you to build based on things getting better but I would also focus a little more on the the sort of big Quantum steps so the analogy that I like is if you look at the information age we went through we had
the printing press we had the telephone we had the telegram we had the um internet and we had smartphones right and if you had built Uber back when we had um uh internet it wouldn't have worked because you'd book a ride you'd go somewhere how do you get home exactly right and uh we're in the same sort of space now so we don't the models hallucinate so it would be hard to build a medical diagnosis company it would be hard to build a legal company right however if you were doing that and the algorithmic enhancements
happen that get the hallucination rate down you were perfectly positioned just like grock we were around for seven years before we had product Market fit right we were around we our bet was scaled inference that inference was going to be the bottleneck that we were going to need to run really big heavy models like everyone was assuming you would have a single pcie card running inference because training was the complicated part right the the reality was we made the right bet ahead of time and then we were perfectly positioned your job is not to follow
the wave your job is to get positioned for the wave and that's the hardest thing to do because everyone is trying to talk you into coming on Shore again right almost everyone was telling us don't do llms they're going to be terrible for you we're like this is literally what we built for did you ever doubt yourself seven years is an incredibly long wait time well it just doubt there was doubt but there was never a pause and the reason was so even back before before starting the TPU I was concerned that AI was going
to be a technology that would allow some people to have outsized control outsized influence if you allow that to just happen in potentially not the best hands it doesn't really matter how rich you are it doesn't matter nothing matters it it's the most important technology so it didn't matter how hard it got there was no choice but to be successful and our goal is to preserve human agency in the age of AI right if if we don't do that we have failed and so it wouldn't matter whether there was doubt or not and yes there
was plenty of Doubt there was a point where we were so close to running out of money we did this thing that we called grock bonds so you know um um war bonds from World War II of course but for anyone that doesn't what as a war bond so war bond World War II was funded with bonds the US government they had these posters it was like fund your troops and and whatever and you'd buy them and and they would pay you a return and that funded the war effort we were very close to running
out of money at one point rather than trying to pretend to be strong you know we were vulnerable with our employees and we said we're going to run out of money we need you to trade Equity salary for Equity we literally took pictures of the the war bonds and we put grock Bonds on it instead and we had an all hands where we we said this and we were worried everyone was going to leave uh instead of leaving about 80% of the employees participated 50% I think went to the statutory minimum salary by law when
we finally raised um the first bit of our $300 million round we had uh so little money in the bank left that it was less money than we saved doing Rock bonds so had we not done that we would have literally run out of money so there were some really hard times and I know every founder has these and from the outside it's so hard to understand it's like watching a TV show you're not in it but you mult you know it's like when you are there everything is 10 to 100 times more intense because
people left their jobs they left their careers their families are banking on this and you have to make decisions like go out there what hap what would have happened if we went out there and asked everyone to do grock bonds and everyone quit then the shareholders would have been like you have all of these people depending on you but if you lean towards that vulnerability people are often going to go with you on it so what is a world where inference is so crucial and 20 times more important than training what does that world look
like I think the simplest way to understand it is equate an lpu or a GPU to an employee right if if you you have enough of them the lpus or gpus you can do work just like with an employee but it's it's a little different in the sense that uh they can't quit and take another job uh you don't have to retrain once you get a model to a certain capability it'll always be at least that capability right it's not going to regress you you you know um so you get the consistency out of it
but now imagine that you're a startup and rather than having to go out and hire 100 people you hire 10 and you buy the amount of compute equivalent to 90 employees worth that's a very different way of thinking about the world because now capex or in some cases different types of Opex can can be used instead of just employees and so that and in terms of inference just to give you a sense of our scaling we started 2024 with about 640 chips in production we ended with over 40,000 this year we want to be at
over 2 million and next year the number is much much much larger so are we seeing constraints on chip Supply I mean that is an unbelievable scating story yeah so for us to hit our numbers next year which I'm not sharing publicly we're going to need almost all of the capacity of the Fab that we're we're using the the biggest issue so seven Powers we love seven Powers right Hamilton Helmer okay um you don't normally think tech companies as having a cornered resource but Nvidia has a cornered resource they're a monopsony the opposite of a
monopoly a single buyer for hbm and and the interposer the Coos so what is hbm so hbm is high band with memory okay and what gpus and who produces hbm I'm sorry for the dumb questions there's three companies in the world that do this um SK hyx Samsung and Micron okay and it's a specialty memory it's only used in high-end servers so there's a limited quantity that's built it's very expensive to ramp up it's a very technically challenging type of memory to build more so than others so there's a very limited Supply and gpus are
so fast computationally that if you were using regular memory it'd be like drinking out of a martini straw it would just take forever this is why you see people preferring to do um even inference but especially training on um gpus rather than CPUs because the memory bandwidth is too limited and CPUs rarely use hbm they're mostly regular memory um our architecture so the observation that we had when we started grock everyone knows More's law every 18 to 24 months like clockwork double the transistors means double the compute but we noticed that AI was getting better
faster and it it clearly wasn't the algorithms because algorithms have sort of discontinuous jump it also uh didn't seem to be um the data cuz there wasn't that much more data and the transistors were only doubling every 18 to 24 months so where was all of this capability coming from turns out the number of chips was also doubling every 18 to 24 months so rather than 2x it was 4X so the question we asked was if you're effectively going to have an unlimited number of chips do you do something architecturally different the answer is absolutely
so rather than using external memory we just use a large number of chips and keep all of the parameters of the model in the chips live and then we just have this pipeline where the computation flows through it sort of like an assembly line right so imagine if you were trying to build a factory and the factory was only 1/1 100th of the size needed for the assembly line so you'd run a bunch of cars through 1/1 100th tear it down set up the next 1 100 ass line you just do this over and over
again that's the way a GPU works lpus very different we actually just have the the computation flow through a whole bunch of chips so rather than using um eight chips we'll use 600 or 3,000 for a model how does that change Energy Efficiency it it improves at about 3x and the reason is how does it improve it when you use more so cuz you use less for more per token so the footprint is higher think of it as the difference between a factory or a backyard um sort of garage the backyard garage is not going
to be as efficient however it has a lower energy footprint or another example would be if you're were trying to transport a ton of coal from one side of the city to the other and you did it on mopeds or you did it with freight trains which one would be more efficient the moped would use less energy per trip but it would need more trips and therefore it would use more energy overall in fact this is one of the things most people misunderstand they think that edge Computing is lower energy actually Edge Computing is less
energy efficient than Computing in the data center why is that when you're Computing in the data center it's a little bit like that freight train you're actually getting to do a whole bunch of jobs simultaneously so the fact that we don't have to read from that external memory means that um we don't have to spend the energy doing that even with gpus you get to batch but going back to why it's so energy efficient the amount of energy used in a chip there's there's these physical wires and the physical wires have a width and when
you look at the WID width and you look at the length you charge that wire up to set it to a one and then you discharge it to set it to a zero which means it's sort of like charging a capacitor and discharging a capacitor you're using energy the longer that wire the more charge when you have HPM here and another chip here you're actually having to charge a wire between the chips and then discharge it every time you send a bit and so that's a long distance to travel but also the wires are wider
than the wires that are inside the chips so you just use a lot more energy when we keep that um memory in the chip it's only traveling a little distance using much thinner wires and therefore it uses a lot less energy so do we see a world of lpu and GPU GPU usage in combin like how how does that distribution look between lpu usage and GPU usage there's a couple of things the first is um training should be done on gpus and actually I think Nvidia will sell every single GPU they make for training right
now about 40% of their you know Market is inference um I think if we were to to deploy a lot of much lower cost inference chips um what you would see is that same number of gpus would be sold but the demand for training would increase because the more inference you have the more training you need and vice versa um the other use case is we're actually so crazy fast compared to gpus that we've actually experimented a little bit with taking some portions of the model and running it on our lpus and letting the rest
run on GPU and it actually speeds up and makes the the GPU more economical so since people already have a bunch of gpus they've deployed one use case we've uh contemplated is selling some of our lpus to sort of nitro boost those gpus this is my question which is that you know people have bought gpus so far ahead of time that by the time you get them they're deployed and installed they're almost out of date actually we've we've spoken with some customers that put orders in over a year in advance they paid a year in
advance and still haven't gotten them uh the the recent deployment we did in in Saudi Arabia uh 51 days from contract to the first tokens being served in production in country how are you able to do it so quickly 51 days is astonishing yeah um part of it is architecturally things are much simpler for us we don't have a bunch of other Hardware components we actually don't use switches to communicate between our chips we just plug our chips into our chips our chips are the switch and we don't have all of this network tuning think
about it this way when you're going across town in uh France how long does it take to get from one side to the other a long time a long time but a variable amount of long time for sure if you do it in the middle of the night it might be fast if you do it in you know middle of the day during an event like we've got going on with AI Summit slow Terri exactly but it's unpredictable MH however certaines of Transportation Like Trains can be predictable with what we're doing it is uh 100%
predictable given the Energy Efficiency given the predictability why is NVIDIA not being more proactive on lpus um what makes you think that they don't want to be more proactive on it it's they don't talk about it well why would they talk about it that would be like talking about something you don't have when you're trying to project strength rather than vulnerability well I think if you wanted to protect shareholder value and wanted to protect a Wall Street image of dominance and being ahead of the game you'd at least say oh we are of course working
on lpus as well but then until they had that ability until they had lpus they would effectively be exposing that there's something missing like if you look at the last GTC there was an announcement that the latest gpus were 30X faster than the previous generation and when you look at how it was done there was this curve that looked kind of like this and then it basically ended here and then there was another curve that was kind of like this now that 30X was from the end of this curve to this curve if you moved
it here it would have been less than 30X if you moved it here it would have been infinite so their chip is infinitely faster than the previous one but that wouldn't have sounded reasonable right there's a history in this market of speckmans ship because it's it's so hard to like get access to Chip and this is a a lesson on Enterprise sales I think in Enterprise sales people rely on specmar better than your specs my chip is faster than your chip I get more ter flops per second than you do right but who cares like
just tell me what the the tokens per dollar is and tell me what the tokens per watt is nothing else really matters but people will find all of these other weird things to measure that they might be better on sort of like I'll sell you a car with better RPMs RPMs don't matter right what matters is miles per gallon and maybe the speed that you can drive that although speed limits kind of render that you know moot right but in the case of um uh Enterprise sales people often well there there was a time when
the way that you would buy soap or you would Market soap the Billboards would say our soap has more bubbles than this other brand soap who cares and what they figured out was let's put really happy people up on a billboard after they use the soap and then maybe people associate that happiness right lifestyle marketing sure for some reason Enterprise still hasn't learned this lesson it's still we have more bubbles we have more ter Ops we have more whatever things that people just literally don't care about so you think Invidia is hey with 30 times
faster is not good marketing I think it worked because it's what people are used to but our counter was we we did a press release to that that said um grock still faster that was it and and people went Gaga over it right because it was just we are we're still faster so who cares I I totally get that do you think Wall Street up us down that way I think they're starting to yeah but again I I I don't think there's real competition here I think if you are competing you have done something seriously
wrong if you're competing it means that you haven't found an unsolved customer problem because if you're competing someone else has already solved the problem so why are you spending time on it so you don't view Nvidia as a comestor no they they don't offer fast tokens and they don't offer lowc cost tokens it's a very different product but what they do very very well is training they do it better than anyone else and by such a wide degree it's a solved problem why would we bother trying to solve a problem that's already been solved so
you like seed the training Market to them we'll own the inference Market yeah and they're saying that we also want the inference Market of course it's the way it always works so what do we do now so now we are competing in the influence Market but are we yeah so we don't really have people saying you know we're going to buy gpus instead of you we do have people saying we're going to buy both that happens but we don't care because we all like I showed a demo to someone and he's like should we just
not buy any more gpus I'm like no you should buy every single GPU you can get your hands on and he's looking at me very perplexed and I'm like well how are you going to do training we don't do training buy the gpus get every single one you can because I want your models running on us to be really good but for inference they don't need to buy Nvidia anymore they don't need to buy gpus for inference but if you can get them I mean they're a little expensive but if you're used to it why
not plenty people still sell main frames but if you want lower cost and faster then you want an lpu how much lower cost is it more than 5x lower more than 5x lower just the memory alone in the latest gpus cost more than our fully loaded capex per chip deployed and and on top of that so we talked about the Energy Efficiency so we use about a third of the energy per token about over threee period onethird of our cost is the Opex which is mostly energy and data center rent and 2/3 is the capex
which means that since we're one3 of the energy the cost to run that GPU to produce the same number of tokens for inference is the same as our total cost just the Opex for the GPU is the same as our capex plus our Opex why is 40% of their revenue inference then and what why have you not taken so much more of that at the beginning of 2024 we only had 640 chips at the end we had 40,000 we're not at that scale yet so you have to you have to provide quality you have to
provide low cost you have to provide speed but you also have to provide capacity and so this is where that that um most important part of not using hbm came in it means that we effectively have no scale limits so the GPU itself is actually um manufactured using the same process that um you use for your mobile phone right so the same silicon that's in your mobile phone is the same silicon for the GPU in fact they build the mobile phones first uh the mobile phone chips first because they're smaller so they better so the
Nvidia actually gets it after Apple the the difference is that memory that's the only difference but that memory is the hard part to manufacture that's that's Limited in scale so by us avoiding that we effectively have almost no limit on how much we can scale up and that's important for inference what is nvidia's margin 70 to 80% 70 to 80% so they can take 70 to 80% off and be radically more yeah comparatively cost effective compared to you like you could destroy their margin but why would so I in that same vein you can almost
say we're one of the best things ever happened to Nvidia because they can make every single GPU that they were going to make and they can sell it for training high margin right gets amortised across the deployment and you know we'll take the low margin high volume inference business off their hands and they won't have to sell either margin what's low margin anywhere from um depending on the deal um we do get some on the the back side but up front it's about 20% about 20% yeah okay so there's his 80 yours is 20 but
then you're looking at a 20 x but then we get more later off of it so we take some of the risk what do you mean you get more later sorry so the deals that we do um the partner will off because we don't deploy we don't spend money for our own capex the partner will put up the money for us to deploy we pay back with a you know decent irr and but we split and most of it goes to the partner and then once we hit the irr it flips the other way so
others are putting the capex up for us what does it look like at the end then it's a little it's not like other business models so we we didn't just innovate on the chip we also innovated on the business model and um we're Limited in how much money we can make based on how much we can deploy not how much money we have because the partners are putting that money up so when I'm looking at what we can do it's all about how much we can scale what are the limits to your deployment is it
purely Chi constraints mostly so you're asking about misconceptions in AI I think one of them is about power so it is true that there is a mismatch in the market between people with chips and people with power but that's partially because you need a data center in the middle and there aren't enough data centers those aren't the hardest thing in the world to build they're not easy but they're not the hardest thing it's harder to build up the power um however because of that mismatch you have big hyperscalers going around and saying I need a
gwatt of power and they'll say this to to 60 different potential data center Builders and then all of a sudden you hear this Echo well I heard that you know there's a gwatt here and a gwatt here and a gwatt here and all of a sudden there's like 60 gwatt of demand and it's this Echo from that first gwatt the thing is I am aware of about 20 GW of power that people want to make available for data centers now right now there's about 15 gaws of data centers worldwide so more than double the current
capacity concern that I have is that people are now building up more power and what's going to happen in the next 3 to four years is people are going to be like I built up all this power and no one's using it and this was like a complete waste and we're never going to do this again then what's going to happen remember that doubling of chips every 18 to 24 months while 3 to four years you double that 15 gws twice and now you're talking about what 120 gws there isn't that much power available and
then another one after that now you're at 240 and so what's going to happen is we're going to overbuild slightly right now just because of that mismatch and the miscommunication that's going on right now and then we're going to dampen our building and we're going to you know close down on that and then we're going to have the real need for the power that's my big concern right now because that that power will become a hard bottleneck in three to four years okay just so I understand why will we have that data over data sent
to over Supply when we are moving into world of inference which will be 20x larger than training so the problem with data centers is everyone thinks that data centers are real estate and a lot of people do real estate data centers are not real estate um the common joke in the industry now is someone says I'm going to have you know 100 megawatts of capacity for you and I'm going to have it in three months are you willing to sign and then you ask a question like well um what's your up time and they're like
I don't know whatever the the you know power grid is you're like wait what where are your generators oh I haven't ordered those I'll order them now you know that there's a 90 Monon lead time on generators right now oh really and then the next 90 90 90 mhm and then the next question is so where are you getting the water from wait data centers need water I thought it was a bunch of chips what do you mean water so there's a bunch of people who have no idea what they're doing going into it because
they think it's real estate and so those people are now building an over supply of data centers but they're not really building them so they're they're fake data centers that people think are real what happens to those data centers because they're not going to be utilized are they Amazon is not going to pay for a data center that doesn't well Amazon doesn't fall for this Amazon has really good people whoever the the buyer is is not going to pay for a data s it's got no water or got no power yeah and so is it
just wasted from your these projects will never be developed will we build them fast enough you said about the data like the over Supply it t it does take time to build the data center and it's it's so it's almost okay if you train a model you really want to amortise it for about six months if you um if you deploy chips you really want to amortise it for three to five years right we're more on the threeyear side others are more on the fiveyear side uh if you build a um data center you're probably
talking 10 to 15 years in a power plant you're talking like 15 20 years so the the problem we have in the industry is not on on this and there's this mismatch between the the sort of financing and the the needs here so you have someone who wants to train a model they're going to be doing this for six months and they don't understand why people want 3 to 5 year commitments on the chips and then the people deploying the chips don't understand why someone wants a 15 to 20 year commitment on the data C
right it's at seven years now on the data centers and then the people building the data centers then need a long seven years commitment yeah that's the kind of thing they're asking for so you've got this complete mismatch throughout the ecosystem but the funny part about it is while they all want to take zero risk and have a committed um you know Sovereign wealth level um uh sort of credit rating on the other side of it with long commits the longer the payoff time the more generic the infrastructure is a model has a pretty specific
use but accelerators like lpus and gpus can be used for other things besides generative AI or llms um the data center can be used for other things besides the accelerators the power can be used for anything so while they're looking for the least risk over here it's the place where there is the least risk because if if we don't use it for for AI we'll use it to power all of the electric cars so is this a case where incumbents SN because they're one of the only ones who are able to match the durations required
by data center providers well and and this is why we've partnered with um aramco and this new uh entity in Saudi Arabia because they have an enormous um ability to fund this over the long term they have a very long-term perspective um and they have an amazing credit rating I mean listen the So when you say they have an ability to fund it they and this is why the misconception was people think it's a funding round of a billion and a half it's not a funding round of a billion no we we did not raise
1.5 billion that's Revenue that's that's actually about 30% of the revenue of open AI can you just walk me through how that deal is structured yeah so we started off last year right and and we got to um 19,000 of our chips deployed we did that in about 51 days and the question was what could we do this year so they've gone off they've collected up a bunch of power in in the country and the deal is structured so that they will put up the capex for us to deploy our our chips in that data
center or those data centers and we pay back based on the money that we make so it's it's sort of um it's a little bit different than debt in that they participate in the upside um but it it's similar in nature but it is revenue because we actually make profit upfront how does that change what you can do well we are not limited by Capital anymore and one of the unique reasons we can do this there's a there is one misconception around grock um there was a paper that was written that said that um we
couldn't be profitable while being lowest price we could charge more but actually we have a very positive contribution margin right now um and so as far as we know we're the only ones that are actually making money running these open source models because with the open source models everyone's sort of competing with VC dollars trying to take market share Uber style right and Meanwhile we're sitting here going we could do this all day long because we're making money and we're we're able to even pay off an irr and and and make our partners money so
the the difference here is there's another part of the model so we're also working with some proprietary model providers so we um uh actually showed off the first first one at leap on Sunday um where uh we did a voice model with play AI that one is also a REV share but the thing is they get to make money off of that whereas most others in the industry are losing money because of the commoditization of the models so do you have cheaper pricing over time as you bluntly have uh less Monopoly power or do you
have higher prices as your Monopoly increases well well we want the we want the margin to stay about the same but we want the prices to go down because then we get into jeevan's Paradox and and life gets great because we're going to scale and our focus is on getting to scale right to preserve human agents in the age of AI we need to be one of the most important Compu providers in the world and our goal by the end of 2027 is to be providing at least half of the world's AI inference compute we
think we could be further than 2x um given that we don't have all the constraints but in order to get there we do need to be very aggressively building out and we need to give people no excuse for not running their models on us and and using the models that are on us by charging extra and and what I keep telling the team over and over again because you have to remind them sometimes is we are growing faster than exponential and when you are growing faster than exponential there is no amount of profit that you
can make that matters what matters is you know getting a toe hold in the market and becoming relevant what would prevent that we used to be worried that someone would try and price below us and then we realized that wasn't a concern because there's so much money going into this that people are going to want to lose less money by running on us so that isn't a concern but that was the big one early on until we realized that when we see Zuck investing $65 billion in data centers what does that actually mean that means
he's internalizing all of the margins that he would have had to have spend on data centers with the providers that we mentioned earlier and Facebook is doing full stack what does that mean so met is doing 65 billion a year um I think Google said 70 or 75 and Sacha said Microsoft's doing 80 and then you've also got Stargate right yeah these are crazy sums of money and this is all for data center Builder no it also includes the stuff that goes in includes the chips as well the systems everything okay we've never seen money
like this no no there's never been anything like this but there's never been a case where it was so clear that there was going to be value at the end right if if you knew how successful um search was going to be right remember Google stayed private as long as they did because they were afraid that Microsoft would figure out how much money search was making and then would try and replicate in the moment that they went public Bing right they they called that perfectly everyone knows how much money there is in AI so everyone's
going after it do you think that value is distributed amongst many players or concentrated towards one or two I completely agree with you in terms of the clear value when assigned but is it distributed to some levels evenly or concentrated it's a power law and the more value there is in the economy the more risk there is of a single entity being so far on one end that they just Dominate and you see this with the mag s right and it's predict just the bigger the economy gets the more you will have you know big
swings in in the economic outcomes right now the hyperscalers are all sort of even in their market caps it's strange you would expect one of them to just be killing it and taking it much further and so I don't understand why they're so closely grouped so when we think about that distribution like how do we think about changing that then like obviously with a Gro you know you want to be one of the mag 7 you want to be one of the most important compan in the world how do you see that there's so the
way that you get there and the way that you stay there are two very different things and the there's a there's sort of a circle of life that happens in startups the first Circle the first stage is solve an unsolved problem right that's how you go viral that's how you do well the second stage is the marketing stage which is now other people are trying to copy what you've done right because they can't think of something themselves and now you have to fight it out in in advertising and marketing and whatnot and you see cpg
companies often get stuck there right and it becomes more about where on the Shelf they are than than anything else and then the final stage is the seven Powers it's once You' found some of those and you've really started improving it and you have you know sort of systemic advantages and then what happens is someone solves an unsolved customer problem and the whole cycle of Life continues right now Google uh has to redo this because llms are better than search right so the way that you start off to become a MAG 7 is you solve
that unsolved problem the way that you stay there is first you find one of those seven powers or or multiple but then you have to be ready for when you get disrupted to continue fighting back and and solving customer problems can we mentioned the different huge amounts of money that's being spent here is this a good bubble that bluntly lays the foundations for an incredible next 10 to 20 years where bluntly the capital actually turns out to be productive but not seemingly so on paper or is it where actually just a huge amount of money
is incinerated on depreciating assets I can guarantee you that a huge amount of money will be incinerated but I also bet that in total more money will be made than will be put in and so this is the problem you you have to look at it either in aggregate or individual bets right when everyone is making investments in the market some people are going to lose money because not every company's going to be successful so what you always see is when there is some real Tech improvements or things coming you've got the things that were
early that people are investing in heavily that are super successful and then everyone else wants to get in on it and you know it goes from you have ai chips and AI models to now you've got AI you know t-shirts and next thing you know you've got AI thermal grease right it it just like people just start applying AI to everything next thing you know you'll have an AI condo sure yeah and so the trick is Discerning what is real and what isn't you're always going to have all of these really obnoxious charlatans coming in
whenever there's something real and that's unfortunate but eventually they get cleared away once people start to understand the technology and what's real and what isn't and so the job is to start educating and the more educated people are the less they'll invest in AI thermal grease what is the largest individual bat that will lead to the largest incineration of cash I'm not going to call anyone out in particular but I actually think it will happen across every single discipline the are you aware of the Keeny and Bea Contest no okay so um John Maynard ke
The Economist he has this great um this will explain everything you need to know about VC so I'm nervous but keep going so um take a magazine full of models human models like you know good-looking models and have a whole bunch of VCS in the room and they're allowed to make bets on who the the most beautiful model is and in the end Whoever has the most money on them is the winner and based on the proportion that you put on that particular model's face you get the share of all of the money so if
you if you put money on one that isn't the most beautiful by dollars then you lose your money to the the people who bet on that one and that was sort of the the bet that SoftBank was making which was they could win the kenian beauty contest I'm just going to put more money in and I'm going to win that is problematic when you have true technological advantages as opposed to marketing when you're solving customer problems it's a weighing machine once the customer problem has been solved you then get into this sort of popularity contest
of marketing now something unusual has happened this time around which I don't think has ever happened in in VC before which is you see people raising billions of dollars who who have competitors whove raised billions of dollars it usually there is a clear winner in the keny and beauty contest you don't have this like fight where you know it's it's sort of like well I I got to put a little more money in I got to put a little more I got to you know put 10 billion in I'm going to put 20 billion I'm
going to put 500 billion in right because the kinyan beauty contest has gone completely a muck and um this has never happened before and so now people don't even understand how to react because it used to be if someone had raised a billion dollars you're like oh they're the winner now it's like there's three or four competitors who have a billion dollars so who wins and who loses like is Masa going to incinerate the largest amount of cash ever I think the Keeny and beauty contest no longer app here because there's so much money available
being spread out and and I think you're you're going to see that the people who have the best products are actually going to be the winners because everyone can be capitalized but there will be problems for the winners because of this the problems are going to be of the sort you you had this employee that you were going to hire and someone offered them a ridiculous amount of money yeah you see this all the time now and they could have gone and contributed to the winner but now they're contributing to a competitor that shouldn't exist
right or is equally likely to win and now you're splitting the talent what do you also do when you have such high salaries we've seen a million 2 million for kind of Junior to mid level in some of these companies and they are living an amazing life actually in great places you think they're living that amazing life in Guangdong when they're working for deep seek or any other Chinese alternative I don't think so I think they're actually getting paid much less working their ass off 20 hours a day and not getting Kucha and being paid
2 million a year fair not only fair we have a policy that we never offer the highest because we want people to choose us not choose the salary if we win in a bidding war then that means the next time someone comes along with a higher salary that's it they're just going to go take that other job there's no loyalty they don't believe in the mission instead we focus on look we're going to build this this is your opportunity you're going to get to work with amazing people um spend some time with the team are
these the people you want to be working with because frankly you're going to make so much cash it doesn't matter but bet on the equity the outcome right help us make this thing valuable and people who buy into that they're so much easier to manage because they're Mission oriented they all want to do the same thing they're not there because they want the kombucha and they're not going to complain because the the cappuccino machine is broken they'll just go and buy their coffee next door will you and Nvidia move into the model A everyone talks
about model ERS becoming application providers will infrastructure providers become model providers we have decided that we're not going to train our own models we'll do a little fine-tuning for specific cases or whatnot but we don't want to compete and that's really important because people are putting their models with their weights on us right and they don't want us to to learn from and take that stuff for our own benefit this is the problem you have when you work with a hyperscaler because you know they're also doing everything that you are doing so we've decided model
providers you make the model we don't do that I think there's also the data side of the users in the queries so the other thing that we could do that we do not do is log the queries and then we've got data if we want to train we don't train we have no reason to hold the data so we we only temporarily store things in the Damm so there's no persistent storage if the power went out everything's gone and Dam is limited so we can't hold things for a long time so you know that we
don't have your data now people who are building businesses on top of us you can obviously keep the data from your customers if you want we have no control over that that's fine but we don't take any data do you think Invidia move into the Mel providing it's possible but I think I mean if I was them I would avoid it because I wouldn't want to give the customers of mine I mean Nvidia is great at training right it it's crazy it would be like you know um being a um Automotive a car company and
then creating your own taxi service you're now competing directly with your customer right and I think tech companies love to do this we we have a management philosophy and it's based on bigo complexity and we only do things that require a sublinear number of employees so what I mean by that is if someone comes to me and says I need 10 people to go do this thing a lot of people would say well why can't you do it with five I would say okay you're supporting customers if we double the number of customers do you
need 20 or do you need 11 because I want to know what's that growth rate are they automating everything right we completely automated our compiler we completely automated everything that you know large portions of our cloud and that means that we can scale with a small team we have 300 people we have 300 people we built our own chip we built our own networking hardware and software we built our own runtime we built our own orchestration layer um we built our own compiler we built our own Cloud we built all this with 300 people now
we would only be able to do this with a small number of people because you don't have the communication overhead but if we if you have to decide what your constants and variables are what are the things that you want to preserve and one of the constants is Talent density we want to stay small we want to stay Nimble and the the other side of this is growth is a problem so we measure our growth in what I call problem units so a problem unit every time you triple something you have about the same number
of problems as the last time you tripled going from 100 employees to 300 300 to 1,000 1,000 to 3,000 each one of those has the same number of problems we scaled from 620 uh or 640 lpus last year at the beginning to 40,000 that's four problem units that's four triplings of the number of chips if we were also tripling the number of employees that would be another problem unit management bandwidth is limited you can only solve so many problems so you have to decide where you're going to allocate them if you build things really well
from the beginning and you can scale up with the number of employees you have then you can scale over here if you want to Triple the number of customers there's another problem unit that you have to solve what's the biggest challenge when you are scaling at that rate but then the team is not scaling in conjunction with it there's this common belief that the people that you have early on are right for the job and the people that you get later maybe they're better in a sort of more corporate environment I don't think that's the
case I think you should always try and get generalists um otherwise you you get stuck and aifi in a particular way of doing things because that's what that one person knew how to do but there there are people who burn out in a being in a startup is hard like there are people who just literally burn out there's also people who were the best that you you get at the time and then there are people who are just unmanageable wild children and they should go off and start another startup and they shouldn't be you know
scaling with you that's that happens but it's the rarer of them I think saying that you're you're going to hire B players because you've gotten large enough is is laziness and an excuse and it it's a lack of creativity in your business model and how you're going the algorithm of how you're going to scale think of it this way Walmart versus Amazon Walmart Walmart wants to double the number of customers they have to double the number of stores and employees Amazon does not need to double the number of websites that's a fundamental Advantage but Amazon
still has to double and improve the logistics right they don't have as many problems where they have to scale linearly but they have some you wanted to disrupt Amazon what you would do is you'd build a completely robotic logistic system and bring the comp the the overhead and complexity of that down and then you can outmaneuver them right that's how you need to imp don't just say I need more people focus on the algorithm of your business the last time we spoke we discussed deep seek I think more has come out over the last few
weeks about buny their Innovations some of the distillation that they used where is China better than us today well as we discussed they're more willing to use things that maybe they shouldn't be using um you know they they distilled the open AI Model A lot of people have the opinion well open eye was scraping the internet so you know good for deep seek but whether that's right or wrong most of the model providers had considered that a red line they didn't want to cross I don't know if that's going to change but but it might
but the other the open source nature of deep seek open AI now benefit from the innovations that they did also have well and they also probably have all the data that deep seek paid them to generate so yeah I but but they also were clever they they innovated I think the big biggest thing is this is a shot in the arm for morale in China and it gives them a sense but but again this you know as I said Sputnik 2.0 it's also woken up the US totally has how do you compare Stargate to the
$128 billion that China have now committed so China has a more complicated situation and a simpler one at the same time the the this problem is they don't have the technology that we have in terms of the chip efficiency on the other hand they have scale if they wanted to deploy 150 nuclear reactors is I think the plan is no big deal they just do it so if the chips aren't as efficient they can just deploy more of them on the other hand if they want to go out into the world and deploy chips like
they did with Huawei and networking gear that's going to be complicated because people aren't going to have the power around the world to run more expensive accelerators the home I don't think anything is a problem I only think as they're trying to expand it's going to be an issue China is quite opaque in everything what do we not know about China that we would like to know I think the most important thing to understand is where they're going to end up on the the censorship and privacy of these models we come from democratic countries we
have an expectation that companies can build something that says anything are they going to be permissive and allow models to make mistakes and hallucinate or they going to shut it down because I think if you know that you know whether or not China has a shot one of the biggest nightmares that they have is Free Speech it's the exact opposite of that vulnerability we talked about earlier can you imagine XII ping going out and saying country we've lost our advantage in AI I need your help never ever it's always going to be we're the greatest
we're the best everyone's going to know differently but they're all going to have to tow the party line right now because of that I think it's really hard for them to just allow these models to to say anything say you know the US is great and better at that that's a bad thing for them and so that's going to really tell you a lot about the AI story in China and so if they aren't permissive of uh more open truthful models then they're inherently disadvantaged you're saying well look what um I I forget if it
was was it Jack ma who got in trouble with the CCP yeah yeah um if they aren't more permissive than if you are running a Chinese tech company your fear is that you become Jack ma that's really going to stifle Innovation if I was in China right now I'd be looking for the exit for how I could do like if if you're craft is AI I would want to do that someplace that's supportive do you really buy that they don't have access to Blackwell this is China I think XI jinping's like well sorry no Blackwell
well I I don't I don't think it matters on whether not they physically have it because right now most of the cloud providers are happy if you swipe a credit card to rent it to you but there is limits to renting no I I'm na I I think if you so um one of the concerns right now is about um uh Malaysia or Singapore that region over there being a place where people are deploying gpus with the wink wink like we're not going to rent it to China right um but that's that's a belief that
a lot of people are doing that otherwise that's a lot of gpus for that region that feels like it's even more of a safety net just in case the tap ever gets turned off at the hyperscalers because right now um you could just write a check to any of the hyperscalers and say I need these chips they'll deploy them and you can run on them doesn't really matter where you're coming from I mean if you're a sanctioned country no China's not sanctioned okay so we have China where China is obviously in terms of innovation and
actually proving that they they are in the race we have the US and then we have Europe which feels like it's languishing yeah is this the ultimate na in Europe's coffin we talked about how grock almost died but we had the right technology all along we're just waiting for the thing for for the llms to arrive and I think Europe's very similar I think Europe has amazing talent amazing Talent um but that Talent leaves and goes to the US or other places so the question is how do you have Europe's llm moment how how do
you position yourselves and it's it's not that complicated the problem is when you surround yourself you become the average of your five closest friends right if your five closest friends are like that'll never succeed ah you should just keep your job ah startups they're terrible then you're going to be risk averse but if your five closest friends say you should do it that's great I support you then you're going to be more likely to do a startup and even in Silicon Valley people make that transition from the big tech company to the startup and it's
hard right they're you know comfortable right they're making those crazy salaries the the big companies take care of them and they have a fiduciary obligation to their family how do they make that that leap and it's because you've got tons of entrepreneurs trying to hire them and they hear the pitch all the time and they get used to it right they also see the success around them and then VCS come in and try and close some of the candidates in the early stage too right Europe needs the same thing you need a place where people
are surrounded just surrounded by entrepreneurial people who are risk on and who aren't going to try and talk people out of joining a startup from regulation perspective Europe is you know unbelievably efficient in the Masters of Regulation you know I was speaking someone the other day in the EU has supposedly hired 1500 people for AI safety and policing um what would you do if I put you in charge of European AI regulation well I wouldn't waste my time regulating something that doesn't exist instead of regulating what are you going to promote you want to promote
risk-taking you want to promote that onclave of people are risk on so I was just visiting station F yesterday amazing macron was there it was like full of people right vibrant you feel it I would and I was talking to the the person who runs station f roxan um and Xavier Neil and with roxan we were talking about what about a city F what about a place where there's you start off with like 10,000 people in in the center right a little radius and then once that's full you expand it once it's full you expand
it and so on to get to like a million people in Europe who are all risk on the little Silicon Valley here and I would give it special economic dispensations I would allow everything that employers need I would make it simple and I would say you know what if you don't want to buy into that that's fine go to other regions in France go to other regions in Europe but if you want to participate in what is going to be the biggest technological revolution in human history this is the city for you you know inherently
punishing incumbents then and what I mean by that is if we are talking about uh I'm just using this as an example uh AI Insurance Underwriters startups yeah there's many companies that are going after insurance underw writing in an AI and you are giving them benefits like that you are inherently punishing some of the biggest providers of insurance in your region you're inherently punishing people who hire 200,000 people that feels unfair so there is no right to be an incumbent especially a slothful incumbent that is not reacting to disruption and you want to encourage disruption
and this is one of the things in Silicon Valley you can move from one place to another there there are no I mean we had non-s solicits when I started but even that's gone right so that free movement of people is very important are you allowed to start work straight away straight away but not before if you start before that's we have months months six months there's no such thing and so in that region I would say you can immediately start like literally the next day that is so good we have to wait 6 months
it's not good if you are a company right now it it feels like okay well it's harder to poach but what does that do it suppresses wages it's harder to hire someone they're less likely to move there's less competition it suppresses wages and by the way the company has to pay for the six months anyway it makes no sense at all I so I I totally get you and understand that um can I ask you you know you mentioned like what would you promote a lot of people would promote I love the way you said
risk gone um being a European I actually thought first safety and regulation uh but specifically safety so uh sticking with that all that Dario will talk about these days is safety is he losing a step by being so focused on safety when bluntly his competitors are talking about product so safety matters in AI it's a little bit like nuclear power lots of Pros lots of cons I'm worried about different things than I think um than Dario is is worried about I'm more worried about people voluntarily giving up their decision-making Authority because it's so easy and
this is what I mean by preserving human agency in the age of AI good analogy is you you probably know plenty of wealthy people and the struggles they have bringing up children with wealth I refer to it as Financial diabetes right you you have children who aren't in ented to to they're not they're not going to strive to succeed I was very fortunate when I was growing up and so I I actually just told this story for the first time today so no one's heard it but um I was fortunate because my father lost all
of his money multiple times just and I've heard you say the same thing yeah and he would sell a billion dollar life insurance policy and he would get all the commissions from that and you would have tons of money and then you would spend it all and so there was one time we were living in a $20 million mansion and there was a couple of times where we ordered food he would talk to the delivery guy and he would convince him to to give us the food and he would pay him back later because we'd
get money later but um this time he was like so despondent he sort of locked himself in his office and wouldn't come out my little brother came to me and said I had to go and talk to the Chinese food delivery guy and convince him to uh to give us the food and I was like mentally preparing how to convince him to do it and I I walk out and I walk up to him and I'm like getting ready to do my whole Spiel and he hands it to me and I'm like I don't have
the money right now he's like oh yeah pay me later I didn't have to I fortunately didn't have to do anything he just trusted because we're living in a $20 million mansion but that happened multiple times and when that happens multiple times like I have a friend who was homeless once for a couple of weeks and he'd almost been homeless a couple of times and he said the best thing that ever happened to him was that he was homeless for a couple of weeks because he survived it and he's like I've been through it I
always viewed this as the worst thing that could ever happen in the world but now that I've been through it I can survive it I'm not worried anymore I think we live incredibly comfortable lives way too comfortable most people don't have to go through that and so we have the sort of financial diabetes as a society and I think it's going to get worse with AI I think we're really going into an age of abundance very few people have to worry about food security now but what happens if you don't need to worry wor about
home security or ending what happens if you can just live a life without working and what is that going to do to your psychology and so as we enter an age of abundance how do we get people to still be making their own decisions and have a fulfilled life do we get better or do we get accepting of good enough and what I mean by that is you know now bluntly with with the majority of schedules we will start with open Ai and we will do deep research and then we will kind of use different
prompts depending on different guests and then we supplement it with a huge amount of research from speaking to chath and speaking to scooter and speaking to every in between we care about it being good enough first and then great later with all the references most people will actually just be happy with good enough and get away with it do we as a human society get happy with good enough when we hire we hire for something that we call booking the win early so one of the most important driving forces for people is loss bias when
you have something you don't want to lose it people are less likely to go after something that they they've already had and you you did grow up in a in a family that was well off and then you lost that that might be part of the drive because you want to get back to it when we have an engineer that we're hiring and there's a room full of people who are saying you know if we do this thing we could be twice as fast I want that engineer to to hear wait if we don't do
that we're going to be half the speed we could have been the loss bias right book The Win early because it's possible it must be done I think that's a smaller segment of the population those are the people who deliver amazing things that no one else is going to do because everyone else is like that's good enough however I think with AI it's so easy to create a prototype to stand out you're going to need to do that and one of the things that's happened with the ability to communicate more freely and see what other
people are doing like can you think back to what the restaurant experience was 20 years ago versus what it is now the average restaurant is better than what high-end restaurants were 20 years ago because people see all of the the stuff that others are doing the best and they start to expect that and you have less localization more globalization you have to compete at the highest ends and AI is no exception there's going to be 40 people creating that app you know that you have to polish it in order to stand out listen dude I
could talk to you all day I do want to do a quick five what do you believe that most around you disbelief I'm going to go with anti-f founder mode here I'm anti-f founder mode I believe in delegation I think when you are telling people how to do their job that is an indication that it's not not necessarily a problem with you it could just be that that person is not right for that job and it's much easier to just direct them than to go find someone else competent but it also means you probably haven't
aligned them so we we align people through this challenge coin um so everyone at grock carries this um 25 million token per second challenge coin and what this is is it tells everyone what we're doing it's an alignment and I can't tell you how many people have showed that to who are like that's awesome and yet no one else is making them and it's heavy it's heavy but but also you know you you you like to say you know the greatest things in life um you know the heaviest things aren't gold or whatever yeah I
know gold but I'm made decisions exactly well this was a very made decision because I had to consolidate everything we were doing into one very simple message of we're going to get to 25 million tokens per second and then engraved it on a coin on this tiny amount of space right here and gave it to everyone at grock and now whenever we're in a meeting and something doesn't help with this they can just tap their coin on the table and be like no no no that's not the way this is going to go so is
everyone wrong on Founder mode then I think that's what you do when you don't have the quality of people working for you you need a you need the right gearing ratio between you and your direct reports it's a really unfair question but I have to ask it how do you analyze elon's attempt to buy Twitter uh not buy Twitter to buy open AI so I was sitting at the Elise Palace or however I pronounce it um at the dinner with mcon and Sam mman so it was it was mcon it was JD Vance and it
was Sam Alman and frankly I think um I think Elon was a little jealous that Sam Alman was sitting next to JD Vance and it wasn't him and because it was right around the time that Sam Alman was speaking that he announced it and frankly I thought Sam's tweet response part of it was pretty good I would have probably said instead of whatever he said about 9 billion I would have said yeah I'm going to take Twitter public at $420 a share it was just it was attention grabbing right some people can't stand to not
be getting attention and so my revenge on this is to give as little attention as possible so let's move on what would you do if you knew you couldn't fail I would put in 100% of the orders for every single chip we could possibly manufacture because right now the demand is unlimited but every time you triple you find the same number of problems and so you got to keep you got to do it a little judiciously but if I knew that no matter what problem was going to come up that we didn't need to be
safe at all I would just go great we're going to go build 20 million chips done in 10 years time is NVIDIA 3x bigger 10x bigger or 50x bigger I I think they will be bigger I couldn't tell you a number training will become more important I wouldn't be surprised if they were 3x bigger I also wouldn't be surprised if they stayed around the same wow it's it's so hard to to tell where things are going because remember a lot of assumptions in in the investment in Nvidia were that they were going to run away
with the entire Market including the inference Market including the inference market and they just haven't built the right thing for inference I do think that as a weighing machine they should increase in value but so much popularity contest applied to it that I don't know if if they're they might need to grow to get to to the where they are you know they might need to grow their revenue to get to where they are but it's a pretty fair multiple given everything going on so I couldn't tell you like the popularity contest skews everything what's
a crazy AI prediction you have that everyone else think is science fiction I would assume that in the next 10 years and I know this is going to be crazy but you you you saw that picture of me and my weight loss right unbelievable dude 70 pound 70 pounds yeah but I was on monjaro so if you know anyone is overweight and it's hurting their health get them on Manjaro as soon as you can it works what is mjar it's one of those uh glp Inhibitors one of the the weight loss drugs that have become
popular recently it works but my crazy AI belief is that if it is possible if it is possible to significant ly slow or stop aging I think that you will have a mjo moment in maybe the next 10 years it because that came out of nowhere all of a sudden you know you you could just lose weight something finally worked and it's worked for a bunch of people you probably know a bunch of people who've lost weight yeah exactly and I don't know if it is possible to slow or stop aging right some some wear
and terror is a real thing and it might just be impossible but if it is impossible to slow or stop a if then I think in the next 10 years we will do it and it will be sudden it'll be like the monjaro um um and the other one as well the other it'll be like that moment I don't see how it is not possible like when you look at the advances that will come in medical research I don't see how it's not possible that we will at least extend you know longevity by 60 years
I mean diary have will live to 150 I don't see why that's impossible I don't either but I also don't know that it is possible and and until I know that I'm going to I'm going to stick that conditional in there and say if possible uh what have you changed your mind on in the last 12 months and and this is less of a a mental one and more of an emotional one we didn't have product Market fit for seven years yeah like that is terrible like the morale like when you find product Market fit
the world is brighter the birds sing like I feel like hugging people you sleep I sleep life is better and you know I forget if it was you or someone else someone was talking about type one and type two happiness yeah and I think there's a I think there's a third so as a Founder The only type of happiness you get is this third type which is future happiness the other two the common ones are um the present is happy right and the the other one is you went through some real crappy stuff but the
memories are make you happy right so there's past there's present and there's future as a Founder you're living 100% in future happiness when you get product Market fit you start to get um you start to get that past happiness and when you start to get that revenue and everything then you end up getting the the the present happiness and it it changes everything I love that if you had to bet on one company other than grow to define the AI era who would it be I would probably focus more on the companies that that you
haven't heard about um and I don't know what the companies are but I can tell you what they're going to do and I can tell you what each one of them will be Co the first one will be the one that solves the hallucination problem the second one will be the one who is best able to break down sub goals for agentic I think agentic comes after you solve the hallucination problem because otherwise you got these long chains where you can introduce hallucinations it'll kind of work but it'll work much better after I think the
next one is what I what I call the invent stage so right now the way LMS work they make the most probable prediction it's actually kind of amazing it's like I'm going to take an entire novel I'm going to delete you and you've got um a detective you know Murder Mystery and you get to the point where the detective says and the murderer is and it can actually predict it it had to understand everything right but it's going to give you the most probable answer and that's not good for invention it's not good for you
know art writing the reason that the writing from llms is terrible is because it's predictable so how do you actually say something that's nonobvious but is obvious when you see it we don't even have the right word for it right non obvious but obvious and that is going to unlock invention and then the final one is what I call the proxy stage when someone makes it so that models can just make decisions for you you can proxy your decisions like the decision to do do this interview right like other things had to be canceled the
flight had to be booked we had to get a ride over right you would trust an EA or a chief of staff to make that decision you wouldn't trust in llm and that's the final stage I think before you get to gener of AI but each company that does that is going to be a defining company can I ask you you said that we're going to have to fix hallucination before we get like efficient agentes does that mean that money going into a g say will be burned no I I let's take an example on
hallucination so the examples I gave you were medal medical diagnosis and law were two areas that will be unlocked once we get rid of the hallucinations but there are um startups like perplexity that are doing just fine right now even though there's hallucinations because it's not high risk and it you know it's for entertainment only but but if you click those links you can check them and and it works kind of okay it depends on how risky the industry is that you're in on whether or not you can get started trying to position for the
wave early and generating um but if you're in the right position we were in the right position for seven years and the wave came so that money isn't incinerated in fact that recent deal we just announced is more Revenue than the money we've raised so does that how does that cash hit I know it's a ner throughout the year throughout the year yeah but it's this year there potentially more next year lot more what is that contract in three years if we sell everything that we possibly can this year it is many billions but just
from the capacity alone there's there's tens of billions of capacity of of Hardware that we could build next year in these sort of deals but we're also doing it at you know high volume low margin so if we were talking about GPU sales and and GPU prices I mean we'd be talking about hundreds of billions we're just not charging that much final one for you the thing that I'm singly most excited for is actually like um disease Discovery in terms of drugs you obviously my mother has Ms and that's incred it was always taught to
me that it was incurable and actually like now it's like actually maybe not what are you singly most excited for we went from a phase where people were Hardware Engineers to they were software Engineers to be a hardware engineer is ridiculously difficult the the training you have to get things right there's a real expense if you get it wrong becoming a software engineer so much easier all you have to do is get a little bit of time on a machine and you can teach yourself nowadays you can just download um manuals from the internet um
or or tutorials from the Internet or or whatever I think prompt engineering is going to unlock a huge swap of human society there's 1.3 1.4 billion people in Africa who know how to who know how to speak and if you were to give them access to a tool that they could create applications live just by speaking to it that would be another 1.3 1.4 billion potential entrepreneurs there's 8 billion people on the planet and the difference is Hardware was just ridiculously difficult to it was Arcane knowledge that's hard to get software was plentiful language you
already know it you don't have to learn a thing what's that going to do for Venture what's that going to do for entrepreneurialism Jonathan I love talking to you it's always such a broad and wide ranging discussion thank you so much for putting up with me in person and I've loved it awesome so glad to be here
Related Videos
Steeve Morin: Why Google Will Win the AI Arms Race & OpenAI Will Not | E1262
1:18:10
Steeve Morin: Why Google Will Win the AI A...
20VC with Harry Stebbings
6,363 views
Risks, Rewards & Building the Unicorn Chip Company Taking on Nvidia | Inside Groq with Jonathan Ross
1:15:14
Risks, Rewards & Building the Unicorn Chip...
The Eric Ries Show
6,926 views
NVIDIA CEO Jensen Huang's Vision for the Future
1:03:03
NVIDIA CEO Jensen Huang's Vision for the F...
Cleo Abram
1,861,232 views
Fabien Pinckaers, CEO @Odoo: The Billionaire Founder Who Doesn’t Care About Money | E1259
1:15:33
Fabien Pinckaers, CEO @Odoo: The Billionai...
20VC with Harry Stebbings
33,042 views
Satya Nadella – Microsoft’s AGI Plan & Quantum Breakthrough
1:16:55
Satya Nadella – Microsoft’s AGI Plan & Qua...
Dwarkesh Patel
554,290 views
A conversation with Bill Gates
1:19:38
A conversation with Bill Gates
Stripe
22,737 views
Merz’s Love Actually Moment & The Unquestionable Ally
53:19
Merz’s Love Actually Moment & The Unquesti...
The Rest Is Politics
70,108 views
DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters | Lex Fridman Podcast #459
5:06:19
DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC...
Lex Fridman
1,724,188 views
Deep Dive into LLMs like ChatGPT
3:31:24
Deep Dive into LLMs like ChatGPT
Andrej Karpathy
1,234,312 views
AJ Tennant: How to Build a Sales Machine and Where Most Go Wrong | E2118
1:08:51
AJ Tennant: How to Build a Sales Machine a...
20VC with Harry Stebbings
36,965 views
AI bosses on what keeps them up at night
14:24
AI bosses on what keeps them up at night
The Economist
88,272 views
5 Expert Predictions: The Future of Artificial Intelligence
48:49
5 Expert Predictions: The Future of Artifi...
Joe Lonsdale
14,987 views
Adarsh Hiremath @ Mercor: The Most Intense Culture in Silicon Valley  | E1261
46:24
Adarsh Hiremath @ Mercor: The Most Intense...
20VC with Harry Stebbings
27,319 views
Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity | Lex Fridman Podcast #452
5:15:01
Dario Amodei: Anthropic CEO on Claude, AGI...
Lex Fridman
1,426,339 views
Google DeepMind CEO Demis Hassabis: The Path To AGI, Deceptive AIs, Building a Virtual Cell
54:58
Google DeepMind CEO Demis Hassabis: The Pa...
Alex Kantrowitz
191,197 views
Jensen Huang, Founder and CEO of NVIDIA
56:27
Jensen Huang, Founder and CEO of NVIDIA
Stanford Graduate School of Business
1,788,873 views
Odoo vs Zoho vs Microsoft : Building Odoo into a 433 million dollar company? :  IBP EP12
1:16:57
Odoo vs Zoho vs Microsoft : Building Odoo ...
Think School
193,566 views
Analysis of TSMC: The most important company in the world | Lex Fridman Podcast
28:25
Analysis of TSMC: The most important compa...
Lex Clips
148,499 views
How To Build The Future: Aravind Srinivas
34:39
How To Build The Future: Aravind Srinivas
Y Combinator
77,850 views
The truth about New Microsoft Majorana Chip
18:35
The truth about New Microsoft Majorana Chip
Anastasi In Tech
114,211 views
Copyright © 2025. Made with ♥ in London by YTScribe.com