Item 43661807 - HN

thunderbird120 • 9 days ago

This article doesn't mention TPUs anywhere. I don't think it's obvious for people outside of google's ecosystem just how extraordinarily good the JAX + TPU ecosystem is. Google several structural advantages over other major players, but the largest one is that they roll their own compute solution which is actually very mature and competitive. TPUs are extremely good at both training and inference[1] especially at scale. Google's ability to tailor their mature hardware to exactly what they need gives them a massive leg up on competition. AI companies fundamentally have to answer the question "what can you do that no one else can?". Google's hardware advantage provides an actual answer to that question which can't be erased the next time someone drops a new model onto huggingface.

[1]https://blog.google/products/google-cloud/ironwood-tpu-age-o...

11

marcusb • 8 days ago

From the article:

> I’m forgetting something. Oh, of course, Google is also a hardware company. With its left arm, Google is fighting Nvidia in the AI chip market (both to eliminate its former GPU dependence and to eventually sell its chips to other companies). How well are they doing? They just announced the 7th version of their TPU, Ironwood. The specifications are impressive. It’s a chip made for the AI era of inference, just like Nvidia Blackwell

thunderbird120 • 8 days ago

Nice to see that they added that, but that section wasn't in the article when I wrote that comment.

marcusb • 8 days ago

Maybe they read your comment?

SubiculumCode • 8 days ago

It was there.

marcusb • 8 days ago

To be fair to thunderbird120, the author of this piece made edits at some point. See https://archive.is/K4n9E. No discussion of the recent TPU releases, or TPUs for all, for that matter.

SubiculumCode • 8 days ago

You are correct. I misjudged.I thought I had read the article early, it must have been just after the edits.

jibal • 8 days ago

"I’m forgetting something." was a giant blaring clue. Take this as an opportunity to learn the lesson of not calling someone a liar unless you are very very sure and have taken all the evidence into account.

SubiculumCode • 6 days ago

whoah. I accused no one of lying. It is amazing how many people mix up this basic idea: Saying something that is incorrect is not the same as lying about it.

jibal • 5 days ago

"that section wasn't in the article when I wrote that comment."

"It was there."

Everyone understands that such naysaying is effectively an accusation of lying. In any case it was a totally low effort utterly inappropriate comment. Clearly you aren't going to learn the lesson.

marcusb • 8 days ago

You and me both.

krackers • 9 days ago

Assuming that DeepSeek continues to open-source, then we can assume that in the future there won't be any "secret sauce" in model architecture. Only data and training/serving infrastructure, and Google is in a good position with regard to both.

jononor • 9 days ago

Google is also in a great position wrt distribution - to get users at scale, and attach to pre-existing revenue streams. Via Android, Gmail, Docs, Search - they have a lot of reach. YouTube as well, though fit there is maybe less obvious. Combined with the two factors you mention, and the size of their warchest - they are really excellently positioned.

mark_l_watson • 8 days ago

Over the last nine months, I have periodically tested Gemini’s access to and effective use of data from Gmail/Docs/Calendar/Keep-notes, etc.

The improvement has been steady and impressive. The entire integration is becoming a product that I want to use.

cootsnuck • 7 days ago

Yea, I just ended up trying out their Gemini stuff in Sheets and Slides. In Sheets it's pretty cool to have it just directly insert formulas for me. In Slides it's...okay...it was useful for me to rush to get a presentation done. But I can tell it's pretty bad compared to literally anyone who has enough time to just create a decent presentation. But I can also tell it will get better at least.

tinodb • 7 days ago

Does that ever provide you with anything more than a lame summary? I mean Gemini models can do a lot, but I don’t have the feeling they’ve well integrated tbh.

mattlondon • 8 days ago

YouTube is very well positioned - all these video generating models etc. I am sure they'll be loads of AI editors too

vitaflo • 8 days ago

Good, maybe Youtube will finally recommend something to me I actually want to watch.

HaZeust • 7 days ago

Personally, I've never actually heard this problem. Do you watch industry-specific videos in a non-anonymized browser session enough? Once you watch, like, 5 videos on topics you care about, the algorithm has no shortage of astute suggestions.

collingreen • 7 days ago

I'm not the person you're replying to but in my experience the YouTube algorithm is quite bad at filling my wish for a variety of topics and tone at all levels. I feel like watching one or two clips from the same channel suddenly floods me with that going forward which is rarely what I want. Personally I have a core set of things I want lots of plus I'd really appreciate brief forays into new topics with similar creators or new creators with similar topics but this feels completely impossible for me on yt.

I think they've jumped the shark and need to give me more control because currently I actively avoid watching videos I think MIGHT be interesting because the risk is too high. This is a terrible position to put your users in both from a specific experience perspective but also in a "how they feel about your product" perspective.

sciurus • 3 days ago

> I actively avoid watching videos I think MIGHT be interesting because the risk is too high

You can remove videos from your viewing history. I do this when I start watching something but the content turnes out to not be what I expected. It seems to prevent polluting my recommendations.

soco • 6 days ago

The problem with that is, I have more than one interest. So any algorithm will make a salad out of that and never hit me with what I want. Even if this particular session was about topic X they will still fill it with proposals from yesterday's topic Y, or extrapolate from them to topic Z I don't care, and so on. Mostly useless either way.

sharpshadow • 5 days ago

There videos which YouTube will very very rarely recommended you, especially if they are not monetised and let’s say critical or opposing the mainstream.

fulafel • 9 days ago

Making your own hardware would seem to yield freedoms in model architectures as well since performance is closely related to how the model architecture fits the hardware.

chermi • 5 days ago

Huh? I don't think it's that simple. As far as we know, everyone has some secret sauce. You're assuming deepseek will find all of that.

spwa4 • 8 days ago

... except that it still pretty much requires Nvidia hardware. Maybe not for edge inference, but even inference at scale (ie. say at companies, or governments) will still require it.

mike_hearn • 8 days ago

TPUs aren't necessarily a pro. They go back 15 years and don't seem to have yielded any kind of durable advantage. Developing them is expensive but their architecture was often over-fit to yesterday's algorithms which is why they've been through so many redesigns. Their competitors have routinely moved much faster using CUDA.

Once the space settles down, the balance might tip towards specialized accelerators but NVIDIA has plenty of room to make specialized silicon and cut prices too. Google has still to prove that the TPU investment is worth it.

summerlight • 8 days ago

Not sure how familiar you are with the internal situation... But from my experience think it's safe to say that TPU basically multiplies Google's computation capability by 10x, if not 20x. Also they don't need to compete with others to secure expensive nvidia chips. If this is not an advantage, I don't see there's anything considered to be an advantage. The entire point of vertical integration is to secure full control of your stack so your capability won't be limited by potential competitors, and TPU is one of the key component of its strategy.

Also worth noting that its Ads division is the largest, heaviest user of TPU. Thanks to it, it can flex running a bunch of different expensive models that you cannot realistically afford with GPU. The revenue delta from this is more than enough to pay off the entire investment history for TPU.

mike_hearn • 8 days ago

They must very much compete with others. All these chips are being fabbed at the same facilities in Taiwan and capacity trades off against each other. Google has to compete for the same fab capacity alongside everyone else, as well as skilled chip designers etc.

> The revenue delta from this is more than enough to pay off the entire investment history for TPU.

Possibly; such statements were common when I was there too but digging in would often reveal that the numbers being used for what things cost, or how revenue was being allocated, were kind of ad hoc and semi-fictional. It doesn't matter as long as the company itself makes money, but I heard a lot of very odd accounting when I was there. Doubtful that changed in the years since.

Regardless the question is not whether some ads launches can pay for the TPUs, the question is whether it'd have worked out cheaper in the end to just buy lots of GPUs. Answering that would require a lot of data that's certainly considered very sensitive, and makes some assumptions about whether Google could have negotiated private deals etc.

summerlight • 8 days ago

> They must very much compete with others. All these chips are being fabbed at the same facilities in Taiwan and capacity trades off against each other.

I'm not sure what you're trying to deliver here. Following your logic, even if you have a fab you need to compete for rare metals, ASML etc etc... That's a logic built for nothing but its own sake. In the real world, it is much easier to compete outside Nvidia's own allocation as you get rid of the critical bottleneck. And Nvidia has all the incentives to control the supply to maximize its own profit, not to meet the demands.

> Possibly; such statements were common when I was there too but digging in would often reveal that the numbers being used for what things cost, or how revenue was being allocated, were kind of ad hoc and semi-fictional.

> Regardless the question is not whether some ads launches can pay for the TPUs, the question is whether it'd have worked out cheaper in the end to just buy lots of GPUs.

Of course everyone can build their own narratives in favor of their launch, but I've been involved in some of those ads quality launches and can say pretty confidently that most of those launches would not be launchable without TPU at all. This was especially true in the early days of TPU as the supply of GPU for datacenter was extremely limited and immature.

More GPU can solve? Companies are talking about 100k~200k of H100 as a massive cluster and Google already has much larger TPU clusters with computation capability in a different order of magnitudes. The problem is, you cannot simply buy more computation even if you have lots of money. I've been pretty clear about how relying on Nvidia's supply could be a critical limiting factor in a strategic point of view but you're trying to move the point. Please don't.

alienthrowaway • 8 days ago

> Developing them is expensive

So are the electric and cooling costs at Google's scale. Improving perf-per-watt efficiency can pay for itself. The fact that they keep iterating on it suggests it's not a negative-return exercise.

mike_hearn • 8 days ago

TPUs probably can pay for themselves, especially given NVIDIA's huge margins. But it's not a given that it's so just because they fund it. When I worked there Google routinely funded all kinds of things without even the foggiest idea of whether it was profitable or not. There was just a really strong philosophical commitment to doing everything in house no matter what.

marsten • 8 days ago

> When I worked there Google routinely funded all kinds of things without even the foggiest idea of whether it was profitable or not.

You're talking about small-money bets. The technical infrastructure group at Google makes a lot of them, to explore options or hedge risks, but they only scale the things that make financial sense. They aren't dumb people after all.

The TPU was a small-money bet for quite a few years until this latest AI boom.

mike_hearn • 7 days ago

Maybe it's changed. I'm going back a long way but part of my view on this was shaped by an internal white paper written by an engineer who analyzed the cost of building a Gmail clone using commodity tech vs Google's in house approach, this was maybe circa 2010. He didn't even look at people costs, just hardware, and the commodity tech stack smoked Gmail's on cost without much difference in features (this was focused on storage and serving, not spam filtering where there was no comparably good commodity solution).

The cost delta was massive and really quite astounding to see spelled out because it was hardly talked about internally even after the paper was written. And if you took into account the very high comp Google engineers got, even back then when it was lower than today, the delta became comic. If Gmail had been a normal business it'd have been outcompeted on price and gone broke instantly, the cost disadvantage was so huge.

The people who built Gmail were far from dumb but they just weren't being measured on cost efficiency at all. The same issues could be seen at all levels of the Google stack at that time. For instance, one reason for Gmail's cost problem was that the underlying shared storage systems like replicated BigTables were very expensive compared to more ordinary SANs. And Google's insistence on being able to take clusters offline at will with very little notice required a higher replication factor than a normal company would have used. There were certainly benefits in terms of rapid iteration on advanced datacenter tech, but did every product really need such advanced datacenters to begin with? Probably not. The products I worked on didn't seem to.

Occasionally we'd get a reality check when acquiring companies and discovering they ran competitive products on what was for Google an unimaginably thrifty budget.

So Google was certainly willing to scale things up that only made financial sense if you were in an environment totally unconstrained by normal budgets. Perhaps the hardware divisions operate differently, but it was true of the software side at least.

foota • 8 days ago

Haven't Nvidia published roughly as many chip designs in the same period?

mike_hearn • 8 days ago

The issue isn't number of designs but architectural stability. NVIDIA's chips have been general purpose for a long time. They get faster and more powerful but CUDA has always been able to run any kind of neural network. TPUs used to be over-specialised to specific NN types and couldn't handle even quite small evolutions in algorithm design whereas NVIDIA cards could. Google has used a lot of GPU hardware too, as a consequence.

pixl97 • 8 days ago

At the same time if the TPU didn't exist NVIDIA would pretty much have a complete monopoly on the market.

While Nv does have an unlimited money printer at the moment, the fact that at least some potential future competition exists does represent a threat to that.

dgacmu • 8 days ago

They go back about 11 years.

phillypham • 8 days ago

Depending how you count, parent comment is accurate. Hardware doesn't just appear. 4 years of planning and R&D for the first generation chip is probably right.

dgacmu • 8 days ago

The first TPU (Seastar) was designed, tested, and deployed in 15 months: https://arxiv.org/pdf/1704.04760

They started becoming available internally in mid 2015.

mike_hearn • 8 days ago

I was wrong, ironically because Google's AI overview says it's 15 years if you search. The article it's quoting from appears to be counting the creation of TensorFlow as an "origin".

dgacmu • 8 days ago

That's awesome. :) and even that article is off. They probably were thinking of DistBelief, the predecessor to TF.

imtringued • 8 days ago

Google is what everyone thinks OpenAI is.

Google has their own cloud with their data centers with their own custom designed hardware using their own machine learning software stack running their in-house designed neural networks.

The only thing Google is missing is designing a computer memory that is specifically tailored for machine learning. Something like processing in memory.

ENGNR • 8 days ago

The one thing they lack that OpenAI has is… product focus. There’s some kind of management issue that makes Google all over the shop, cancelling products for no reason. Whereas Sam Altmans team is right on the money.

Google is catching up fast on product though.

jxjnskkzxxhx • 8 days ago

I've used Jax quite a bit and it's so much better than tf/pytorch.

Now for the life of me, I still haven't been able to understan what a TPU is. Is it Google's marketing term for a GPU? Or is it something different entirely?

mota7 • 8 days ago

There's basically a difference in philosophy. GPU chips have a bunch of cores, each of which is semi-capable, whereas TPU chips have (effectively) one enormous core.

So GPUs have ~120 small systolic arrays, one per SM (aka, a tensorcore), plus passable off-chip bandwidth (aka 16 lines of PCI).

Where has TPUs have one honking big systolic array, plus large amounts of off-chip bandwidth.

This roughly translates to GPUs being better if you're doing a bunch of different small-ish things in parallel, but TPUs are better if you're doing lots of large matrix multiplies.

317070 • 8 days ago

Way back when, most of a GPU was for graphics. Google decided to design a completely new chip, which focused on the operations for neural networks (mainly vectorized matmul). This is the TPU.

It's not a GPU, as there is no graphics hardware there anymore. Just memory and very efficient cores, capable of doing massively parallel matmuls on the memory. The instruction set is tiny, basically only capable of doing transformer operations fast.

Today, I'm not sure how much graphics an A100 GPU still can do. But I guess the answer is "too much"?

kcb • 8 days ago

Less and less with each generation. The A100 has 160 ROPS, a 5090 has 176, the H100 and GB100 have just 24.

JLO64 • 8 days ago

TPUs (short for Tensor Processing Units) are Google’s custom AI accelerator hardware which are completely separate from GPUs. I remember that introduced them in 2015ish but I imagine that they’re really starting to pay off with Gemini.

https://en.wikipedia.org/wiki/Tensor_Processing_Unit

jxjnskkzxxhx • 8 days ago

Believe it or not, I'm also familiar with Wikipedia. It reads that they're optimized for low precisio high thruput. To me this sounds like a GPU with a specific optimization.

flebron • 8 days ago

Perhaps this chapter can help? https://jax-ml.github.io/scaling-book/tpus/

It's a chip (and associated hardware) that can do linear algebra operations really fast. XLA and TPUs were co-designed, so as long as what you are doing is expressible in XLA's HLO language (https://openxla.org/xla/operation_semantics), the TPU can run it, and in many cases run it very efficiently. TPUs have different scaling properties than GPUs (think sparser but much larger communication), no graphics hardware inside them (no shader hardware, no raytracing hardware, etc), and a different control flow regime ("single-threaded" with very-wide SIMD primitives, as opposed to massively-multithreaded GPUs).

jxjnskkzxxhx • 7 days ago

Thank you for the answer! You see, up until now I had never appreciated that a GPU does more than matmuls... And that first reference, what a find :-)

Edit: And btw, another question that I had had before was what's the difference between a tensor core and a GPU, and based on your answer, my speculative answer to that would be that the tensor core is the part inside the GPU that actually does the matmuls.

jibal • 8 days ago

You asked a question, people tried to help, and you lashed out at them in a way that makes you look quite bad.

kgwgk • 8 days ago

Did you also read just after that "without hardware for rasterisation/texture mapping"? Does that sound like a _G_PU?

crazygringo • 8 days ago

I mean yes. But GPU's also have a specific optimization, for graphics. This is a different optimization.

albert_e • 8 days ago

Amazon also invests in own hardware and silicon -- the Inferentia and Trainium chips for example.

But I am not sure how AWS and Google Cloud match up in terms of making this verticial integration work for their competitive advantage.

Any insight there - would be curious to read up on.

I guess Microsoft for that matter also has been investing -- we heard about the latest quantum breakthrough that was reported as creating a fundamenatally new physical state of matter. Not sure if they also have some traction with GPUs and others with more immediate applications.

chazeon • 3 days ago

I think Amazon, Meta have been trying on inference hardware, they throw their hands up on training; but TPUs can actually be used in training, based on what I saw in Google’s colab.

6510 • 7 days ago

The problem is always their company never the product. They had countless great products. You cant depend on a product if the company is reliably unreliable enough. If they don't simply delete it for being expensive and "unprofitable" they might initially win, eventually, like search and youtube, it will be so watered down you cant taste the wine.

AlbertoRomGar • 7 days ago

I am the author of the article. It was there since the beginning, just behind the paywall, which I removed due to the amount of interest the topic was receiving.

noosphr • 9 days ago

And yet google's main structural disadvantage is being google.

Modern BERT with the extended context has solved natural language web search. I mean it as no exaggeration that _everything_ google does for search is now obsolete. The only reason why google search isn't dead yet is that it takes a while to index all web paged into a vector database.

And yet it wasn't google that released the architecture update, it was hugging face as a summer collaboration between a dozen people. Google's version came out in 2018 and languished for a decade because it would destroy their business model.

Google is too risk averse to do anything, but completely doomed if they don't cannibalize their cash cow product. Web search is no longer a crown jewel, but plumbing that answering services, like perplexity, need. I don't see google being able to pull off an iPhone moment where they killed the iPod to win the next 20 years.

visarga • 9 days ago

> Modern BERT with the extended context has solved natural language web search. I mean it as no exaggeration that _everything_ google does for search is now obsolete.

The web UI for people using search may be obsolete, but search is hot, all AIs need it, both web and local. It's because models don't have recent information in them and are unable to reliably quote from memory.

nroets • 9 days ago

And models often makes reasoning errors. Many users will want to check that the sources substantiate the conclusion.

fennecfoxy • 7 days ago

As they should do even for a Google search.

I see search engines as a dripfeed from a firehose, not some magical thing that's going to get me the 100% correct 100% accurate result.

Humans are the most prolific liars; I could never trust search results anyway since Google may find something that looks right but the author may be heavily biased, uninformed and all manner of other things anyways.

vidarh • 9 days ago

The point is that the secret sauce in Google's search was better retrieval, and the assertion above is that the advantage there is gone. While crawling the web isn't a piece of cake, it's a much smaller moat than retrieval quality was.

pixl97 • 8 days ago

Eh, I don't really see that.

Crawling the web has a huge moat because a huge number of sites have blocked 'abusive' crawlers except Google and possibly Bing.

For example just try to crawl sites like Reddit and see how long before you're blocked and get a "please pay us for our data" message.

literalAardvark • 8 days ago

My experience running a few hundred very successful shops (hundreds of thousands of orders per month) is that there's no need for quotes around 'abusive'.

95% of our load is from crawlers, so we have to pick who to serve.

If they want our data all they need to do is offer a way for us to send it, we're happy to increase exposure and shopping aggregation site updates are our second highest priority task after price and availability updates.

vidarh • 7 days ago

It may be tricky, but it's a piece of cake compared to doing good retrieval.

petesergeant • 9 days ago

> Google is too risk averse to do anything, but completely doomed if they don't cannibalize their cash cow product.

Google's cash-cow product is relevant ads. You can display relevant ads in LLM output or natural language web-search. As long as people are interacting with a Google property, I really don't think it matters what that product is, as long as there are ad views. Also:

> Web search is no longer a crown jewel, but plumbing that answering services, like perplexity, need

This sounds like a gigantic competitive advantage if you're selling AI-based products. You don't have to give everyone access to the good search via API, just your inhouse AI generator.

michaelt • 9 days ago

Kodak was well placed to profit from the rise of digital imaging - in the late 1970s and early 1980s Kodak labs pioneered colour image sensors, and was producing some of the highest resolution CCDs out there.

Bryce Bayer worked for Kodak when he invented and patented the Bayer pattern filter used in essentially every colour image sensor to this day.

But the problem was: Kodak had a big film business - with a lot of film factories, a lot of employees, a lot of executives, and a lot of recurring revenue. And jumping into digital with both feet would have threatened all that.

So they didn't capitalise on their early lead - and now they're bankrupt, reduced to licensing their brand to third-party battery makers.

> You can display relevant ads in LLM output or natural language web-search.

Maybe. But the LLM costs a lot more per response.

Making half a cent is very profitable if you only take 0.2s of CPU to do it. Making half a cent with 30 seconds multiple GPUs, consuming 1000W of power... isn't.

djtango • 8 days ago

This is a good anecdote and it reminds me of how Sony had cloud architecture/digital distribution, a music label, movie studio, mobile phones, music players, speakers, tvs, laptops, mobile apps... and totally missed out on building Spotify or Netflix.

I do think Google is a little different to Kodak however; their scale and influence is on another level. GSuite, Cloud, YouTube and Android are pretty huge diversifications from Search in my mind even if Search is still the money maker...

decimalenough • 8 days ago

Sony's Achilles heel was and remains software. You can't build a Spotify or Netflix if you can't build a proper website.

vel0city • 8 days ago

That, and while Sony had all these big groups they often didn't play nice with each other. Look at how they failed to make Minidisc into any useful data platforms with PCs, largely because MD's were consumer devices and not personal computers so they were pretty much only seen as music hardware.

Even on the few Vaios that had MD drives on them, they're pretty much just an external MD player permanently glued to the device instead of being a full and deeply integrated PC component.

fragmede • 8 days ago

It goes to internal corporate culture, and what happens to you when you point out an uncomfortable truth. Do we shoot the messenger, or heed her warnings and pivot the hopefully not Titanic? RIM/Blackberry didn't manage to avoid it either.

People like to believe CEOs aren't worth their pay package, and sometimes they're not. But a look at a couple of their failures and a different CEO of Kodak wouldn't have had what happened happen, makes me think that sometimes, some of them do deserve that.

fennecfoxy • 7 days ago

I firmly believe the majority of CEOs and executives may do something useful (often not) but none of them truly _earn_ their multi-million salaries (for those that are on the modern 100-1000x salaries). It's just suits, handshakes and social connections from certain schools/families. That's all.

Constantly I see them dodging responsibility or resigning (as an "apology") during a crisis they caused and then moving on to the next place they got buddies at for another multi-mil salary.

Many here would defend 'em tho. HN/SV tech people seem to aspire to such things from what I've seen. The rest of us just really think computers are super cool.

johnecheck • 8 days ago

If the king/ceo is great, autocracy works well.

When a fool inevitably takes the throne, disaster ensues.

I can't say for sure that a different system of government would have saved Kodak. But when one man's choices result in disaster for a massive organization, I don't blame the man. I blame the structure that laid the power to make such a mistake on his shoulders.

fragmede • 8 days ago

that seems weird. Why hold up one person as being great while not also holding up one person as not? If my leader led me into battle and we were victorious, we'd put it on them. if they lead us to ruin, why should I blame the organizational structure that led to them getting power as the culprit instead of blaming them directly?

johnecheck • 6 days ago

I'm not saying you'd be wrong to blame the bad leader - just that blaming them doesn't achieve much.

The CEO takes the blame, the board picks a new one (Unless the CEO has special shares that make them impossible to dismiss), and we go on hoping that the king isn't an idiot this time.

My reading of history is that some people are fools - we can blame them for their incompetence or we can set out to build foolproof systems. (Obviously, nothing will be truly foolproof. But we can build systems that are robust against a minority of the population being fools/defectors.)

dgacmu • 8 days ago

1/2 kW/minute costs about $0.001 so you technically could make a profit at that rate. The real problem is the GPU cost - a $20k GPU amortized over five years costs $0.046 per second. :)

pingou • 8 days ago

How do you get that? I get $0.0001 per second over 5 years to reach 20k.

dgacmu • 8 days ago

Because I'm an idiot and left off a factor of 365. Thank you! A 20k GPU for 30 seconds is 1/3 of a cent. Still more than the power but also potentially profitable under this scenario informing all the other overhead and utilization.

lonelyasacloud • 8 days ago

> Google's cash-cow product is relevant ads.

As a business Google's interest is in showing ads that make it the most money - if they quickly show just the relevant information then Google loses advertising opportunities.

To an extent, it is the web equivalent of irl super markets intentionally moving stuff around and having checkout displays.

dambusm • 8 days ago

> As a business Google's interest is in showing ads that make it the most money - if they quickly show just the relevant information then Google loses advertising opportunities.

This is just a question of UX- the purpose of their search engine was already to show the most relevant information (ie. links), but they just put some semi-relevant information (ie. sponsored links) first, and make a fortune. They can just do the same with AI results.

danpalmer • 9 days ago

This would be like claiming in 2010 that because Page Rank is out there, search is a solved problem and there’s no secret sauce, and the following decade proved that false.

noosphr • 8 days ago

In a time where statistical models couldn't understand natural language the click stream from users was their secret sauce.

Today a consumer grade >8b decoder only model does a better job of predicting if some (long) string of text matches a user query than any bespoke algorithm would.

The only reason why encoder only models are better than decoder only models is that you can cache the results against the corpus ahead of time.

jampekka • 9 days ago

> Modern BERT with the extended context has solved natural language web search.

I doubt this. Embedding models are no panacea even with a lot simpler retrieval tasks like RAG.

noosphr • 8 days ago

RAG is literally what Google Search is.

Unlike the natural language queries that RAG has to deal with, Google searches are (usually) atomic ideas and encoder-only models have a much easier time with them.

podnami • 9 days ago

Do we have insights on whether they knew that their business model was at risk? My understanding is that OpenAI’s credibility lies in seeing the potential of scaling up a transformer-based model and that Google was caught off guard.

marsten • 8 days ago

I think what may save Google from an Innovator's Dilemma extinction is that none of the AI would-be Google killers (OpenAI etc.) have figured out how to achieve any degree of lock-in. We're in a phase right now where everybody gets excited by the latest model and the switching cost is next to zero. This is very different from the dynamics of, say, Intel missing the boat on mobile CPUs.

I've been wondering for some time what sustainable advantage will end up looking like in AI. The only obvious thing is that whoever invents an AI that can remember who you are and every conversation it's had with you -- that will be a sticky product.

noosphr • 7 days ago

Who ever gets AI to be able to search the whole corpus of human knowledge. I'm not just talking about web pages, I'm talking every book, every scientific paper, every news paper, every piece of text stored somewhere.

I've build RAG systems that index tokens in the 1e12 range and the main thing stopping us from having a super search that will make google look like the library card catalogue is the copyright system.

A country that ignores that and builds the first XXX billion parameter encoder only model will do for knowledge work what the high pressure steam engine did for muscle work.

dash2 • 9 days ago

They can just plug the google.com web page into their AI. They already do that.

fragmede • 8 days ago

but because users are used to doing that for free, they can't charge money for that, but if they don't charge money for that, and no one's seeing ads, then where does they money come from?

eitally • 8 days ago

Well, it clearly affects search ads, but in terms of revenue streams Google is already somewhat diversified:

1. Search ads (at risk of disintermediation) 2. Display ads (not going anywhere) 3. Ad-supported YouTube 4. Ad-supported YouTube TV 5. Ad-supported Maps 6. Partnership/Ad supported Travel, YouTube, News, Shopping (and probably several more) 7. Hardware (ChromeOS licensing, Android, Pixel, Nest) 8. Cloud

There are probably more ad-supported or ad-enhanced properties, but what's been shifting over the past few years is the focus on subscription-supported products:

1. YouTube TV 2. YouTube Premium 3. GoogleOne (initially for storage, but now also for advanced AI access) 4. Nest Aware 5. Android Play Store 6. Google Fi 7. Workspace (and affiliated products)

In terms of search, we're already seeing a renaissance of new options, most of which are AI-powered or enhanced, like basic LLM interfaces (ChatGPT, Gemini, etc), or fundamentally improved products like Perplexity & Kagi. But Google has a broad and deep moat relative to any direct competitors. Its existential risk factors are mostly regulation/legal challenge and specific product competition, but not everything on all fronts all at once.

acstorage • 7 days ago

Unclear if they can actually beat GPUs in training throughout with 4D parallelism

retinaros • 8 days ago

they re not alone to do that tho.. aws also does and I believe microsoft is into it too