Rendered at 20:38:15 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
ProllyInfamous 7 hours ago [-]
I did not "know" John personally, but we attended the same undergraduate program, concurrently. My workstudy job was supporting engineering/hardware for his specific school/program.
Nothing but nice things were ever heard about him. In a school of egos, his was approachably humble [†]... reasoning and facts seemed to make him tick. My twin actually had classes with him (small school), but also didn't know him well... but knew he was a swell guy. None of his IT tickets were ever typical "rude rich Vandy kid" – he could solve/delegate most his own problems.
So glad to see Vanderbilt secure yet another humble Laureate (Muhammud Yunis won his Nobel while we all were attending, I believe the previous alumnus-so).
Keep it humble, fellow knowledge-seeker Human John. Howdy from KissamKissam.
[†] most Obviously Brilliant-types "come off" as I_went_to_Harvard arrogant; not John ("awe" of his obvious brilliance?)
> Thanks John for an extraordinary partnership and wonderful collaboration over the past 9 years! What we achieved with AlphaFold changed the world, and showed the field what was possible with AI for science and medicine, lighting the way for how AI can benefit humanity.
CuriouslyC 1 days ago [-]
Something spicy must have happened internally at Google. This rapid fire high level attrition isn't just down to the bureaucratic quagmire.
mlmonkey 3 hours ago [-]
Here's my read. Take it for what you paid for it ... it comes from a couple of decades of experience in research labs and top companies.
Google appears to be falling behind in the AGI race. The leadership (MBAs) do what clueless leadership always does: they start cracking the whips, bring the knives out. People like Jumper, Shazeer, Dean, etc. are not built for fighting political battles; they're built for solving tough problems! What MBAs don't understand is that researchers at this level put a tremendous amount of pressure on themselves; and this internal pressure is far beyond what any external entity can apply. So, when MBAs start hassling top researchers with "so ... what have you done lately?" and "what are you working on? Is it important?" and "when are we getting AGI (with a smirk)" type of questions, then it feels really really grating: If I knew a sure path to the goal, I would work 24x7 to get there, dammit!
kranke155 1 days ago [-]
Is it possible they are just falling behind ?
Their newest model wasn’t really SOTA. And honestly fable 5 was the most human like model I’d ever tried. It was an incredible jump.
And recently lots of Claude users at r/ClaudeAI are noticing Opus 4.8 has really increased in capability. Not new things but maybe redirected compute. It just feels like one of the best models ever, maybe because the compute that was previously assigned to Fable has been redirected? It feels incredible.
jatins 13 hours ago [-]
I think they'll catch up in pure model capabilities but they do such a terrible job of making products from which to use their models that having the best model doesn't end up mattering.
Is Gemini good at writing code? I am sure it is. But where is their Codex? And no, antigravity isn't it.
Is Gemini good at making visualizations? I am sure it is. But where are artifact or visualise skill in gemini.google.com similar to what's available on claude.ai?
What is an average user going to do raw model capability if the product surface isn't expressive enough?
bjackman 12 hours ago [-]
This is weird coz as a user of both Gemini and Claude I have the opposite feeling.
Antigravity CLI is quite decent, it's a huge step up from Gemini CLI (like, for example, it actually fucking works) and has some genuine advantages over Claude Code. Does Codex have something over both of them? I haven't tried it.
But the model just fucking sucks. Before I switched to Claude for personal stuff a few weeks ago, I was like "damn model capabilities are really slowing down" but no, it's just Gemini that's slowing down.
Will have to see if 3.5 Pro is any good when that comes out. But it feels like they would be attempting to catch up to Opus, not to Fable.
FWIW issue is never really about the code it writes it's about general intelligence. Gemini hallucinates like it's 2024, fails to follow instructions, and goes down wildly wrong debugging paths. Opus just gets the job done, first time, every time. With Gemini it feels like "I _am_ glad this intern is working for me but I'm tired of babysitting him" and with Claude it's like "this new PhD guy can replace me soon".
mendigou 5 hours ago [-]
I would be interested to hear what advantages you find Antigravity CLI has over Claude Code.
bjackman 2 hours ago [-]
Nothing major just a few little details:
- the /artifact thing is quite useful (don't think CC has it?)
- the /tasks is a bit better than CC's equivalent
- there are a few built-in skills that I haven't found CC equivalents for in the built in set (but the fact that I haven't sought out 3rd party versions shows you they aren't very important).
And more generally it does a better job of making the agent available. When Claude is debugging something complex and running a bunch of experiments it's often unavailable for like 20 minutes at a time, you only have /btw. Whereas AGY tends to more aggressively use timers and background jobs.
But now I wrote that out, I realised it's probably just as much of a system prompt thing as a harness design thing. Coz Claude _can_ operate that way too.
Anyway, like I said none of these come anywhere near balancing out the model quality gap.
DontchaKnowit 7 hours ago [-]
I reallt think forum comments are heavily astrotrufed. Me and all my coworkers agree Opus performance has been horrible for the last to months or so.
I went from spending 20-60 dollars a day in api fees down to like 5 dollars a day cause I have had to limit my use to things I know it performs reliably on.
basch 1 days ago [-]
from the looks of it, 3.5 Flash is still better than most models
The idea of "falling behind" when you can leapfrog each other every six months leads me to believe it has to be more than just "falling behind" for one cycle. It's a culture, process, red tape, focus, or mandate problem of some sort. Something not as easily correctable preparing for next launch.
joe_mamba 24 hours ago [-]
>from the looks of it, 3.5 Flash is still better than most models
Define "better". I guess it depends on what you're using it for. I use it almost daily as an alternative to google search and it's great for that, but I think it's absolute garbage for coding and reasoning.
For questions related to coding, solving Arch Linux and WINE Lutris issues, helping me with MXLinux issues, and wifi issues on an old rooted huawei tablet running LineageOS, it was consistently wrong, constantly giving out confident but outdated or misinformation, or hallucinating stuff while gaslighting me. Every time I would point out it was wrong, it would re-check and keep apologizing and then repeat giving me wrong answers, and then apologising again and so on. It doesn't matter what prompts or jailbreaks you give it to get 3.5 Flash to chew longer on complex problems for better reasoning and accuracy, it just defaults to being lazy and giving you the quick and easy answer from its weights, which can be totally wrong. Same for asking it to write me a cover letter based on my resume and the job description I wanted to apply to. It massively sucked at that too and made up a bunch of unusable fake sounding BS.
Basic free tier ChatGPT 5.5 would blow it out of the water on all of those tasks. Hell, even Grok free is better at that, it gave me a one-shot Arduino code that blew Gemini 3.5 flash away.
3.5 Flash seems tuned to just eyeballing basic answers to general purpose questions that resemble Google searches like "give me a recipe" or "give me a workout plan", or "what's the difference between Arch and Fedora based distros", not to solving complex issues that require cognition and accuracy. That's what the 3.1 Pro is better for according to Gemini. Oh and it is also gaslights you by starting the answers with first telling you how amazing things from your question are, which is insanely annoying but I guess Google's A/B testing found out the majority of Average Joe midwits love it when "the AI" reinforces their choices and decisions like a fake friend.
I think Google just doesn't care about being the SOTA for coding, reasoning and accuracy, since they're in the ads and search business for everyone, not in the agentic coding business for pro-sumers, so if the answers are some hallucinations that sound "good enough" to its clueless search user base, but is at least dirt cheap to run on their datacenter hardware, then it's already more than enough for them and they can all it a day.
Meanwhile OpenAI and Anthropic don't have search and ads monopolies, so they need to perform well at certain task for people and businesses to give them their hard earned money for them to survive. For them, nailing stuff like coding and writing accuracy is an existential threat, not a hobby sideproject like it is for Google.
WarmWash 24 hours ago [-]
The thing about Gemini is that it never chews on a problem. Claude and GPT will regularly churn on a prompt for 10-15 minutes. I don't think I have ever seen Gemini think for more than a 2 minutes.
Google seems more interested in fast models that can quickly turn responses, which kind of fits with a company that needs to serve AI on a mass scale.
thewebguyd 23 hours ago [-]
It also fits with ad delivery, if that is the route they are going to go with consumer (non-API usage) gemini. Their cash cow is still ads, and will likely remain ads they aren't suddenly going to be come a frontier lab selling access to a model.
Fast answers, using their search as grounding, that can parse keywords and spit out a few ads is where Gemini Flash is going to head. That, and the agentic actions stuff they showed off at I/O with Google shopping, ordering food, etc. Speed is important there.
I think Google just needs to keep its foot in the door and let the other two spend their way into oblivion, channeling “AGI or die trying”.
basch 21 hours ago [-]
Did you look at the charts in the article?
It out-performed every model that wasnt a max/ultrafrontier of some sort, except for the one that the article was extolling the virtues of, including grok high. you could make a good argument that deepseek is a better value, but gemini flash is when bundled is already pretty accessible.
nowhere did i claim that flash was better than fable or 5.5xhigh.
joe_mamba 13 hours ago [-]
>Did you look at the charts in the article?
I don't care about someone else's charts, i care about my own lived experiences. Benchmarks can be gamed to get to the top of charts. When I pay for a service I care about how it performs in my test cases, not about which tops some random charts.
Read my comment again please. I think I was pretty clear with detailed examples on where Gemini sucks and where it's good at.
>nowhere did i claim that flash was better than fable or
5.5xhigh.
And nowhere did I claim that. I said even basic GPT and Grok are better than Gemini Flash at reasoning tasks. Again, read my comment again, I have already explained why with examples.
torben-friis 11 hours ago [-]
If you don't care about someone else's charts, why do you expect others to care about your comment?
joe_mamba 8 hours ago [-]
>If you don't care about someone else's charts
Bruh, do you know what a personal opinion is? You don't need to care about it, and I never said I don't care about other people's opinions, I just said that chart which is based on non peer reviewed information, doesn't match my experience so without further peer reviewed proof to back it up, I don't care about it as my expire shows otherwise.
Would you believe any graph that tells you the sky is brown when you go outside and see that it isn't?
thewebguyd 24 hours ago [-]
> noticing Opus 4.8 has really increased in capability
I've definitely noticed it, at least for doing backend C#/dotnet. Its insanely good, I haven't had to babysit much at all this week.
xnx 1 days ago [-]
They almost certainly wanted 3.5 Pro out for Google IO a few weeks ago. They're still crunching on it. No ETA given. Would be fascinating to read about the behind the scenes stories (failed training run?) if they ever get told.
staticman2 16 hours ago [-]
They did give an ETA. They said 3.5 Pro would come out in June.
xnx 9 hours ago [-]
Ah! Thank you. I missed that.
joe_mamba 24 hours ago [-]
> They're still crunching on it. No ETA given.
Thank God. I'd rather companies ship something when engineers say it's actually ready rather than when the suits want something to show on stage to pump their egos and career exposure but turn out to be a massive disappointment covered in fluff.
Although it does feel very embarrassing for Google who invented transformers and has more money than both Anthropic and OpenAI combined, to fall behind them at the LLM race.
squidbeak 3 hours ago [-]
Be fair - Gemini 3.5 hasn't been released yet. You're talking about Gemini's flash model, but when are flash models ever SOTA?
16 hours ago [-]
1 days ago [-]
make3 16 hours ago [-]
you can't know if an improvement is just due to compute, this was a lesson of the early scaling, the whole emergence phenomena, new capabilities emerge even only from more compute & data
pbgcp2026 14 hours ago [-]
"fable 5 was the most human like model" I am laughing. There is simply no other model as good as Gemini 3.0 Pro. 2.5 was fine. 3.1 is Ok. 3.0 was a masterpiece. Compared to Opus 4.5. (yes, both got killed to prioritise "coding monkey assistants".)
AgentMasterRace 1 days ago [-]
Gemini is super bad, grok is actually superior most of the time and that's saying something because grok also sucks.
throwa356262 10 hours ago [-]
Username checks out.
More seriously, Grok has serious problem with bias. Since its an activist model its judgment cannot be trusted.
Example: ask 50 different models if Elon Musk should be elected as the next US president. 49 will tell you this cannot happen since he was born South African. One will tell you this is an excellent idea.
HlessClaudesman 13 hours ago [-]
Google's AI is hamstrung by a culture of safetyisim, by that I mean going beyond what we can all recognise as safe limits to protect the user from imaginary ephemeral things like cultural harms.
So maximal safety at all costs is in itself a cost. They can spend billions on AI but that spend is down the toilet if the user bounces because the AI's persona is a relentless politically correct scold.
sigmoid10 13 hours ago [-]
People forgot, but Google had their own internal version of ChatGPT before OpenAI. But they never even intended to launch it. If OpenAI hadn't just thrown the technology out there for everyone to see, Google would probably still be sitting on it. Google does tons of original stuff, but they haven't released any original product in more than a decade. All they do now is play catch-up once they see people actually like something.
fsmv 5 hours ago [-]
There's also the major problem of people expecting Google to be right when it tells them something but OpenAI had no starting reputation so it was okay to say "be aware it might be wrong sometimes"
david_shi 16 hours ago [-]
The economics of working at a pre-IPO company that will likely have a successful IPO and a 20+ year post-IPO company are also very different.
barbarr 51 minutes ago [-]
He doesn't need the money though. I bet there's something cultural or structural going on at GDM that made him want to leave.
mlmonkey 3 hours ago [-]
He won the Nobel. He made enough from Google to last 7 lifetimes. He does not need the money!
overfeed 48 minutes ago [-]
> He does not need the money!
Where did I hear this before? Very few people feel that a specific amount is ever enough.
make3 16 hours ago [-]
in theory the IPO pricing is already set
pbgcp2026 14 hours ago [-]
... yep and kxm Bessent kxm helped to set it. Anthropic shot themselves in a foot. Twice LOL
fhe 14 hours ago [-]
I felt Hassabis' heart was never in consumer applications or even LLMs. He'd much rather be doing scientific research.
idorosen 17 hours ago [-]
Why invent conspiracy theories? Occam's razor: They are leaving for companies that have filed for upcoming IPOs, so it could just as well be compensation.
michaelbuckbee 1 days ago [-]
Vesting schedule?
24 hours ago [-]
whiplash451 24 hours ago [-]
Maybe because they know where things are going with Gemini (more ads to your face) while Anthropic might, for once, have a different story.
When personal finance is not the bottleneck anymore, the new criteria becomes "vision" and "stacked talent".
nowittyusername 23 hours ago [-]
I think its google doing what theve always done, make a great *thing then ignore it. The models are great their agentic harness systems are really poor though, compared to codex cli and claude code cli its a mess.
24 hours ago [-]
IncreasePosts 23 hours ago [-]
Maybe Google doesn't want to pay out billions to a handful of engineers?
mlmonkey 3 hours ago [-]
LOL ... after character.AI ?
IncreasePosts 2 hours ago [-]
Maybe they did that and realized the error. Anthropic handed him a bucket of money, I'm sure any reasonable person would bring that offer to their current employer and try to get them to match it. Maybe Google said no thanks
pbgcp2026 14 hours ago [-]
Too many mid-management from Amazon joined plus GRAD bullshit. But it's good for stock price, so I am buying more Google
24 hours ago [-]
HlessClaudesman 13 hours ago [-]
John Jumper Jumps to Anthropic, was right there.
OoTheNigerian 9 hours ago [-]
Drop the "John" :)
13 hours ago [-]
hsaliak 12 hours ago [-]
Gemini fumbled not on the models but on the basics.
Gemini 3.1 flash was actually an amazing model to code with and their 20 dollar AI plans had solid value, but they locked it all behind 429s, needless gatekeeping of clients and poor product differentiation even among internal offerings. Users moved on. To claude for the best product, to OpenAI for the non gatekept API access. It’s hard to bring them back.
energy123 11 hours ago [-]
Organizational issues probably. As a user of Gemini for half a year I was stunned at the amount of bugs and lag for one of their flagship products.
Their devs are not incompetent so there must be some extreme dysfunction for that to be possible at the org level, where the IC is either not allowed to fix the bugs or doesn't want to fix them.
Either way that dysfunction is probably not limited to just the Gemini UI team.
abraxas 24 hours ago [-]
Something seems afoot at Google. The real tell will be if Demis makes a move. Jeff Dean seems more like a lifer to me.
lachlanj 16 hours ago [-]
Maybe this a a naive take but I see Demis as future Google CEO candidate. If AI is really the future, he is much better suited to lead the company.
seydor 15 hours ago [-]
The scientist who works at night and sleep at day? I don't think he s the managerial type
edg5000 14 hours ago [-]
Famously Mao was like that as well.
tw1984 14 hours ago [-]
Mao started doing that after becoming the "CEO", not the other way around.
make3 16 hours ago [-]
I just think that people are running to pre IPO Anthropic to make bank.
It also feels like Anthropic is the new Google though. They actually try to not be evil, and are actually at the frontier of new tech.
argee 15 hours ago [-]
Google was famously undervalued at IPO, even relative to its value back then (that is, ignoring the wild upside that actualized). It’s likely not the new Google from a shareholder upside perspective, given the hype.
mlmonkey 3 hours ago [-]
Google's IPO was a Dutch Auction. It's impossible to be undervalued in a Dutch Auction.
thesmtsolver2 15 hours ago [-]
Was Google preventing other search engine developers from searching for resources?
enos_feedler 14 hours ago [-]
I mean, just saying the new guy is trying not to be evil does not give any credence to some meaningful differentiation over the long term. Won't they just eventually be evil too? Seems odd to continue to trust the early thinking of long term companies. Haven't we been burned enough?
15 hours ago [-]
jatins 13 hours ago [-]
Demis is practically a co-CEO at this point. If he leaves, that'd be a huge of trust in Google.
musicale 1 days ago [-]
Name checks out.
freedomben 1 days ago [-]
Missed opportunity for headline: John jumper jumps to anthropic
So Mr. Jumper. You are committed? We need people longer term here. Your boss Mr. Settles is really excited about you joining.
aabhay 23 hours ago [-]
I am guessing this is related to Anthropic’s recent acquisition of Coefficient Bio, and their interest on essentially using AGI to discover novel drugs
kingkongjaffa 10 hours ago [-]
Between this and Shazeer you guys are idolizing individuals in an unhealthy way IMO.
Why are we keeping tabs on researchers moving around?
You're talking about them like people talk about the NFL trading players or (football) soccer teams making recordbreaking player transfers.
someone further down wrote
> Anthropic legit builds one the strongest if not the strongest IC team in the history of computational technology.
What a weird thing to say. It's not a team sport where you support 'sides'.
frollogaston 29 minutes ago [-]
It is like NFL trading players, what's wrong with that? Not that either should be idolized, but they're paid a lot for a reason.
jwithington 4 hours ago [-]
>What a weird thing to say. It's not a team sport where you support 'sides'.
It is if you have equity!
thorum 2 hours ago [-]
The team with the most star power and hype tends to attract the best young talent.
If the next big breakthrough in AI comes from Anthropic, good chance it comes from some genius you’ve never heard of who decided to work there because of [famous researcher].
manc_lad 10 hours ago [-]
this is a natural human trait. soap operas, sports, businesses, music.
there are also real implications. assuming money is not the only factor in moving to anthropic, it does help guide insight into where innovation might be and where to put your AI spend. a decision which could result in real returns for individuals and companies.
baobabKoodaa 8 hours ago [-]
KRAZAM made a skit about this. You would like it.
angoragoats 9 hours ago [-]
It’s annoying that the headline gives no context of who Jumper is, as if everyone is expected to know.
frollogaston 22 minutes ago [-]
It's Hacker News, a lot of people probably know. I didn't, but he has a Wikipedia page so whatever.
sidibe 3 hours ago [-]
Google has been bleeding talent for decades, though few people have ever heard of them before they left google. Like when John Giannendra left it was the first time I'd ever heard of him but it was made out like he was a key person and big blow for Google. Every self driving car company came out of ex Google people as well, same w/OpenAi. The thing is Google research orgs have always been much bigger than any competitors, they'll be shedding people forever but there's a lot more where that came from
banana_sandwich 8 hours ago [-]
why is it unhealthy?
geodel 4 hours ago [-]
Well, tons of top people left Google when Facebook was up and coming. Its nothing new. Google can't really do much for people who've made up their mind.
But yeah conspiracy theoretical approach would generate lot of discussion on this rather normal thing happening.
hackerbeat 1 days ago [-]
Super Mario leaves Nintendo to focus on plumbing.
Iolaum 1 days ago [-]
Two big names left GDM recently. Could be a coincidence, but where's the fun in that? :p
overfeed 39 minutes ago [-]
That's meaningless without knowing many big names remain, and how much "big names" influence future outcomes. Google seems to have a knack for making names big
coderatlarge 24 hours ago [-]
you mean shazeer?
2 hours ago [-]
david_shi 22 hours ago [-]
Nominative determinism.
bmitc 14 hours ago [-]
Who cares?
brcmthrowaway 13 hours ago [-]
Why do people care when LeBron moves? Answer: people follow movements when earning tens of millions is involved.
bmitc 8 hours ago [-]
This guy earns tens of millions of dollars?
WarmWash 24 hours ago [-]
Shazeer yesterday and Jumper today....Demis is there something we need to know about?
make3 16 hours ago [-]
Shazeer is mostly an IC, weirdly enough (source: I worked with him).
Demis is the CEO of DeepMind, it's completely different.
Jumper.. the AlphaFold team left & made Isomorphic. I was always surprised that Jumper hadn't gone with them.
rvz 23 hours ago [-]
He will be back at Google after the Anthropic IPO.
Seems like everyone here is easily fooled by the Anthropic hype. After the IPO, Anthropic won't be like the daycare it is today.
Their main competitors are the chinese labs which are racing all their prices down close to $0.
black_knight 2 hours ago [-]
With Anthropic going public, I wonder how long until enshittification sets in for LLMs. I guess as soon as they have figured out a way to lock-in users and growth slows down.
frollogaston 7 minutes ago [-]
Might never happen since these are paid services. Maybe for the free tier, if that even continues to exist which I'm betting it won't. Even Gemini has already started paywalling features that used to be free.
uejfiweun 23 hours ago [-]
Can you go into a bit more detail about what exactly it is that you are predicting? This is interesting, and definitely cuts against the grain that the rest of these comments are going with.
23 hours ago [-]
CamperBob2 23 hours ago [-]
A lot depends on whether the z.ai CEO, who just released the first freely-available Opus-class model weights, is blowing smoke when he claims he's less than a year away from achieving Fable-level performance.
If he walks the talk, I really do not understand how either OpenAI or Anthropic is going to justify the twelve-digit valuations they are hoping for. They will just be some people who bought a domain name and rented some GPUs.
jaggederest 16 hours ago [-]
I would expect he's 6 months behind. That's exactly how much lag the open models have versus the frontier models, and have consistently the last 2-3 years.
The question is, how far ahead will the frontier models be in 6 months? if it's still 6 months, open weights might have a fable equivalent model, and the frontier models will be on upwards towards ... essay, or novel, or bibliography, or whatever the next name is.
alex7734 10 hours ago [-]
After a point, does it matter? Even assuming that models can keep getting smarter indefinitely, it's not like you need to use the smartest model to get the job done.
Moore's law is dead now, so at some threshold purchasing the GPUs to run the biggest and newest model hurts you more than whatever rent you could've extracted from it.
When we get there, why would you want to run a closed model that you can't control, with restrictions you can't remove, that a company can take from you or silently nerf without telling you?
theturtletalks 23 hours ago [-]
Seems like what Shazeer did with Character AI. He started it and then Google licensed to bring him and some of the team back after some time.
Not a bad playbook. If you’re important to the company, leave and start your own company. Then play the M&A game and you can clean up nicely.
uejfiweun 23 hours ago [-]
I'm just not sure that Google, or anyone for that matter, really has the capital to do that with Anthropic. This is a near trillion dollar company.
isodev 4 hours ago [-]
LinkedIn is right there for “career updates”.
Nobody really knows or cares about Mr Jumper there. Congrats on his new role in converting humanity’s achievements into slop.
dewitt 4 hours ago [-]
> Nobody really knows or cares about Mr Jumper
There's no place in polite company for comments like this, and you could be trolling. But since you also simply might not know:
John Jumper won the Nobel Prize in Chemistry in 2024 for his contributions to AlphaFold and is a Fellow of the Royal Society. Among the more influential scientists of our time.
I would just join a lab for 4 years and retire at this point.
jatins 13 hours ago [-]
You think these people don't already have generational wealth?
brcmthrowaway 13 hours ago [-]
Before they joined the lab?
vld_chk 1 days ago [-]
Anthropic legit builds one the strongest if not the strongest IC team in the history of computational technology. They are insanely stacked on talent, and either we will witness a legendary run, or a new LTCM
swyx 23 hours ago [-]
why so dramatic? why cant it just be a quietly competent lab, why must it be a dramatic collapse?
greenavocado 23 hours ago [-]
> why must it be a dramatic collapse?
Extreme investor desire for return on capital investment, and quickly
fhe 12 hours ago [-]
because, as Musk once said, always bet on the more interesting outcome (because we live in a simulation)
kh9000 21 hours ago [-]
And yet they can’t build a TUI that scrolls without flickering or uses a reasonable amount of memory
shard972 18 hours ago [-]
[dead]
glimshe 23 hours ago [-]
Google at its peak, as well as Microsoft, had similarly strong teams.
frays 23 hours ago [-]
Apart from Karpathy and now Shazeer and Jumper, who are the other top ICs in their team?
gordonhart 20 hours ago [-]
Shazeer is joining OpenAI, adjust your fantasy frontier team draft picks accordingly
crypto420 16 hours ago [-]
Carlini
uejfiweun 23 hours ago [-]
I'm starting to get the sense that Anthropic is the company that will fulfill the prophecy laid out in AI 2027.
graphime 1 days ago [-]
[dead]
angoragoats 21 hours ago [-]
[flagged]
incognito124 14 hours ago [-]
Oh he just won this tiny award named after Alfred Nobel
angoragoats 9 hours ago [-]
Cool! I’d advocate for OP to put that in the headline, then.
andrewstuart 1 days ago [-]
John Jumper what a great name sounds like a video game action hero.
darksim905 23 hours ago [-]
The film Jumper is good fun, though it was a missed opportunity to have this be the character's name. :)
avdelazeri 4 hours ago [-]
Mainstream tends to clown on Jumper, but I had a lot of fun watching it with my pals back then.
SilverElfin 1 days ago [-]
Who?
artninja1988 1 days ago [-]
He was leading the development of AlphaFold, the AI system that predicts protein structures for which he got the 2024 Nobel Prize in Chemistry.
yuffffley 1 days ago [-]
I remember that.
That was when they realized the deep learning was largely unnecessary, and they could just use their massive compute resources to brute force the problem space.
Proving that we would greatly benefit from using our compute resources for science rather than showing ads, and then we just kept showing ads.
dekhn 1 days ago [-]
AlphaFold is based on deep learning and it's not brute force.
TeMPOraL 1 days ago [-]
You could argue that training SOTA LLMs is pre-bruteforcing every problem everywhere all at once.
If AlphaFold really is brute force on known protein problem space, would it then be usable as a model for novel proteins?
tmule 23 hours ago [-]
What brute force? Any citations?
nimchimpsky 1 days ago [-]
[dead]
mikert89 23 hours ago [-]
Anthropic is probably approaching AGI, and everyone wants to be there for it
nozzlegear 17 hours ago [-]
AGI is a convenient myth that the big American AI companies use to continue breathlessly marketing their products to the people and intellectuals who would, under any other circumstance, consider themselves rational and immune to such tricks.
euleriancon 15 hours ago [-]
Help me understand this viewpoint that AGI being possible in the near-ish future is a myth, I see it repeated quite a lot.
I've been in NLP since the LSTM days and it's hard for me to look at LLMs and not just think they are incredible. It's truly a different level of expressiveness. So much of capabilities research is pointing to LLMs effectively learning a world model.
RLVR is also proving really effective. It is hard for me to imagine a world in the future where LLMs aren't at human level performance across a wide variety of tasks.
I fully acknowledge that current LLM labs have a financial interest in people believing AGI is very near, but from what I'm reading in the literature and seeing myself experimenting with the SOTA models it doesn't seem totally unreasonable.
What evidence are you seeing that makes you confident that AGI in the soon-ish future is a complete myth?
weregiraffe 10 hours ago [-]
If AGI is near, why Tesla still doesn't have real FSD?
flebron 3 hours ago [-]
I think you might be thinking of "AGI" as some sort of point in time, where something happens and everyone all at once has some technology. Not only is the progress towards AGI gradual, it's also very jagged in both capabilities and especially who has access to it. It's irrelevant whether a particular company, like Cisco, Pepsi, or Tesla, has some capability, when there exists a different research lab that is at the frontier, approaching AGI from some direction.
balherian 9 hours ago [-]
ergo propter hoc?
weregiraffe 6 hours ago [-]
No. AGI should be able to drive a car.
Unfortunately an LLM is not AGI, and video recording is not text.
sph 23 hours ago [-]
Source: it came to me in a dream
mikert89 23 hours ago [-]
Source: I used fable, can project into the future
seydor 15 hours ago [-]
They are approaching AGI and thus they need more humans?
hansmayer 8 hours ago [-]
[dead]
SpyCoder77 1 days ago [-]
The guy who invented jumping is joining a major AI lab?!?
kridsdale1 23 hours ago [-]
The JMP assembly instruction is pretty important. Imagine if the inventor had royalties.
Nothing but nice things were ever heard about him. In a school of egos, his was approachably humble [†]... reasoning and facts seemed to make him tick. My twin actually had classes with him (small school), but also didn't know him well... but knew he was a swell guy. None of his IT tickets were ever typical "rude rich Vandy kid" – he could solve/delegate most his own problems.
So glad to see Vanderbilt secure yet another humble Laureate (Muhammud Yunis won his Nobel while we all were attending, I believe the previous alumnus-so).
Keep it humble, fellow knowledge-seeker Human John. Howdy from KissamKissam.
[†] most Obviously Brilliant-types "come off" as I_went_to_Harvard arrogant; not John ("awe" of his obvious brilliance?)
> Thanks John for an extraordinary partnership and wonderful collaboration over the past 9 years! What we achieved with AlphaFold changed the world, and showed the field what was possible with AI for science and medicine, lighting the way for how AI can benefit humanity.
Google appears to be falling behind in the AGI race. The leadership (MBAs) do what clueless leadership always does: they start cracking the whips, bring the knives out. People like Jumper, Shazeer, Dean, etc. are not built for fighting political battles; they're built for solving tough problems! What MBAs don't understand is that researchers at this level put a tremendous amount of pressure on themselves; and this internal pressure is far beyond what any external entity can apply. So, when MBAs start hassling top researchers with "so ... what have you done lately?" and "what are you working on? Is it important?" and "when are we getting AGI (with a smirk)" type of questions, then it feels really really grating: If I knew a sure path to the goal, I would work 24x7 to get there, dammit!
Their newest model wasn’t really SOTA. And honestly fable 5 was the most human like model I’d ever tried. It was an incredible jump.
And recently lots of Claude users at r/ClaudeAI are noticing Opus 4.8 has really increased in capability. Not new things but maybe redirected compute. It just feels like one of the best models ever, maybe because the compute that was previously assigned to Fable has been redirected? It feels incredible.
Is Gemini good at writing code? I am sure it is. But where is their Codex? And no, antigravity isn't it.
Is Gemini good at making visualizations? I am sure it is. But where are artifact or visualise skill in gemini.google.com similar to what's available on claude.ai?
What is an average user going to do raw model capability if the product surface isn't expressive enough?
Antigravity CLI is quite decent, it's a huge step up from Gemini CLI (like, for example, it actually fucking works) and has some genuine advantages over Claude Code. Does Codex have something over both of them? I haven't tried it.
But the model just fucking sucks. Before I switched to Claude for personal stuff a few weeks ago, I was like "damn model capabilities are really slowing down" but no, it's just Gemini that's slowing down.
Will have to see if 3.5 Pro is any good when that comes out. But it feels like they would be attempting to catch up to Opus, not to Fable.
FWIW issue is never really about the code it writes it's about general intelligence. Gemini hallucinates like it's 2024, fails to follow instructions, and goes down wildly wrong debugging paths. Opus just gets the job done, first time, every time. With Gemini it feels like "I _am_ glad this intern is working for me but I'm tired of babysitting him" and with Claude it's like "this new PhD guy can replace me soon".
- the /artifact thing is quite useful (don't think CC has it?)
- the /tasks is a bit better than CC's equivalent
- there are a few built-in skills that I haven't found CC equivalents for in the built in set (but the fact that I haven't sought out 3rd party versions shows you they aren't very important).
And more generally it does a better job of making the agent available. When Claude is debugging something complex and running a bunch of experiments it's often unavailable for like 20 minutes at a time, you only have /btw. Whereas AGY tends to more aggressively use timers and background jobs.
But now I wrote that out, I realised it's probably just as much of a system prompt thing as a harness design thing. Coz Claude _can_ operate that way too.
Anyway, like I said none of these come anywhere near balancing out the model quality gap.
I went from spending 20-60 dollars a day in api fees down to like 5 dollars a day cause I have had to limit my use to things I know it performs reliably on.
https://artificialanalysis.ai/articles/glm-5-2-is-the-new-le...
The idea of "falling behind" when you can leapfrog each other every six months leads me to believe it has to be more than just "falling behind" for one cycle. It's a culture, process, red tape, focus, or mandate problem of some sort. Something not as easily correctable preparing for next launch.
Define "better". I guess it depends on what you're using it for. I use it almost daily as an alternative to google search and it's great for that, but I think it's absolute garbage for coding and reasoning.
For questions related to coding, solving Arch Linux and WINE Lutris issues, helping me with MXLinux issues, and wifi issues on an old rooted huawei tablet running LineageOS, it was consistently wrong, constantly giving out confident but outdated or misinformation, or hallucinating stuff while gaslighting me. Every time I would point out it was wrong, it would re-check and keep apologizing and then repeat giving me wrong answers, and then apologising again and so on. It doesn't matter what prompts or jailbreaks you give it to get 3.5 Flash to chew longer on complex problems for better reasoning and accuracy, it just defaults to being lazy and giving you the quick and easy answer from its weights, which can be totally wrong. Same for asking it to write me a cover letter based on my resume and the job description I wanted to apply to. It massively sucked at that too and made up a bunch of unusable fake sounding BS.
Basic free tier ChatGPT 5.5 would blow it out of the water on all of those tasks. Hell, even Grok free is better at that, it gave me a one-shot Arduino code that blew Gemini 3.5 flash away.
3.5 Flash seems tuned to just eyeballing basic answers to general purpose questions that resemble Google searches like "give me a recipe" or "give me a workout plan", or "what's the difference between Arch and Fedora based distros", not to solving complex issues that require cognition and accuracy. That's what the 3.1 Pro is better for according to Gemini. Oh and it is also gaslights you by starting the answers with first telling you how amazing things from your question are, which is insanely annoying but I guess Google's A/B testing found out the majority of Average Joe midwits love it when "the AI" reinforces their choices and decisions like a fake friend.
I think Google just doesn't care about being the SOTA for coding, reasoning and accuracy, since they're in the ads and search business for everyone, not in the agentic coding business for pro-sumers, so if the answers are some hallucinations that sound "good enough" to its clueless search user base, but is at least dirt cheap to run on their datacenter hardware, then it's already more than enough for them and they can all it a day.
Meanwhile OpenAI and Anthropic don't have search and ads monopolies, so they need to perform well at certain task for people and businesses to give them their hard earned money for them to survive. For them, nailing stuff like coding and writing accuracy is an existential threat, not a hobby sideproject like it is for Google.
Google seems more interested in fast models that can quickly turn responses, which kind of fits with a company that needs to serve AI on a mass scale.
Fast answers, using their search as grounding, that can parse keywords and spit out a few ads is where Gemini Flash is going to head. That, and the agentic actions stuff they showed off at I/O with Google shopping, ordering food, etc. Speed is important there.
https://ads.openai.com/
https://openai.com/index/new-ways-to-buy-chatgpt-ads/
It out-performed every model that wasnt a max/ultrafrontier of some sort, except for the one that the article was extolling the virtues of, including grok high. you could make a good argument that deepseek is a better value, but gemini flash is when bundled is already pretty accessible.
nowhere did i claim that flash was better than fable or 5.5xhigh.
I don't care about someone else's charts, i care about my own lived experiences. Benchmarks can be gamed to get to the top of charts. When I pay for a service I care about how it performs in my test cases, not about which tops some random charts.
Read my comment again please. I think I was pretty clear with detailed examples on where Gemini sucks and where it's good at.
>nowhere did i claim that flash was better than fable or 5.5xhigh.
And nowhere did I claim that. I said even basic GPT and Grok are better than Gemini Flash at reasoning tasks. Again, read my comment again, I have already explained why with examples.
Bruh, do you know what a personal opinion is? You don't need to care about it, and I never said I don't care about other people's opinions, I just said that chart which is based on non peer reviewed information, doesn't match my experience so without further peer reviewed proof to back it up, I don't care about it as my expire shows otherwise.
Would you believe any graph that tells you the sky is brown when you go outside and see that it isn't?
I've definitely noticed it, at least for doing backend C#/dotnet. Its insanely good, I haven't had to babysit much at all this week.
Thank God. I'd rather companies ship something when engineers say it's actually ready rather than when the suits want something to show on stage to pump their egos and career exposure but turn out to be a massive disappointment covered in fluff.
Although it does feel very embarrassing for Google who invented transformers and has more money than both Anthropic and OpenAI combined, to fall behind them at the LLM race.
More seriously, Grok has serious problem with bias. Since its an activist model its judgment cannot be trusted.
Example: ask 50 different models if Elon Musk should be elected as the next US president. 49 will tell you this cannot happen since he was born South African. One will tell you this is an excellent idea.
So maximal safety at all costs is in itself a cost. They can spend billions on AI but that spend is down the toilet if the user bounces because the AI's persona is a relentless politically correct scold.
Where did I hear this before? Very few people feel that a specific amount is ever enough.
When personal finance is not the bottleneck anymore, the new criteria becomes "vision" and "stacked talent".
Gemini 3.1 flash was actually an amazing model to code with and their 20 dollar AI plans had solid value, but they locked it all behind 429s, needless gatekeeping of clients and poor product differentiation even among internal offerings. Users moved on. To claude for the best product, to OpenAI for the non gatekept API access. It’s hard to bring them back.
Their devs are not incompetent so there must be some extreme dysfunction for that to be possible at the org level, where the IC is either not allowed to fix the bugs or doesn't want to fix them.
Either way that dysfunction is probably not limited to just the Gemini UI team.
It also feels like Anthropic is the new Google though. They actually try to not be evil, and are actually at the frontier of new tech.
Why are we keeping tabs on researchers moving around?
You're talking about them like people talk about the NFL trading players or (football) soccer teams making recordbreaking player transfers.
someone further down wrote
> Anthropic legit builds one the strongest if not the strongest IC team in the history of computational technology.
What a weird thing to say. It's not a team sport where you support 'sides'.
It is if you have equity!
If the next big breakthrough in AI comes from Anthropic, good chance it comes from some genius you’ve never heard of who decided to work there because of [famous researcher].
there are also real implications. assuming money is not the only factor in moving to anthropic, it does help guide insight into where innovation might be and where to put your AI spend. a decision which could result in real returns for individuals and companies.
But yeah conspiracy theoretical approach would generate lot of discussion on this rather normal thing happening.
Demis is the CEO of DeepMind, it's completely different.
Jumper.. the AlphaFold team left & made Isomorphic. I was always surprised that Jumper hadn't gone with them.
Seems like everyone here is easily fooled by the Anthropic hype. After the IPO, Anthropic won't be like the daycare it is today.
Their main competitors are the chinese labs which are racing all their prices down close to $0.
If he walks the talk, I really do not understand how either OpenAI or Anthropic is going to justify the twelve-digit valuations they are hoping for. They will just be some people who bought a domain name and rented some GPUs.
The question is, how far ahead will the frontier models be in 6 months? if it's still 6 months, open weights might have a fable equivalent model, and the frontier models will be on upwards towards ... essay, or novel, or bibliography, or whatever the next name is.
Moore's law is dead now, so at some threshold purchasing the GPUs to run the biggest and newest model hurts you more than whatever rent you could've extracted from it.
When we get there, why would you want to run a closed model that you can't control, with restrictions you can't remove, that a company can take from you or silently nerf without telling you?
Not a bad playbook. If you’re important to the company, leave and start your own company. Then play the M&A game and you can clean up nicely.
Nobody really knows or cares about Mr Jumper there. Congrats on his new role in converting humanity’s achievements into slop.
There's no place in polite company for comments like this, and you could be trolling. But since you also simply might not know:
John Jumper won the Nobel Prize in Chemistry in 2024 for his contributions to AlphaFold and is a Fellow of the Royal Society. Among the more influential scientists of our time.
https://en.wikipedia.org/wiki/John_M._Jumper
Extreme investor desire for return on capital investment, and quickly
That was when they realized the deep learning was largely unnecessary, and they could just use their massive compute resources to brute force the problem space.
Proving that we would greatly benefit from using our compute resources for science rather than showing ads, and then we just kept showing ads.
I've been in NLP since the LSTM days and it's hard for me to look at LLMs and not just think they are incredible. It's truly a different level of expressiveness. So much of capabilities research is pointing to LLMs effectively learning a world model.
RLVR is also proving really effective. It is hard for me to imagine a world in the future where LLMs aren't at human level performance across a wide variety of tasks.
I fully acknowledge that current LLM labs have a financial interest in people believing AGI is very near, but from what I'm reading in the literature and seeing myself experimenting with the SOTA models it doesn't seem totally unreasonable.
What evidence are you seeing that makes you confident that AGI in the soon-ish future is a complete myth?
Unfortunately an LLM is not AGI, and video recording is not text.