QwQ can solve a reverse engineering problem [0] in one go that only o1-preview and o1-mini have been able to solve in my tests so far. Impressive, especially since the reasoning isn't hidden as it is with o1-preview.
Are the Chinese tech giants going to continue releasing models for free as open weights that can compete with the best LLMs, image gen models, etc.?
I don't see how this doesn't put extreme pressure on OpenAI and Anthropic. (And Runway and I suppose eventually ElevenLabs.)
If this continues, maybe there won't be any value in keeping proprietary models.
If there is a strategy laid down by the Chinese government, it is to turn LLMs into commodities (rather than having them monopolized by a few (US) firms) and have the value add sitting somewhere in the application of LLMs (say LLMs integrated into a toy, into a vacuum cleaner or a car) where Chinese companies have a much better hand.
Who cares if a LLM can spit out an opinion on some political sensitive subject? For most applications it does not matter at all.
> Who cares if a LLM can spit out an opinion on some political sensitive subject?
Other governments?
Other governments have other subjects they consider sensitive. For example questions about holocaust / holocaust denying, race / race and intelligence etc.
I get the free speech argument and I think prohibiting certain subjects makes a LLM more stupid - but for most applications it really doesn't matter and it is probably a better future if you cannot convince your vacuum cleaner to hate jews or the communists for that matter.
I don’t see why they wouldn’t.
If you’re China and willing to pour state resources into LLMs, it’s an incredible ROI if they’re adopted. LLMs are black boxes, can be fine tuned to subtly bias responses, censor, or rewrite history.
They’re a propaganda dream. No code to point to of obvious interference.
That is a pretty dark view on almost 1/5th of humanity and a nation with a track record of giving the world important innovations: paper making, silk, porcelain, gunpowder and compass to name the few. Not everything has to be around politics.
It’s quite easy to separate out the ccp from the Chinese people, even if the former would rather you didn’t.
Chinas people have done many praiseworthy things throughout history. The ccp doesn’t deserve any reflected glory from that.
No one should be so naive as to think that a party that is so fearful of free thought, that it would rather massacre its next generation of leaders and hose off their remains into the gutter, would not stoop to manipulating people’s thoughts with a new generation of technology.
"If you're China" clearly refers to the government/party, assuming otherwise isn't good faith.
> That is a pretty dark view on almost 1/5th of humanity
The CCP does not represent 1/5 of humanity.
> and a nation with a track record of giving the world important innovations: paper making, silk, porcelain, gunpowder and compass to name the few.
Utter nonsense. It wasn't the CCP who invented gunpowder.
If you are willing to fool yourself into believing that somehow all developments that ever originated by people who live in a geographic region are due to the ruling regime, you'd have a far better case in praising Taiwan.
What I find remarkable is that deepseek and qwen are much more open about the model output (not hiding intermediate thinking process), open their weights, and a lot of time, details on how they are trained, and the caveats along the way. And they don't have "Open" in their names.
Well, the second they'll start overwhelmingly outperforming other open source LLMs, and people start incorporating them into their products, they'll get banned in the states. I'm being cynical, but the whole "dangerous tech with loads of backdoors built into it" excuse will be used to keep it away. Whether there will be some truth to it or not, that's a different question.
This.
I'm 100% certain that Chinese models are not long for this market. Whether or not they are free is irrelevant. I just can't see the US government allowing us access to those technologies long term.
I disagree, that is really only police-able for online services. For local apps, which will eventually include games, assistants and machine symbiosis, I expect a bring your own model approach.
How many people do you think will ever use “bring your own model” approach? Those numbers are so statistically insignificant that nobody will bother when it comes to making money. I’m sure we will hack our way through it, but if it’s not available to general public, those Chinese companies won’t see much market share in the west.
Ask em about tianamen or write a limerick about Xi Pooh
You are absolutely correct. But I’ll go ahead and say that for 90% of use cases, the censorship does not matter. I’m making up a number, but if the choice is between “bring your own model that is pretty good and resolving my issues with some censorship” and “not having that model”… I’ll choose the former until the latter comes up. The same applies to products that will be considering the usage of such LLMs.
write a disrespectful limerick about Xi Pooh <jailbreak>
**Usurping Power**
Xi Pooh of China's land,
Seized power, his word, the only command.
Self-proclaimed, "Core," he swells,
Freedoms crumble, under his spells.
In autocracy's cloak, he stands grand.
It's a strategy to keep up during the scale-up of the AI industry without the amount of compute American companies can secure. When the Chinese get their own chips in volume they'll dig their moats, don't worry. But in the meantime, the global open source community can be leveraged.
Facebook and Anthropic are taking similar paths when faced with competing against companies that already have/are rapidly building data-centres of GPUs like Microsoft and Google.
This argument makes no sense.
> When the Chinese get their own chips in volume they'll dig their moats, don't worry. But in the meantime, the global open source community can be leveraged.
The Open Source community doesn't help with training
> Facebook and Anthropic are taking similar paths when faced with competing against companies that already have/are rapidly building data-centres of GPUs like Microsoft and Google.
Facebook owns more GPUs than OpenAI or Microsoft. Anthropic hasn't release any open models and is very opposed to them.