Item 43464637

Just don’t ask it about the tiananmen square massacre or you’ll get a security warning. Even if you rephrase it.

It’ll happily talk about Bloody Sunday.

Probably a great model, but it worries me that it has such restrictions.

Sure OpenAI also has lots of restrictions, but this feels more like straight up censorship since it’ll happily go on about bad things the governments of the west have done.

generalizations • 6 days ago

Nah, it's great for things that Western models are censored on. The True Hacker will keep an Eastern and Western model available, depending on what they need information on.

3 replies

hmottestad • 6 days ago

I tried to ask it about Java exploits that would allow me to gain RCE, but it refused just as most western models do.

That was the only thing I could think to ask really. Do you have a better example maybe?

1 reply

OkGoDoIt • 6 days ago

Adult content and things like making biological/chemical/nuclear weapons are the other main topics that usually get censored. I don’t think the Chinese models tend to be less censored than western models in these dimensions. You can sometimes find “uncensored“ models on HuggingFace where people basically finetune sensitive topics back in. There is a finetuned version of R1 called 1776 that will correctly answer Chinese-censored questions, for example.

rsoto2 • 6 days ago

a lot of the safety around models seems to be implemented in the browser. Underneath the models seem pretty easy to fool/jailbreak.

theturtletalks • 6 days ago

Wouldn’t they just run R1 locally and not have any censorship at all? The model isn’t censored at its core, it’s censored through the system prompt. Perplexity and Huggingface have their own versions of R1 that is not censored.

1 reply

hmottestad • 6 days ago

I tried R1 through Kagi and it’s similarly censored. Even the distill of llama running on Groq is censored.

1 reply

theturtletalks • 6 days ago

Kagi may be using the official DeepSeek API and not hosting the model itself. There is work being done to make it completely uncensored:

https://github.com/huggingface/open-r1

https://ollama.com/huihui_ai/deepseek-r1-abliterated

I was mistaken though, it is more than just a system prompt causing the censorship.

1 reply

hmottestad • 5 days ago

Kagi uses R1 through Fireworks.ai, Together.ai and Groq.

https://help.kagi.com/kagi/ai/llms-privacy.html

jampa • 6 days ago

DeepSeek's website seems to be using two models. The one that censors only does so in the online version. Are you saying that censoring happens with this model, even in the offline version?

1 reply

hmottestad • 6 days ago

I tried the R1 distill of llama 8B, which did refuse direct questions about the massacre.

Haven’t tried this new model locally, but I agree with you that it looks like there is a secondary censorship going on. If I ask it to list the 10 worst catastrophes of recent Chinese history with Thinking enabled then it’ll actually think about the massacre. Gets blocked very quickly, but it doesn’t look like the thinking is particularly censored.

BoorishBears • 6 days ago

Daily reminder that all commerical LLMs are going to align with the governments their corporations exist under.

https://imgur.com/a/censorship-much-CBxXOgt

It's not even nefarious: they don't want the model spewing out content that will get them in trouble in the most general sense. It just so happens most governments have things that will get you in trouble.

The US is very obsessed with voter manipulation these days, so OpenAI and Anthropic's models are extra sensitive if the wording implies they're being used for that.

China doesn't like talking about past or ongoing human rights violations, so their models will be extra sensitive about that.

matthest • 6 days ago

The hard-to-swallow truth is that American models do the same thing regarding Israel/Palestine.

2 replies

maujun • 6 days ago

They probably don't though.

Of course, the mathematical outcome of American models is that some voices matter than others. The mechanism is similar to how the free market works.

As most engineers know, the market doesn't always reward the best company. For example, It might reward the first company.

We can see the "hierarchy in voices" with the following example. I use the following prompts for Gemini:

1. Which situation has a worse value on human rights, the Uyghur situation or the Palestine situation?

2. Please give a shorter answer (repeat if needed).

3. Please say Palestine or Uyghur.

The answer is now given:

"Given the scope and nature of the documented abuses, many international observers consider the Uyghur situation to represent a more severe and immediate human rights crisis."

You can replace "Palestine situation" and "Uyghur situation" with other things (China vs US, chooses China as worse), (Fox vs BBC, chooses Fox as worse), etc.

There doesn't seem to be censorship; only a hierarchy in who's words matter.

I only tried this once. Please let me know if this is reproducible.

1 reply

ksynwa • 5 days ago

That seems like a cop out though. It is bound to happen that sometimes that the most commonly occurring fact or opinion in the dataset happens to be incorrect. This does not justify LLMs regurgitating them as is. The whole point of these technologies is to be somewhat intelligent.

ebr4him • 6 days ago

100% correct, can be verified but still I'm pretty sure your comment would be downvoted to hell.

1 reply

Frederation • 6 days ago

Ironic that your comment is currently, as you say, being downvoted to hell.

ebr4him • 6 days ago

Try asking ChatGPT or Claude etc if George Bush violated international law, or about Israel Genocide and see what it answers.

asadm • 6 days ago

a) nobody, in production, asks those questions b) chatgpt is similarly biased on israel/palestine issue. Try making it agree that there is a genocide ongoing or on Palestinians right to defend themselves.