paxys 7 hours ago

Does anyone know what GPUs the Qwen team has access to to be able to train these models? They can't be Nvidia right?

4
jsheard 6 hours ago

Nvidia still sells GPUs to China, they made special SKUs specifically to slip under the spec limits imposed by the sanctions:

https://www.tomshardware.com/news/nvidia-reportedly-creating...

Those cards ship with 24GB of VRAM but supposedly there's companies doing PCB rework to upgrade them to 48GB:

https://videocardz.com/newz/nvidia-geforce-rtx-4090d-with-48...

Assuming the regular SKUs aren't making it into China anyway through back channels...

paxys 6 hours ago

A company of Alibaba's scale probably isn't going to risk evading US sanctions. Even more so considering they are listed in the NYSE.

griomnib 4 hours ago

NVIDIA sure as hell is trying to evade the spirit of the sanctions. Seriously questioning the wisdom of that.

nl 4 hours ago

> the spirit of the sanctions

What does this mean? The sanctions are very specific on what can't be sold, so the spirit is to sell anything up to that limit.

chronic74930791 1 hour ago

> What does this mean? The sanctions are very specific on what can't be sold, so the spirit is to sell anything up to that limit.

25% of Nvidia revenue comes from the tiny country of Singapore. You think Nvidia is asking why? (Answer: they aren’t)

bovinejoni 40 minutes ago

Not according to their reported financials. You have a source for that number?

umeshunni 18 minutes ago

https://www.cnbc.com/amp/2023/12/01/this-tiny-country-drove-...

About 15% or $2.7 billion of Nvidia's revenue for the quarter ended October came from Singapore, a U.S. Securities and Exchange Commission filing showed. Revenue coming from Singapore in the third quarter jumped 404.1% from the $562 million in revenue recorded in the same period a year ago.

hyperknot 6 hours ago

There was also a video where they are resoldering memory chips on gaming grade cards to make them usable for AI workloads.

ipsum2 6 hours ago

That only works for inference, not training.

willy_k 4 hours ago

Why so?

miki123211 3 hours ago

Because training usually requires bigger batches, doing a backward pass instead of just the forward pass, storing optimizer states in memory etc. This means it takes a lot more RAM than inference, so much more that you can't run it on a single GPU.

If you're training on more than one GPU, the speed at which you can exchange data between them suddenly becomes your bottleneck. To alleviate that problem, you need extremely fast, direct GPU-to-GPU "interconnect", something like NV Link for example, and consumer GPUs don't provide that.

Even if you could train on a single GPU, you probably wouldn't want to, because of the sheer amount of time that would take.

elashri 2 hours ago

But does this prevent usage of cluster or consumer GPUs to be used in training? Or does it just make it slower and less efficient?

Those are real questions and not argumentative questions.

trebligdivad 4 hours ago

Alibaba's cloud has data centres around the world including the US, EU, UK, Japan, SK, etc - so i'd assume they can legaly get recent tech. See:

https://www.alibabacloud.com/en/global-locations?_p_lc=1

lithiumii 4 hours ago

Many Chinese tech giants already had A100 and maybe some H100 before the sanction. After the first wave of sanction (bans A100 and H100), NVIDIA released A800 and H800, which are nerfed versions of A100 and H100.

Then there was a second round of sanction that bans H800, A800, and all the way to much weaker cards like A6000 and 4090. So NVIDIA released H20 for China. H20 is an especially interesting card because it has weaker compute but larger vram (96 GB instead of the typical 80 GB for H100).

And of course they could have smuggled some more H100s.

hustwindmaple1 4 hours ago

Large Chinese companies usually have overseas subsidiaries, which can buy H100 GPUs from NVidia

nl 3 hours ago

Movement of the chips to China is under restriction too.

However, neither access to the chips via cloud compute providers or Chinese nationals working in the US or other countries on clusters powered by the chips is restricted.

nextworddev 4 hours ago

which is why the CHIPS act is a joke

nl 3 hours ago

The CHIPS act isn't related to the sanctions