Item 43465460

ggregoire • 6 days ago

We were using Llama vision 3.2 a few months back and were very frustrated with it (both in term of speed and results quality). Some day we were looking for alternatives on Hugging Face and eventually stumbled upon Qwen. The difference in accuracy and speed absolutely blew our mind. We ask it to find something in an image and we get a response in like half a second with a 4090 and it's most of the time correct. What's even more mind blowing is that when we ask it to extract any entity name from the image, and the entity name is truncated, it gives us the complete name without even having to ask for it (e.g. "Coca-C" is barely visible in the background, it will return "Coca-Cola" on its own). And it does it with entities not as well known as Coca-Cola, and with entities only known in some very specific regions too. Haven't looked back to Llama or any other vision models since we tried Qwen.

Alifatisk • 5 days ago

Ever since I switched to Qwen as my go to, it's been a bliss. They have a model for many (if not all) cases. No more daily quota! And you get to use their massive context window (1M tokens).

1 reply

Hugsun • 5 days ago

How are you using them? Who is enforcing the daily quota?

1 reply

Alifatisk • 5 days ago

I use them through chat.qwenlm.ai, what's nice is that you can run your prompt through 3 different modes in parallel to see which suits the best for that case.

The daily quota I spoke about is chatgpt and claude, those are very limited on the usage (for free users at least, understandable), while on Qwen, I have felt likeI am abusing it with how much I use it. It's very versatile in the sense that it has capabilities like image generation, video generation, massive context window, both visual and textual reasoning all in one place.

Alibaba is really doing something amazing here.

exe34 • 5 days ago

what do you use to serve it, ollama or llama.cpp or similar?