osti 6 days ago

Are 5090's able to run 32B models?

1
regularfry 5 days ago

The 4090 can run 32B models in Q4_K_M, so yes, on that measure. Not unquantised though, nothing bigger than Q8 would fit. On a 32GB card you'll have more choices to trade off quantisation against context.