redundantly 2 days ago

Trivial after a substantial hardware investment and installation, configuration, testing, benchmarking, tweaking, hardening, benchmarking again, new models come out so more tweaking and benchmarking and tweaking again, all while slamming your head against the wall dealing with the mediocre documentation surrounding all hardware and software components you're trying to deploy.

Yup. Trivial.

6
dvt 2 days ago

Even my 4-year-old M1 Pro can run a quantized Deepseek R1 pretty well. Sure, full-scale productizing these models is hard work (and the average "just-make-shovels" startups are failing hard at this), but we'll 100% get there in the next 1-2 years.

whatevaa 2 days ago

Those small models suck. You need the big guns to get those "amazing" coding agents.

bravesoul2 2 days ago

Local for emotional therapy. Big guns to generate code. Local to edit generated code once it is degooped and worth something.

benoau 2 days ago

I put it LM Studio on an old gaming rig with a 3060 TI, took about 10 minutes to start using it and most of that time was downloading a model.

jjmarr 2 days ago

If you're dealing with ITAR compliance you should have experience with hosting things on-premises.

dlivingston 1 day ago

Yes. The past two companies I've been at have self-hosted enterprise LLMs running on their own servers and connected to internal documentation. There is also Azure Cloud for Gov and other similar privacy-first ways of doing this.

But also, running LLMs locally is easy. I don't know what goes into hosting them, as a service for your org, but just getting an LLM running locally is a straightforward 30-minute task.

genewitch 2 days ago

I'm for hire, I'll do all that for any company that needs it. Email in profile. Contract or employee, makes no difference to me.

blastro 2 days ago

This hasn't been my experience. Pretty easy with AWS Bedrock

paxys 2 days ago

Ah yes, "self host" by using a fully Amazon-managed service on Amazon's servers. How would a US court ever access those logs?

garyfirestorm 2 days ago

Run a vllm docker container. Yeah the assumption is you already know what hardware you need or you already have it on prem. Assuming this is ITAR stuff, you must be self hosting everything.