Comment

Comment on Self-Hosted AI is pretty darn cool

coffee_with_cream@sh.itjust.works ⁨6⁩ ⁨months⁩ ago

You probably want 48gb of vram or more to run the good stuff. I recommend renting GPU time instead of using your own hardware via AWS or other vendors - runpod.io is pretty good.

source

Sort:hotnew top

NotMyOldRedditName@lemmy.world ⁨6⁩ ⁨months⁩ ago
Kinda defeats the purpose of doing it private and local.

I wouldn’t trust any claims a 3rd party service makes with regards to being private.

source
theterrasque@infosec.pub ⁨6⁩ ⁨months⁩ ago
Llama3 8b can be run at 6gb vram, and it’s fairly competent. Gemma has a 9b I think, which would also be worth looking into.

source
31337@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
IDK, looks like 48GB cloud pricing would be 0.35/hr => $255/month. Used 3090s go for $700. Two 3090s would give you 48GB of VRAM, and cost $1400 (I’m assuming you can do “model-parallel” will Llama; never tried running an LLM, but it should be possible and work well). So, the break-even point would be <6 months. Hmm, but if Severless works well, that could be pretty cheap. Would probably take a few minutes to process and load a ~48GB model every cold start though?

source
- fhein@lemmy.world ⁨6⁩ ⁨months⁩ ago
  Assuming they already own a PC, if someone buys two 3090 for it they’ll probably also have to upgrade their PSU so that might be worth including in the budget. But it’s definitely a relatively low cost way to get more VRAM, there are people who run 3 or 4 RTX3090 too.
  
  source